I'm dropping the RFC tag now as I have the feeling that we are starting to have something in a good shape that can be pushed for more testing in near future.
This is v7 of my attempt to add support for a generic pci_host_bridge controller created from a description passed in the device tree.
Changes from v6: - Made pci_register_io_range() return an error if PCI_IOBASE is not defined. When the cleanup of PCI_IOBASE use is going to happen, that will catch all those architectures that don't use virtual mapping of I/O ranges (like x86) or don't support PCI at all. - Improved the error handling in of_pci_range_to_resource() and made it propagate the error as well. - Bail out of the parsing of PCI ranges if of_pci_range_to_resource() fails.
Changes from v5: - Tested by Tanmay Inamdar, thanks Tanmay! - dropped v5 5/7 pci: Use parent domain number when allocating child busses. - Added weak implementation of pcibios_fixup_bridge_ranges() in drivers/pci/host-bridge.c so that architectures that enable CONFIG_OF and CONFIG_PCI don't suddenly get compilation errors. While at this, changed the signature of the function so that an error can be returned. - With the new error code in pcibios_fixup_bridge_ranges(), reworked the error handling in pci_host_bridge_of_get_ranges() and of_create_pci_host_bridge(). - Add linux/slab.h to the #include list - Revisit the error path in pci_create_root_bus[_in_domain]() and fixed the case where failing to allocate the bus will not return an error.
Changes from v4: - Export pci_find_host_bridge() to be used by arch code. There is scope for making the arch/arm64 version of pci_domain_nr the default weak implementation but that would double the size of this series in order to handle all #define versions of the pci_domain_nr() function, so I suggest keeping that for a separate cleanup series.
Changes from v3: - Dynamically allocate bus_range resource in of_create_pci_host_bridge() - Fix the domain number used when creating child busses. - Changed domain number allocator to use atomic operations. - Use ERR_PTR() to propagate the error out of pci_create_root_bus_in_domain() and of_create_pci_host_bridge().
Changes from v2: - Use range->cpu_addr when calling pci_address_to_pio() - Introduce pci_register_io_range() helper function in order to register io ranges ahead of their conversion to PIO values. This is needed as no information is being stored yet regarding the range mapping, making pci_address_to_pio() fail. Default weak implementation does nothing, to cover the default weak implementation of pci_address_to_pio() that expects direct mapping of physical addresses into PIO values (x86 view).
Changes from v1: - Add patch to fix conversion of IO ranges into IO resources. - Added a domain_nr member to pci_host_bridge structure, and a new function to create a root bus in a given domain number. In order to facilitate that I propose changing the order of initialisation between pci_host_bridge and it's related bus in pci_create_root_bus() as sort of a rever of 7b5436635800. This is done in patch 1/4 and 2/4. - Added a simple allocator of domain numbers in drivers/pci/host-bridge.c. The code will first try to get a domain id from of_alias_get_id(..., "pci-domain") and if that fails assign the next unallocated domain id. - Changed the name of the function that creates the generic host bridge from pci_host_bridge_of_init to of_create_pci_host_bridge and exported as GPL symbol.
v6 thread here: https://lkml.org/lkml/2014/3/5/179 v5 thread here: https://lkml.org/lkml/2014/3/4/318 v4 thread here: https://lkml.org/lkml/2014/3/3/301 v3 thread here: https://lkml.org/lkml/2014/2/28/216 v2 thread here: https://lkml.org/lkml/2014/2/27/245 v1 thread here: https://lkml.org/lkml/2014/2/3/380
Best regards, Liviu
Liviu Dudau (6): pci: Introduce pci_register_io_range() helper function. pci: OF: Fix the conversion of IO ranges into IO resources. pci: Create pci_host_bridge before its associated bus in pci_create_root_bus. pci: Introduce a domain number for pci_host_bridge. pci: Export find_pci_host_bridge() function. pci: Add support for creating a generic host_bridge from device tree
drivers/of/address.c | 49 +++++++++++ drivers/pci/host-bridge.c | 161 ++++++++++++++++++++++++++++++++++- drivers/pci/probe.c | 73 ++++++++++------ include/linux/of_address.h | 14 +-- include/linux/pci.h | 17 ++++ 5 files changed, 278 insertions(+), 36 deletions(-)
Some architectures do not share x86 simple view of the PCI I/O space and instead use a range of addresses that map to bus addresses. For some architectures these ranges will be expressed by OF bindings in a device tree file.
Introduce a pci_register_io_range() helper function that can be used by the architecture code to keep track of the I/O ranges described by the PCI bindings. If the PCI_IOBASE macro is not defined that signals lack of support for PCI and we return an error.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Acked-by: Grant Likely grant.likely@linaro.org Tested-by: Tanmay Inamdar tinamdar@apm.com --- drivers/of/address.c | 9 +++++++++ include/linux/of_address.h | 1 + 2 files changed, 10 insertions(+)
diff --git a/drivers/of/address.c b/drivers/of/address.c index 1a54f1f..be958ed 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -619,6 +619,15 @@ const __be32 *of_get_address(struct device_node *dev, int index, u64 *size, } EXPORT_SYMBOL(of_get_address);
+int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size) +{ +#ifndef PCI_IOBASE + return -EINVAL; +#else + return 0; +#endif +} + unsigned long __weak pci_address_to_pio(phys_addr_t address) { if (address > IO_SPACE_LIMIT) diff --git a/include/linux/of_address.h b/include/linux/of_address.h index 5f6ed6b..40c418d 100644 --- a/include/linux/of_address.h +++ b/include/linux/of_address.h @@ -56,6 +56,7 @@ extern void __iomem *of_iomap(struct device_node *device, int index); extern const __be32 *of_get_address(struct device_node *dev, int index, u64 *size, unsigned int *flags);
+extern int pci_register_io_range(phys_addr_t addr, resource_size_t size); extern unsigned long pci_address_to_pio(phys_addr_t addr);
extern int of_pci_range_parser_init(struct of_pci_range_parser *parser,
On Fri, Mar 14, 2014 at 03:34:27PM +0000, Liviu Dudau wrote:
Some architectures do not share x86 simple view of the PCI I/O space and instead use a range of addresses that map to bus addresses. For some architectures these ranges will be expressed by OF bindings in a device tree file.
It's true that the current Linux "x86 view of PCI I/O space" is pretty simple and limited. But I don't think that's a fundamental x86 limitation (other than the fact that the actual INB/OUTB/etc. CPU instructions themselves are limited to a single 64K I/O port space).
Host bridges on x86 could have MMIO apertures that turn CPU memory accesses into PCI port accesses. We could implement any number of I/O port spaces this way, by making the kernel inb()/outb()/etc. interfaces smart enough to use the memory-mapped space instead of (or in addition to) the INB/OUTB/etc. instructions.
ia64 does this (see arch/ia64/include/asm/io.h for a little description) and I think maybe one or two other arches have something similar.
Introduce a pci_register_io_range() helper function that can be used by the architecture code to keep track of the I/O ranges described by the PCI bindings. If the PCI_IOBASE macro is not defined that signals lack of support for PCI and we return an error.
I don't quite see how you intend to use this, because this series doesn't include any non-stub implementation of pci_register_io_range().
Is this anything like the ia64 strategy I mentioned above? If so, it would be really nice to unify some of this stuff.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Acked-by: Grant Likely grant.likely@linaro.org Tested-by: Tanmay Inamdar tinamdar@apm.com
drivers/of/address.c | 9 +++++++++ include/linux/of_address.h | 1 + 2 files changed, 10 insertions(+)
diff --git a/drivers/of/address.c b/drivers/of/address.c index 1a54f1f..be958ed 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -619,6 +619,15 @@ const __be32 *of_get_address(struct device_node *dev, int index, u64 *size, } EXPORT_SYMBOL(of_get_address); +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size) +{ +#ifndef PCI_IOBASE
- return -EINVAL;
+#else
- return 0;
+#endif +}
unsigned long __weak pci_address_to_pio(phys_addr_t address) { if (address > IO_SPACE_LIMIT) diff --git a/include/linux/of_address.h b/include/linux/of_address.h index 5f6ed6b..40c418d 100644 --- a/include/linux/of_address.h +++ b/include/linux/of_address.h @@ -56,6 +56,7 @@ extern void __iomem *of_iomap(struct device_node *device, int index); extern const __be32 *of_get_address(struct device_node *dev, int index, u64 *size, unsigned int *flags); +extern int pci_register_io_range(phys_addr_t addr, resource_size_t size); extern unsigned long pci_address_to_pio(phys_addr_t addr); extern int of_pci_range_parser_init(struct of_pci_range_parser *parser, -- 1.9.0
On Fri, 2014-04-04 at 18:19 -0600, Bjorn Helgaas wrote:
Introduce a pci_register_io_range() helper function that can be used by the architecture code to keep track of the I/O ranges described by the PCI bindings. If the PCI_IOBASE macro is not defined that signals lack of support for PCI and we return an error.
I don't quite see how you intend to use this, because this series doesn't include any non-stub implementation of pci_register_io_range().
Is this anything like the ia64 strategy I mentioned above? If so, it would be really nice to unify some of this stuff.
We also use two different strategies on ppc32 and ppc64
- On ppc32, inb/outb turn into an MMIO access to _IO_BASE + port
That _IO_BASE is a variable which is initialized to the ioremapped address of the IO space MMIO aperture of the first bridge we discover. Then port numbers are "fixed up" on all other bridges so that the addition _IO_BASE + port fits the ioremapped address of the IO space on that bridge. A bit messy... and breaks whenever drivers copy port numbers into variables of the wrong type such as shorts.
- On ppc64, we have more virtual space, so instead we reserve a range of address space (fixed) for IO space, it's always the same. Bridges IO spaces are then mapped into that range, so we always have a positive offset from _IO_BASE which makes things a bit more robust and less "surprising" than ppc32. Additionally, the first 64k are reserved. They are only mapped if we see an ISA bridge (which some older machines have). Otherwise it's left unmapped, so crappy drivers trying to hard code x86 IO ports will blow up immediately which I deem better than silently whacking the wrong hardware. In addition, we have a mechanism we use on powernv to re-route accesses to that first 64k to the power8 built-in LPC bus which can have some legacy IOs on it such as a UART or a RTC.
Cheers, Ben.
On Sun, Apr 06, 2014 at 10:49:53AM +0100, Benjamin Herrenschmidt wrote:
On Fri, 2014-04-04 at 18:19 -0600, Bjorn Helgaas wrote:
Introduce a pci_register_io_range() helper function that can be used by the architecture code to keep track of the I/O ranges described by the PCI bindings. If the PCI_IOBASE macro is not defined that signals lack of support for PCI and we return an error.
I don't quite see how you intend to use this, because this series doesn't include any non-stub implementation of pci_register_io_range().
Is this anything like the ia64 strategy I mentioned above? If so, it would be really nice to unify some of this stuff.
We also use two different strategies on ppc32 and ppc64
- On ppc32, inb/outb turn into an MMIO access to _IO_BASE + port
That _IO_BASE is a variable which is initialized to the ioremapped address of the IO space MMIO aperture of the first bridge we discover. Then port numbers are "fixed up" on all other bridges so that the addition _IO_BASE + port fits the ioremapped address of the IO space on that bridge. A bit messy... and breaks whenever drivers copy port numbers into variables of the wrong type such as shorts.
- On ppc64, we have more virtual space, so instead we reserve a range
of address space (fixed) for IO space, it's always the same. Bridges IO spaces are then mapped into that range, so we always have a positive offset from _IO_BASE which makes things a bit more robust and less "surprising" than ppc32. Additionally, the first 64k are reserved. They are only mapped if we see an ISA bridge (which some older machines have). Otherwise it's left unmapped, so crappy drivers trying to hard code x86 IO ports will blow up immediately which I deem better than silently whacking the wrong hardware. In addition, we have a mechanism we use on powernv to re-route accesses to that first 64k to the power8 built-in LPC bus which can have some legacy IOs on it such as a UART or a RTC.
Cheers, Ben.
Hi Benjamin,
Thanks for the summary, is really useful as I was recently looking into code in that area. One thing I was trying to understand is why ppc needs _IO_BASE at all rather than using the generic PCI_IOBASE?
Best regards, Liviu
On Mon, 2014-04-07 at 09:35 +0100, Liviu Dudau wrote:
Thanks for the summary, is really useful as I was recently looking into code in that area. One thing I was trying to understand is why ppc needs _IO_BASE at all rather than using the generic PCI_IOBASE?
Perhaps because our code predates it ? :-) I haven't looked much into the semantics of PCI_IOBASE yet...
Cheers, Ben.
On Monday 07 April 2014 19:13:28 Benjamin Herrenschmidt wrote:
On Mon, 2014-04-07 at 09:35 +0100, Liviu Dudau wrote:
Thanks for the summary, is really useful as I was recently looking into code in that area. One thing I was trying to understand is why ppc needs _IO_BASE at all rather than using the generic PCI_IOBASE?
Perhaps because our code predates it ? I haven't looked much into the semantics of PCI_IOBASE yet...
Yes, I'm pretty sure that's all there is to it. PCI_IOBASE just happened to be an identifier we picked for asm-generic, but the use on PowerPC is much older than the generic file.
Arnd
On Sat, Apr 05, 2014 at 01:19:53AM +0100, Bjorn Helgaas wrote:
On Fri, Mar 14, 2014 at 03:34:27PM +0000, Liviu Dudau wrote:
Some architectures do not share x86 simple view of the PCI I/O space and instead use a range of addresses that map to bus addresses. For some architectures these ranges will be expressed by OF bindings in a device tree file.
It's true that the current Linux "x86 view of PCI I/O space" is pretty simple and limited. But I don't think that's a fundamental x86 limitation (other than the fact that the actual INB/OUTB/etc. CPU instructions themselves are limited to a single 64K I/O port space).
Hi Bjorn,
Thanks for reviewing this series.
I might've taken a too dim view of x86 world. I tend to split the existing architectures into the ones that have special I/O instructions and the ones that map a region of memory into CPU space and do I/O transactions there as simple read/writes.
Host bridges on x86 could have MMIO apertures that turn CPU memory accesses into PCI port accesses. We could implement any number of I/O port spaces this way, by making the kernel inb()/outb()/etc. interfaces smart enough to use the memory-mapped space instead of (or in addition to) the INB/OUTB/etc. instructions.
Right, sorry for my ignorance then: how does *currently* the device driver do the I/O transfer transparent of the implementation mechanism? Or they have intimate knowledge of wether the device is behind a host bridge and can do MMIO or is on an ISA or CF bus and then it needs INB/OUTB ? And if we make inb/outb smarter, does that mean that we need to change the drivers?
ia64 does this (see arch/ia64/include/asm/io.h for a little description) and I think maybe one or two other arches have something similar.
Introduce a pci_register_io_range() helper function that can be used by the architecture code to keep track of the I/O ranges described by the PCI bindings. If the PCI_IOBASE macro is not defined that signals lack of support for PCI and we return an error.
I don't quite see how you intend to use this, because this series doesn't include any non-stub implementation of pci_register_io_range().
Is this anything like the ia64 strategy I mentioned above? If so, it would be really nice to unify some of this stuff.
After discussions with Arnd and Catalin I know have a new series that moves some of the code from arm64 series into this one. I am putting it through testing right know as I am going to have to depend on another series that makes PCI_IOBASE defined only for architectures that do MMIO in order to choose the correct default implementation for these functions. My hope is that I will be able to send the series this week.
Best regards, Liviu
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Acked-by: Grant Likely grant.likely@linaro.org Tested-by: Tanmay Inamdar tinamdar@apm.com
drivers/of/address.c | 9 +++++++++ include/linux/of_address.h | 1 + 2 files changed, 10 insertions(+)
diff --git a/drivers/of/address.c b/drivers/of/address.c index 1a54f1f..be958ed 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -619,6 +619,15 @@ const __be32 *of_get_address(struct device_node *dev, int index, u64 *size, } EXPORT_SYMBOL(of_get_address); +int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size) +{ +#ifndef PCI_IOBASE
- return -EINVAL;
+#else
- return 0;
+#endif +}
unsigned long __weak pci_address_to_pio(phys_addr_t address) { if (address > IO_SPACE_LIMIT) diff --git a/include/linux/of_address.h b/include/linux/of_address.h index 5f6ed6b..40c418d 100644 --- a/include/linux/of_address.h +++ b/include/linux/of_address.h @@ -56,6 +56,7 @@ extern void __iomem *of_iomap(struct device_node *device, int index); extern const __be32 *of_get_address(struct device_node *dev, int index, u64 *size, unsigned int *flags); +extern int pci_register_io_range(phys_addr_t addr, resource_size_t size); extern unsigned long pci_address_to_pio(phys_addr_t addr); extern int of_pci_range_parser_init(struct of_pci_range_parser *parser, -- 1.9.0
On Monday 07 April 2014 09:31:20 Liviu Dudau wrote:
On Sat, Apr 05, 2014 at 01:19:53AM +0100, Bjorn Helgaas wrote:
Host bridges on x86 could have MMIO apertures that turn CPU memory accesses into PCI port accesses. We could implement any number of I/O port spaces this way, by making the kernel inb()/outb()/etc. interfaces smart enough to use the memory-mapped space instead of (or in addition to) the INB/OUTB/etc. instructions.
PowerPC actually has this already, as CONFIG_PPC_INDIRECT_PIO meaning that access to PIO registers is bus specific, and there is also CONFIG_PPC_INDIRECT_MMIO for the case where MMIO access is not native.
Right, sorry for my ignorance then: how does *currently* the device driver do the I/O transfer transparent of the implementation mechanism? Or they have intimate knowledge of wether the device is behind a host bridge and can do MMIO or is on an ISA or CF bus and then it needs INB/OUTB ? And if we make inb/outb smarter, does that mean that we need to change the drivers?
The idea of that would be to not change drivers.
My preference here would be to only have a generic function for those architectures that have the simple MMIO access all the time.
ia64 does this (see arch/ia64/include/asm/io.h for a little description) and I think maybe one or two other arches have something similar.
Introduce a pci_register_io_range() helper function that can be used by the architecture code to keep track of the I/O ranges described by the PCI bindings. If the PCI_IOBASE macro is not defined that signals lack of support for PCI and we return an error.
I don't quite see how you intend to use this, because this series doesn't include any non-stub implementation of pci_register_io_range().
Is this anything like the ia64 strategy I mentioned above? If so, it would be really nice to unify some of this stuff.
After discussions with Arnd and Catalin I know have a new series that moves some of the code from arm64 series into this one. I am putting it through testing right know as I am going to have to depend on another series that makes PCI_IOBASE defined only for architectures that do MMIO in order to choose the correct default implementation for these functions. My hope is that I will be able to send the series this week.
I think migrating other architectures to use the same code should be a separate effort from adding a generic implementation that can be used by arm64. It's probably a good idea to have patches to convert arm32 and/or microblaze.
Arnd
On Mon, Apr 07, 2014 at 12:36:15PM +0100, Arnd Bergmann wrote:
On Monday 07 April 2014 09:31:20 Liviu Dudau wrote:
On Sat, Apr 05, 2014 at 01:19:53AM +0100, Bjorn Helgaas wrote:
Host bridges on x86 could have MMIO apertures that turn CPU memory accesses into PCI port accesses. We could implement any number of I/O port spaces this way, by making the kernel inb()/outb()/etc. interfaces smart enough to use the memory-mapped space instead of (or in addition to) the INB/OUTB/etc. instructions.
PowerPC actually has this already, as CONFIG_PPC_INDIRECT_PIO meaning that access to PIO registers is bus specific, and there is also CONFIG_PPC_INDIRECT_MMIO for the case where MMIO access is not native.
Right, sorry for my ignorance then: how does *currently* the device driver do the I/O transfer transparent of the implementation mechanism? Or they have intimate knowledge of wether the device is behind a host bridge and can do MMIO or is on an ISA or CF bus and then it needs INB/OUTB ? And if we make inb/outb smarter, does that mean that we need to change the drivers?
The idea of that would be to not change drivers.
My preference here would be to only have a generic function for those architectures that have the simple MMIO access all the time.
ia64 does this (see arch/ia64/include/asm/io.h for a little description) and I think maybe one or two other arches have something similar.
Introduce a pci_register_io_range() helper function that can be used by the architecture code to keep track of the I/O ranges described by the PCI bindings. If the PCI_IOBASE macro is not defined that signals lack of support for PCI and we return an error.
I don't quite see how you intend to use this, because this series doesn't include any non-stub implementation of pci_register_io_range().
Is this anything like the ia64 strategy I mentioned above? If so, it would be really nice to unify some of this stuff.
After discussions with Arnd and Catalin I know have a new series that moves some of the code from arm64 series into this one. I am putting it through testing right know as I am going to have to depend on another series that makes PCI_IOBASE defined only for architectures that do MMIO in order to choose the correct default implementation for these functions. My hope is that I will be able to send the series this week.
I think migrating other architectures to use the same code should be a separate effort from adding a generic implementation that can be used by arm64. It's probably a good idea to have patches to convert arm32 and/or microblaze.
Agree. My updated series only moves the arm64 code into framework to make the arm64 part a noop.
Liviu
Arnd
On Mon, Apr 7, 2014 at 5:36 AM, Arnd Bergmann arnd@arndb.de wrote:
I think migrating other architectures to use the same code should be a separate effort from adding a generic implementation that can be used by arm64. It's probably a good idea to have patches to convert arm32 and/or microblaze.
Let me reiterate that I am 100% in favor of replacing arch-specific code with more generic implementations.
However, I am not 100% in favor of doing it as separate efforts (although maybe I could be convinced). The reasons I hesitate are that (1) if only one architecture uses a new "generic" implementation, we really don't know whether it is generic enough, (2) until I see the patches to convert other architectures, I have to assume that I'm the one who will write them, and (3) as soon as we add the code to drivers/pci, it becomes partly my headache to maintain it, even if only one arch benefits from it.
Please don't think I'm questioning anyone's intent or good will. It's just that I understand the business pressures, and I know how hard it can be to justify this sort of work to one's management, especially after the immediate problem has been solved.
Bjorn
On Mon, Apr 07, 2014 at 06:58:24PM +0100, Bjorn Helgaas wrote:
On Mon, Apr 7, 2014 at 5:36 AM, Arnd Bergmann arnd@arndb.de wrote:
I think migrating other architectures to use the same code should be a separate effort from adding a generic implementation that can be used by arm64. It's probably a good idea to have patches to convert arm32 and/or microblaze.
Let me reiterate that I am 100% in favor of replacing arch-specific code with more generic implementations.
However, I am not 100% in favor of doing it as separate efforts (although maybe I could be convinced). The reasons I hesitate are that (1) if only one architecture uses a new "generic" implementation, we really don't know whether it is generic enough, (2) until I see the patches to convert other architectures, I have to assume that I'm the one who will write them, and (3) as soon as we add the code to drivers/pci, it becomes partly my headache to maintain it, even if only one arch benefits from it.
Please don't think I'm questioning anyone's intent or good will. It's just that I understand the business pressures, and I know how hard it can be to justify this sort of work to one's management, especially after the immediate problem has been solved.
I understand your concern. I guess the only way to prove my good intentions is to shut up and show the code.
Liviu
Bjorn
On Tuesday 08 April 2014 10:50:39 Liviu Dudau wrote:
On Mon, Apr 07, 2014 at 06:58:24PM +0100, Bjorn Helgaas wrote:
On Mon, Apr 7, 2014 at 5:36 AM, Arnd Bergmann arnd@arndb.de wrote:
I think migrating other architectures to use the same code should be a separate effort from adding a generic implementation that can be used by arm64. It's probably a good idea to have patches to convert arm32 and/or microblaze.
Let me reiterate that I am 100% in favor of replacing arch-specific code with more generic implementations.
However, I am not 100% in favor of doing it as separate efforts (although maybe I could be convinced). The reasons I hesitate are that (1) if only one architecture uses a new "generic" implementation, we really don't know whether it is generic enough, (2) until I see the patches to convert other architectures, I have to assume that I'm the one who will write them, and (3) as soon as we add the code to drivers/pci, it becomes partly my headache to maintain it, even if only one arch benefits from it.
Fair enough.
My approach to the asm-generic infrastruction has mostly been to ensure that whoever adds a new architecture has to make things easier for the next person. For the PCI code it's clearly your call to pick whatever works best for you.
Please don't think I'm questioning anyone's intent or good will. It's just that I understand the business pressures, and I know how hard it can be to justify this sort of work to one's management, especially after the immediate problem has been solved.
I understand your concern. I guess the only way to prove my good intentions is to shut up and show the code.
I'd suggest looking at architectures in this order then:
* microblaze (this one is easy and wants to share code with us) * arm32-multiplatform (obviously interesting, but not as easy as microblaze) * powerpc64 (they are keen on sharing, code is similar to what you have) * mips (this is really platform specific, some want to share drivers with arm32, others should keep their current code. Note that platform selection on mips is compile-time only, they don't have to do it all the same way) * powerpc32 (their code is currently different, might not be worth it)
My preference would be to have only the first two done initially and leave the other ones up to architecture maintainers, but Bjorn should say how much he wants to see get done.
Arnd
On Tue, Apr 8, 2014 at 4:22 AM, Arnd Bergmann arnd@arndb.de wrote:
On Tuesday 08 April 2014 10:50:39 Liviu Dudau wrote:
On Mon, Apr 07, 2014 at 06:58:24PM +0100, Bjorn Helgaas wrote:
On Mon, Apr 7, 2014 at 5:36 AM, Arnd Bergmann arnd@arndb.de wrote:
I think migrating other architectures to use the same code should be a separate effort from adding a generic implementation that can be used by arm64. It's probably a good idea to have patches to convert arm32 and/or microblaze.
Let me reiterate that I am 100% in favor of replacing arch-specific code with more generic implementations.
However, I am not 100% in favor of doing it as separate efforts (although maybe I could be convinced). The reasons I hesitate are that (1) if only one architecture uses a new "generic" implementation, we really don't know whether it is generic enough, (2) until I see the patches to convert other architectures, I have to assume that I'm the one who will write them, and (3) as soon as we add the code to drivers/pci, it becomes partly my headache to maintain it, even if only one arch benefits from it.
Fair enough.
My approach to the asm-generic infrastruction has mostly been to ensure that whoever adds a new architecture has to make things easier for the next person.
That's a good rule. But if we add a generic implementation used only by one architecture, the overall complexity has increased (we added new unshared code), so the next person has to look at N+1 existing implementations. If we even convert one existing arch, that seems like an improvement: we have N implementations with one being used by at least two arches.
Bjorn
(sorry for replying to a months old thread)
On Mon, Apr 07, 2014 at 06:58:24PM +0100, Bjorn Helgaas wrote:
On Mon, Apr 7, 2014 at 5:36 AM, Arnd Bergmann arnd@arndb.de wrote:
I think migrating other architectures to use the same code should be a separate effort from adding a generic implementation that can be used by arm64. It's probably a good idea to have patches to convert arm32 and/or microblaze.
Let me reiterate that I am 100% in favor of replacing arch-specific code with more generic implementations.
However, I am not 100% in favor of doing it as separate efforts (although maybe I could be convinced). The reasons I hesitate are that (1) if only one architecture uses a new "generic" implementation, we really don't know whether it is generic enough, (2) until I see the patches to convert other architectures, I have to assume that I'm the one who will write them, and (3) as soon as we add the code to drivers/pci, it becomes partly my headache to maintain it, even if only one arch benefits from it.
I agree and understand your point.
Please don't think I'm questioning anyone's intent or good will. It's just that I understand the business pressures, and I know how hard it can be to justify this sort of work to one's management, especially after the immediate problem has been solved.
But, unfortunately, that's something we failed to address in reasonable time (even though I was one of the proponents of the generic PCIe implementation). This work is very likely to slip further into the late part of this year and I am aware that several ARM partners are blocked on the (upstream) availability of PCIe support for the arm64 kernel.
Although a bit late, I'm raising this now and hopefully we'll come to a conclusion soon. Delaying arm64 PCIe support even further is not a real option, which leaves us with:
1. Someone else (with enough PCIe knowledge) volunteering to take over soon or 2. Dropping Liviu's work and going for an arm64-specific implementation (most likely based on the arm32 implementation, see below)
First option is ideal but there is work to do as laid out by Arnd here:
http://article.gmane.org/gmane.linux.kernel/1679304
The latest patches from Liviu are here (they only target arm64 and there are additional comments to be addressed from the above thread):
http://linux-arm.org/git?p=linux-ld.git%3Ba=shortlog%3Bh=refs/heads/for-upst...
The main reason for the second option is timing. We could temporarily move Liviu's code under arch/arm64 with the hope that we generalise it later. However, the risk is even higher that once the code is in mainline, the generic implementation won't happen. In which case, I don't see much point in departing from the arm32 PCI API, making bios32 clone the best second option.
For the alternative implementation above, we already have a heavily cut down version of the arm32 PCI support but only tested in a virtual environment so far:
https://git.kernel.org/cgit/linux/kernel/git/will/linux.git/log/?h=pci/bios3...
In conclusion, unless someone volunteers for the first option fairly soon, we'll post the alternative patches for review and take it from there.
Thanks.
On Thu, Jun 26, 2014 at 09:59:26AM +0100, Catalin Marinas wrote:
(sorry for replying to a months old thread)
On Mon, Apr 07, 2014 at 06:58:24PM +0100, Bjorn Helgaas wrote:
On Mon, Apr 7, 2014 at 5:36 AM, Arnd Bergmann arnd@arndb.de wrote:
I think migrating other architectures to use the same code should be a separate effort from adding a generic implementation that can be used by arm64. It's probably a good idea to have patches to convert arm32 and/or microblaze.
Let me reiterate that I am 100% in favor of replacing arch-specific code with more generic implementations.
However, I am not 100% in favor of doing it as separate efforts (although maybe I could be convinced). The reasons I hesitate are that (1) if only one architecture uses a new "generic" implementation, we really don't know whether it is generic enough, (2) until I see the patches to convert other architectures, I have to assume that I'm the one who will write them, and (3) as soon as we add the code to drivers/pci, it becomes partly my headache to maintain it, even if only one arch benefits from it.
I agree and understand your point.
Please don't think I'm questioning anyone's intent or good will. It's just that I understand the business pressures, and I know how hard it can be to justify this sort of work to one's management, especially after the immediate problem has been solved.
But, unfortunately, that's something we failed to address in reasonable time (even though I was one of the proponents of the generic PCIe implementation). This work is very likely to slip further into the late part of this year and I am aware that several ARM partners are blocked on the (upstream) availability of PCIe support for the arm64 kernel.
Although a bit late, I'm raising this now and hopefully we'll come to a conclusion soon. Delaying arm64 PCIe support even further is not a real option, which leaves us with:
- Someone else (with enough PCIe knowledge) volunteering to take over soon or
- Dropping Liviu's work and going for an arm64-specific implementation (most likely based on the arm32 implementation, see below)
First option is ideal but there is work to do as laid out by Arnd here:
http://article.gmane.org/gmane.linux.kernel/1679304
The latest patches from Liviu are here (they only target arm64 and there are additional comments to be addressed from the above thread):
http://linux-arm.org/git?p=linux-ld.git%3Ba=shortlog%3Bh=refs/heads/for-upst...
The main reason for the second option is timing. We could temporarily move Liviu's code under arch/arm64 with the hope that we generalise it later. However, the risk is even higher that once the code is in mainline, the generic implementation won't happen. In which case, I don't see much point in departing from the arm32 PCI API, making bios32 clone the best second option.
For the alternative implementation above, we already have a heavily cut down version of the arm32 PCI support but only tested in a virtual environment so far:
https://git.kernel.org/cgit/linux/kernel/git/will/linux.git/log/?h=pci/bios3...
In conclusion, unless someone volunteers for the first option fairly soon, we'll post the alternative patches for review and take it from there.
That would be a huge step backwards IMO and a huge dissapointment. If you go with the alternative patches from Will you will basically reset every partner's implementation that has been built on top of my patches (when they did so with the understanding that my series will be the one ARM will support and publish) *and* make anyone's attempt to create a generic implementation harder, as they will have to undo this code to remove the arch-specific parts.
While I have part of a personal blame to carry as I have dragged some of the work for too long, there are things that I could not have done differently due to internal pressure inside the project I work on. It is my intent to resume work on it as soon as possible, but life is such at the moment that I have to dedicate my time to other things.
Best regards, Liviu
Thanks.
-- Catalin
On Thu, Jun 26, 2014 at 10:30:29AM +0100, Liviu Dudau wrote:
On Thu, Jun 26, 2014 at 09:59:26AM +0100, Catalin Marinas wrote:
Although a bit late, I'm raising this now and hopefully we'll come to a conclusion soon. Delaying arm64 PCIe support even further is not a real option, which leaves us with:
- Someone else (with enough PCIe knowledge) volunteering to take over soon or
- Dropping Liviu's work and going for an arm64-specific implementation (most likely based on the arm32 implementation, see below)
[...]
In conclusion, unless someone volunteers for the first option fairly soon, we'll post the alternative patches for review and take it from there.
That would be a huge step backwards IMO and a huge dissapointment. If you go with the alternative patches from Will you will basically reset every partner's implementation that has been built on top of my patches (when they did so with the understanding that my series will be the one ARM will support and publish) *and* make anyone's attempt to create a generic implementation harder, as they will have to undo this code to remove the arch-specific parts.
I fully agree and the alternative patchset is definitely _not_ my preferred solution. You can read this email as a request for help to complete the work (whether it comes from ARM, Linaro or other interested parties). I don't mean taking over the whole patchset but potentially helping with other arch conversion (microblaze, arm multi-platform).
(however, if the generic PCIe work won't happen in reasonable time, we need to set some deadline rather than keeping the patchset out of tree indefinitely)
On Thu, Jun 26, 2014 at 03:11:38PM +0100, Catalin Marinas wrote:
On Thu, Jun 26, 2014 at 10:30:29AM +0100, Liviu Dudau wrote:
On Thu, Jun 26, 2014 at 09:59:26AM +0100, Catalin Marinas wrote:
Although a bit late, I'm raising this now and hopefully we'll come to a conclusion soon. Delaying arm64 PCIe support even further is not a real option, which leaves us with:
- Someone else (with enough PCIe knowledge) volunteering to take over soon or
- Dropping Liviu's work and going for an arm64-specific implementation (most likely based on the arm32 implementation, see below)
[...]
In conclusion, unless someone volunteers for the first option fairly soon, we'll post the alternative patches for review and take it from there.
That would be a huge step backwards IMO and a huge dissapointment. If you go with the alternative patches from Will you will basically reset every partner's implementation that has been built on top of my patches (when they did so with the understanding that my series will be the one ARM will support and publish) *and* make anyone's attempt to create a generic implementation harder, as they will have to undo this code to remove the arch-specific parts.
I fully agree and the alternative patchset is definitely _not_ my preferred solution. You can read this email as a request for help to complete the work (whether it comes from ARM, Linaro or other interested parties). I don't mean taking over the whole patchset but potentially helping with other arch conversion (microblaze, arm multi-platform).
I feel it's also worth pointing out that I didn't write that code with the intention of getting it merged, nor as a competing solution to what Liviu was proposing at the time. It was merely a development tool to enable some of the SMMU and GICv3 work that Marc and I have been working on recently.
Will
Adding Michael Simek...
On Thu, Jun 26, 2014 at 3:59 AM, Catalin Marinas catalin.marinas@arm.com wrote:
(sorry for replying to a months old thread)
On Mon, Apr 07, 2014 at 06:58:24PM +0100, Bjorn Helgaas wrote:
On Mon, Apr 7, 2014 at 5:36 AM, Arnd Bergmann arnd@arndb.de wrote:
I think migrating other architectures to use the same code should be a separate effort from adding a generic implementation that can be used by arm64. It's probably a good idea to have patches to convert arm32 and/or microblaze.
Let me reiterate that I am 100% in favor of replacing arch-specific code with more generic implementations.
However, I am not 100% in favor of doing it as separate efforts (although maybe I could be convinced). The reasons I hesitate are that (1) if only one architecture uses a new "generic" implementation, we really don't know whether it is generic enough, (2) until I see the patches to convert other architectures, I have to assume that I'm the one who will write them, and (3) as soon as we add the code to drivers/pci, it becomes partly my headache to maintain it, even if only one arch benefits from it.
I agree and understand your point.
Please don't think I'm questioning anyone's intent or good will. It's just that I understand the business pressures, and I know how hard it can be to justify this sort of work to one's management, especially after the immediate problem has been solved.
But, unfortunately, that's something we failed to address in reasonable time (even though I was one of the proponents of the generic PCIe implementation). This work is very likely to slip further into the late part of this year and I am aware that several ARM partners are blocked on the (upstream) availability of PCIe support for the arm64 kernel.
Although a bit late, I'm raising this now and hopefully we'll come to a conclusion soon. Delaying arm64 PCIe support even further is not a real option, which leaves us with:
- Someone else (with enough PCIe knowledge) volunteering to take over soon or
Well, I might have 2 months ago, but now I'm pretty booked up.
- Dropping Liviu's work and going for an arm64-specific implementation (most likely based on the arm32 implementation, see below)
3. Keeping Liviu's patches leaving some of the architecture specific bits. I know Arnd and I both commented on it still needing more common pieces, but compared to option 2 that would be way better.
Let's look at the patches in question:
3e71867 pci: Introduce pci_register_io_range() helper function. 6681dff pci: OF: Fix the conversion of IO ranges into IO resources.
Both OF patches. I'll happily merge them.
2d5dd85 pci: Create pci_host_bridge before its associated bus in pci_create_root_bus. f6f2854 pci: Introduce a domain number for pci_host_bridge. 524a9f5 pci: Export find_pci_host_bridge() function.
These don't seem to be too controversial.
fb75718 pci: of: Parse and map the IRQ when adding the PCI device.
6 LOC. Hardly controversial.
920a685 pci: Add support for creating a generic host_bridge from device tree
This function could be moved to drivers/of/of_pci.c if having it in drivers/pci is too much maintenance burden. However, nearly the same code is already being duplicated in every DT enabled ARM PCI host driver and will continue as more PCI hosts are added. So this isn't really a question of converting other architectures to common PCI host infrastructure, but converting DT based PCI hosts to common infrastructure. ARM is the only arch moving host drivers to drivers/pci ATM. Until other architectures start doing that, converting them is pointless.
bcf5c10 Fix ioport_map() for !CONFIG_GENERIC_IOMAP cases.
Seems like an independent fix that should be applied regardless.
7cfde80 arm64: Add architecture support for PCI
What is here is really just a function of which option we pick.
First option is ideal but there is work to do as laid out by Arnd here:
I don't agree arm32 is harder than microblaze. Yes, converting ALL of arm would be, but that is not necessary. With Liviu's latest branch the hacks I previously needed are gone (thanks!), and this is all I need to get Versatile PCI working (under QEMU):
diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h index 3d23418..22b7529 100644 --- a/arch/arm/include/asm/io.h +++ b/arch/arm/include/asm/io.h @@ -178,6 +178,7 @@ static inline void __iomem *__typesafe_io(unsigned long addr)
/* PCI fixed i/o mapping */ #define PCI_IO_VIRT_BASE 0xfee00000 +#define PCI_IOBASE PCI_IO_VIRT_BASE
#if defined(CONFIG_PCI) void pci_ioremap_set_mem_type(int mem_type);
Hum, so I guess now I've converted ARM...
Here's a branch with my changes[1]. And BTW, it also has multi-platform support for Versatile as moving the PCI host to DT (and drivers/pci/host) is about the last remaining obstacle.
Rob
[1] git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git versatile-pci-v2
On Thursday 26 June 2014 19:44:21 Rob Herring wrote:
Although a bit late, I'm raising this now and hopefully we'll come to a conclusion soon. Delaying arm64 PCIe support even further is not a real option, which leaves us with:
- Someone else (with enough PCIe knowledge) volunteering to take over soon or
Well, I might have 2 months ago, but now I'm pretty booked up.
- Dropping Liviu's work and going for an arm64-specific implementation (most likely based on the arm32 implementation, see below)
- Keeping Liviu's patches leaving some of the architecture specific
bits. I know Arnd and I both commented on it still needing more common pieces, but compared to option 2 that would be way better.
Agreed.
920a685 pci: Add support for creating a generic host_bridge from device tree
This function could be moved to drivers/of/of_pci.c if having it in drivers/pci is too much maintenance burden. However, nearly the same code is already being duplicated in every DT enabled ARM PCI host driver and will continue as more PCI hosts are added. So this isn't really a question of converting other architectures to common PCI host infrastructure, but converting DT based PCI hosts to common infrastructure. ARM is the only arch moving host drivers to drivers/pci ATM. Until other architectures start doing that, converting them is pointless.
This is the most important part in my mind. Every implementation gets this code wrong at least initially, and we have to provide some generic helper to get the broken code out of host drivers. I don't care whether that helper is part of the PCI core or part of the OF core.
7cfde80 arm64: Add architecture support for PCI
What is here is really just a function of which option we pick.
First option is ideal but there is work to do as laid out by Arnd here:
I don't agree arm32 is harder than microblaze. Yes, converting ALL of arm would be, but that is not necessary. With Liviu's latest branch the hacks I previously needed are gone (thanks!), and this is all I need to get Versatile PCI working (under QEMU):
I meant converting all of arm32 would be harder, but I agree we don't have to do it. It would be nice to convert all of the drivers/pci/host drivers though, iow all multiplatform-enabled ones, and leaving the current arm32 pci implementation for the platforms we don't want to convert to multiplatform anyway (footbridge, iop, ixp4xx, ks8695 (?), pxa, sa1100).
Arnd
On Fri, Jun 27, 2014 at 12:03:34PM +0100, Arnd Bergmann wrote:
On Thursday 26 June 2014 19:44:21 Rob Herring wrote:
I don't agree arm32 is harder than microblaze. Yes, converting ALL of arm would be, but that is not necessary. With Liviu's latest branch the hacks I previously needed are gone (thanks!), and this is all I need to get Versatile PCI working (under QEMU):
I meant converting all of arm32 would be harder, but I agree we don't have to do it. It would be nice to convert all of the drivers/pci/host drivers though, iow all multiplatform-enabled ones, and leaving the current arm32 pci implementation for the platforms we don't want to convert to multiplatform anyway (footbridge, iop, ixp4xx, ks8695 (?), pxa, sa1100).
I'm more than happy to convert the generic host controller we merged recently, but I'd probably want the core changes merged first so that I know I'm not wasting my time!
Will
On Friday 27 June 2014 13:49:49 Will Deacon wrote:
On Fri, Jun 27, 2014 at 12:03:34PM +0100, Arnd Bergmann wrote:
On Thursday 26 June 2014 19:44:21 Rob Herring wrote:
I don't agree arm32 is harder than microblaze. Yes, converting ALL of arm would be, but that is not necessary. With Liviu's latest branch the hacks I previously needed are gone (thanks!), and this is all I need to get Versatile PCI working (under QEMU):
I meant converting all of arm32 would be harder, but I agree we don't have to do it. It would be nice to convert all of the drivers/pci/host drivers though, iow all multiplatform-enabled ones, and leaving the current arm32 pci implementation for the platforms we don't want to convert to multiplatform anyway (footbridge, iop, ixp4xx, ks8695 (?), pxa, sa1100).
I'm more than happy to convert the generic host controller we merged recently, but I'd probably want the core changes merged first so that I know I'm not wasting my time!
That is definitely fine with me, but it's Bjorn's decision.
Arnd
On Fri, Jun 27, 2014 at 02:16:28PM +0100, Arnd Bergmann wrote:
On Friday 27 June 2014 13:49:49 Will Deacon wrote:
On Fri, Jun 27, 2014 at 12:03:34PM +0100, Arnd Bergmann wrote:
On Thursday 26 June 2014 19:44:21 Rob Herring wrote:
I don't agree arm32 is harder than microblaze. Yes, converting ALL of arm would be, but that is not necessary. With Liviu's latest branch the hacks I previously needed are gone (thanks!), and this is all I need to get Versatile PCI working (under QEMU):
I meant converting all of arm32 would be harder, but I agree we don't have to do it. It would be nice to convert all of the drivers/pci/host drivers though, iow all multiplatform-enabled ones, and leaving the current arm32 pci implementation for the platforms we don't want to convert to multiplatform anyway (footbridge, iop, ixp4xx, ks8695 (?), pxa, sa1100).
I'm more than happy to convert the generic host controller we merged recently, but I'd probably want the core changes merged first so that I know I'm not wasting my time!
That is definitely fine with me, but it's Bjorn's decision.
Or the changes to the generic host controller can be part of Liviu's patch series (maybe together with other PCI host drivers like designware, as time allows).
On 06/27/2014 07:49 AM, Will Deacon wrote:
On Fri, Jun 27, 2014 at 12:03:34PM +0100, Arnd Bergmann wrote:
On Thursday 26 June 2014 19:44:21 Rob Herring wrote:
I don't agree arm32 is harder than microblaze. Yes, converting ALL of arm would be, but that is not necessary. With Liviu's latest branch the hacks I previously needed are gone (thanks!), and this is all I need to get Versatile PCI working (under QEMU):
I meant converting all of arm32 would be harder, but I agree we don't have to do it. It would be nice to convert all of the drivers/pci/host drivers though, iow all multiplatform-enabled ones, and leaving the current arm32 pci implementation for the platforms we don't want to convert to multiplatform anyway (footbridge, iop, ixp4xx, ks8695 (?), pxa, sa1100).
I'm more than happy to convert the generic host controller we merged recently, but I'd probably want the core changes merged first so that I know I'm not wasting my time!
Something like this untested patch...
Another issue I found still present is pci_ioremap_io needs some work to unify with the arm implementation. That's a matter of changing the function from an offset to i/o resource. That should be a pretty mechanical change.
Also, there is a potential for memory leak because there is no undo for of_create_pci_host_bridge.
Rob
8<------------------------------------------------------------------
From 301b402631b2867ced38d5533586da0bd888045c Mon Sep 17 00:00:00 2001
From: Rob Herring robh@kernel.org Date: Fri, 27 Jun 2014 09:46:09 -0500 Subject: [PATCH] pci: generic-host: refactor to use common range parsing
Signed-off-by: Rob Herring robh@kernel.org --- drivers/pci/host/pci-host-generic.c | 191 ++++++------------------------------ 1 file changed, 29 insertions(+), 162 deletions(-)
diff --git a/drivers/pci/host/pci-host-generic.c b/drivers/pci/host/pci-host-generic.c index 44fe6aa..57fd1e1 100644 --- a/drivers/pci/host/pci-host-generic.c +++ b/drivers/pci/host/pci-host-generic.c @@ -32,16 +32,14 @@ struct gen_pci_cfg_bus_ops {
struct gen_pci_cfg_windows { struct resource res; - struct resource bus_range; void __iomem **win;
const struct gen_pci_cfg_bus_ops *ops; };
struct gen_pci { - struct pci_host_bridge host; + struct pci_host_bridge *host; struct gen_pci_cfg_windows cfg; - struct list_head resources; };
static void __iomem *gen_pci_map_cfg_bus_cam(struct pci_bus *bus, @@ -50,7 +48,7 @@ static void __iomem *gen_pci_map_cfg_bus_cam(struct pci_bus *bus, { struct pci_sys_data *sys = bus->sysdata; struct gen_pci *pci = sys->private_data; - resource_size_t idx = bus->number - pci->cfg.bus_range.start; + resource_size_t idx = bus->number - bus->busn_res.start;
return pci->cfg.win[idx] + ((devfn << 8) | where); } @@ -66,7 +64,7 @@ static void __iomem *gen_pci_map_cfg_bus_ecam(struct pci_bus *bus, { struct pci_sys_data *sys = bus->sysdata; struct gen_pci *pci = sys->private_data; - resource_size_t idx = bus->number - pci->cfg.bus_range.start; + resource_size_t idx = bus->number - bus->busn_res.start;
return pci->cfg.win[idx] + ((devfn << 12) | where); } @@ -138,154 +136,32 @@ static const struct of_device_id gen_pci_of_match[] = { }; MODULE_DEVICE_TABLE(of, gen_pci_of_match);
-static int gen_pci_calc_io_offset(struct device *dev, - struct of_pci_range *range, - struct resource *res, - resource_size_t *offset) -{ - static atomic_t wins = ATOMIC_INIT(0); - int err, idx, max_win; - unsigned int window; - - if (!PAGE_ALIGNED(range->cpu_addr)) - return -EINVAL; - - max_win = (IO_SPACE_LIMIT + 1) / SZ_64K; - idx = atomic_inc_return(&wins); - if (idx > max_win) - return -ENOSPC; - - window = (idx - 1) * SZ_64K; - err = pci_ioremap_io(window, range->cpu_addr); - if (err) - return err; - - of_pci_range_to_resource(range, dev->of_node, res); - res->start = window; - res->end = res->start + range->size - 1; - *offset = window - range->pci_addr; - return 0; -} - -static int gen_pci_calc_mem_offset(struct device *dev, - struct of_pci_range *range, - struct resource *res, - resource_size_t *offset) -{ - of_pci_range_to_resource(range, dev->of_node, res); - *offset = range->cpu_addr - range->pci_addr; - return 0; -} - -static void gen_pci_release_of_pci_ranges(struct gen_pci *pci) -{ - struct pci_host_bridge_window *win; - - list_for_each_entry(win, &pci->resources, list) - release_resource(win->res); - - pci_free_resource_list(&pci->resources); -} - -static int gen_pci_parse_request_of_pci_ranges(struct gen_pci *pci) -{ - struct of_pci_range range; - struct of_pci_range_parser parser; - int err, res_valid = 0; - struct device *dev = pci->host.dev.parent; - struct device_node *np = dev->of_node; - - if (of_pci_range_parser_init(&parser, np)) { - dev_err(dev, "missing "ranges" property\n"); - return -EINVAL; - } - - for_each_of_pci_range(&parser, &range) { - struct resource *parent, *res; - resource_size_t offset; - u32 restype = range.flags & IORESOURCE_TYPE_BITS; - - res = devm_kmalloc(dev, sizeof(*res), GFP_KERNEL); - if (!res) { - err = -ENOMEM; - goto out_release_res; - } - - switch (restype) { - case IORESOURCE_IO: - parent = &ioport_resource; - err = gen_pci_calc_io_offset(dev, &range, res, &offset); - break; - case IORESOURCE_MEM: - parent = &iomem_resource; - err = gen_pci_calc_mem_offset(dev, &range, res, &offset); - res_valid |= !(res->flags & IORESOURCE_PREFETCH || err); - break; - default: - err = -EINVAL; - continue; - } - - if (err) { - dev_warn(dev, - "error %d: failed to add resource [type 0x%x, %lld bytes]\n", - err, restype, range.size); - continue; - } - - err = request_resource(parent, res); - if (err) - goto out_release_res; - - pci_add_resource_offset(&pci->resources, res, offset); - } - - if (!res_valid) { - dev_err(dev, "non-prefetchable memory resource required\n"); - err = -EINVAL; - goto out_release_res; - } - - return 0; - -out_release_res: - gen_pci_release_of_pci_ranges(pci); - return err; -} - static int gen_pci_parse_map_cfg_windows(struct gen_pci *pci) { int err; u8 bus_max; resource_size_t busn; - struct resource *bus_range; - struct device *dev = pci->host.dev.parent; + struct pci_bus *bus = pci->host->bus; + struct resource *bus_range = &bus->busn_res; + struct device *dev = pci->host->dev.parent; struct device_node *np = dev->of_node;
- if (of_pci_parse_bus_range(np, &pci->cfg.bus_range)) - pci->cfg.bus_range = (struct resource) { - .name = np->name, - .start = 0, - .end = 0xff, - .flags = IORESOURCE_BUS, - }; - err = of_address_to_resource(np, 0, &pci->cfg.res); if (err) { dev_err(dev, "missing "reg" property\n"); return err; }
- pci->cfg.win = devm_kcalloc(dev, resource_size(&pci->cfg.bus_range), + pci->cfg.win = devm_kcalloc(dev, resource_size(bus_range), sizeof(*pci->cfg.win), GFP_KERNEL); if (!pci->cfg.win) return -ENOMEM;
/* Limit the bus-range to fit within reg */ - bus_max = pci->cfg.bus_range.start + + bus_max = bus_range->start + (resource_size(&pci->cfg.res) >> pci->cfg.ops->bus_shift) - 1; - pci->cfg.bus_range.end = min_t(resource_size_t, pci->cfg.bus_range.end, - bus_max); + pci_bus_update_busn_res_end(bus, min_t(resource_size_t, + bus_range->end, bus_max));
/* Map our Configuration Space windows */ if (!devm_request_mem_region(dev, pci->cfg.res.start, @@ -293,7 +169,6 @@ static int gen_pci_parse_map_cfg_windows(struct gen_pci *pci) "Configuration Space")) return -ENOMEM;
- bus_range = &pci->cfg.bus_range; for (busn = bus_range->start; busn <= bus_range->end; ++busn) { u32 idx = busn - bus_range->start; u32 sz = 1 << pci->cfg.ops->bus_shift; @@ -305,34 +180,21 @@ static int gen_pci_parse_map_cfg_windows(struct gen_pci *pci) return -ENOMEM; }
- /* Register bus resource */ - pci_add_resource(&pci->resources, bus_range); return 0; }
-static int gen_pci_setup(int nr, struct pci_sys_data *sys) -{ - struct gen_pci *pci = sys->private_data; - list_splice_init(&pci->resources, &sys->resources); - return 1; -} +/* Unused, temporary to satisfy ARM arch code */ +static struct pci_sys_data sys;
static int gen_pci_probe(struct platform_device *pdev) { - int err; + int err, lastbus; const char *type; const struct of_device_id *of_id; const int *prop; struct device *dev = &pdev->dev; struct device_node *np = dev->of_node; struct gen_pci *pci = devm_kzalloc(dev, sizeof(*pci), GFP_KERNEL); - struct hw_pci hw = { - .nr_controllers = 1, - .private_data = (void **)&pci, - .setup = gen_pci_setup, - .map_irq = of_irq_parse_and_map_pci, - .ops = &gen_pci_ops, - };
if (!pci) return -ENOMEM; @@ -353,23 +215,28 @@ static int gen_pci_probe(struct platform_device *pdev)
of_id = of_match_node(gen_pci_of_match, np); pci->cfg.ops = of_id->data; - pci->host.dev.parent = dev; - INIT_LIST_HEAD(&pci->host.windows); - INIT_LIST_HEAD(&pci->resources);
- /* Parse our PCI ranges and request their resources */ - err = gen_pci_parse_request_of_pci_ranges(pci); - if (err) - return err; + pci->host = of_create_pci_host_bridge(&pdev->dev, &gen_pci_ops, &sys); + if (PTR_ERR(pci->host)) + return PTR_ERR(pci->host);
/* Parse and map our Configuration Space windows */ err = gen_pci_parse_map_cfg_windows(pci); - if (err) { - gen_pci_release_of_pci_ranges(pci); + if (err) return err; - }
- pci_common_init_dev(dev, &hw); + pci_ioremap_io(0, pci->host->io_base); + + pci_add_flags(PCI_ENABLE_PROC_DOMAINS); + pci_add_flags(PCI_REASSIGN_ALL_BUS | PCI_REASSIGN_ALL_RSRC); + + lastbus = pci_scan_child_bus(pci->host->bus); + pci_bus_update_busn_res_end(pci->host->bus, lastbus); + + pci_assign_unassigned_bus_resources(pci->host->bus); + + pci_bus_add_devices(pci->host->bus); + return 0; }
Hi Rob,
Nice work!
On Fri, Jun 27, 2014 at 05:15:53PM +0100, Rob Herring wrote:
On 06/27/2014 07:49 AM, Will Deacon wrote:
On Fri, Jun 27, 2014 at 12:03:34PM +0100, Arnd Bergmann wrote:
On Thursday 26 June 2014 19:44:21 Rob Herring wrote:
I don't agree arm32 is harder than microblaze. Yes, converting ALL of arm would be, but that is not necessary. With Liviu's latest branch the hacks I previously needed are gone (thanks!), and this is all I need to get Versatile PCI working (under QEMU):
I meant converting all of arm32 would be harder, but I agree we don't have to do it. It would be nice to convert all of the drivers/pci/host drivers though, iow all multiplatform-enabled ones, and leaving the current arm32 pci implementation for the platforms we don't want to convert to multiplatform anyway (footbridge, iop, ixp4xx, ks8695 (?), pxa, sa1100).
I'm more than happy to convert the generic host controller we merged recently, but I'd probably want the core changes merged first so that I know I'm not wasting my time!
Something like this untested patch...
Another issue I found still present is pci_ioremap_io needs some work to unify with the arm implementation. That's a matter of changing the function from an offset to i/o resource. That should be a pretty mechanical change.
Also, there is a potential for memory leak because there is no undo for of_create_pci_host_bridge.
Once Liviu reposts his series, I'll take a look at of_create_pci_host_bridge, as I'm not sure that it provides all the behaviour that we get from gen_pci_parse_request_of_pci_ranges right now (e.g. required a non-prefetchable mem resource).
Will
On Fri, Jun 27, 2014 at 01:44:21AM +0100, Rob Herring wrote:
On Thu, Jun 26, 2014 at 3:59 AM, Catalin Marinas catalin.marinas@arm.com wrote:
Although a bit late, I'm raising this now and hopefully we'll come to a conclusion soon. Delaying arm64 PCIe support even further is not a real option, which leaves us with:
- Someone else (with enough PCIe knowledge) volunteering to take over soon or
- Dropping Liviu's work and going for an arm64-specific implementation (most likely based on the arm32 implementation, see below)
- Keeping Liviu's patches leaving some of the architecture specific
bits. I know Arnd and I both commented on it still needing more common pieces, but compared to option 2 that would be way better.
Let's look at the patches in question:
3e71867 pci: Introduce pci_register_io_range() helper function. 6681dff pci: OF: Fix the conversion of IO ranges into IO resources.
Both OF patches. I'll happily merge them.
We just need to make sure they don't break other users of of_pci_range_to_resource() that the second patch introduces.
2d5dd85 pci: Create pci_host_bridge before its associated bus in pci_create_root_bus. f6f2854 pci: Introduce a domain number for pci_host_bridge. 524a9f5 pci: Export find_pci_host_bridge() function.
These don't seem to be too controversial.
I think here there were discussions around introducing domain_nr to pci_host_bridge, particularly to the pci_create_root_bus_in_domain() API change. I don't think we reached any conclusion.
fb75718 pci: of: Parse and map the IRQ when adding the PCI device.
6 LOC. Hardly controversial.
I agree.
920a685 pci: Add support for creating a generic host_bridge from device tree
This function could be moved to drivers/of/of_pci.c if having it in drivers/pci is too much maintenance burden.
I think it makes sense. Currently drivers/pci/host-bridge.c doesn't have anything OF related, so of_pci.c looks more appropriate.
However, nearly the same code is already being duplicated in every DT enabled ARM PCI host driver and will continue as more PCI hosts are added. So this isn't really a question of converting other architectures to common PCI host infrastructure, but converting DT based PCI hosts to common infrastructure. ARM is the only arch moving host drivers to drivers/pci ATM. Until other architectures start doing that, converting them is pointless.
I agree. It's probably more important to convert some of the drivers/pci/host implementations to using the common parsing rather than a new architecture (this way we avoid even more code duplication).
bcf5c10 Fix ioport_map() for !CONFIG_GENERIC_IOMAP cases.
Seems like an independent fix that should be applied regardless.
Indeed. But it got stuck at the top of this series and hasn't been pushed upstream.
7cfde80 arm64: Add architecture support for PCI
What is here is really just a function of which option we pick.
With Liviu's latest version (not posted) and with of_create_pci_host_bridge() function moved to of_pci.c, I don't think there is much new functionality added to drivers/pci/. What I think we need is clarifying the domain_nr patch (and API change) and more users of the new generic code. As you said, it doesn't need to be a separate architecture but rather existing pci host drivers under drivers/pci. Of course, other arch conversion should follow shortly as well but even without an immediate conversion, I don't see too much additional maintenance burden for the core PCIe code (and code sharing between new PCIe host drivers is even more beneficial).
On Fri, Jun 27, 2014 at 8:14 AM, Catalin Marinas catalin.marinas@arm.com wrote:
... With Liviu's latest version (not posted) and with of_create_pci_host_bridge() function moved to of_pci.c, I don't think there is much new functionality added to drivers/pci/. What I think we need is clarifying the domain_nr patch (and API change) and more users of the new generic code. As you said, it doesn't need to be a separate architecture but rather existing pci host drivers under drivers/pci. Of course, other arch conversion should follow shortly as well but even without an immediate conversion, I don't see too much additional maintenance burden for the core PCIe code (and code sharing between new PCIe host drivers is even more beneficial).
Sorry, I haven't had time to follow this. It sounds like there are several pieces we could get out of the way easily. How about posting the actual patches again? Maybe re-order them so the easy pieces are first so they can get applied even if there are issues with later ones?
Bjorn
On Fri, Jun 27, 2014 at 03:55:04PM +0100, Bjorn Helgaas wrote:
On Fri, Jun 27, 2014 at 8:14 AM, Catalin Marinas catalin.marinas@arm.com wrote:
... With Liviu's latest version (not posted) and with of_create_pci_host_bridge() function moved to of_pci.c, I don't think there is much new functionality added to drivers/pci/. What I think we need is clarifying the domain_nr patch (and API change) and more users of the new generic code. As you said, it doesn't need to be a separate architecture but rather existing pci host drivers under drivers/pci. Of course, other arch conversion should follow shortly as well but even without an immediate conversion, I don't see too much additional maintenance burden for the core PCIe code (and code sharing between new PCIe host drivers is even more beneficial).
Sorry, I haven't had time to follow this. It sounds like there are several pieces we could get out of the way easily. How about posting the actual patches again? Maybe re-order them so the easy pieces are first so they can get applied even if there are issues with later ones?
OK, I will post a new series on Monday.
Thanks, Liviu
Bjorn
On Fri, Mar 14, 2014 at 9:34 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
Some architectures do not share x86 simple view of the PCI I/O space and instead use a range of addresses that map to bus addresses. For some architectures these ranges will be expressed by OF bindings in a device tree file.
Introduce a pci_register_io_range() helper function that can be used by the architecture code to keep track of the I/O ranges described by the PCI bindings. If the PCI_IOBASE macro is not defined that signals lack of support for PCI and we return an error.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Acked-by: Grant Likely grant.likely@linaro.org Tested-by: Tanmay Inamdar tinamdar@apm.com
drivers/of/address.c | 9 +++++++++ include/linux/of_address.h | 1 + 2 files changed, 10 insertions(+)
diff --git a/drivers/of/address.c b/drivers/of/address.c index 1a54f1f..be958ed 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -619,6 +619,15 @@ const __be32 *of_get_address(struct device_node *dev, int index, u64 *size, } EXPORT_SYMBOL(of_get_address);
+int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size) +{ +#ifndef PCI_IOBASE
return -EINVAL;
+#else
return 0;
+#endif +}
This isn't PCI code, so I'm fine with it in that sense, but I'm not sure the idea of a PCI_IOBASE #define is really what we need. It's not really determined by the processor architecture, it's determined by the platform. And a single address isn't enough in general, either, because if there are multiple host bridges, there's no reason the apertures that generate PCI I/O transactions need to be contiguous on the CPU side.
That's just a long way of saying that if we ever came up with a more generic way to handle I/O port spaces, PCI_IOBASE might go away. And I guess part of that rework could be changing this use of it along with the others.
unsigned long __weak pci_address_to_pio(phys_addr_t address) { if (address > IO_SPACE_LIMIT) diff --git a/include/linux/of_address.h b/include/linux/of_address.h index 5f6ed6b..40c418d 100644 --- a/include/linux/of_address.h +++ b/include/linux/of_address.h @@ -56,6 +56,7 @@ extern void __iomem *of_iomap(struct device_node *device, int index); extern const __be32 *of_get_address(struct device_node *dev, int index, u64 *size, unsigned int *flags);
+extern int pci_register_io_range(phys_addr_t addr, resource_size_t size); extern unsigned long pci_address_to_pio(phys_addr_t addr);
extern int of_pci_range_parser_init(struct of_pci_range_parser *parser,
1.9.0
On Monday 07 April 2014 17:21:51 Bjorn Helgaas wrote:
On Fri, Mar 14, 2014 at 9:34 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
Some architectures do not share x86 simple view of the PCI I/O space and instead use a range of addresses that map to bus addresses. For some architectures these ranges will be expressed by OF bindings in a device tree file.
Introduce a pci_register_io_range() helper function that can be used by the architecture code to keep track of the I/O ranges described by the PCI bindings. If the PCI_IOBASE macro is not defined that signals lack of support for PCI and we return an error.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Acked-by: Grant Likely grant.likely@linaro.org Tested-by: Tanmay Inamdar tinamdar@apm.com
drivers/of/address.c | 9 +++++++++ include/linux/of_address.h | 1 + 2 files changed, 10 insertions(+)
diff --git a/drivers/of/address.c b/drivers/of/address.c index 1a54f1f..be958ed 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -619,6 +619,15 @@ const __be32 *of_get_address(struct device_node *dev, int index, u64 *size, } EXPORT_SYMBOL(of_get_address);
+int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size) +{ +#ifndef PCI_IOBASE
return -EINVAL;
+#else
return 0;
+#endif +}
This isn't PCI code, so I'm fine with it in that sense, but I'm not sure the idea of a PCI_IOBASE #define is really what we need. It's not really determined by the processor architecture, it's determined by the platform. And a single address isn't enough in general, either, because if there are multiple host bridges, there's no reason the apertures that generate PCI I/O transactions need to be contiguous on the CPU side.
That's just a long way of saying that if we ever came up with a more generic way to handle I/O port spaces, PCI_IOBASE might go away. And I guess part of that rework could be changing this use of it along with the others.
I'd rather not add a generic implementation of this at all, but keep it all within the host resource scanning code.
If we do add a generic implementation, my preference would be to use the version introduced for arm64, with a fallback of returning -EINVAL if the architecture doesn't implement it.
There is no way ever that returning '0' makes sense here: Either the architecture supports memory mapped I/O spaces and then we should be able to find an appropriate io_offset for it, or it doesn't support memory mapped I/O spaces and we should never even call this function.
Arnd
On Tue, Apr 08, 2014 at 12:21:51AM +0100, Bjorn Helgaas wrote:
On Fri, Mar 14, 2014 at 9:34 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
Some architectures do not share x86 simple view of the PCI I/O space and instead use a range of addresses that map to bus addresses. For some architectures these ranges will be expressed by OF bindings in a device tree file.
Introduce a pci_register_io_range() helper function that can be used by the architecture code to keep track of the I/O ranges described by the PCI bindings. If the PCI_IOBASE macro is not defined that signals lack of support for PCI and we return an error.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Acked-by: Grant Likely grant.likely@linaro.org Tested-by: Tanmay Inamdar tinamdar@apm.com
drivers/of/address.c | 9 +++++++++ include/linux/of_address.h | 1 + 2 files changed, 10 insertions(+)
diff --git a/drivers/of/address.c b/drivers/of/address.c index 1a54f1f..be958ed 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -619,6 +619,15 @@ const __be32 *of_get_address(struct device_node *dev, int index, u64 *size, } EXPORT_SYMBOL(of_get_address);
+int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size) +{ +#ifndef PCI_IOBASE
return -EINVAL;
+#else
return 0;
+#endif +}
This isn't PCI code, so I'm fine with it in that sense, but I'm not sure the idea of a PCI_IOBASE #define is really what we need. It's not really determined by the processor architecture, it's determined by the platform. And a single address isn't enough in general, either, because if there are multiple host bridges, there's no reason the apertures that generate PCI I/O transactions need to be contiguous on the CPU side.
It should not be only platform's choice if the architecture doesn't support it. To my mind PCI_IOBASE means "I support MMIO operations and this is the start of the virtual address where my I/O ranges are mapped." It's the same as ppc's _IO_BASE. And pci_address_to_pio() will take care to give you the correct io_offset in the presence of multiple host bridges, while keeping the io resource in the range [0 .. host_bridge_io_range_size - 1]
That's just a long way of saying that if we ever came up with a more generic way to handle I/O port spaces, PCI_IOBASE might go away. And I guess part of that rework could be changing this use of it along with the others.
And I have a patch series that #defines PCI_IOBASE only in those architectures that support MMIO, where this macro makes sense. Also notice that the arm64 series has a patch that I'm going to roll into this one where ioport_map() gets fixed to include PCI_IOBASE when !CONFIG_GENERIC_MAP.
Best regards, Liviu
unsigned long __weak pci_address_to_pio(phys_addr_t address) { if (address > IO_SPACE_LIMIT) diff --git a/include/linux/of_address.h b/include/linux/of_address.h index 5f6ed6b..40c418d 100644 --- a/include/linux/of_address.h +++ b/include/linux/of_address.h @@ -56,6 +56,7 @@ extern void __iomem *of_iomap(struct device_node *device, int index); extern const __be32 *of_get_address(struct device_node *dev, int index, u64 *size, unsigned int *flags);
+extern int pci_register_io_range(phys_addr_t addr, resource_size_t size); extern unsigned long pci_address_to_pio(phys_addr_t addr);
extern int of_pci_range_parser_init(struct of_pci_range_parser *parser,
1.9.0
On Tuesday 08 April 2014 10:49:33 Liviu Dudau wrote:
On Tue, Apr 08, 2014 at 12:21:51AM +0100, Bjorn Helgaas wrote:
On Fri, Mar 14, 2014 at 9:34 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
Some architectures do not share x86 simple view of the PCI I/O space and instead use a range of addresses that map to bus addresses. For some architectures these ranges will be expressed by OF bindings in a device tree file.
Introduce a pci_register_io_range() helper function that can be used by the architecture code to keep track of the I/O ranges described by the PCI bindings. If the PCI_IOBASE macro is not defined that signals lack of support for PCI and we return an error.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Acked-by: Grant Likely grant.likely@linaro.org Tested-by: Tanmay Inamdar tinamdar@apm.com
drivers/of/address.c | 9 +++++++++ include/linux/of_address.h | 1 + 2 files changed, 10 insertions(+)
diff --git a/drivers/of/address.c b/drivers/of/address.c index 1a54f1f..be958ed 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -619,6 +619,15 @@ const __be32 *of_get_address(struct device_node *dev, int index, u64 *size, } EXPORT_SYMBOL(of_get_address);
+int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size) +{ +#ifndef PCI_IOBASE
return -EINVAL;
+#else
return 0;
+#endif +}
This isn't PCI code, so I'm fine with it in that sense, but I'm not sure the idea of a PCI_IOBASE #define is really what we need. It's not really determined by the processor architecture, it's determined by the platform. And a single address isn't enough in general, either, because if there are multiple host bridges, there's no reason the apertures that generate PCI I/O transactions need to be contiguous on the CPU side.
It should not be only platform's choice if the architecture doesn't support it. To my mind PCI_IOBASE means "I support MMIO operations and this is the start of the virtual address where my I/O ranges are mapped." It's the same as ppc's _IO_BASE. And pci_address_to_pio() will take care to give you the correct io_offset in the presence of multiple host bridges, while keeping the io resource in the range [0 .. host_bridge_io_range_size - 1]
There is a wide range of implementations across architectures:
a) no access to I/O ports at all (tile, s390, ...) b) access to I/O ports only through special instructions (x86, ...) c) all MMIO is mapped virtual contiguous to PCI_IOBASE or _IO_BASE (most ppc64, arm32 with MMU, arm64, ...) d) only one PCI host can have an I/O space (mips, microblaze, ...) e) each host controller can have its own method (ppc64 with indirect pio) f) PIO token equals virtual address plus offset (some legacy ARM platforms, probably some other architectures), or physical address (sparc) g) PIO token encodes address space number plus offset (ia64)
a) and b) are trivially handled by any implementation that falls back to 'return -EINVAL'. I believe that c) is the most appropriate solution and we should be able to adopt it by most of the architectures that have an MMU and make it the default implementation. d) seems like a good fallback for noMMU architectures: While we do need to support I/O spaces, we need to support multiple PCI domains, and we need to support noMMU, the combination of all three should be extremely rare, and I'd leave it up to the architecture to support that if there is a real use case, rather than trying to put that into common code. Anything that currently requires e), f) or g) I think should keep doing that and not try to use the generic implementation.
Arnd
On Tue, Apr 8, 2014 at 4:11 AM, Arnd Bergmann arnd@arndb.de wrote:
There is a wide range of implementations across architectures:
a) no access to I/O ports at all (tile, s390, ...) b) access to I/O ports only through special instructions (x86, ...) c) all MMIO is mapped virtual contiguous to PCI_IOBASE or _IO_BASE (most ppc64, arm32 with MMU, arm64, ...) d) only one PCI host can have an I/O space (mips, microblaze, ...) e) each host controller can have its own method (ppc64 with indirect pio) f) PIO token equals virtual address plus offset (some legacy ARM platforms, probably some other architectures), or physical address (sparc) g) PIO token encodes address space number plus offset (ia64)
a) and b) are trivially handled by any implementation that falls back to 'return -EINVAL'. I believe that c) is the most appropriate solution and we should be able to adopt it by most of the architectures that have an MMU and make it the default implementation. d) seems like a good fallback for noMMU architectures: While we do need to support I/O spaces, we need to support multiple PCI domains, and we need to support noMMU, the combination of all three should be extremely rare, and I'd leave it up to the architecture to support that if there is a real use case, rather than trying to put that into common code. Anything that currently requires e), f) or g) I think should keep doing that and not try to use the generic implementation.
Thanks for the nice summary. That's way more than I had figured out myself.
I don't know whether it'd be worth it, especially for something that's so close to obsolete, but it seems like it should be *possible* to generalize and unify this somewhat. I would argue that g) (which I wrote, so I know it better than the others) could fairly easily subsume c), d), and f), since it maps an ioport number to a virtual address for an MMIO access, but it doesn't assume that all the MMIO spaces are contiguous.
b), e), and maybe a) could be handled with an exception, e.g., inside inb(), look up the struct io_space (e.g., similar to what ia64 does in __ia64_mk_io_addr()), and if that struct contains a non-zero ops pointer, use that instead of doing the MMIO access. The ops pointer functions could use the x86 INB instruction or do the indirect PIO thing or whatever.
Bjorn
The ranges property for a host bridge controller in DT describes the mapping between the PCI bus address and the CPU physical address. The resources framework however expects that the IO resources start at a pseudo "port" address 0 (zero) and have a maximum size of IO_SPACE_LIMIT. The conversion from pci ranges to resources failed to take that into account.
In the process move the function into drivers/of/address.c as it now depends on pci_address_to_pio() code and make it return an error message.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Tested-by: Tanmay Inamdar tinamdar@apm.com --- drivers/of/address.c | 40 ++++++++++++++++++++++++++++++++++++ include/linux/of_address.h | 13 ++---------- 2 files changed, 42 insertions(+), 11 deletions(-)
diff --git a/drivers/of/address.c b/drivers/of/address.c index be958ed..4eabd30 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -728,3 +728,43 @@ void __iomem *of_iomap(struct device_node *np, int index) return ioremap(res.start, resource_size(&res)); } EXPORT_SYMBOL(of_iomap); + +/** + * of_pci_range_to_resource - Create a resource from an of_pci_range + * @range: the PCI range that describes the resource + * @np: device node where the range belongs to + * @res: pointer to a valid resource that will be updated to + * reflect the values contained in the range. + * + * Returns EINVAL if the range cannot be converted to resource. + * + * Note that if the range is an IO range, the resource will be converted + * using pci_address_to_pio() which can fail if it is called too early or + * if the range cannot be matched to any host bridge IO space (our case here). + * To guard against that we try to register the IO range first. + * If that fails we know that pci_address_to_pio() will do too. + */ +int of_pci_range_to_resource(struct of_pci_range *range, + struct device_node *np, struct resource *res) +{ + res->flags = range->flags; + if (res->flags & IORESOURCE_IO) { + unsigned long port = -1; + int err = pci_register_io_range(range->cpu_addr, range->size); + if (err) + return err; + port = pci_address_to_pio(range->cpu_addr); + if (port == (unsigned long)-1) { + res->start = (resource_size_t)OF_BAD_ADDR; + res->end = (resource_size_t)OF_BAD_ADDR; + return -EINVAL; + } + res->start = port; + } else { + res->start = range->cpu_addr; + } + res->end = res->start + range->size - 1; + res->parent = res->child = res->sibling = NULL; + res->name = np->full_name; + return 0; +} diff --git a/include/linux/of_address.h b/include/linux/of_address.h index 40c418d..a4b400d 100644 --- a/include/linux/of_address.h +++ b/include/linux/of_address.h @@ -23,17 +23,8 @@ struct of_pci_range { #define for_each_of_pci_range(parser, range) \ for (; of_pci_range_parser_one(parser, range);)
-static inline void of_pci_range_to_resource(struct of_pci_range *range, - struct device_node *np, - struct resource *res) -{ - res->flags = range->flags; - res->start = range->cpu_addr; - res->end = range->cpu_addr + range->size - 1; - res->parent = res->child = res->sibling = NULL; - res->name = np->full_name; -} - +extern int of_pci_range_to_resource(struct of_pci_range *range, + struct device_node *np, struct resource *res); /* Translate a DMA address from device space to CPU space */ extern u64 of_translate_dma_address(struct device_node *dev, const __be32 *in_addr);
On Friday 14 March 2014, Liviu Dudau wrote:
+int of_pci_range_to_resource(struct of_pci_range *range,
struct device_node *np, struct resource *res)
+{
res->flags = range->flags;
if (res->flags & IORESOURCE_IO) {
unsigned long port = -1;
int err = pci_register_io_range(range->cpu_addr, range->size);
if (err)
return err;
port = pci_address_to_pio(range->cpu_addr);
if (port == (unsigned long)-1) {
res->start = (resource_size_t)OF_BAD_ADDR;
res->end = (resource_size_t)OF_BAD_ADDR;
return -EINVAL;
}
The error handling is inconsistent here: in one case you set the resource to OF_BAD_ADDR, in the other one you don't.
Arnd
On Fri, Mar 14, 2014 at 05:05:28PM +0000, Arnd Bergmann wrote:
On Friday 14 March 2014, Liviu Dudau wrote:
+int of_pci_range_to_resource(struct of_pci_range *range,
struct device_node *np, struct resource *res)
+{
res->flags = range->flags;
if (res->flags & IORESOURCE_IO) {
unsigned long port = -1;
int err = pci_register_io_range(range->cpu_addr, range->size);
if (err)
return err;
port = pci_address_to_pio(range->cpu_addr);
if (port == (unsigned long)-1) {
res->start = (resource_size_t)OF_BAD_ADDR;
res->end = (resource_size_t)OF_BAD_ADDR;
return -EINVAL;
}
The error handling is inconsistent here: in one case you set the resource to OF_BAD_ADDR, in the other one you don't.
Arnd
You are right, that was lazy of me. What about this version?
8<----------------------------------------------------
From acfd63b5c48b4a9066ab0b373633c5bb4feaadf5 Mon Sep 17 00:00:00 2001
From: Liviu Dudau Liviu.Dudau@arm.com Date: Fri, 28 Feb 2014 12:40:46 +0000 Subject: [PATCH v7 2/6] pci: OF: Fix the conversion of IO ranges into IO resources.
The ranges property for a host bridge controller in DT describes the mapping between the PCI bus address and the CPU physical address. The resources framework however expects that the IO resources start at a pseudo "port" address 0 (zero) and have a maximum size of IO_SPACE_LIMIT. The conversion from pci ranges to resources failed to take that into account.
In the process move the function into drivers/of/address.c as it now depends on pci_address_to_pio() code and make it return an error message.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Tested-by: Tanmay Inamdar tinamdar@apm.com --- drivers/of/address.c | 45 ++++++++++++++++++++++++++++++++++++ include/linux/of_address.h | 13 ++--------- 2 files changed, 47 insertions(+), 11 deletions(-)
diff --git a/drivers/of/address.c b/drivers/of/address.c index be958ed..673c050 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -728,3 +728,48 @@ void __iomem *of_iomap(struct device_node *np, int index) return ioremap(res.start, resource_size(&res)); } EXPORT_SYMBOL(of_iomap); + +/** + * of_pci_range_to_resource - Create a resource from an of_pci_range + * @range: the PCI range that describes the resource + * @np: device node where the range belongs to + * @res: pointer to a valid resource that will be updated to + * reflect the values contained in the range. + * + * Returns EINVAL if the range cannot be converted to resource. + * + * Note that if the range is an IO range, the resource will be converted + * using pci_address_to_pio() which can fail if it is called too early or + * if the range cannot be matched to any host bridge IO space (our case here). + * To guard against that we try to register the IO range first. + * If that fails we know that pci_address_to_pio() will do too. + */ +int of_pci_range_to_resource(struct of_pci_range *range, + struct device_node *np, struct resource *res) +{ + res->flags = range->flags; + res->parent = res->child = res->sibling = NULL; + res->name = np->full_name; + + if (res->flags & IORESOURCE_IO) { + unsigned long port = -1; + int err = pci_register_io_range(range->cpu_addr, range->size); + if (err) + goto invalid_range; + port = pci_address_to_pio(range->cpu_addr); + if (port == (unsigned long)-1) { + err = -EINVAL; + goto invalid_range; + } + res->start = port; + } else { + res->start = range->cpu_addr; + } + res->end = res->start + range->size - 1; + return 0; + +invalid_range: + res->start = (resource_size_t)OF_BAD_ADDR; + res->end = (resource_size_t)OF_BAD_ADDR; + return err; +} diff --git a/include/linux/of_address.h b/include/linux/of_address.h index 40c418d..a4b400d 100644 --- a/include/linux/of_address.h +++ b/include/linux/of_address.h @@ -23,17 +23,8 @@ struct of_pci_range { #define for_each_of_pci_range(parser, range) \ for (; of_pci_range_parser_one(parser, range);)
-static inline void of_pci_range_to_resource(struct of_pci_range *range, - struct device_node *np, - struct resource *res) -{ - res->flags = range->flags; - res->start = range->cpu_addr; - res->end = range->cpu_addr + range->size - 1; - res->parent = res->child = res->sibling = NULL; - res->name = np->full_name; -} - +extern int of_pci_range_to_resource(struct of_pci_range *range, + struct device_node *np, struct resource *res); /* Translate a DMA address from device space to CPU space */ extern u64 of_translate_dma_address(struct device_node *dev, const __be32 *in_addr);
On Friday 14 March 2014, Liviu Dudau wrote:
You are right, that was lazy of me. What about this version?
Yes, that seems better. Thanks for fixing it up.
But back to the more important question that I realized we have not resolved yet:
You now have two completely independent allocation functions for logical I/O space numbers, and they will return different numbers for any nontrivial scenario.
+int of_pci_range_to_resource(struct of_pci_range *range,
struct device_node *np, struct resource *res)
+{
res->flags = range->flags;
res->parent = res->child = res->sibling = NULL;
res->name = np->full_name;
if (res->flags & IORESOURCE_IO) {
unsigned long port = -1;
int err = pci_register_io_range(range->cpu_addr, range->size);
if (err)
goto invalid_range;
port = pci_address_to_pio(range->cpu_addr);
if (port == (unsigned long)-1) {
err = -EINVAL;
goto invalid_range;
}
res->start = port;
} else {
This one concatenates the I/O spaces and assumes that each space starts at bus address zero, and takes little precaution to avoid filling up IO_SPACE_LIMIT if the sizes are too big.
+unsigned long pci_ioremap_io(const struct resource *res, phys_addr_t phys_addr) +{
unsigned long start, len, virt_start;
int err;
if (res->end > IO_SPACE_LIMIT)
return -EINVAL;
/*
* try finding free space for the whole size first,
* fall back to 64K if not available
*/
len = resource_size(res);
start = bitmap_find_next_zero_area(pci_iospace, IO_SPACE_PAGES,
res->start / PAGE_SIZE, len / PAGE_SIZE, 0);
if (start == IO_SPACE_PAGES && len > SZ_64K) {
len = SZ_64K;
start = 0;
start = bitmap_find_next_zero_area(pci_iospace, IO_SPACE_PAGES,
start, len / PAGE_SIZE, 0);
}
/* no 64K area found */
if (start == IO_SPACE_PAGES)
return -ENOMEM;
/* ioremap physical aperture to virtual aperture */
virt_start = start * PAGE_SIZE + (unsigned long)PCI_IOBASE;
err = ioremap_page_range(virt_start, virt_start + len,
phys_addr, __pgprot(PROT_DEVICE_nGnRE));
if (err)
return err;
bitmap_set(pci_iospace, start, len / PAGE_SIZE);
/* return io_offset */
return start * PAGE_SIZE - res->start;
+}
While this one will try to fall back to smaller sizes, and will honor nonzero bus addresses.
I think we shouldn't even try to do the same thing twice, but instead just use a single allocator. I'd prefer the one I came up with, but I realize that I am biased here ;-)
Arnd
On Fri, Mar 14, 2014 at 06:46:23PM +0000, Arnd Bergmann wrote:
On Friday 14 March 2014, Liviu Dudau wrote:
You are right, that was lazy of me. What about this version?
Yes, that seems better. Thanks for fixing it up.
But back to the more important question that I realized we have not resolved yet:
You now have two completely independent allocation functions for logical I/O space numbers, and they will return different numbers for any nontrivial scenario.
+int of_pci_range_to_resource(struct of_pci_range *range,
struct device_node *np, struct resource *res)
+{
res->flags = range->flags;
res->parent = res->child = res->sibling = NULL;
res->name = np->full_name;
if (res->flags & IORESOURCE_IO) {
unsigned long port = -1;
int err = pci_register_io_range(range->cpu_addr, range->size);
if (err)
goto invalid_range;
port = pci_address_to_pio(range->cpu_addr);
if (port == (unsigned long)-1) {
err = -EINVAL;
goto invalid_range;
}
res->start = port;
} else {
This one concatenates the I/O spaces and assumes that each space starts at bus address zero, and takes little precaution to avoid filling up IO_SPACE_LIMIT if the sizes are too big.
Actually, you are attaching too much meaning to this one. pci_register_io_range() only tries to remember the ranges, nothing else. And it *does* check that the total sum of registered ranges does not exceed IO_SPACE_LIMIT. This is used before we have any resource mapped for that range (actually it is used to *create* the resource for the range) so there is no other helping hand.
It doesn't assume that space starts at bus address zero, it ignores the bus address. It only handles CPU addresses for the range, to help with generating logical IO ports. If you have overlapping CPU addresses with different bus addresses it will not work, but then I guess you will have different problems then.
+unsigned long pci_ioremap_io(const struct resource *res, phys_addr_t phys_addr) +{
unsigned long start, len, virt_start;
int err;
if (res->end > IO_SPACE_LIMIT)
return -EINVAL;
/*
* try finding free space for the whole size first,
* fall back to 64K if not available
*/
len = resource_size(res);
start = bitmap_find_next_zero_area(pci_iospace, IO_SPACE_PAGES,
res->start / PAGE_SIZE, len / PAGE_SIZE, 0);
if (start == IO_SPACE_PAGES && len > SZ_64K) {
len = SZ_64K;
start = 0;
start = bitmap_find_next_zero_area(pci_iospace, IO_SPACE_PAGES,
start, len / PAGE_SIZE, 0);
}
/* no 64K area found */
if (start == IO_SPACE_PAGES)
return -ENOMEM;
/* ioremap physical aperture to virtual aperture */
virt_start = start * PAGE_SIZE + (unsigned long)PCI_IOBASE;
err = ioremap_page_range(virt_start, virt_start + len,
phys_addr, __pgprot(PROT_DEVICE_nGnRE));
if (err)
return err;
bitmap_set(pci_iospace, start, len / PAGE_SIZE);
/* return io_offset */
return start * PAGE_SIZE - res->start;
+}
While this one will try to fall back to smaller sizes, and will honor nonzero bus addresses.
Yes, because this one does the actual allocation. And it needs to know how the mapping from CPU to bus works.
I think we shouldn't even try to do the same thing twice, but instead just use a single allocator. I'd prefer the one I came up with, but I realize that I am biased here ;-)
I agree, but I think the two functions serve two different and distinct roles.
Best regards, Liviu
Arnd
To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday 14 March 2014, Liviu Dudau wrote:
On Fri, Mar 14, 2014 at 06:46:23PM +0000, Arnd Bergmann wrote:
On Friday 14 March 2014, Liviu Dudau wrote:
+int of_pci_range_to_resource(struct of_pci_range *range,
struct device_node *np, struct resource *res)
+{
res->flags = range->flags;
res->parent = res->child = res->sibling = NULL;
res->name = np->full_name;
if (res->flags & IORESOURCE_IO) {
unsigned long port = -1;
int err = pci_register_io_range(range->cpu_addr, range->size);
if (err)
goto invalid_range;
port = pci_address_to_pio(range->cpu_addr);
if (port == (unsigned long)-1) {
err = -EINVAL;
goto invalid_range;
}
res->start = port;
} else {
This one concatenates the I/O spaces and assumes that each space starts at bus address zero, and takes little precaution to avoid filling up IO_SPACE_LIMIT if the sizes are too big.
Actually, you are attaching too much meaning to this one. pci_register_io_range() only tries to remember the ranges, nothing else. And it *does* check that the total sum of registered ranges does not exceed IO_SPACE_LIMIT. This is used before we have any resource mapped for that range (actually it is used to *create* the resource for the range) so there is no other helping hand.
It doesn't assume that space starts at bus address zero, it ignores the bus address. It only handles CPU addresses for the range, to help with generating logical IO ports. If you have overlapping CPU addresses with different bus addresses it will not work, but then I guess you will have different problems then.
The problem is that it tries to set up a mapping so that pci_address_to_pio returns the actual port number, but the port that you assign to res->start above is assumed to match the 'start' variable below. If the value ends up different, the BARs that get set by the PCI bus scan are not in the same place that got ioremapped into the virtual I/O aperture.
+unsigned long pci_ioremap_io(const struct resource *res, phys_addr_t phys_addr) +{
unsigned long start, len, virt_start;
int err;
if (res->end > IO_SPACE_LIMIT)
return -EINVAL;
/*
* try finding free space for the whole size first,
* fall back to 64K if not available
*/
len = resource_size(res);
start = bitmap_find_next_zero_area(pci_iospace, IO_SPACE_PAGES,
res->start / PAGE_SIZE, len / PAGE_SIZE, 0);
if (start == IO_SPACE_PAGES && len > SZ_64K) {
len = SZ_64K;
start = 0;
start = bitmap_find_next_zero_area(pci_iospace, IO_SPACE_PAGES,
start, len / PAGE_SIZE, 0);
}
/* no 64K area found */
if (start == IO_SPACE_PAGES)
return -ENOMEM;
/* ioremap physical aperture to virtual aperture */
virt_start = start * PAGE_SIZE + (unsigned long)PCI_IOBASE;
err = ioremap_page_range(virt_start, virt_start + len,
phys_addr, __pgprot(PROT_DEVICE_nGnRE));
if (err)
return err;
bitmap_set(pci_iospace, start, len / PAGE_SIZE);
/* return io_offset */
return start * PAGE_SIZE - res->start;
+}
Arnd
On Fri, Mar 14, 2014 at 07:16:10PM +0000, Arnd Bergmann wrote:
On Friday 14 March 2014, Liviu Dudau wrote:
On Fri, Mar 14, 2014 at 06:46:23PM +0000, Arnd Bergmann wrote:
On Friday 14 March 2014, Liviu Dudau wrote:
+int of_pci_range_to_resource(struct of_pci_range *range,
struct device_node *np, struct resource *res)
+{
res->flags = range->flags;
res->parent = res->child = res->sibling = NULL;
res->name = np->full_name;
if (res->flags & IORESOURCE_IO) {
unsigned long port = -1;
int err = pci_register_io_range(range->cpu_addr, range->size);
if (err)
goto invalid_range;
port = pci_address_to_pio(range->cpu_addr);
if (port == (unsigned long)-1) {
err = -EINVAL;
goto invalid_range;
}
res->start = port;
} else {
This one concatenates the I/O spaces and assumes that each space starts at bus address zero, and takes little precaution to avoid filling up IO_SPACE_LIMIT if the sizes are too big.
Actually, you are attaching too much meaning to this one. pci_register_io_range() only tries to remember the ranges, nothing else. And it *does* check that the total sum of registered ranges does not exceed IO_SPACE_LIMIT. This is used before we have any resource mapped for that range (actually it is used to *create* the resource for the range) so there is no other helping hand.
It doesn't assume that space starts at bus address zero, it ignores the bus address. It only handles CPU addresses for the range, to help with generating logical IO ports. If you have overlapping CPU addresses with different bus addresses it will not work, but then I guess you will have different problems then.
The problem is that it tries to set up a mapping so that pci_address_to_pio returns the actual port number, but the port that you assign to res->start above is assumed to match the 'start' variable below. If the value ends up different, the BARs that get set by the PCI bus scan are not in the same place that got ioremapped into the virtual I/O aperture.
Yes, after writting a reply trying to justify why it would actually work I've realised where the fault of my logic stands (short version, two host controllers will get different io_offsets but the ports numbers will start from zero leading to confusion about which host controller the resource belongs to).
I will try to split your function into two parts, one that calculates the io_offset and another that does the ioremap_page_range() side and replace my cooked function.
Best regards, Liviu
+unsigned long pci_ioremap_io(const struct resource *res, phys_addr_t phys_addr) +{
unsigned long start, len, virt_start;
int err;
if (res->end > IO_SPACE_LIMIT)
return -EINVAL;
/*
* try finding free space for the whole size first,
* fall back to 64K if not available
*/
len = resource_size(res);
start = bitmap_find_next_zero_area(pci_iospace, IO_SPACE_PAGES,
res->start / PAGE_SIZE, len / PAGE_SIZE, 0);
if (start == IO_SPACE_PAGES && len > SZ_64K) {
len = SZ_64K;
start = 0;
start = bitmap_find_next_zero_area(pci_iospace, IO_SPACE_PAGES,
start, len / PAGE_SIZE, 0);
}
/* no 64K area found */
if (start == IO_SPACE_PAGES)
return -ENOMEM;
/* ioremap physical aperture to virtual aperture */
virt_start = start * PAGE_SIZE + (unsigned long)PCI_IOBASE;
err = ioremap_page_range(virt_start, virt_start + len,
phys_addr, __pgprot(PROT_DEVICE_nGnRE));
if (err)
return err;
bitmap_set(pci_iospace, start, len / PAGE_SIZE);
/* return io_offset */
return start * PAGE_SIZE - res->start;
+}
Arnd
Before commit 7b5436635800 the pci_host_bridge was created before the root bus. As that commit has added a needless dependency on the bus for pci_alloc_host_bridge() the creation order has been changed for no good reason. Revert the order of creation as we are going to depend on the pci_host_bridge structure to retrieve the domain number of the root bus.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Acked-by: Grant Likely grant.likely@linaro.org Tested-by: Tanmay Inamdar tinamdar@apm.com --- drivers/pci/probe.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..fd11c12 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -505,7 +505,7 @@ static void pci_release_host_bridge_dev(struct device *dev) kfree(bridge); }
-static struct pci_host_bridge *pci_alloc_host_bridge(struct pci_bus *b) +static struct pci_host_bridge *pci_alloc_host_bridge(void) { struct pci_host_bridge *bridge;
@@ -514,7 +514,6 @@ static struct pci_host_bridge *pci_alloc_host_bridge(struct pci_bus *b) return NULL;
INIT_LIST_HEAD(&bridge->windows); - bridge->bus = b; return bridge; }
@@ -1727,9 +1726,19 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, char bus_addr[64]; char *fmt;
+ bridge = pci_alloc_host_bridge(); + if (!bridge) + return NULL; + + bridge->dev.parent = parent; + bridge->dev.release = pci_release_host_bridge_dev; + error = pcibios_root_bridge_prepare(bridge); + if (error) + goto err_out; + b = pci_alloc_bus(); if (!b) - return NULL; + goto err_out;
b->sysdata = sysdata; b->ops = ops; @@ -1738,26 +1747,15 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, if (b2) { /* If we already got to this bus through a different bridge, ignore it */ dev_dbg(&b2->dev, "bus already known\n"); - goto err_out; + goto err_bus_out; }
- bridge = pci_alloc_host_bridge(b); - if (!bridge) - goto err_out; - - bridge->dev.parent = parent; - bridge->dev.release = pci_release_host_bridge_dev; + bridge->bus = b; dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b), bus); - error = pcibios_root_bridge_prepare(bridge); - if (error) { - kfree(bridge); - goto err_out; - } - error = device_register(&bridge->dev); if (error) { put_device(&bridge->dev); - goto err_out; + goto err_bus_out; } b->bridge = get_device(&bridge->dev); device_enable_async_suspend(b->bridge); @@ -1814,8 +1812,10 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, class_dev_reg_err: put_device(&bridge->dev); device_unregister(&bridge->dev); -err_out: +err_bus_out: kfree(b); +err_out: + kfree(bridge); return NULL; }
Make it easier to discover the domain number of a bus by storing the number in pci_host_bridge for the root bus. Several architectures have their own way of storing this information, so it makes sense to try to unify the code. While at this, add a new function that creates a root bus in a given domain and make pci_create_root_bus() a wrapper around this function.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Tested-by: Tanmay Inamdar tinamdar@apm.com --- drivers/pci/probe.c | 41 +++++++++++++++++++++++++++++++++-------- include/linux/pci.h | 4 ++++ 2 files changed, 37 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index fd11c12..172c615 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1714,8 +1714,9 @@ void __weak pcibios_remove_bus(struct pci_bus *bus) { }
-struct pci_bus *pci_create_root_bus(struct device *parent, int bus, - struct pci_ops *ops, void *sysdata, struct list_head *resources) +struct pci_bus *pci_create_root_bus_in_domain(struct device *parent, + int domain, int bus, struct pci_ops *ops, void *sysdata, + struct list_head *resources) { int error; struct pci_host_bridge *bridge; @@ -1728,30 +1729,34 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
bridge = pci_alloc_host_bridge(); if (!bridge) - return NULL; + return ERR_PTR(-ENOMEM);
bridge->dev.parent = parent; bridge->dev.release = pci_release_host_bridge_dev; + bridge->domain_nr = domain; error = pcibios_root_bridge_prepare(bridge); if (error) goto err_out;
b = pci_alloc_bus(); - if (!b) + if (!b) { + error = -ENOMEM; goto err_out; + }
b->sysdata = sysdata; b->ops = ops; b->number = b->busn_res.start = bus; - b2 = pci_find_bus(pci_domain_nr(b), bus); + b2 = pci_find_bus(bridge->domain_nr, bus); if (b2) { /* If we already got to this bus through a different bridge, ignore it */ dev_dbg(&b2->dev, "bus already known\n"); + error = -EEXIST; goto err_bus_out; }
bridge->bus = b; - dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b), bus); + dev_set_name(&bridge->dev, "pci%04x:%02x", bridge->domain_nr, bus); error = device_register(&bridge->dev); if (error) { put_device(&bridge->dev); @@ -1766,7 +1771,7 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
b->dev.class = &pcibus_class; b->dev.parent = b->bridge; - dev_set_name(&b->dev, "%04x:%02x", pci_domain_nr(b), bus); + dev_set_name(&b->dev, "%04x:%02x", bridge->domain_nr, bus); error = device_register(&b->dev); if (error) goto class_dev_reg_err; @@ -1816,7 +1821,27 @@ err_bus_out: kfree(b); err_out: kfree(bridge); - return NULL; + return ERR_PTR(error); +} + +struct pci_bus *pci_create_root_bus(struct device *parent, int bus, + struct pci_ops *ops, void *sysdata, struct list_head *resources) +{ + int domain_nr; + struct pci_bus *b = pci_alloc_bus(); + if (!b) + return NULL; + + b->sysdata = sysdata; + domain_nr = pci_domain_nr(b); + kfree(b); + + b = pci_create_root_bus_in_domain(parent, domain_nr, bus, + ops, sysdata, resources); + if (IS_ERR(b)) + return NULL; + + return b; }
int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max) diff --git a/include/linux/pci.h b/include/linux/pci.h index 33aa2ca..1eed009 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -394,6 +394,7 @@ struct pci_host_bridge_window { struct pci_host_bridge { struct device dev; struct pci_bus *bus; /* root bus */ + int domain_nr; struct list_head windows; /* pci_host_bridge_windows */ void (*release_fn)(struct pci_host_bridge *); void *release_data; @@ -747,6 +748,9 @@ struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata); struct pci_bus *pci_create_root_bus(struct device *parent, int bus, struct pci_ops *ops, void *sysdata, struct list_head *resources); +struct pci_bus *pci_create_root_bus_in_domain(struct device *parent, + int domain, int bus, struct pci_ops *ops, + void *sysdata, struct list_head *resources); int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax); int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax); void pci_bus_release_busn_res(struct pci_bus *b);
On Fri, Mar 14, 2014 at 03:34:30PM +0000, Liviu Dudau wrote:
Make it easier to discover the domain number of a bus by storing the number in pci_host_bridge for the root bus. Several architectures have their own way of storing this information, so it makes sense to try to unify the code.
I like the idea of unifying the way we handle the domain number. But I'd like to see more of the strategy before committing to this approach.
This patch adds struct pci_host_bridge.domain_nr, but of course pci_domain_nr() doesn't use it. It can't today, because pci_create_root_bus() relies on pci_domain_nr() to fill in pci_host_bridge.domain_nr.
But I suppose the intent is that someday we can make pci_domain_nr() arch-independent somehow. I'd just like to see more of the path to that happening.
While at this, add a new function that creates a root bus in a given domain and make pci_create_root_bus() a wrapper around this function.
I'm a little concerned about adding a new "create root bus" interface, partly because we have quite a few already, and I'd like to reduce the number of them instead of adding more. And I think there might be other similar opportunities for unification, so I could easily see us adding new functions in the future to unify NUMA node info, ECAM info, etc.
I wonder if we need some sort of descriptor structure that the arch could fill in and pass to the PCI core. Then we could add new members without having to create new interfaces each time.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Tested-by: Tanmay Inamdar tinamdar@apm.com
drivers/pci/probe.c | 41 +++++++++++++++++++++++++++++++++-------- include/linux/pci.h | 4 ++++ 2 files changed, 37 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index fd11c12..172c615 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1714,8 +1714,9 @@ void __weak pcibios_remove_bus(struct pci_bus *bus) { } -struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
struct pci_ops *ops, void *sysdata, struct list_head *resources)
+struct pci_bus *pci_create_root_bus_in_domain(struct device *parent,
int domain, int bus, struct pci_ops *ops, void *sysdata,
struct list_head *resources)
{ int error; struct pci_host_bridge *bridge; @@ -1728,30 +1729,34 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, bridge = pci_alloc_host_bridge(); if (!bridge)
return NULL;
return ERR_PTR(-ENOMEM);
bridge->dev.parent = parent; bridge->dev.release = pci_release_host_bridge_dev;
- bridge->domain_nr = domain; error = pcibios_root_bridge_prepare(bridge); if (error) goto err_out;
b = pci_alloc_bus();
- if (!b)
- if (!b) {
goto err_out;error = -ENOMEM;
- }
b->sysdata = sysdata; b->ops = ops; b->number = b->busn_res.start = bus;
- b2 = pci_find_bus(pci_domain_nr(b), bus);
- b2 = pci_find_bus(bridge->domain_nr, bus); if (b2) { /* If we already got to this bus through a different bridge, ignore it */ dev_dbg(&b2->dev, "bus already known\n");
goto err_bus_out; }error = -EEXIST;
bridge->bus = b;
- dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b), bus);
- dev_set_name(&bridge->dev, "pci%04x:%02x", bridge->domain_nr, bus); error = device_register(&bridge->dev); if (error) { put_device(&bridge->dev);
@@ -1766,7 +1771,7 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, b->dev.class = &pcibus_class; b->dev.parent = b->bridge;
- dev_set_name(&b->dev, "%04x:%02x", pci_domain_nr(b), bus);
- dev_set_name(&b->dev, "%04x:%02x", bridge->domain_nr, bus); error = device_register(&b->dev); if (error) goto class_dev_reg_err;
@@ -1816,7 +1821,27 @@ err_bus_out: kfree(b); err_out: kfree(bridge);
- return NULL;
- return ERR_PTR(error);
+}
+struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
struct pci_ops *ops, void *sysdata, struct list_head *resources)
+{
- int domain_nr;
- struct pci_bus *b = pci_alloc_bus();
- if (!b)
return NULL;
- b->sysdata = sysdata;
- domain_nr = pci_domain_nr(b);
- kfree(b);
- b = pci_create_root_bus_in_domain(parent, domain_nr, bus,
ops, sysdata, resources);
- if (IS_ERR(b))
return NULL;
- return b;
} int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max) diff --git a/include/linux/pci.h b/include/linux/pci.h index 33aa2ca..1eed009 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -394,6 +394,7 @@ struct pci_host_bridge_window { struct pci_host_bridge { struct device dev; struct pci_bus *bus; /* root bus */
- int domain_nr; struct list_head windows; /* pci_host_bridge_windows */ void (*release_fn)(struct pci_host_bridge *); void *release_data;
@@ -747,6 +748,9 @@ struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata); struct pci_bus *pci_create_root_bus(struct device *parent, int bus, struct pci_ops *ops, void *sysdata, struct list_head *resources); +struct pci_bus *pci_create_root_bus_in_domain(struct device *parent,
int domain, int bus, struct pci_ops *ops,
void *sysdata, struct list_head *resources);
int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax); int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax); void pci_bus_release_busn_res(struct pci_bus *b); -- 1.9.0
On Sat, Apr 05, 2014 at 01:00:07AM +0100, Bjorn Helgaas wrote:
On Fri, Mar 14, 2014 at 03:34:30PM +0000, Liviu Dudau wrote:
Make it easier to discover the domain number of a bus by storing the number in pci_host_bridge for the root bus. Several architectures have their own way of storing this information, so it makes sense to try to unify the code.
I like the idea of unifying the way we handle the domain number. But I'd like to see more of the strategy before committing to this approach.
*My* strategy is to get rid of pci_domain_nr(). I don't see why we need to have arch specific way of providing the number, specially after looking at the existing implementations that return a value from a variable that is never touched or incremented. My guess is that pci_domain_nr() was created to work around the fact that there was no domain_nr maintainance in the generic code.
This patch adds struct pci_host_bridge.domain_nr, but of course pci_domain_nr() doesn't use it. It can't today, because pci_create_root_bus() relies on pci_domain_nr() to fill in pci_host_bridge.domain_nr.
But I suppose the intent is that someday we can make pci_domain_nr() arch-independent somehow. I'd just like to see more of the path to that happening.
The path would be to send a patch that removes all existing pci_domain_nr() macros/inline functions and rely on the generic function.
While at this, add a new function that creates a root bus in a given domain and make pci_create_root_bus() a wrapper around this function.
I'm a little concerned about adding a new "create root bus" interface, partly because we have quite a few already, and I'd like to reduce the number of them instead of adding more. And I think there might be other similar opportunities for unification, so I could easily see us adding new functions in the future to unify NUMA node info, ECAM info, etc.
The reason for creating the wrapper function was to allow for explicit passing of domain_nr. If we find architectures where generic allocation of domain_nr doesn't work for them, we can make them use this wrapper to pass the domain_nr into the host bridge when creating the root bus.
I wonder if we need some sort of descriptor structure that the arch could fill in and pass to the PCI core. Then we could add new members without having to create new interfaces each time.
I'm trying to reduce the number of variables being passed between architectures and generic code. host_bridge (with the associated root bus), domain_nr those are needed. Is there anything else that you have in your mind that needs to be shared?
My approach would be in sharing of the data: PCI is a standard, and the core framework implements it. What is so broken in your architecture that you need to work around the core code? And I'm not talking about drivers and quirks, but architectural level support.
Best regards, Liviu
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Tested-by: Tanmay Inamdar tinamdar@apm.com
drivers/pci/probe.c | 41 +++++++++++++++++++++++++++++++++-------- include/linux/pci.h | 4 ++++ 2 files changed, 37 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index fd11c12..172c615 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1714,8 +1714,9 @@ void __weak pcibios_remove_bus(struct pci_bus *bus) { } -struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
struct pci_ops *ops, void *sysdata, struct list_head *resources)
+struct pci_bus *pci_create_root_bus_in_domain(struct device *parent,
int domain, int bus, struct pci_ops *ops, void *sysdata,
struct list_head *resources)
{ int error; struct pci_host_bridge *bridge; @@ -1728,30 +1729,34 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, bridge = pci_alloc_host_bridge(); if (!bridge)
return NULL;
return ERR_PTR(-ENOMEM);
bridge->dev.parent = parent; bridge->dev.release = pci_release_host_bridge_dev;
- bridge->domain_nr = domain; error = pcibios_root_bridge_prepare(bridge); if (error) goto err_out;
b = pci_alloc_bus();
- if (!b)
- if (!b) {
goto err_out;error = -ENOMEM;
- }
b->sysdata = sysdata; b->ops = ops; b->number = b->busn_res.start = bus;
- b2 = pci_find_bus(pci_domain_nr(b), bus);
- b2 = pci_find_bus(bridge->domain_nr, bus); if (b2) { /* If we already got to this bus through a different bridge, ignore it */ dev_dbg(&b2->dev, "bus already known\n");
goto err_bus_out; }error = -EEXIST;
bridge->bus = b;
- dev_set_name(&bridge->dev, "pci%04x:%02x", pci_domain_nr(b), bus);
- dev_set_name(&bridge->dev, "pci%04x:%02x", bridge->domain_nr, bus); error = device_register(&bridge->dev); if (error) { put_device(&bridge->dev);
@@ -1766,7 +1771,7 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, b->dev.class = &pcibus_class; b->dev.parent = b->bridge;
- dev_set_name(&b->dev, "%04x:%02x", pci_domain_nr(b), bus);
- dev_set_name(&b->dev, "%04x:%02x", bridge->domain_nr, bus); error = device_register(&b->dev); if (error) goto class_dev_reg_err;
@@ -1816,7 +1821,27 @@ err_bus_out: kfree(b); err_out: kfree(bridge);
- return NULL;
- return ERR_PTR(error);
+}
+struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
struct pci_ops *ops, void *sysdata, struct list_head *resources)
+{
- int domain_nr;
- struct pci_bus *b = pci_alloc_bus();
- if (!b)
return NULL;
- b->sysdata = sysdata;
- domain_nr = pci_domain_nr(b);
- kfree(b);
- b = pci_create_root_bus_in_domain(parent, domain_nr, bus,
ops, sysdata, resources);
- if (IS_ERR(b))
return NULL;
- return b;
} int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max) diff --git a/include/linux/pci.h b/include/linux/pci.h index 33aa2ca..1eed009 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -394,6 +394,7 @@ struct pci_host_bridge_window { struct pci_host_bridge { struct device dev; struct pci_bus *bus; /* root bus */
- int domain_nr; struct list_head windows; /* pci_host_bridge_windows */ void (*release_fn)(struct pci_host_bridge *); void *release_data;
@@ -747,6 +748,9 @@ struct pci_bus *pci_scan_bus(int bus, struct pci_ops *ops, void *sysdata); struct pci_bus *pci_create_root_bus(struct device *parent, int bus, struct pci_ops *ops, void *sysdata, struct list_head *resources); +struct pci_bus *pci_create_root_bus_in_domain(struct device *parent,
int domain, int bus, struct pci_ops *ops,
void *sysdata, struct list_head *resources);
int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax); int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax); void pci_bus_release_busn_res(struct pci_bus *b); -- 1.9.0
On Mon, 2014-04-07 at 09:46 +0100, Liviu Dudau wrote:
*My* strategy is to get rid of pci_domain_nr(). I don't see why we need to have arch specific way of providing the number, specially after looking at the existing implementations that return a value from a variable that is never touched or incremented. My guess is that pci_domain_nr() was created to work around the fact that there was no domain_nr maintainance in the generic code.
Well, there was no generic host bridge structure. There is one now, it should go there.
Cheers, Ben.
On Mon, Apr 07, 2014 at 10:14:18AM +0100, Benjamin Herrenschmidt wrote:
On Mon, 2014-04-07 at 09:46 +0100, Liviu Dudau wrote:
*My* strategy is to get rid of pci_domain_nr(). I don't see why we need to have arch specific way of providing the number, specially after looking at the existing implementations that return a value from a variable that is never touched or incremented. My guess is that pci_domain_nr() was created to work around the fact that there was no domain_nr maintainance in the generic code.
Well, there was no generic host bridge structure. There is one now, it should go there.
Exactly! Hence my patch. After it gets accepted I will go through architectures and remove their version of pci_domain_nr().
Best regards, Liviu
Cheers, Ben.
On Mon, Apr 7, 2014 at 4:07 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 10:14:18AM +0100, Benjamin Herrenschmidt wrote:
On Mon, 2014-04-07 at 09:46 +0100, Liviu Dudau wrote:
*My* strategy is to get rid of pci_domain_nr(). I don't see why we need to have arch specific way of providing the number, specially after looking at the existing implementations that return a value from a variable that is never touched or incremented. My guess is that pci_domain_nr() was created to work around the fact that there was no domain_nr maintainance in the generic code.
Well, there was no generic host bridge structure. There is one now, it should go there.
Exactly! Hence my patch. After it gets accepted I will go through architectures and remove their version of pci_domain_nr().
Currently the arch has to supply pci_domain_nr() because that's the only way for the generic code to learn the domain. After you add pci_create_root_bus_in_domain(), the arch can supply the domain that way, and we won't need the arch-specific pci_domain_nr(). Right? That makes more sense to me; thanks for the explanation.
Let me try to explain my concern about the pci_create_root_bus_in_domain() interface. We currently have these interfaces:
pci_scan_root_bus() pci_scan_bus() pci_scan_bus_parented() pci_create_root_bus()
pci_scan_root_bus() is a higher-level interface than pci_create_root_bus(), so I'm trying to migrate toward it because it lets us remove a little code from the arch, e.g., pci_scan_child_bus() and pci_bus_add_devices().
I think we can only remove the arch-specific pci_domain_nr() if that arch uses pci_create_root_bus_in_domain(). When we convert an arch from using scan_bus interfaces to using pci_create_root_bus_in_domain(), we will have to move the rest of the scan_bus code (pci_scan_child_bus(), pci_bus_add_devices()) back into the arch code.
One alternative is to add an _in_domain() variant of each of these interfaces, but that doesn't seem very convenient either. My idea of passing in a structure would also require adding variants, so there's not really an advantage there, but I am thinking of the next unification effort, e.g., for NUMA node info. I don't really want to have to change all the _in_domain() interfaces to also take yet another parameter for the node number.
Bjorn
On Mon, Apr 07, 2014 at 11:44:51PM +0100, Bjorn Helgaas wrote:
On Mon, Apr 7, 2014 at 4:07 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 10:14:18AM +0100, Benjamin Herrenschmidt wrote:
On Mon, 2014-04-07 at 09:46 +0100, Liviu Dudau wrote:
*My* strategy is to get rid of pci_domain_nr(). I don't see why we need to have arch specific way of providing the number, specially after looking at the existing implementations that return a value from a variable that is never touched or incremented. My guess is that pci_domain_nr() was created to work around the fact that there was no domain_nr maintainance in the generic code.
Well, there was no generic host bridge structure. There is one now, it should go there.
Exactly! Hence my patch. After it gets accepted I will go through architectures and remove their version of pci_domain_nr().
Currently the arch has to supply pci_domain_nr() because that's the only way for the generic code to learn the domain. After you add pci_create_root_bus_in_domain(), the arch can supply the domain that way, and we won't need the arch-specific pci_domain_nr(). Right? That makes more sense to me; thanks for the explanation.
Right.
Let me try to explain my concern about the pci_create_root_bus_in_domain() interface. We currently have these interfaces:
pci_scan_root_bus() pci_scan_bus() pci_scan_bus_parented() pci_create_root_bus()
pci_scan_root_bus() is a higher-level interface than pci_create_root_bus(), so I'm trying to migrate toward it because it lets us remove a little code from the arch, e.g., pci_scan_child_bus() and pci_bus_add_devices().
I think we can only remove the arch-specific pci_domain_nr() if that arch uses pci_create_root_bus_in_domain(). When we convert an arch from using scan_bus interfaces to using pci_create_root_bus_in_domain(), we will have to move the rest of the scan_bus code (pci_scan_child_bus(), pci_bus_add_devices()) back into the arch code.
One alternative is to add an _in_domain() variant of each of these interfaces, but that doesn't seem very convenient either. My idea of passing in a structure would also require adding variants, so there's not really an advantage there, but I am thinking of the next unification effort, e.g., for NUMA node info. I don't really want to have to change all the _in_domain() interfaces to also take yet another parameter for the node number.
OK, what about this: all the functions that you have mentioned take a void *sysdata parameter. Should we convert this opaque pointer into a specific structure that holds the domain_nr and (in future) the NUMA node info?
Converting all the architectures is going to be a long winded job as everyone loved to add their own stuff in the structure pointed by sysdata, breaking that will lead to frustrations, but maybe the framework should take back ownership of it.
Best regards, Liviu
Bjorn
To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 8, 2014 at 4:20 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 11:44:51PM +0100, Bjorn Helgaas wrote:
Let me try to explain my concern about the pci_create_root_bus_in_domain() interface. We currently have these interfaces:
pci_scan_root_bus() pci_scan_bus() pci_scan_bus_parented() pci_create_root_bus() ...
One alternative is to add an _in_domain() variant of each of these interfaces, but that doesn't seem very convenient either. My idea of passing in a structure would also require adding variants, so there's not really an advantage there, but I am thinking of the next unification effort, e.g., for NUMA node info. I don't really want to have to change all the _in_domain() interfaces to also take yet another parameter for the node number.
OK, what about this: all the functions that you have mentioned take a void *sysdata parameter. Should we convert this opaque pointer into a specific structure that holds the domain_nr and (in future) the NUMA node info?
I doubt if we can make sysdata itself generic because I suspect we need a way to have *some* arch-specific data. But maybe the arch could supply a structure containing a struct device *, domain, struct pci_ops *, list of resources, aperture info, etc. I wonder if struct pci_host_bridge would be a reasonable place to put this stuff, e.g., something like this:
struct pci_host_bridge { int domain; int node; struct device *dev; struct pci_ops *ops; struct list_head resources; void *sysdata; struct pci_bus *bus; /* filled in by core, not by arch */ ... /* other existing contents managed by core */ };
struct pci_bus *pci_scan_host_bridge(struct pci_host_bridge *bridge);
On Tue, Apr 08, 2014 at 05:28:39PM +0100, Bjorn Helgaas wrote:
On Tue, Apr 8, 2014 at 4:20 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 11:44:51PM +0100, Bjorn Helgaas wrote:
Let me try to explain my concern about the pci_create_root_bus_in_domain() interface. We currently have these interfaces:
pci_scan_root_bus() pci_scan_bus() pci_scan_bus_parented() pci_create_root_bus() ...
One alternative is to add an _in_domain() variant of each of these interfaces, but that doesn't seem very convenient either. My idea of passing in a structure would also require adding variants, so there's not really an advantage there, but I am thinking of the next unification effort, e.g., for NUMA node info. I don't really want to have to change all the _in_domain() interfaces to also take yet another parameter for the node number.
OK, what about this: all the functions that you have mentioned take a void *sysdata parameter. Should we convert this opaque pointer into a specific structure that holds the domain_nr and (in future) the NUMA node info?
I doubt if we can make sysdata itself generic because I suspect we need a way to have *some* arch-specific data. But maybe the arch could supply a structure containing a struct device *, domain, struct pci_ops *, list of resources, aperture info, etc. I wonder if struct pci_host_bridge would be a reasonable place to put this stuff, e.g., something like this:
struct pci_host_bridge { int domain; int node; struct device *dev; struct pci_ops *ops; struct list_head resources; void *sysdata; struct pci_bus *bus; /* filled in by core, not by arch */ ... /* other existing contents managed by core */ };
struct pci_bus *pci_scan_host_bridge(struct pci_host_bridge *bridge);
I'm really reluctant to give the arches more rope to hang themselves. I really dislike the use of xxxx_initcall() to do PCI initialisation ordering that is currently in widespread use through the arch code. As I hope to have proven with my arm64 code, you can have PCI support for an architecture without having to provide any arch specific code. We have enough correct code in the PCI framework, what would the architectures provide to the generic code that we cannot get by following the standard?
Of course, there are always arch specific corners and they need their data structures to make sense of those, but rather than having architectures fill in a structure *before* we can setup host bridges I think we need to reverse the order. Using your example structure, I don't think is the arch's job to provide the list of resources or the domain number before we can scan the host bridge. We should be able to get those from somewhere else (like adding by default the ioport_resource and iomem_resource and managing domain numbers inside the core framework).
Best regards, Liviu
-- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 9, 2014 at 6:07 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Tue, Apr 08, 2014 at 05:28:39PM +0100, Bjorn Helgaas wrote:
On Tue, Apr 8, 2014 at 4:20 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 11:44:51PM +0100, Bjorn Helgaas wrote:
Let me try to explain my concern about the pci_create_root_bus_in_domain() interface. We currently have these interfaces:
pci_scan_root_bus() pci_scan_bus() pci_scan_bus_parented() pci_create_root_bus() ...
One alternative is to add an _in_domain() variant of each of these interfaces, but that doesn't seem very convenient either. My idea of passing in a structure would also require adding variants, so there's not really an advantage there, but I am thinking of the next unification effort, e.g., for NUMA node info. I don't really want to have to change all the _in_domain() interfaces to also take yet another parameter for the node number.
OK, what about this: all the functions that you have mentioned take a void *sysdata parameter. Should we convert this opaque pointer into a specific structure that holds the domain_nr and (in future) the NUMA node info?
I doubt if we can make sysdata itself generic because I suspect we need a way to have *some* arch-specific data. But maybe the arch could supply a structure containing a struct device *, domain, struct pci_ops *, list of resources, aperture info, etc. I wonder if struct pci_host_bridge would be a reasonable place to put this stuff, e.g., something like this:
struct pci_host_bridge { int domain; int node; struct device *dev; struct pci_ops *ops; struct list_head resources; void *sysdata; struct pci_bus *bus; /* filled in by core, not by arch */ ... /* other existing contents managed by core */ };
struct pci_bus *pci_scan_host_bridge(struct pci_host_bridge *bridge);
I'm really reluctant to give the arches more rope to hang themselves.
If you mean the sysdata pointer is rope to hang themselves, I think it would be great it we didn't need sysdata at all. But I think it would be a huge amount of work to get rid of it completely, and keeping it would let us work at that incrementally.
I really dislike the use of xxxx_initcall() to do PCI initialisation ordering that is currently in widespread use through the arch code.
I certainly agree that initcall ordering is fragile and to be avoided when possible.
As I hope to have proven with my arm64 code, you can have PCI support for an architecture without having to provide any arch specific code. We have enough correct code in the PCI framework, what would the architectures provide to the generic code that we cannot get by following the standard?
PCI host bridges are not architected, i.e., the PCI/PCIe specs do not say anything about how to discover them or how to program them. So the arch or a driver for another bus (ACPI, OF, etc.) must enumerate them and discover the bridge apertures (those are in the resource list). And obviously the arch has to provide the root bus number and PCI config accessors.
Of course, there are always arch specific corners and they need their data structures to make sense of those, but rather than having architectures fill in a structure *before* we can setup host bridges I think we need to reverse the order. Using your example structure, I don't think is the arch's job to provide the list of resources or the domain number before we can scan the host bridge. We should be able to get those from somewhere else (like adding by default the ioport_resource and iomem_resource and managing domain numbers inside the core framework).
It's possible we could manage domain numbers in the core. On ACPI systems, we currently we use the ACPI _SEG value as the domain. In some cases, e.g., on ia64, config space access is done via firmware interfaces, and those interfaces expect the _SEG values. We could conceivably maintain a mapping between _SEG and domain, but I'm not sure there's an advantage there.
I probably don't understand what you intend by reversing the order. Are you suggesting something like new pcibios_*() interfaces the arch can use to get the host bridge apertures and domain number?
Bjorn
On Wednesday 09 April 2014 08:02:41 Bjorn Helgaas wrote:
It's possible we could manage domain numbers in the core. On ACPI systems, we currently we use the ACPI _SEG value as the domain. In some cases, e.g., on ia64, config space access is done via firmware interfaces, and those interfaces expect the _SEG values. We could conceivably maintain a mapping between _SEG and domain, but I'm not sure there's an advantage there.
I think it's a safe assumption that we will never have more than one firmware trying to enumerate the domains, so it would be safe to let ACPI keep doing its own thing for domain numbers, have the DT code pick domain number using some method we come up with, and for everything else let the architecture code deal with it. There are probably very few systems that have multiple domains but use neither ACPI nor DT.
Arnd
On Wed, 2014-04-09 at 08:02 -0600, Bjorn Helgaas wrote:
It's possible we could manage domain numbers in the core. On ACPI systems, we currently we use the ACPI _SEG value as the domain. In some cases, e.g., on ia64, config space access is done via firmware interfaces, and those interfaces expect the _SEG values. We could conceivably maintain a mapping between _SEG and domain, but I'm not sure there's an advantage there.
I'd rather keep the ability for the architecture to assign domain numbers.
I'm working on making them relate to the physical slot numbers on our new systems so we get predictable PCI IDs which helps with some stuff like the new network device naming scheme etc...
Predictability is a good thing :-)
I probably don't understand what you intend by reversing the order. Are you suggesting something like new pcibios_*() interfaces the arch can use to get the host bridge apertures and domain number?
Cheers, Ben.
On Wed, Apr 09, 2014 at 08:02:41AM -0600, Bjorn Helgaas wrote:
On Wed, Apr 9, 2014 at 6:07 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Tue, Apr 08, 2014 at 05:28:39PM +0100, Bjorn Helgaas wrote:
On Tue, Apr 8, 2014 at 4:20 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 11:44:51PM +0100, Bjorn Helgaas wrote:
Let me try to explain my concern about the pci_create_root_bus_in_domain() interface. We currently have these interfaces:
pci_scan_root_bus() pci_scan_bus() pci_scan_bus_parented() pci_create_root_bus() ...
One alternative is to add an _in_domain() variant of each of these interfaces, but that doesn't seem very convenient either. My idea of passing in a structure would also require adding variants, so there's not really an advantage there, but I am thinking of the next unification effort, e.g., for NUMA node info. I don't really want to have to change all the _in_domain() interfaces to also take yet another parameter for the node number.
OK, what about this: all the functions that you have mentioned take a void *sysdata parameter. Should we convert this opaque pointer into a specific structure that holds the domain_nr and (in future) the NUMA node info?
I doubt if we can make sysdata itself generic because I suspect we need a way to have *some* arch-specific data. But maybe the arch could supply a structure containing a struct device *, domain, struct pci_ops *, list of resources, aperture info, etc. I wonder if struct pci_host_bridge would be a reasonable place to put this stuff, e.g., something like this:
struct pci_host_bridge { int domain; int node; struct device *dev; struct pci_ops *ops; struct list_head resources; void *sysdata; struct pci_bus *bus; /* filled in by core, not by arch */ ... /* other existing contents managed by core */ };
struct pci_bus *pci_scan_host_bridge(struct pci_host_bridge *bridge);
I'm really reluctant to give the arches more rope to hang themselves.
If you mean the sysdata pointer is rope to hang themselves, I think it would be great it we didn't need sysdata at all. But I think it would be a huge amount of work to get rid of it completely, and keeping it would let us work at that incrementally.
Agree. But then your suggestion was to wrap sysdata inside another structure, which to me constitutes additional rope.
I really dislike the use of xxxx_initcall() to do PCI initialisation ordering that is currently in widespread use through the arch code.
I certainly agree that initcall ordering is fragile and to be avoided when possible.
As I hope to have proven with my arm64 code, you can have PCI support for an architecture without having to provide any arch specific code. We have enough correct code in the PCI framework, what would the architectures provide to the generic code that we cannot get by following the standard?
PCI host bridges are not architected, i.e., the PCI/PCIe specs do not say anything about how to discover them or how to program them. So the arch or a driver for another bus (ACPI, OF, etc.) must enumerate them and discover the bridge apertures (those are in the resource list). And obviously the arch has to provide the root bus number and PCI config accessors.
The "what" for PCI host bridges is defined in the spec. The "how" is implementation defined. What I'm trying to get with the cleanup is the ordering of pci_host_bridge creation: core creates the structure first ("what"), arch then has the chance of adding specific data to it (ops, resources, etc) ("how").
At the moment arm and powerpc do some horrible dances in trying to create their local idea of a host bridge before passing it to the core code.
As for the root bus number, maybe we can offer some sensible default strategy for numbering them and the arches that don't care too much can use that.
Of course, there are always arch specific corners and they need their data structures to make sense of those, but rather than having architectures fill in a structure *before* we can setup host bridges I think we need to reverse the order. Using your example structure, I don't think is the arch's job to provide the list of resources or the domain number before we can scan the host bridge. We should be able to get those from somewhere else (like adding by default the ioport_resource and iomem_resource and managing domain numbers inside the core framework).
It's possible we could manage domain numbers in the core. On ACPI systems, we currently we use the ACPI _SEG value as the domain. In some cases, e.g., on ia64, config space access is done via firmware interfaces, and those interfaces expect the _SEG values. We could conceivably maintain a mapping between _SEG and domain, but I'm not sure there's an advantage there.
I probably don't understand what you intend by reversing the order. Are you suggesting something like new pcibios_*() interfaces the arch can use to get the host bridge apertures and domain number?
Yes. Lets stop having the architectures do early inits so that they can prepare their host bridge structures to be ready for when the PCI framework calls them. Lets create the host bridge in the PCI core and use pcibios_*() to add to it where necessary without having to race for position.
Best regards, Liviu
Bjorn
To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 9, 2014 at 7:27 PM, Liviu Dudau liviu@dudau.co.uk wrote:
On Wed, Apr 09, 2014 at 08:02:41AM -0600, Bjorn Helgaas wrote:
On Wed, Apr 9, 2014 at 6:07 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Tue, Apr 08, 2014 at 05:28:39PM +0100, Bjorn Helgaas wrote:
On Tue, Apr 8, 2014 at 4:20 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 11:44:51PM +0100, Bjorn Helgaas wrote:
Let me try to explain my concern about the pci_create_root_bus_in_domain() interface. We currently have these interfaces:
pci_scan_root_bus() pci_scan_bus() pci_scan_bus_parented() pci_create_root_bus() ...
One alternative is to add an _in_domain() variant of each of these interfaces, but that doesn't seem very convenient either. My idea of passing in a structure would also require adding variants, so there's not really an advantage there, but I am thinking of the next unification effort, e.g., for NUMA node info. I don't really want to have to change all the _in_domain() interfaces to also take yet another parameter for the node number.
OK, what about this: all the functions that you have mentioned take a void *sysdata parameter. Should we convert this opaque pointer into a specific structure that holds the domain_nr and (in future) the NUMA node info?
I doubt if we can make sysdata itself generic because I suspect we need a way to have *some* arch-specific data. But maybe the arch could supply a structure containing a struct device *, domain, struct pci_ops *, list of resources, aperture info, etc. I wonder if struct pci_host_bridge would be a reasonable place to put this stuff, e.g., something like this:
struct pci_host_bridge { int domain; int node; struct device *dev; struct pci_ops *ops; struct list_head resources; void *sysdata; struct pci_bus *bus; /* filled in by core, not by arch */ ... /* other existing contents managed by core */ };
struct pci_bus *pci_scan_host_bridge(struct pci_host_bridge *bridge);
I'm really reluctant to give the arches more rope to hang themselves.
If you mean the sysdata pointer is rope to hang themselves, I think it would be great it we didn't need sysdata at all. But I think it would be a huge amount of work to get rid of it completely, and keeping it would let us work at that incrementally.
Agree. But then your suggestion was to wrap sysdata inside another structure, which to me constitutes additional rope.
I'll ponder this more, but I don't see your point here yet. The arch already supplies a sysdata pointer to pci_scan_root_bus(), and we stash it in every struct pci_bus already. My idea was just to pass it in differently, as a structure member rather than a separate argument. (And I'm not completely attached to my proposal; it was only to illustrate my concern about the explosion of interfaces if we have to add *_domain(), *_node(), etc.)
The "what" for PCI host bridges is defined in the spec. The "how" is implementation defined. What I'm trying to get with the cleanup is the ordering of pci_host_bridge creation: core creates the structure first ("what"), arch then has the chance of adding specific data to it (ops, resources, etc) ("how").
At the moment arm and powerpc do some horrible dances in trying to create their local idea of a host bridge before passing it to the core code.
As for the root bus number, maybe we can offer some sensible default strategy for numbering them and the arches that don't care too much can use that.
I think we're at the limit of what can be accomplished with the abstractness of English.
My opinion is that it's reasonable for the arch to discover the host bridge properties first and pass them to the core, and it doesn't require unreasonable things of the arch. I know the arm PCI setup is complicated, partly because it deals with a huge number of machines that don't have a consistent firmware interface. The x86/ACPI setup is relatively simple because it deals with a simple firmware interface. But that's just my opinion, and maybe your code will show otherwise.
Bjorn
On Wednesday 09 April 2014 21:48:14 Bjorn Helgaas wrote:
On Wed, Apr 9, 2014 at 7:27 PM, Liviu Dudau liviu@dudau.co.uk wrote:
On Wed, Apr 09, 2014 at 08:02:41AM -0600, Bjorn Helgaas wrote:
struct pci_host_bridge { int domain; int node; struct device *dev; struct pci_ops *ops; struct list_head resources; void *sysdata; struct pci_bus *bus; /* filled in by core, not by arch */ ... /* other existing contents managed by core */ };
struct pci_bus *pci_scan_host_bridge(struct pci_host_bridge *bridge);
I'm really reluctant to give the arches more rope to hang themselves.
If you mean the sysdata pointer is rope to hang themselves, I think it would be great it we didn't need sysdata at all. But I think it would be a huge amount of work to get rid of it completely, and keeping it would let us work at that incrementally.
Agree. But then your suggestion was to wrap sysdata inside another structure, which to me constitutes additional rope.
I'll ponder this more, but I don't see your point here yet. The arch already supplies a sysdata pointer to pci_scan_root_bus(), and we stash it in every struct pci_bus already. My idea was just to pass it in differently, as a structure member rather than a separate argument. (And I'm not completely attached to my proposal; it was only to illustrate my concern about the explosion of interfaces if we have to add *_domain(), *_node(), etc.)
As a minor variation of your suggestion, how about passing in a pointer to struct pci_host_bridge, and embed that within its own private structure? I think this is closer to how a lot of other subsystems do the abstraction.
The "what" for PCI host bridges is defined in the spec. The "how" is implementation defined. What I'm trying to get with the cleanup is the ordering of pci_host_bridge creation: core creates the structure first ("what"), arch then has the chance of adding specific data to it (ops, resources, etc) ("how").
At the moment arm and powerpc do some horrible dances in trying to create their local idea of a host bridge before passing it to the core code.
As for the root bus number, maybe we can offer some sensible default strategy for numbering them and the arches that don't care too much can use that.
I think we're at the limit of what can be accomplished with the abstractness of English.
My opinion is that it's reasonable for the arch to discover the host bridge properties first and pass them to the core, and it doesn't require unreasonable things of the arch. I know the arm PCI setup is complicated, partly because it deals with a huge number of machines that don't have a consistent firmware interface. The x86/ACPI setup is relatively simple because it deals with a simple firmware interface. But that's just my opinion, and maybe your code will show otherwise.
Makes sense. One of the areas where the PCI code shows its age is the method how the various parts link together: there are function calls going back and forth between architecture specific files and generic files, rather a hierarchy of files with generic code being code by more specific code.
To do the probing properly, I think it's totally fine to have the core code expect stuff like the resources and domain number to be filled out already by whoever calls it, but then have wrappers around it that get this information from a firmware interface, or from hardwired architecture specific code where necessary.
Arnd
On Thu, Apr 10, 2014 at 2:00 AM, Arnd Bergmann arnd@arndb.de wrote:
On Wednesday 09 April 2014 21:48:14 Bjorn Helgaas wrote:
On Wed, Apr 9, 2014 at 7:27 PM, Liviu Dudau liviu@dudau.co.uk wrote:
On Wed, Apr 09, 2014 at 08:02:41AM -0600, Bjorn Helgaas wrote:
struct pci_host_bridge { int domain; int node; struct device *dev; struct pci_ops *ops; struct list_head resources; void *sysdata; struct pci_bus *bus; /* filled in by core, not by arch */ ... /* other existing contents managed by core */ };
struct pci_bus *pci_scan_host_bridge(struct pci_host_bridge *bridge);
I'm really reluctant to give the arches more rope to hang themselves.
If you mean the sysdata pointer is rope to hang themselves, I think it would be great it we didn't need sysdata at all. But I think it would be a huge amount of work to get rid of it completely, and keeping it would let us work at that incrementally.
Agree. But then your suggestion was to wrap sysdata inside another structure, which to me constitutes additional rope.
I'll ponder this more, but I don't see your point here yet. The arch already supplies a sysdata pointer to pci_scan_root_bus(), and we stash it in every struct pci_bus already. My idea was just to pass it in differently, as a structure member rather than a separate argument. (And I'm not completely attached to my proposal; it was only to illustrate my concern about the explosion of interfaces if we have to add *_domain(), *_node(), etc.)
As a minor variation of your suggestion, how about passing in a pointer to struct pci_host_bridge, and embed that within its own private structure? I think this is closer to how a lot of other subsystems do the abstraction.
I'm not sure I'm following you; you mean the arch-specific sysdata structure would contain a pointer to struct pci_host_bridge?
I have to admit that I'm not up on how other subsystems handle this sort of abstraction. Do you have any pointers to good examples that I can study?
Bjorn
On Thursday 10 April 2014 07:50:52 Bjorn Helgaas wrote:
On Thu, Apr 10, 2014 at 2:00 AM, Arnd Bergmann arnd@arndb.de wrote:
On Wednesday 09 April 2014 21:48:14 Bjorn Helgaas wrote:
On Wed, Apr 9, 2014 at 7:27 PM, Liviu Dudau liviu@dudau.co.uk wrote:
On Wed, Apr 09, 2014 at 08:02:41AM -0600, Bjorn Helgaas wrote:
> struct pci_host_bridge { > int domain; > int node; > struct device *dev; > struct pci_ops *ops; > struct list_head resources; > void *sysdata; > struct pci_bus *bus; /* filled in by core, not by arch */ > ... /* other existing contents managed by core */ > }; > > struct pci_bus *pci_scan_host_bridge(struct pci_host_bridge *bridge);
I'm not sure I'm following you; you mean the arch-specific sysdata structure would contain a pointer to struct pci_host_bridge?
I have to admit that I'm not up on how other subsystems handle this sort of abstraction. Do you have any pointers to good examples that I can study?
What I mean is like this:
/* generic structure */ struct pci_host_bridge { int domain; int node; struct device *dev; struct pci_ops *ops; struct list_head resources; struct pci_bus *bus; /* filled in by core, not by arch */ ... /* other existing contents managed by core */ };
/* arm specific structure */ struct pci_sys_data { char io_res_name[12]; /* Bridge swizzling */ u8 (*swizzle)(struct pci_dev *, u8 *); /* IRQ mapping */ int (*map_irq)(const struct pci_dev *, u8, u8); /* Resource alignement requirements */ void (*add_bus)(struct pci_bus *bus); void (*remove_bus)(struct pci_bus *bus); void *private_data; /* platform controller private data */
/* not a pointer: */ struct pci_host_bridge bridge; }; static inline struct pci_sys_data *to_pci_sys_data(struct pci_host_bridge *bridge) { return container_of(bridge, struct pci_sys_data, bridge); }
/* arm specific, driver specific structure */ struct tegra_pcie { void __iomem *pads; void __iomem *afi;
struct clk *pex_clk; struct clk *afi_clk; struct clk *pll_e; struct clk *cml_clk;
struct tegra_msi msi;
struct list_head ports; unsigned int num_ports;
struct pci_sys_data sysdata; }; static inline struct tegra_pcie *to_tegra_pcie(struct pci_sys_data *sysdata) { return container_of(sysdata, struct tegra_pcie, sysdata); }
This mirrors how we treat devices: a pci_device has an embedded device, and so on, in other subsystems we can have multiple layers.
In this example, the tegra pcie driver then allocates its own tegra_pcie structure, fills out the fields it needs, and registers it with the ARM architecture code, passing just the pci_sys_data pointer. That function in turn passes a pointer to the embedded pci_host_bridge down to the generic code. Ideally we should try to eliminate the architecture specific portion here, but that is a later step.
Arnd
On Thu, Apr 10, 2014 at 03:07:44PM +0100, Arnd Bergmann wrote:
On Thursday 10 April 2014 07:50:52 Bjorn Helgaas wrote:
On Thu, Apr 10, 2014 at 2:00 AM, Arnd Bergmann arnd@arndb.de wrote:
On Wednesday 09 April 2014 21:48:14 Bjorn Helgaas wrote:
On Wed, Apr 9, 2014 at 7:27 PM, Liviu Dudau liviu@dudau.co.uk wrote:
On Wed, Apr 09, 2014 at 08:02:41AM -0600, Bjorn Helgaas wrote:
>> struct pci_host_bridge { >> int domain; >> int node; >> struct device *dev; >> struct pci_ops *ops; >> struct list_head resources; >> void *sysdata; >> struct pci_bus *bus; /* filled in by core, not by arch */ >> ... /* other existing contents managed by core */ >> }; >> >> struct pci_bus *pci_scan_host_bridge(struct pci_host_bridge *bridge); >
I'm not sure I'm following you; you mean the arch-specific sysdata structure would contain a pointer to struct pci_host_bridge?
I have to admit that I'm not up on how other subsystems handle this sort of abstraction. Do you have any pointers to good examples that I can study?
What I mean is like this:
/* generic structure */ struct pci_host_bridge { int domain; int node; struct device *dev; struct pci_ops *ops; struct list_head resources; struct pci_bus *bus; /* filled in by core, not by arch */ ... /* other existing contents managed by core */ };
/* arm specific structure */ struct pci_sys_data { char io_res_name[12]; /* Bridge swizzling */ u8 (*swizzle)(struct pci_dev *, u8 *); /* IRQ mapping */ int (*map_irq)(const struct pci_dev *, u8, u8); /* Resource alignement requirements */ void (*add_bus)(struct pci_bus *bus); void (*remove_bus)(struct pci_bus *bus); void *private_data; /* platform controller private data */
/* not a pointer: */ struct pci_host_bridge bridge; }; static inline struct pci_sys_data *to_pci_sys_data(struct pci_host_bridge *bridge) { return container_of(bridge, struct pci_sys_data, bridge); }
/* arm specific, driver specific structure */ struct tegra_pcie { void __iomem *pads; void __iomem *afi;
struct clk *pex_clk; struct clk *afi_clk; struct clk *pll_e; struct clk *cml_clk; struct tegra_msi msi; struct list_head ports; unsigned int num_ports;
struct pci_sys_data sysdata; }; static inline struct tegra_pcie *to_tegra_pcie(struct pci_sys_data *sysdata) { return container_of(sysdata, struct tegra_pcie, sysdata); }
This mirrors how we treat devices: a pci_device has an embedded device, and so on, in other subsystems we can have multiple layers.
In this example, the tegra pcie driver then allocates its own tegra_pcie structure, fills out the fields it needs, and registers it with the ARM architecture code, passing just the pci_sys_data pointer. That function in turn passes a pointer to the embedded pci_host_bridge down to the generic code. Ideally we should try to eliminate the architecture specific portion here, but that is a later step.
So Arnd seems to agree with me: we should try to get out of architecture specific pci_sys_data and link the host bridge driver straight into the PCI core. The core then can call into arch code via pcibios_*() functions.
Arnd, am I reading correctly into what you are saying?
Liviu
Arnd
To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 10 April 2014 15:53:04 Liviu Dudau wrote:
On Thu, Apr 10, 2014 at 03:07:44PM +0100, Arnd Bergmann wrote:
This mirrors how we treat devices: a pci_device has an embedded device, and so on, in other subsystems we can have multiple layers.
In this example, the tegra pcie driver then allocates its own tegra_pcie structure, fills out the fields it needs, and registers it with the ARM architecture code, passing just the pci_sys_data pointer. That function in turn passes a pointer to the embedded pci_host_bridge down to the generic code. Ideally we should try to eliminate the architecture specific portion here, but that is a later step.
So Arnd seems to agree with me: we should try to get out of architecture specific pci_sys_data and link the host bridge driver straight into the PCI core. The core then can call into arch code via pcibios_*() functions.
Arnd, am I reading correctly into what you are saying?
Half of it ;-)
I think it would be better to not have an architecture specific data structure, just like it would be better not to have architecture specific pcibios_* functions that get called by the PCI core. Note that the architecture specific functions are the ones that rely on the architecture specific data structures as well. If they only use the common fields, it should also be possible to share the code.
I also don't realistically think we can get there on a lot of architectures any time soon. Note that most architectures only have one PCI host implementation, so the architecture structure is the same as the host driver structure anyway.
For architectures like powerpc and arm that have people actively working on them, we have a chance to clean up that code in the way we want it (if we can agree on the direction), but it's still not trivial to do.
Speaking of arm32 in particular, I think we will end up with a split approach: modern platforms (multiplatform, possibly all DT based) using PCI core infrastructure directly and no architecture specific PCI code on the one side, and a variation of today's code for the legacy platforms on the other.
Arnd
On Thu, 2014-04-10 at 22:46 +0200, Arnd Bergmann wrote:
Half of it ;-)
I think it would be better to not have an architecture specific data structure, just like it would be better not to have architecture specific pcibios_* functions that get called by the PCI core. Note that the architecture specific functions are the ones that rely on the architecture specific data structures as well. If they only use the common fields, it should also be possible to share the code.
I don't understand... we'll never get rid of architecture specific hooks in one form or another.
We'll always need to some things in an architecture or host-bridge specific way. Now if you don't want to call them arch hooks, then call them host bridge ops, but they are needed and thus they will need some kind of architecture specific extension to the base host bridge structure.
EEH is one big nasty example on powerpc.
Another random one that happens to be hot in my brain right now because we just finished debugging it: On powernv, we are just fixing a series of bugs caused by the generic code trying to do hot resets on PCIe "by hand" by directly toggling the secondary reset register in bridges.
Well, on our root complexes, this triggers a link loss which triggers a fatal EEH "ER_all" interrupt which we escalate into a fence and all hell breaks loose.
We need to mask some error traps in the hardware before doing something that can cause an intentional link loss... and unmask them when done. (Among other things, there are other issues on P7 with hot reset).
So hot reset must be an architecture hook.
PERST (fundamental reset) can *only* be a hook. The way to generate a PERST is not specified. In fact, on our machines, we have special GPIOs we can use to generate PERST on individual slots below a PLX bridge and a different methods for slots directly on a PHB.
Eventually most of those hooks land into firmware, and as such it's akin to ACPI which also keeps a separate state structure and a pile of hooks.
I also don't realistically think we can get there on a lot of architectures any time soon. Note that most architectures only have one PCI host implementation, so the architecture structure is the same as the host driver structure anyway.
For architectures like powerpc and arm that have people actively working on them, we have a chance to clean up that code in the way we want it (if we can agree on the direction), but it's still not trivial to do.
Speaking of arm32 in particular, I think we will end up with a split approach: modern platforms (multiplatform, possibly all DT based) using PCI core infrastructure directly and no architecture specific PCI code on the one side, and a variation of today's code for the legacy platforms on the other.
Arnd
On Friday 11 April 2014 15:01:09 Benjamin Herrenschmidt wrote:
On Thu, 2014-04-10 at 22:46 +0200, Arnd Bergmann wrote:
Half of it ;-)
I think it would be better to not have an architecture specific data structure, just like it would be better not to have architecture specific pcibios_* functions that get called by the PCI core. Note that the architecture specific functions are the ones that rely on the architecture specific data structures as well. If they only use the common fields, it should also be possible to share the code.
I don't understand... we'll never get rid of architecture specific hooks in one form or another.
We'll always need to some things in an architecture or host-bridge specific way. Now if you don't want to call them arch hooks, then call them host bridge ops, but they are needed and thus they will need some kind of architecture specific extension to the base host bridge structure.
Absolutely right. The thing I'd like to get rid of in the long run is global functions defined in the architecture code that are called by core code for host bridge specific functionality. In a lot of cases, they should not be needed if we can express the same things in a generic way. In other cases, we can use function pointers that are set at the time that the host bridge is registered.
EEH is one big nasty example on powerpc.
Another random one that happens to be hot in my brain right now because we just finished debugging it: On powernv, we are just fixing a series of bugs caused by the generic code trying to do hot resets on PCIe "by hand" by directly toggling the secondary reset register in bridges.
Well, on our root complexes, this triggers a link loss which triggers a fatal EEH "ER_all" interrupt which we escalate into a fence and all hell breaks loose.
We need to mask some error traps in the hardware before doing something that can cause an intentional link loss... and unmask them when done. (Among other things, there are other issues on P7 with hot reset).
So hot reset must be an architecture hook.
This sounds to me very much host bridge specific, not architecture specific. If you have the same host bridge in an ARM system, you'd want the same things to happen, and if you have another host bridge on PowerPC, you probably don't want that code to be called.
PERST (fundamental reset) can *only* be a hook. The way to generate a PERST is not specified. In fact, on our machines, we have special GPIOs we can use to generate PERST on individual slots below a PLX bridge and a different methods for slots directly on a PHB.
Eventually most of those hooks land into firmware, and as such it's akin to ACPI which also keeps a separate state structure and a pile of hooks.
On PowerPC, there are currently a bunch of platform specific callbacks in the ppc_md: pcibios_after_init, pci_exclude_device, pcibios_fixup_resources, pcibios_fixup_bus, pcibios_enable_device_hook, pcibios_fixup_phb, pcibios_window_alignment, and possibly some more. There is some architecture specific code that gets called by the PCI core, with the main purpose of calling into these.
On ARM32, we have a similar set of callbacks in the architecture private pci_sys_data: swizzle, map_irq, align_resource, add_bus, remove_bus, and some more callbacks for setup in the hw_pci structure that is used at initialization time: setup, scan, preinit, postinit. Again, these are called from architecture specific code that gets called by the PCI core.
I'm sure some of the other architectures have similar things, most of them probably less verbose because there is fewer variability between subarchitectures.
I think a nice way to deal with these in the long run would be to have a generic 'struct pci_host_bridge_ops' that can be defined by the architecture or the platform, or a particular host bridge driver. We'd have to define exactly which function pointers would go in there, but a good start would be the set of functions that are today provided by each architecture. The reset method you describe above would also fit well into this.
A host bridge driver can fill out the pointers with its own functions, or put platform or architecture specific function pointers in there, that get called by the PCI core. There are multiple ways to deal with default implementations here, one way would be that the core just falls back to a generic implementation (which is currently the __weak function) if it sees a NULL pointer. Another way would be to require each driver to either fill out all pointers or none of them, in which case we would use a default pci_host_bridge_ops struct that contains the pointers to the global pcibios_*() functions.
Arnd
On Fri, 2014-04-11 at 10:36 +0200, Arnd Bergmann wrote:
EEH is one big nasty example on powerpc.
Another random one that happens to be hot in my brain right now because we just finished debugging it: On powernv, we are just fixing a series of bugs caused by the generic code trying to do hot resets on PCIe "by hand" by directly toggling the secondary reset register in bridges.
Well, on our root complexes, this triggers a link loss which triggers a fatal EEH "ER_all" interrupt which we escalate into a fence and all hell breaks loose.
We need to mask some error traps in the hardware before doing something that can cause an intentional link loss... and unmask them when done. (Among other things, there are other issues on P7 with hot reset).
So hot reset must be an architecture hook.
This sounds to me very much host bridge specific, not architecture specific. If you have the same host bridge in an ARM system, you'd want the same things to happen, and if you have another host bridge on PowerPC, you probably don't want that code to be called.
Yes, it is partially host bridge specific, partially firmware related (OPAL vs. ACPI, vs....) so it's a mixture here.
So I do agree, host bridge ops to replace most of these pcibios_* hooks does make sense.
PERST (fundamental reset) can *only* be a hook. The way to generate a PERST is not specified. In fact, on our machines, we have special GPIOs we can use to generate PERST on individual slots below a PLX bridge and a different methods for slots directly on a PHB.
Eventually most of those hooks land into firmware, and as such it's akin to ACPI which also keeps a separate state structure and a pile of hooks.
On PowerPC, there are currently a bunch of platform specific callbacks in the ppc_md: pcibios_after_init, pci_exclude_device, pcibios_fixup_resources, pcibios_fixup_bus, pcibios_enable_device_hook, pcibios_fixup_phb, pcibios_window_alignment, and possibly some more. There is some architecture specific code that gets called by the PCI core, with the main purpose of calling into these.
Yes. Most of them could be made into host bridge hooks fairly easily.
The remaining ones are going to need somebody (probably me) to untangle :-)
On ARM32, we have a similar set of callbacks in the architecture private pci_sys_data: swizzle, map_irq, align_resource, add_bus, remove_bus, and some more callbacks for setup in the hw_pci structure that is used at initialization time: setup, scan, preinit, postinit. Again, these are called from architecture specific code that gets called by the PCI core.
I'm sure some of the other architectures have similar things, most of them probably less verbose because there is fewer variability between subarchitectures.
I think a nice way to deal with these in the long run would be to have a generic 'struct pci_host_bridge_ops' that can be defined by the architecture or the platform, or a particular host bridge driver.
Well, the host bridge needs either a "driver" or be subclassed or both...
We'd have to define exactly which function pointers would go in there, but a good start would be the set of functions that are today provided by each architecture. The reset method you describe above would also fit well into this.
A host bridge driver can fill out the pointers with its own functions, or put platform or architecture specific function pointers in there, that get called by the PCI core. There are multiple ways to deal with default implementations here, one way would be that the core just falls back to a generic implementation (which is currently the __weak function) if it sees a NULL pointer. Another way would be to require each driver to either fill out all pointers or none of them, in which case we would use a default pci_host_bridge_ops struct that contains the pointers to the global pcibios_*() functions.
Either works, we can start with the easy ones like window alignment, and move from there.
Ben.
On Thu, Apr 10, 2014 at 09:46:36PM +0100, Arnd Bergmann wrote:
On Thursday 10 April 2014 15:53:04 Liviu Dudau wrote:
On Thu, Apr 10, 2014 at 03:07:44PM +0100, Arnd Bergmann wrote:
This mirrors how we treat devices: a pci_device has an embedded device, and so on, in other subsystems we can have multiple layers.
In this example, the tegra pcie driver then allocates its own tegra_pcie structure, fills out the fields it needs, and registers it with the ARM architecture code, passing just the pci_sys_data pointer. That function in turn passes a pointer to the embedded pci_host_bridge down to the generic code. Ideally we should try to eliminate the architecture specific portion here, but that is a later step.
So Arnd seems to agree with me: we should try to get out of architecture specific pci_sys_data and link the host bridge driver straight into the PCI core. The core then can call into arch code via pcibios_*() functions.
Arnd, am I reading correctly into what you are saying?
Half of it ;-)
I think it would be better to not have an architecture specific data structure, just like it would be better not to have architecture specific pcibios_* functions that get called by the PCI core. Note that the architecture specific functions are the ones that rely on the architecture specific data structures as well. If they only use the common fields, it should also be possible to share the code.
While I've come to like the pcibios_*() interface (and yes, it could be formalised and abstracted into a pci_xxxx_ops structure) I don't like the fact that those functions use architectural data in order to function. I know it might sound strange, as they *are* supposed to be implemented by the arches, but in my mind the link between generic code and arch code for PCI should be done by the host bridge driver. That's how PCI spec describes it, and I see no reason why we should not be able to adopt the same view.
To be more precise, what I would like to happen in the case of some functions would be for the PCI core code to call a pci_host_bridge_ops method which in turn will call the arch specific code if it needs to. Why I think that would be better? Because otherwise you put in the architectural side code to cope with a certain host bridge, then another host bridge comes in and you add more architectural code, but then when you port host bridge X to arch B you discover that you need to add code there as well for X. And it all ends up in the mess we currently have where the drivers in drivers/pci/host are not capable of being ported to a different architecture because they rely on infrastructure only present in arm32 that is not properly documented.
I also don't realistically think we can get there on a lot of architectures any time soon. Note that most architectures only have one PCI host implementation, so the architecture structure is the same as the host driver structure anyway.
For architectures like powerpc and arm that have people actively working on them, we have a chance to clean up that code in the way we want it (if we can agree on the direction), but it's still not trivial to do.
Speaking of arm32 in particular, I think we will end up with a split approach: modern platforms (multiplatform, possibly all DT based) using PCI core infrastructure directly and no architecture specific PCI code on the one side, and a variation of today's code for the legacy platforms on the other.
Actually, if we could come up with a compromise for the pci_fixup_*() functions (are they still used by functional hardware?) then I think we could convert most of the arm32 arch code to re-direct the calls to the infrastructure code. But yes, there might be a lot of resistance to change due to lack of resources when changing old platforms.
Best regards, Liviu
Arnd
On Friday 11 April 2014 10:22:25 Liviu Dudau wrote:
On Thu, Apr 10, 2014 at 09:46:36PM +0100, Arnd Bergmann wrote:
On Thursday 10 April 2014 15:53:04 Liviu Dudau wrote:
So Arnd seems to agree with me: we should try to get out of architecture specific pci_sys_data and link the host bridge driver straight into the PCI core. The core then can call into arch code via pcibios_*() functions.
Arnd, am I reading correctly into what you are saying?
Half of it ;-)
I think it would be better to not have an architecture specific data structure, just like it would be better not to have architecture specific pcibios_* functions that get called by the PCI core. Note that the architecture specific functions are the ones that rely on the architecture specific data structures as well. If they only use the common fields, it should also be possible to share the code.
While I've come to like the pcibios_*() interface (and yes, it could be formalised and abstracted into a pci_xxxx_ops structure) I don't like the fact that those functions use architectural data in order to function. I know it might sound strange, as they *are* supposed to be implemented by the arches, but in my mind the link between generic code and arch code for PCI should be done by the host bridge driver. That's how PCI spec describes it, and I see no reason why we should not be able to adopt the same view.
Yes, that's a good goal for the architectures that need the complexity. I would also like to have a way to change as little as possible for the architectures that don't care about this because they only have one possible host controller implementation, which isn't necessarily a conflict.
To be more precise, what I would like to happen in the case of some functions would be for the PCI core code to call a pci_host_bridge_ops method which in turn will call the arch specific code if it needs to. Why I think that would be better? Because otherwise you put in the architectural side code to cope with a certain host bridge, then another host bridge comes in and you add more architectural code, but then when you port host bridge X to arch B you discover that you need to add code there as well for X. And it all ends up in the mess we currently have where the drivers in drivers/pci/host are not capable of being ported to a different architecture because they rely on infrastructure only present in arm32 that is not properly documented.
Right. Now it was intentional that we started putting the host drivers into drivers/pci/host before cleaning it all up. We just had to start somewhere.
I also don't realistically think we can get there on a lot of architectures any time soon. Note that most architectures only have one PCI host implementation, so the architecture structure is the same as the host driver structure anyway.
For architectures like powerpc and arm that have people actively working on them, we have a chance to clean up that code in the way we want it (if we can agree on the direction), but it's still not trivial to do.
Speaking of arm32 in particular, I think we will end up with a split approach: modern platforms (multiplatform, possibly all DT based) using PCI core infrastructure directly and no architecture specific PCI code on the one side, and a variation of today's code for the legacy platforms on the other.
Actually, if we could come up with a compromise for the pci_fixup_*() functions (are they still used by functional hardware?) then I think we could convert most of the arm32 arch code to re-direct the calls to the infrastructure code.
The fixups are used by hardware that we want to keep supporting, but I don't see a problem there. None of them rely on the architecture specific PCI implementation, and we could easily move the fixup code into a separate file. Also, I suspect they are all used only on platforms that won't be using CONFIG_ARCH_MULTIPLATFORM.
But yes, there might be a lot of resistance to change due to lack of resources when changing old platforms.
Well, it should be trivial to just create a pci_host_bridge_ops structure containing the currently global functions, and use that for everything registered through pci_common_init_dev(). We definitely have to support this method for things like iop/ixp/pxa/sa1100/footbridge, especially those that have their own concept of PCI domains.
For the more modern multiplatform stuff that uses DT for probing and has a driver in drivers/pci/host, we should be able to use completely distinct pci_host_bridge_ops structure that can be shared with arm64.
Arnd
On Mon, Apr 07, 2014 at 11:44:51PM +0100, Bjorn Helgaas wrote:
On Mon, Apr 7, 2014 at 4:07 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 10:14:18AM +0100, Benjamin Herrenschmidt wrote:
On Mon, 2014-04-07 at 09:46 +0100, Liviu Dudau wrote:
*My* strategy is to get rid of pci_domain_nr(). I don't see why we need to have arch specific way of providing the number, specially after looking at the existing implementations that return a value from a variable that is never touched or incremented. My guess is that pci_domain_nr() was created to work around the fact that there was no domain_nr maintainance in the generic code.
Well, there was no generic host bridge structure. There is one now, it should go there.
Exactly! Hence my patch. After it gets accepted I will go through architectures and remove their version of pci_domain_nr().
Currently the arch has to supply pci_domain_nr() because that's the only way for the generic code to learn the domain. After you add pci_create_root_bus_in_domain(), the arch can supply the domain that way, and we won't need the arch-specific pci_domain_nr(). Right? That makes more sense to me; thanks for the explanation.
Let me try to explain my concern about the pci_create_root_bus_in_domain() interface. We currently have these interfaces:
pci_scan_root_bus() pci_scan_bus() pci_scan_bus_parented() pci_create_root_bus()
pci_scan_root_bus() is a higher-level interface than pci_create_root_bus(), so I'm trying to migrate toward it because it lets us remove a little code from the arch, e.g., pci_scan_child_bus() and pci_bus_add_devices().
I think we can only remove the arch-specific pci_domain_nr() if that arch uses pci_create_root_bus_in_domain(). When we convert an arch from using scan_bus interfaces to using pci_create_root_bus_in_domain(), we will have to move the rest of the scan_bus code (pci_scan_child_bus(), pci_bus_add_devices()) back into the arch code.
One alternative is to add an _in_domain() variant of each of these interfaces, but that doesn't seem very convenient either. My idea of passing in a structure would also require adding variants, so there's not really an advantage there, but I am thinking of the next unification effort, e.g., for NUMA node info. I don't really want to have to change all the _in_domain() interfaces to also take yet another parameter for the node number.
Resurecting this thread as I'm about to send an updated patch:
TL;DR: Bjorn is concerned that my introduction of an _in_domain() version of pci_create_root_bus() as a way to pass a domain number from the arch code down (or up?) into the generic PCI code is incomplete, as other APIs that he listed make use of the non-domain aware version of pci_create_root_bus() and as he plans to remove the use of the function and use higher level APIs like pci_scan_root_bus() we will have to introduce an _in_domain() version for those higher level functions.
After a bit of thinking I think the change I'm proposing is fine exactly because it is a low level API. My intention is to automate the management of the PCI domain numbers and any architecture that wants to go against that should probably use the lower abstraction API to better control the flow. So, in my updated v8 version of the patch I'm going to keep the suggestion *as is* and hope we can have a(nother) discussion and come up with a conclusion.
Best regards, Liviu
Bjorn
To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Apr 07, 2014 at 11:44:51PM +0100, Bjorn Helgaas wrote:
On Mon, Apr 7, 2014 at 4:07 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 10:14:18AM +0100, Benjamin Herrenschmidt wrote:
On Mon, 2014-04-07 at 09:46 +0100, Liviu Dudau wrote:
*My* strategy is to get rid of pci_domain_nr(). I don't see why we need to have arch specific way of providing the number, specially after looking at the existing implementations that return a value from a variable that is never touched or incremented. My guess is that pci_domain_nr() was created to work around the fact that there was no domain_nr maintainance in the generic code.
Well, there was no generic host bridge structure. There is one now, it should go there.
Exactly! Hence my patch. After it gets accepted I will go through architectures and remove their version of pci_domain_nr().
Currently the arch has to supply pci_domain_nr() because that's the only way for the generic code to learn the domain. After you add pci_create_root_bus_in_domain(), the arch can supply the domain that way, and we won't need the arch-specific pci_domain_nr(). Right? That makes more sense to me; thanks for the explanation.
Let me try to explain my concern about the pci_create_root_bus_in_domain() interface. We currently have these interfaces:
pci_scan_root_bus() pci_scan_bus() pci_scan_bus_parented() pci_create_root_bus()
pci_scan_root_bus() is a higher-level interface than pci_create_root_bus(), so I'm trying to migrate toward it because it lets us remove a little code from the arch, e.g., pci_scan_child_bus() and pci_bus_add_devices().
I think we can only remove the arch-specific pci_domain_nr() if that arch uses pci_create_root_bus_in_domain(). When we convert an arch from using scan_bus interfaces to using pci_create_root_bus_in_domain(), we will have to move the rest of the scan_bus code (pci_scan_child_bus(), pci_bus_add_devices()) back into the arch code.
One alternative is to add an _in_domain() variant of each of these interfaces, but that doesn't seem very convenient either. My idea of passing in a structure would also require adding variants, so there's not really an advantage there, but I am thinking of the next unification effort, e.g., for NUMA node info. I don't really want to have to change all the _in_domain() interfaces to also take yet another parameter for the node number.
Bjorn,
I'm coming around to your way of thinking and I want to suggest a strategy for adding the domain number into the PCI framework.
My understanding is that when pci_host_bridge structure was introduced you were trying to keep the APIs unchanged and hence the creation of a bridge was hidden inside the pci_create_root_bus() function.
If we want to store the domain_nr information in the host bridge structure, together with a pointer to sysdata, then we need to break up the creation of the pci_host_bridge from the creation of a root bus. At that moment, pci_scan_root_bus() will need to be changed to accept a pci_host_bridge pointer, while pci_scan_bus() and pci_scan_bus_parented() will create the host bridge in the body of their function.
Did I understood correctly this time your intentions? Do you agree with this plan?
Best regards, Liviu
Bjorn
To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 4, 2014 at 8:57 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 11:44:51PM +0100, Bjorn Helgaas wrote:
On Mon, Apr 7, 2014 at 4:07 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 10:14:18AM +0100, Benjamin Herrenschmidt wrote:
On Mon, 2014-04-07 at 09:46 +0100, Liviu Dudau wrote:
*My* strategy is to get rid of pci_domain_nr(). I don't see why we need to have arch specific way of providing the number, specially after looking at the existing implementations that return a value from a variable that is never touched or incremented. My guess is that pci_domain_nr() was created to work around the fact that there was no domain_nr maintainance in the generic code.
Well, there was no generic host bridge structure. There is one now, it should go there.
Exactly! Hence my patch. After it gets accepted I will go through architectures and remove their version of pci_domain_nr().
Currently the arch has to supply pci_domain_nr() because that's the only way for the generic code to learn the domain. After you add pci_create_root_bus_in_domain(), the arch can supply the domain that way, and we won't need the arch-specific pci_domain_nr(). Right? That makes more sense to me; thanks for the explanation.
Let me try to explain my concern about the pci_create_root_bus_in_domain() interface. We currently have these interfaces:
pci_scan_root_bus() pci_scan_bus() pci_scan_bus_parented() pci_create_root_bus()
pci_scan_root_bus() is a higher-level interface than pci_create_root_bus(), so I'm trying to migrate toward it because it lets us remove a little code from the arch, e.g., pci_scan_child_bus() and pci_bus_add_devices().
I think we can only remove the arch-specific pci_domain_nr() if that arch uses pci_create_root_bus_in_domain(). When we convert an arch from using scan_bus interfaces to using pci_create_root_bus_in_domain(), we will have to move the rest of the scan_bus code (pci_scan_child_bus(), pci_bus_add_devices()) back into the arch code.
One alternative is to add an _in_domain() variant of each of these interfaces, but that doesn't seem very convenient either. My idea of passing in a structure would also require adding variants, so there's not really an advantage there, but I am thinking of the next unification effort, e.g., for NUMA node info. I don't really want to have to change all the _in_domain() interfaces to also take yet another parameter for the node number.
... My understanding is that when pci_host_bridge structure was introduced you were trying to keep the APIs unchanged and hence the creation of a bridge was hidden inside the pci_create_root_bus() function.
You mean pci_alloc_host_bridge()? Right; ideally I would have used pci_scan_root_bus() everywhere and gotten rid of pci_create_root_bus(). The outline of pci_scan_root_bus() is:
pci_create_root_bus() pci_scan_child_bus() pci_bus_add_devices()
The problem was that several arches do interesting things scattered among that core. The ACPI host bridge driver used on x86 and ia64 does resource allocation before pci_bus_add_devices(), as does parisc. Probably all arches should do this, but they don't.
And powerpc and sparc use of_scan_bus() or something similar instead of pci_scan_child_bus(). They probably *could* provide config space accessors that talk to OF and would allow pci_scan_child_bus() to work. But that seemed like too much work at the time.
If we want to store the domain_nr information in the host bridge structure, together with a pointer to sysdata, then we need to break up the creation of the pci_host_bridge from the creation of a root bus. At that moment, pci_scan_root_bus() will need to be changed to accept a pci_host_bridge pointer, while pci_scan_bus() and pci_scan_bus_parented() will create the host bridge in the body of their function.
It's hard to change an existing interface like pci_scan_root_bus() because it's called from so many places and you have to change them all at once. Then if something goes wrong, the revert makes a mess for everybody. But I think it makes sense to add a new interface that does what you want.
Bjorn
On Tue, Jul 08, 2014 at 02:11:36AM +0100, Bjorn Helgaas wrote:
On Fri, Jul 4, 2014 at 8:57 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 11:44:51PM +0100, Bjorn Helgaas wrote:
On Mon, Apr 7, 2014 at 4:07 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Mon, Apr 07, 2014 at 10:14:18AM +0100, Benjamin Herrenschmidt wrote:
On Mon, 2014-04-07 at 09:46 +0100, Liviu Dudau wrote:
*My* strategy is to get rid of pci_domain_nr(). I don't see why we need to have arch specific way of providing the number, specially after looking at the existing implementations that return a value from a variable that is never touched or incremented. My guess is that pci_domain_nr() was created to work around the fact that there was no domain_nr maintainance in the generic code.
Well, there was no generic host bridge structure. There is one now, it should go there.
Exactly! Hence my patch. After it gets accepted I will go through architectures and remove their version of pci_domain_nr().
Currently the arch has to supply pci_domain_nr() because that's the only way for the generic code to learn the domain. After you add pci_create_root_bus_in_domain(), the arch can supply the domain that way, and we won't need the arch-specific pci_domain_nr(). Right? That makes more sense to me; thanks for the explanation.
Let me try to explain my concern about the pci_create_root_bus_in_domain() interface. We currently have these interfaces:
pci_scan_root_bus() pci_scan_bus() pci_scan_bus_parented() pci_create_root_bus()
pci_scan_root_bus() is a higher-level interface than pci_create_root_bus(), so I'm trying to migrate toward it because it lets us remove a little code from the arch, e.g., pci_scan_child_bus() and pci_bus_add_devices().
I think we can only remove the arch-specific pci_domain_nr() if that arch uses pci_create_root_bus_in_domain(). When we convert an arch from using scan_bus interfaces to using pci_create_root_bus_in_domain(), we will have to move the rest of the scan_bus code (pci_scan_child_bus(), pci_bus_add_devices()) back into the arch code.
One alternative is to add an _in_domain() variant of each of these interfaces, but that doesn't seem very convenient either. My idea of passing in a structure would also require adding variants, so there's not really an advantage there, but I am thinking of the next unification effort, e.g., for NUMA node info. I don't really want to have to change all the _in_domain() interfaces to also take yet another parameter for the node number.
... My understanding is that when pci_host_bridge structure was introduced you were trying to keep the APIs unchanged and hence the creation of a bridge was hidden inside the pci_create_root_bus() function.
You mean pci_alloc_host_bridge()? Right; ideally I would have used pci_scan_root_bus() everywhere and gotten rid of pci_create_root_bus(). The outline of pci_scan_root_bus() is:
pci_create_root_bus() pci_scan_child_bus() pci_bus_add_devices()
The problem was that several arches do interesting things scattered among that core. The ACPI host bridge driver used on x86 and ia64 does resource allocation before pci_bus_add_devices(), as does parisc. Probably all arches should do this, but they don't.
And powerpc and sparc use of_scan_bus() or something similar instead of pci_scan_child_bus(). They probably *could* provide config space accessors that talk to OF and would allow pci_scan_child_bus() to work. But that seemed like too much work at the time.
If we want to store the domain_nr information in the host bridge structure, together with a pointer to sysdata, then we need to break up the creation of the pci_host_bridge from the creation of a root bus. At that moment, pci_scan_root_bus() will need to be changed to accept a pci_host_bridge pointer, while pci_scan_bus() and pci_scan_bus_parented() will create the host bridge in the body of their function.
It's hard to change an existing interface like pci_scan_root_bus() because it's called from so many places and you have to change them all at once. Then if something goes wrong, the revert makes a mess for everybody. But I think it makes sense to add a new interface that does what you want.
OK, I understand your concern. It does sort of return us back to the initial discussion, where you were arguing against adding a new set of functions for every existing function, but it makes sense from transition point of view.
Best regards, Liviu
Bjorn
This is a useful function and we should make it visible outside the generic PCI code. Export it as a GPL symbol.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Tested-by: Tanmay Inamdar tinamdar@apm.com --- drivers/pci/host-bridge.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c index 06ace62..8708b652 100644 --- a/drivers/pci/host-bridge.c +++ b/drivers/pci/host-bridge.c @@ -17,12 +17,13 @@ static struct pci_bus *find_pci_root_bus(struct pci_bus *bus) return bus; }
-static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus) +struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus) { struct pci_bus *root_bus = find_pci_root_bus(bus);
return to_pci_host_bridge(root_bus->bridge); } +EXPORT_SYMBOL_GPL(find_pci_host_bridge);
void pci_set_host_bridge_release(struct pci_host_bridge *bridge, void (*release_fn)(struct pci_host_bridge *),
On Fri, Mar 14, 2014 at 03:34:31PM +0000, Liviu Dudau wrote:
This is a useful function and we should make it visible outside the generic PCI code. Export it as a GPL symbol.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Tested-by: Tanmay Inamdar tinamdar@apm.com
drivers/pci/host-bridge.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c index 06ace62..8708b652 100644 --- a/drivers/pci/host-bridge.c +++ b/drivers/pci/host-bridge.c @@ -17,12 +17,13 @@ static struct pci_bus *find_pci_root_bus(struct pci_bus *bus) return bus; } -static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus) +struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus) { struct pci_bus *root_bus = find_pci_root_bus(bus); return to_pci_host_bridge(root_bus->bridge); } +EXPORT_SYMBOL_GPL(find_pci_host_bridge);
Do you have a place where you actually need to use find_pci_host_bridge()? I'd rather not export it, even as _GPL, unless we have a user.
If we *do* export it, I'd like it to have a more conventional name, e.g., something starting with "pci_".
void pci_set_host_bridge_release(struct pci_host_bridge *bridge, void (*release_fn)(struct pci_host_bridge *), -- 1.9.0
On Sat, Apr 05, 2014 at 12:39:48AM +0100, Bjorn Helgaas wrote:
On Fri, Mar 14, 2014 at 03:34:31PM +0000, Liviu Dudau wrote:
This is a useful function and we should make it visible outside the generic PCI code. Export it as a GPL symbol.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Tested-by: Tanmay Inamdar tinamdar@apm.com
drivers/pci/host-bridge.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c index 06ace62..8708b652 100644 --- a/drivers/pci/host-bridge.c +++ b/drivers/pci/host-bridge.c @@ -17,12 +17,13 @@ static struct pci_bus *find_pci_root_bus(struct pci_bus *bus) return bus; } -static struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus) +struct pci_host_bridge *find_pci_host_bridge(struct pci_bus *bus) { struct pci_bus *root_bus = find_pci_root_bus(bus); return to_pci_host_bridge(root_bus->bridge); } +EXPORT_SYMBOL_GPL(find_pci_host_bridge);
Do you have a place where you actually need to use find_pci_host_bridge()? I'd rather not export it, even as _GPL, unless we have a user.
The code in arm64 series is using it to implement pci_domain_nr().
If we *do* export it, I'd like it to have a more conventional name, e.g., something starting with "pci_".
Understood. pci_find_host_bridge() ?
Best regards, Liviu
void pci_set_host_bridge_release(struct pci_host_bridge *bridge, void (*release_fn)(struct pci_host_bridge *), -- 1.9.0
-- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
If we *do* export it, I'd like it to have a more conventional name, e.g., something starting with "pci_".
Understood. pci_find_host_bridge() ?
pci_get_host_bridge() and take a reference IMHO ? If your primary bus is not PCI then your PCI "host" bridge could be hot swappable so the API at least ought to get that right.
Alan
Several platforms use a rather generic version of parsing the device tree to find the host bridge ranges. Move the common code into the generic PCI code and use it to create a pci_host_bridge structure that can be used by arch code.
Based on early attempts by Andrew Murray to unify the code. Used powerpc and microblaze PCI code as starting point.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Tested-by: Tanmay Inamdar tinamdar@apm.com --- drivers/pci/host-bridge.c | 158 ++++++++++++++++++++++++++++++++++++ include/linux/pci.h | 13 +++ 2 files changed, 171 insertions(+)
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c index 8708b652..7cda90b 100644 --- a/drivers/pci/host-bridge.c +++ b/drivers/pci/host-bridge.c @@ -6,9 +6,14 @@ #include <linux/init.h> #include <linux/pci.h> #include <linux/module.h> +#include <linux/of_address.h> +#include <linux/of_pci.h> +#include <linux/slab.h>
#include "pci.h"
+static atomic_t domain_nr = ATOMIC_INIT(-1); + static struct pci_bus *find_pci_root_bus(struct pci_bus *bus) { while (bus->parent) @@ -92,3 +97,156 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res, res->end = region->end + offset; } EXPORT_SYMBOL(pcibios_bus_to_resource); + +#ifdef CONFIG_OF +/** + * Simple version of the platform specific code for filtering the list + * of resources obtained from the ranges declaration in DT. + * + * Platforms can override this function in order to impose stronger + * constraints onto the list of resources that a host bridge can use. + * The filtered list will then be used to create a root bus and associate + * it with the host bridge. + * + */ +int __weak pcibios_fixup_bridge_ranges(struct list_head *resources) +{ + return 0; +} + +/** + * pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT + * @dev: device node of the host bridge having the range property + * @resources: list where the range of resources will be added after DT parsing + * @io_base: pointer to a variable that will contain the physical address for + * the start of the I/O range. + * + * It is the callers job to free the @resources list if an error is returned. + * + * This function will parse the "ranges" property of a PCI host bridge device + * node and setup the resource mapping based on its content. It is expected + * that the property conforms with the Power ePAPR document. + * + * Each architecture is then offered the chance of applying their own + * filtering of pci_host_bridge_windows based on their own restrictions by + * calling pcibios_fixup_bridge_ranges(). The filtered list of windows + * can then be used when creating a pci_host_bridge structure. + */ +static int pci_host_bridge_of_get_ranges(struct device_node *dev, + struct list_head *resources, resource_size_t *io_base) +{ + struct resource *res; + struct of_pci_range range; + struct of_pci_range_parser parser; + int err; + + pr_info("PCI host bridge %s ranges:\n", dev->full_name); + + /* Check for ranges property */ + err = of_pci_range_parser_init(&parser, dev); + if (err) + return err; + + pr_debug("Parsing ranges property...\n"); + for_each_of_pci_range(&parser, &range) { + /* Read next ranges element */ + pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ", + range.pci_space, range.pci_addr); + pr_debug("cpu_addr:0x%016llx size:0x%016llx\n", + range.cpu_addr, range.size); + + /* + * If we failed translation or got a zero-sized region + * then skip this range + */ + if (range.cpu_addr == OF_BAD_ADDR || range.size == 0) + continue; + + res = kzalloc(sizeof(struct resource), GFP_KERNEL); + if (!res) + return -ENOMEM; + + err = of_pci_range_to_resource(&range, dev, res); + if (err) + return err; + + if (resource_type(res) == IORESOURCE_IO) + *io_base = range.cpu_addr; + + pci_add_resource_offset(resources, res, + res->start - range.pci_addr); + } + + /* Apply architecture specific fixups for the ranges */ + return pcibios_fixup_bridge_ranges(resources); +} + +/** + * of_create_pci_host_bridge - Create a PCI host bridge structure using + * information passed in the DT. + * @parent: device owning this host bridge + * @ops: pci_ops associated with the host controller + * @host_data: opaque data structure used by the host controller. + * + * returns a pointer to the newly created pci_host_bridge structure, or + * NULL if the call failed. + * + * This function will try to obtain the host bridge domain number by + * using of_alias_get_id() call with "pci-domain" as a stem. If that + * fails, a local allocator will be used that will put each host bridge + * in a new domain. + */ +struct pci_host_bridge * +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, void *host_data) +{ + int err, domain, busno; + struct resource *bus_range; + struct pci_bus *root_bus; + struct pci_host_bridge *bridge; + resource_size_t io_base; + LIST_HEAD(res); + + bus_range = kzalloc(sizeof(*bus_range), GFP_KERNEL); + if (!bus_range) + return ERR_PTR(-ENOMEM); + + domain = of_alias_get_id(parent->of_node, "pci-domain"); + if (domain == -ENODEV) + domain = atomic_inc_return(&domain_nr); + + err = of_pci_parse_bus_range(parent->of_node, bus_range); + if (err) { + dev_info(parent, "No bus range for %s, using default [0-255]\n", + parent->of_node->full_name); + bus_range->start = 0; + bus_range->end = 255; + bus_range->flags = IORESOURCE_BUS; + } + busno = bus_range->start; + pci_add_resource(&res, bus_range); + + /* now parse the rest of host bridge bus ranges */ + err = pci_host_bridge_of_get_ranges(parent->of_node, &res, &io_base); + if (err) + goto err_create; + + /* then create the root bus */ + root_bus = pci_create_root_bus_in_domain(parent, domain, busno, + ops, host_data, &res); + if (IS_ERR(root_bus)) { + err = PTR_ERR(root_bus); + goto err_create; + } + + bridge = to_pci_host_bridge(root_bus->bridge); + bridge->io_base = io_base; + + return bridge; + +err_create: + pci_free_resource_list(&res); + return ERR_PTR(err); +} +EXPORT_SYMBOL_GPL(of_create_pci_host_bridge); + +#endif /* CONFIG_OF */ diff --git a/include/linux/pci.h b/include/linux/pci.h index 1eed009..40ddd3d 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -395,6 +395,7 @@ struct pci_host_bridge { struct device dev; struct pci_bus *bus; /* root bus */ int domain_nr; + resource_size_t io_base; /* physical address for the start of I/O area */ struct list_head windows; /* pci_host_bridge_windows */ void (*release_fn)(struct pci_host_bridge *); void *release_data; @@ -1786,11 +1787,23 @@ static inline struct device_node *pci_bus_to_OF_node(struct pci_bus *bus) return bus ? bus->dev.of_node : NULL; }
+struct pci_host_bridge * +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, + void *host_data); + +int pcibios_fixup_bridge_ranges(struct list_head *resources); #else /* CONFIG_OF */ static inline void pci_set_of_node(struct pci_dev *dev) { } static inline void pci_release_of_node(struct pci_dev *dev) { } static inline void pci_set_bus_of_node(struct pci_bus *bus) { } static inline void pci_release_bus_of_node(struct pci_bus *bus) { } + +static inline struct pci_host_bridge * +of_create_pci_host_bridge(struct device *parent, struct pci_ops *ops, + void *host_data) +{ + return NULL; +} #endif /* CONFIG_OF */
#ifdef CONFIG_EEH
Hi Liviu,
On 2014-3-14 23:34, Liviu Dudau wrote:
Several platforms use a rather generic version of parsing the device tree to find the host bridge ranges. Move the common code into the generic PCI code and use it to create a pci_host_bridge structure that can be used by arch code.
Based on early attempts by Andrew Murray to unify the code. Used powerpc and microblaze PCI code as starting point.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Tested-by: Tanmay Inamdar tinamdar@apm.com
drivers/pci/host-bridge.c | 158 ++++++++++++++++++++++++++++++++++++ include/linux/pci.h | 13 +++ 2 files changed, 171 insertions(+)
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c index 8708b652..7cda90b 100644 --- a/drivers/pci/host-bridge.c +++ b/drivers/pci/host-bridge.c @@ -6,9 +6,14 @@ #include <linux/init.h> #include <linux/pci.h> #include <linux/module.h> +#include <linux/of_address.h> +#include <linux/of_pci.h> +#include <linux/slab.h> #include "pci.h" +static atomic_t domain_nr = ATOMIC_INIT(-1);
domain_nr will only be used inside the #ifdef CONFIG_OF, and this will lead to compile warning which complains that 'domain_nr' defined but not used when CONFIG_OF=n (for example on x86).
How about moving the definition to --->
static struct pci_bus *find_pci_root_bus(struct pci_bus *bus) { while (bus->parent) @@ -92,3 +97,156 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res, res->end = region->end + offset; } EXPORT_SYMBOL(pcibios_bus_to_resource);
+#ifdef CONFIG_OF
here?
Thanks Hanjun
Hi Hanjun,
On Tue, Apr 08, 2014 at 01:57:54PM +0100, Hanjun Guo wrote:
Hi Liviu,
On 2014-3-14 23:34, Liviu Dudau wrote:
Several platforms use a rather generic version of parsing the device tree to find the host bridge ranges. Move the common code into the generic PCI code and use it to create a pci_host_bridge structure that can be used by arch code.
Based on early attempts by Andrew Murray to unify the code. Used powerpc and microblaze PCI code as starting point.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Tested-by: Tanmay Inamdar tinamdar@apm.com
drivers/pci/host-bridge.c | 158 ++++++++++++++++++++++++++++++++++++ include/linux/pci.h | 13 +++ 2 files changed, 171 insertions(+)
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c index 8708b652..7cda90b 100644 --- a/drivers/pci/host-bridge.c +++ b/drivers/pci/host-bridge.c @@ -6,9 +6,14 @@ #include <linux/init.h> #include <linux/pci.h> #include <linux/module.h> +#include <linux/of_address.h> +#include <linux/of_pci.h> +#include <linux/slab.h> #include "pci.h" +static atomic_t domain_nr = ATOMIC_INIT(-1);
domain_nr will only be used inside the #ifdef CONFIG_OF, and this will lead to compile warning which complains that 'domain_nr' defined but not used when CONFIG_OF=n (for example on x86).
How about moving the definition to --->
static struct pci_bus *find_pci_root_bus(struct pci_bus *bus) { while (bus->parent) @@ -92,3 +97,156 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res, res->end = region->end + offset; } EXPORT_SYMBOL(pcibios_bus_to_resource);
+#ifdef CONFIG_OF
here?
I've already moved the definition there in my refresh.
Thanks for reviewing the code, Liviu
Thanks Hanjun
linaro-kernel@lists.linaro.org