Following the discussion started here [1], I now have a proposal for tackling generic support for host bridges described via device tree. It is an initial stab at it, to try to get feedback and suggestions, but it is functional enough that I have PCI Express for arm64 working on an FPGA using the patch that I am also publishing that adds support for PCI for that platform.
Looking at the existing architectures that fit the requirements (use of device tree and PCI) yields the powerpc and microblaze as generic enough to make them candidates for conversion. I have a tentative patch for microblaze that I can only compile test it, unfortunately using qemu-microblaze leads to an early crash in the kernel.
As Bjorn has mentioned in the previous discussion, the idea is to add to struct pci_host_bridge enough data to be able to reduce the size or remove the architecture specific pci_controller structure. arm64 support actually manages to get rid of all the architecture static data and has no pci_controller structure defined. For host bridge drivers that means a change of API unless architectures decide to provide a compatibility layer (comments here please).
In order to initialise a host bridge with the new API, the following example code is sufficient for a _probe() function:
static int myhostbridge_probe(struct platform_device *pdev) { int err; struct device_node *dev; struct pci_host_bridge *bridge; struct resource bus_range; struct myhostbridge_port *pp; LIST_HEAD(resources);
dev = pdev->dev.of_node;
if (!of_device_is_available(dev)) { pr_warn("%s: disabled\n", dev->full_name); return -ENODEV; }
pp = kzalloc(sizeof(struct myhostbridge_port), GFP_KERNEL); if (!pp) return -ENOMEM;
err = of_pci_parse_bus_range(dev, &bus_range); if (err) { bus_range.start = 0; bus_range.end = 255; bus_range.flags = IORESOURCE_BUS; } pci_add_resource(&resources, &bus_range);
bridge = pci_host_bridge_of_init(&pdev->dev, 0, &myhostbridge_ops, pp, &resources); if (!bridge) { err = -EINVAL; goto bridge_init_fail; }
err = myhostbridge_setup(bridge->bus); if (err) goto bridge_init_fail;
/* * Add flags here, this is just an example */ pci_add_flags(PCI_ENABLE_PROC_DOMAINS | PCI_COMPAT_DOMAIN_0); pci_add_flags(PCI_REASSIGN_ALL_BUS | PCI_REASSIGN_ALL_RSRC);
bus_range.end = pci_scan_child_bus(bridge->bus); pci_bus_update_busn_res_end(bridge->bus, bus_range.end);
pci_assign_unassigned_bus_resources(bridge->bus);
pci_bus_add_devices(bridge->bus);
return 0;
bridge_init_fail: kfree(pp); pci_free_resource_list(&resources); return err; }
Best regards, Liviu Dudau
[1] http://thread.gmane.org/gmane.linux.kernel.pci/25946
Liviu Dudau (1): pci: Add support for creating a generic host_bridge from device tree
drivers/pci/host-bridge.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++ drivers/pci/probe.c | 11 ++++++ include/linux/pci.h | 14 ++++++++ 3 files changed, 117 insertions(+)
Several platforms use a rather generic version of parsing the device tree to find the host bridge ranges. Move that into the generic PCI code and use it to create a pci_host_bridge structure that can be used by arch code.
Based on early attempts by Andrew Murray to unify the code. Used powerpc and microblaze PCI code as starting point.
Signed-off-by: Liviu Dudau Liviu.Dudau@arm.com Cc: Catalin Marinas Catalin.Marinas@arm.com Cc: Will Deacon Will.Deacon@arm.com --- drivers/pci/host-bridge.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++ drivers/pci/probe.c | 11 ++++++ include/linux/pci.h | 14 ++++++++ 3 files changed, 117 insertions(+)
diff --git a/drivers/pci/host-bridge.c b/drivers/pci/host-bridge.c index 06ace62..9d11deb 100644 --- a/drivers/pci/host-bridge.c +++ b/drivers/pci/host-bridge.c @@ -6,6 +6,7 @@ #include <linux/init.h> #include <linux/pci.h> #include <linux/module.h> +#include <linux/of_address.h>
#include "pci.h"
@@ -91,3 +92,94 @@ void pcibios_bus_to_resource(struct pci_bus *bus, struct resource *res, res->end = region->end + offset; } EXPORT_SYMBOL(pcibios_bus_to_resource); + +/** + * pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT + * @dev: device node of the host bridge having the range property + * @resources: list where the range of resources will be added after DT parsing + * + * This function will parse the "ranges" property of a PCI host bridge device + * node and setup the resource mapping based on its content. It is expected + * that the property conforms with the Power ePAPR document. + * + * Each architecture will then apply their filtering based on the limitations + * of each platform. One general restriction seems to be the number of IO space + * ranges, the PCI framework makes intensive use of struct resource management, + * and for IORESOURCE_IO types they can only be requested if they are contained + * within the global ioport_resource, so that should be limited to one IO space + * range. + */ +static int pci_host_bridge_of_get_ranges(struct device_node *dev, + struct list_head *resources) +{ + struct resource *res; + struct of_pci_range range; + struct of_pci_range_parser parser; + int err; + + pr_info("PCI host bridge %s ranges:\n", dev->full_name); + + /* Check for ranges property */ + err = of_pci_range_parser_init(&parser, dev); + if (err) + return err; + + pr_debug("Parsing ranges property...\n"); + for_each_of_pci_range(&parser, &range) { + /* Read next ranges element */ + pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ", + range.pci_space, range.pci_addr); + pr_debug("cpu_addr:0x%016llx size:0x%016llx\n", + range.cpu_addr, range.size); + + /* If we failed translation or got a zero-sized region + * (some FW try to feed us with non sensical zero sized regions + * such as power3 which look like some kind of attempt + * at exposing the VGA memory hole) then skip this range + */ + if (range.cpu_addr == OF_BAD_ADDR || range.size == 0) + continue; + + res = kzalloc(sizeof(struct resource), GFP_KERNEL); + if (!res) { + err = -ENOMEM; + goto bridge_ranges_nomem; + } + + of_pci_range_to_resource(&range, dev, res); + + pci_add_resource_offset(resources, res, + range.cpu_addr - range.pci_addr); + } + + /* Apply architecture specific fixups for the ranges */ + pcibios_fixup_bridge_ranges(resources); + + return 0; + +bridge_ranges_nomem: + pci_free_resource_list(resources); + return err; +} + +struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, int busno, struct pci_ops *ops, + void *host_data, struct list_head *resources) +{ + struct pci_bus *root_bus; + struct pci_host_bridge *bridge; + + /* first parse the host bridge bus ranges */ + if (pci_host_bridge_of_get_ranges(parent->of_node, resources)) + return NULL; + + /* then create the root bus */ + root_bus = pci_create_root_bus(parent, busno, ops, host_data, resources); + if (!root_bus) + return NULL; + + bridge = to_pci_host_bridge(root_bus->bridge); + + return bridge; +} +EXPORT_SYMBOL(pci_host_bridge_of_init); diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..16febae 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1787,6 +1787,17 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, list_for_each_entry_safe(window, n, resources, list) { list_move_tail(&window->list, &bridge->windows); res = window->res; + /* + * IO resources are stored in the kernel with a CPU start + * address of zero. Adjust the data accordingly and remember + * the offset + */ + if (resource_type(res) == IORESOURCE_IO) { + bridge->io_offset = res->start; + res->end -= res->start; + window->offset -= res->start; + res->start = 0; + } offset = window->offset; if (res->flags & IORESOURCE_BUS) pci_bus_insert_busn_res(b, bus, res->end); diff --git a/include/linux/pci.h b/include/linux/pci.h index fb57c89..8953997 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -394,6 +394,8 @@ struct pci_host_bridge_window { struct pci_host_bridge { struct device dev; struct pci_bus *bus; /* root bus */ + resource_size_t io_offset; /* CPU address offset for io resources */ + int domain_nr; struct list_head windows; /* pci_host_bridge_windows */ void (*release_fn)(struct pci_host_bridge *); void *release_data; @@ -1762,11 +1764,23 @@ static inline struct device_node *pci_bus_to_OF_node(struct pci_bus *bus) return bus ? bus->dev.of_node : NULL; }
+struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, int busno, struct pci_ops *ops, + void *host_data, struct list_head *resources); + +void pcibios_fixup_bridge_ranges(struct list_head *resources); #else /* CONFIG_OF */ static inline void pci_set_of_node(struct pci_dev *dev) { } static inline void pci_release_of_node(struct pci_dev *dev) { } static inline void pci_set_bus_of_node(struct pci_bus *bus) { } static inline void pci_release_bus_of_node(struct pci_bus *bus) { } + +static inline struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, struct pci_ops *ops, + void *host_data, struct list_head *resources) +{ + return NULL; +} #endif /* CONFIG_OF */
#ifdef CONFIG_EEH
On Monday 03 February 2014 18:33:48 Liviu Dudau wrote:
+/**
- pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
- @dev: device node of the host bridge having the range property
- @resources: list where the range of resources will be added after DT parsing
- This function will parse the "ranges" property of a PCI host bridge device
- node and setup the resource mapping based on its content. It is expected
- that the property conforms with the Power ePAPR document.
- Each architecture will then apply their filtering based on the limitations
- of each platform. One general restriction seems to be the number of IO space
- ranges, the PCI framework makes intensive use of struct resource management,
- and for IORESOURCE_IO types they can only be requested if they are contained
- within the global ioport_resource, so that should be limited to one IO space
- range.
Actually we have quite a different set of restrictions around I/O space on ARM32 at the moment: Each host bridge can have its own 64KB range in an arbitrary location on MMIO space, and the total must not exceed 2MB of I/O space.
- */
+static int pci_host_bridge_of_get_ranges(struct device_node *dev,
struct list_head *resources)
+{
- struct resource *res;
- struct of_pci_range range;
- struct of_pci_range_parser parser;
- int err;
- pr_info("PCI host bridge %s ranges:\n", dev->full_name);
- /* Check for ranges property */
- err = of_pci_range_parser_init(&parser, dev);
- if (err)
return err;
- pr_debug("Parsing ranges property...\n");
- for_each_of_pci_range(&parser, &range) {
/* Read next ranges element */
pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ",
range.pci_space, range.pci_addr);
pr_debug("cpu_addr:0x%016llx size:0x%016llx\n",
range.cpu_addr, range.size);
/* If we failed translation or got a zero-sized region
* (some FW try to feed us with non sensical zero sized regions
* such as power3 which look like some kind of attempt
* at exposing the VGA memory hole) then skip this range
*/
if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;
res = kzalloc(sizeof(struct resource), GFP_KERNEL);
if (!res) {
err = -ENOMEM;
goto bridge_ranges_nomem;
}
of_pci_range_to_resource(&range, dev, res);
pci_add_resource_offset(resources, res,
range.cpu_addr - range.pci_addr);
- }
I believe of_pci_range_to_resource() will return the MMIO aperture for the I/O space window here, which is not what you are supposed to pass into pci_add_resource_offset.
+EXPORT_SYMBOL(pci_host_bridge_of_init);
EXPORT_SYMBOL_GPL
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..16febae 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1787,6 +1787,17 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, list_for_each_entry_safe(window, n, resources, list) { list_move_tail(&window->list, &bridge->windows); res = window->res;
/*
* IO resources are stored in the kernel with a CPU start
* address of zero. Adjust the data accordingly and remember
* the offset
*/
if (resource_type(res) == IORESOURCE_IO) {
bridge->io_offset = res->start;
res->end -= res->start;
window->offset -= res->start;
res->start = 0;
offset = window->offset; if (res->flags & IORESOURCE_BUS)}
Won't this break all existing host bridges?
Arnd
Hi Arnd,
First of all, thanks for reviewing this!
On Mon, Feb 03, 2014 at 06:46:10PM +0000, Arnd Bergmann wrote:
On Monday 03 February 2014 18:33:48 Liviu Dudau wrote:
+/**
- pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
- @dev: device node of the host bridge having the range property
- @resources: list where the range of resources will be added after DT parsing
- This function will parse the "ranges" property of a PCI host bridge device
- node and setup the resource mapping based on its content. It is expected
- that the property conforms with the Power ePAPR document.
- Each architecture will then apply their filtering based on the limitations
- of each platform. One general restriction seems to be the number of IO space
- ranges, the PCI framework makes intensive use of struct resource management,
- and for IORESOURCE_IO types they can only be requested if they are contained
- within the global ioport_resource, so that should be limited to one IO space
- range.
Actually we have quite a different set of restrictions around I/O space on ARM32 at the moment: Each host bridge can have its own 64KB range in an arbitrary location on MMIO space, and the total must not exceed 2MB of I/O space.
And that is why the filtering is not (yet) imposed in the generic code. But once you use pci_request_region, that will call request_region which will check against ioport_resource as parent for the requested resource. That should fail if is is not in the correct range, so I don't know how arm arch code manages multiple IO ranges.
- */
+static int pci_host_bridge_of_get_ranges(struct device_node *dev,
struct list_head *resources)
+{
- struct resource *res;
- struct of_pci_range range;
- struct of_pci_range_parser parser;
- int err;
- pr_info("PCI host bridge %s ranges:\n", dev->full_name);
- /* Check for ranges property */
- err = of_pci_range_parser_init(&parser, dev);
- if (err)
return err;
- pr_debug("Parsing ranges property...\n");
- for_each_of_pci_range(&parser, &range) {
/* Read next ranges element */
pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ",
range.pci_space, range.pci_addr);
pr_debug("cpu_addr:0x%016llx size:0x%016llx\n",
range.cpu_addr, range.size);
/* If we failed translation or got a zero-sized region
* (some FW try to feed us with non sensical zero sized regions
* such as power3 which look like some kind of attempt
* at exposing the VGA memory hole) then skip this range
*/
if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;
res = kzalloc(sizeof(struct resource), GFP_KERNEL);
if (!res) {
err = -ENOMEM;
goto bridge_ranges_nomem;
}
of_pci_range_to_resource(&range, dev, res);
pci_add_resource_offset(resources, res,
range.cpu_addr - range.pci_addr);
- }
I believe of_pci_range_to_resource() will return the MMIO aperture for the I/O space window here, which is not what you are supposed to pass into pci_add_resource_offset.
And that is why the code in probe.c has been added to deal with that. It is too early to do the adjustments here as all we have is the list of resources and that might get culled by the architecture fixup code. Remembering the io_offset will happen once the pci_host_bridge gets created, and the resources are then adjusted.
+EXPORT_SYMBOL(pci_host_bridge_of_init);
EXPORT_SYMBOL_GPL
Will change for v2, thanks!
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..16febae 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1787,6 +1787,17 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, list_for_each_entry_safe(window, n, resources, list) { list_move_tail(&window->list, &bridge->windows); res = window->res;
/*
* IO resources are stored in the kernel with a CPU start
* address of zero. Adjust the data accordingly and remember
* the offset
*/
if (resource_type(res) == IORESOURCE_IO) {
bridge->io_offset = res->start;
res->end -= res->start;
window->offset -= res->start;
res->start = 0;
offset = window->offset; if (res->flags & IORESOURCE_BUS)}
Won't this break all existing host bridges?
I am not sure. I believe not, due to what I've explained earlier, but you might be right.
The adjustment happens before the resource is added to the host bridge windows and translates it from MMIO range into IO range.
Best regards, Liviu
Arnd
To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Monday 03 February 2014 19:06:49 Liviu Dudau wrote:
On Mon, Feb 03, 2014 at 06:46:10PM +0000, Arnd Bergmann wrote:
On Monday 03 February 2014 18:33:48 Liviu Dudau wrote:
+/**
- pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
- @dev: device node of the host bridge having the range property
- @resources: list where the range of resources will be added after DT parsing
- This function will parse the "ranges" property of a PCI host bridge device
- node and setup the resource mapping based on its content. It is expected
- that the property conforms with the Power ePAPR document.
- Each architecture will then apply their filtering based on the limitations
- of each platform. One general restriction seems to be the number of IO space
- ranges, the PCI framework makes intensive use of struct resource management,
- and for IORESOURCE_IO types they can only be requested if they are contained
- within the global ioport_resource, so that should be limited to one IO space
- range.
Actually we have quite a different set of restrictions around I/O space on ARM32 at the moment: Each host bridge can have its own 64KB range in an arbitrary location on MMIO space, and the total must not exceed 2MB of I/O space.
And that is why the filtering is not (yet) imposed in the generic code. But once you use pci_request_region, that will call request_region which will check against ioport_resource as parent for the requested resource. That should fail if is is not in the correct range, so I don't know how arm arch code manages multiple IO ranges.
Let's try to come up with nomenclature so we can talk about this better
The ioport_resource is in "logical I/O space", which is a Linux fiction, it goes from 0 to IO_SPACE_LIMIT (2MB on ARM) and is mapped into "virtual I/O space", which start at (void __iomem *)PCI_IO_VIRT_BASE.
Each PCI domain can have its own "bus I/O aperture", which is typically between 0x1000 and 0xffff and reflects the address that is used in PCI transactions and in BARs. The aperture here reflects the subset of the 4GB bus I/O space that is actually mapped into a CPU visible "physical I/O aperture" using an inbound mapping of the host bridge. The physical I/O aperture in turn gets mapped to the virtual I/O space using pci_ioremap_io. The difference between a bus I/O address and a logical I/O address is stored in the io_offset.
So much for basic definitions. When a device driver calls pci_request_region, the port number it sees is the bus I/O port number adjusted using the io_offset to turn it into a logical I/O port number, which should always be within the host bridge window, which in turn is a subset of the ioport_resource.
+static int pci_host_bridge_of_get_ranges(struct device_node *dev,
struct list_head *resources)
+{
- struct resource *res;
- struct of_pci_range range;
- struct of_pci_range_parser parser;
- int err;
- pr_info("PCI host bridge %s ranges:\n", dev->full_name);
- /* Check for ranges property */
- err = of_pci_range_parser_init(&parser, dev);
- if (err)
return err;
- pr_debug("Parsing ranges property...\n");
- for_each_of_pci_range(&parser, &range) {
/* Read next ranges element */
pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ",
range.pci_space, range.pci_addr);
pr_debug("cpu_addr:0x%016llx size:0x%016llx\n",
range.cpu_addr, range.size);
/* If we failed translation or got a zero-sized region
* (some FW try to feed us with non sensical zero sized regions
* such as power3 which look like some kind of attempt
* at exposing the VGA memory hole) then skip this range
*/
if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;
res = kzalloc(sizeof(struct resource), GFP_KERNEL);
if (!res) {
err = -ENOMEM;
goto bridge_ranges_nomem;
}
of_pci_range_to_resource(&range, dev, res);
pci_add_resource_offset(resources, res,
range.cpu_addr - range.pci_addr);
- }
I believe of_pci_range_to_resource() will return the MMIO aperture for the I/O space window here, which is not what you are supposed to pass into pci_add_resource_offset.
And that is why the code in probe.c has been added to deal with that. It is too early to do the adjustments here as all we have is the list of resources and that might get culled by the architecture fixup code. Remembering the io_offset will happen once the pci_host_bridge gets created, and the resources are then adjusted.
So you want to register an incorrect I/O resource first and then have it fixed up later, rather than registering the correct one from the start as everyone else?
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..16febae 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1787,6 +1787,17 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, list_for_each_entry_safe(window, n, resources, list) { list_move_tail(&window->list, &bridge->windows); res = window->res;
/*
* IO resources are stored in the kernel with a CPU start
* address of zero. Adjust the data accordingly and remember
* the offset
*/
if (resource_type(res) == IORESOURCE_IO) {
bridge->io_offset = res->start;
res->end -= res->start;
window->offset -= res->start;
res->start = 0;
} offset = window->offset; if (res->flags & IORESOURCE_BUS)
Won't this break all existing host bridges?
I am not sure. I believe not, due to what I've explained earlier, but you might be right.
The adjustment happens before the resource is added to the host bridge windows and translates it from MMIO range into IO range.
AFAICT, the resource_type of the resource you register above should be IORESOURCE_MEM, so you are not actually matching it here.
Arnd
On Mon, Feb 03, 2014 at 07:31:31PM +0000, Arnd Bergmann wrote:
On Monday 03 February 2014 19:06:49 Liviu Dudau wrote:
On Mon, Feb 03, 2014 at 06:46:10PM +0000, Arnd Bergmann wrote:
On Monday 03 February 2014 18:33:48 Liviu Dudau wrote:
+/**
- pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
- @dev: device node of the host bridge having the range property
- @resources: list where the range of resources will be added after DT parsing
- This function will parse the "ranges" property of a PCI host bridge device
- node and setup the resource mapping based on its content. It is expected
- that the property conforms with the Power ePAPR document.
- Each architecture will then apply their filtering based on the limitations
- of each platform. One general restriction seems to be the number of IO space
- ranges, the PCI framework makes intensive use of struct resource management,
- and for IORESOURCE_IO types they can only be requested if they are contained
- within the global ioport_resource, so that should be limited to one IO space
- range.
Actually we have quite a different set of restrictions around I/O space on ARM32 at the moment: Each host bridge can have its own 64KB range in an arbitrary location on MMIO space, and the total must not exceed 2MB of I/O space.
And that is why the filtering is not (yet) imposed in the generic code. But once you use pci_request_region, that will call request_region which will check against ioport_resource as parent for the requested resource. That should fail if is is not in the correct range, so I don't know how arm arch code manages multiple IO ranges.
Let's try to come up with nomenclature so we can talk about this better
The ioport_resource is in "logical I/O space", which is a Linux fiction, it goes from 0 to IO_SPACE_LIMIT (2MB on ARM) and is mapped into "virtual I/O space", which start at (void __iomem *)PCI_IO_VIRT_BASE.
Each PCI domain can have its own "bus I/O aperture", which is typically between 0x1000 and 0xffff and reflects the address that is used in PCI transactions and in BARs.
Actually, the bus I/O aperture can start from 0x0000 if you are talking about PCI bus addresses.
The aperture here reflects the subset of the 4GB bus I/O space that is actually mapped into a CPU visible "physical I/O aperture" using an inbound mapping of the host bridge. The physical I/O aperture in turn gets mapped to the virtual I/O space using pci_ioremap_io.
Agree.
The difference between a bus I/O address and a logical I/O address is stored in the io_offset.
Not exactly. If that would be true that means that for an I/O range that start at bus I/O address zero but physical I/O apperture starts at 0x40000000 the io_offset is zero. For me, the io_offset should be 0x40000000.
Let me see if I can summarise this correctly, using only CPU addresses:
0x0000 - IO_SPACE_LIMIT <- logical I/O address 0xPPPPPPPP - 0xPPPPPPPP+IO_SIZE <- physical address for PCI I/O space 0xVVVVVVVV - 0xVVVVVVVV+IO_SPACE_LIMIT <- virtual address for I/O
The io_offset then is 0xPPPPPPPP - logical I/O address. At least that is the intent of the io_offset variable that I introduced in pci_host_bridge.
The bus I/O address is generated by the host bridge, I think we can ignore it here as it tends to confuse the message.
So much for basic definitions. When a device driver calls pci_request_region, the port number it sees is the bus I/O port number adjusted using the io_offset to turn it into a logical I/O port number, which should always be within the host bridge window, which in turn is a subset of the ioport_resource.
My understanding is that device drivers all user port numbers that are logical I/O numbers, so no io_offset needs to be applied here. It is only when one wants to access the port, that the translation happens. First, inb or outb will add the PCI_IO_VIRT_BASE to generate the virtual address, the MMU will then convert that address to physical address and the host bridge will then translate the physical address into bus address.
+static int pci_host_bridge_of_get_ranges(struct device_node *dev,
struct list_head *resources)
+{
- struct resource *res;
- struct of_pci_range range;
- struct of_pci_range_parser parser;
- int err;
- pr_info("PCI host bridge %s ranges:\n", dev->full_name);
- /* Check for ranges property */
- err = of_pci_range_parser_init(&parser, dev);
- if (err)
return err;
- pr_debug("Parsing ranges property...\n");
- for_each_of_pci_range(&parser, &range) {
/* Read next ranges element */
pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ",
range.pci_space, range.pci_addr);
pr_debug("cpu_addr:0x%016llx size:0x%016llx\n",
range.cpu_addr, range.size);
/* If we failed translation or got a zero-sized region
* (some FW try to feed us with non sensical zero sized regions
* such as power3 which look like some kind of attempt
* at exposing the VGA memory hole) then skip this range
*/
if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;
res = kzalloc(sizeof(struct resource), GFP_KERNEL);
if (!res) {
err = -ENOMEM;
goto bridge_ranges_nomem;
}
of_pci_range_to_resource(&range, dev, res);
pci_add_resource_offset(resources, res,
range.cpu_addr - range.pci_addr);
- }
I believe of_pci_range_to_resource() will return the MMIO aperture for the I/O space window here, which is not what you are supposed to pass into pci_add_resource_offset.
And that is why the code in probe.c has been added to deal with that. It is too early to do the adjustments here as all we have is the list of resources and that might get culled by the architecture fixup code. Remembering the io_offset will happen once the pci_host_bridge gets created, and the resources are then adjusted.
So you want to register an incorrect I/O resource first and then have it fixed up later, rather than registering the correct one from the start as everyone else?
The incorrect I/O resource is added to a temporary list of resources, it has not been attached yet to the list of windows in the bridge. What gets added is the I/O resource as described if it would be an ordinary resource.
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..16febae 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1787,6 +1787,17 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, list_for_each_entry_safe(window, n, resources, list) { list_move_tail(&window->list, &bridge->windows); res = window->res;
/*
* IO resources are stored in the kernel with a CPU start
* address of zero. Adjust the data accordingly and remember
* the offset
*/
if (resource_type(res) == IORESOURCE_IO) {
bridge->io_offset = res->start;
res->end -= res->start;
window->offset -= res->start;
res->start = 0;
}
Here, we correct for the fact that IORESOURCE_IO is not a normal resource, because Linux wants a logical I/O as start and end address, not the physical CPU address. We adjust to that and remember the offset.
offset = window->offset; if (res->flags & IORESOURCE_BUS)
Won't this break all existing host bridges?
I am not sure. I believe not, due to what I've explained earlier, but you might be right.
The adjustment happens before the resource is added to the host bridge windows and translates it from MMIO range into IO range.
AFAICT, the resource_type of the resource you register above should be IORESOURCE_MEM, so you are not actually matching it here.
No, all resources are added here. For IORESOURCE_IO we do an adjustment.
Best regards, Liviu
Arnd
To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Monday 03 February 2014 22:17:44 Liviu Dudau wrote:
On Mon, Feb 03, 2014 at 07:31:31PM +0000, Arnd Bergmann wrote:
Let's try to come up with nomenclature so we can talk about this better
The ioport_resource is in "logical I/O space", which is a Linux fiction, it goes from 0 to IO_SPACE_LIMIT (2MB on ARM) and is mapped into "virtual I/O space", which start at (void __iomem *)PCI_IO_VIRT_BASE.
Each PCI domain can have its own "bus I/O aperture", which is typically between 0x1000 and 0xffff and reflects the address that is used in PCI transactions and in BARs.
Actually, the bus I/O aperture can start from 0x0000 if you are talking about PCI bus addresses.
Right.
The aperture here reflects the subset of the 4GB bus I/O space that is actually mapped into a CPU visible "physical I/O aperture" using an inbound mapping of the host bridge. The physical I/O aperture in turn gets mapped to the virtual I/O space using pci_ioremap_io.
Agree.
The difference between a bus I/O address and a logical I/O address is stored in the io_offset.
Not exactly. If that would be true that means that for an I/O range that start at bus I/O address zero but physical I/O apperture starts at 0x40000000 the io_offset is zero. For me, the io_offset should be 0x40000000.
That's not how we do it on any of the existing host controllers. Typically the io_offset is zero for the first one, and may be either zero for all the others (meaning BARs get > 64KB values for secondary buses) or between 64KB and 2MB (meaning each bus starts at I/O port number 0).
Let me see if I can summarise this correctly, using only CPU addresses:
0x0000 - IO_SPACE_LIMIT <- logical I/O address 0xPPPPPPPP - 0xPPPPPPPP+IO_SIZE <- physical address for PCI I/O space 0xVVVVVVVV - 0xVVVVVVVV+IO_SPACE_LIMIT <- virtual address for I/O
The io_offset then is 0xPPPPPPPP - logical I/O address. At least that is the intent of the io_offset variable that I introduced in pci_host_bridge.
That is highly confusing then, because we already have something called io_offset with a different meaning. I would call 0xPPPPPPPP the io_phys_base if I had to come up with a variable name for it.
The bus I/O address is generated by the host bridge, I think we can ignore it here as it tends to confuse the message.
No, it's important because the PCI core code has to transform between bus I/O address and logical I/O address when accessing the BARs.
So much for basic definitions. When a device driver calls pci_request_region, the port number it sees is the bus I/O port number adjusted using the io_offset to turn it into a logical I/O port number, which should always be within the host bridge window, which in turn is a subset of the ioport_resource.
My understanding is that device drivers all user port numbers that are logical I/O numbers, so no io_offset needs to be applied here. It is only when one wants to access the port, that the translation happens. First, inb or outb will add the PCI_IO_VIRT_BASE to generate the virtual address, the MMU will then convert that address to physical address and the host bridge will then translate the physical address into bus address.
This is correct. The bus I/O number is not visible to the device driver, but it is what you put into the 'ranges' property in DT, and it gets used during PCI resource scanning.
And that is why the code in probe.c has been added to deal with that. It is too early to do the adjustments here as all we have is the list of resources and that might get culled by the architecture fixup code. Remembering the io_offset will happen once the pci_host_bridge gets created, and the resources are then adjusted.
So you want to register an incorrect I/O resource first and then have it fixed up later, rather than registering the correct one from the start as everyone else?
The incorrect I/O resource is added to a temporary list of resources, it has not been attached yet to the list of windows in the bridge. What gets added is the I/O resource as described if it would be an ordinary resource.
I'm not completely sure I'm following here, but let's work out the other things first, this will probably get clearer then.
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..16febae 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1787,6 +1787,17 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, list_for_each_entry_safe(window, n, resources, list) { list_move_tail(&window->list, &bridge->windows); res = window->res;
/*
* IO resources are stored in the kernel with a CPU start
* address of zero. Adjust the data accordingly and remember
* the offset
*/
if (resource_type(res) == IORESOURCE_IO) {
bridge->io_offset = res->start;
res->end -= res->start;
window->offset -= res->start;
res->start = 0;
}
Here, we correct for the fact that IORESOURCE_IO is not a normal resource, because Linux wants a logical I/O as start and end address, not the physical CPU address. We adjust to that and remember the offset.
But the offset (phys_base) doesn't actually matter to the PCI core or the driver. Why save it?
offset = window->offset; if (res->flags & IORESOURCE_BUS)
Won't this break all existing host bridges?
I am not sure. I believe not, due to what I've explained earlier, but you might be right.
The adjustment happens before the resource is added to the host bridge windows and translates it from MMIO range into IO range.
AFAICT, the resource_type of the resource you register above should be IORESOURCE_MEM, so you are not actually matching it here.
No, all resources are added here. For IORESOURCE_IO we do an adjustment.
But there should never be an IORESOURCE_IO resource structure that is not in IO space, i.e. within ioport_resource. Doing an "adjustment" is not an operation defined on this structure. What I meant above is that the pci range parser gets this right and gives you a resource that looks like { .flags = IORESOURCE_MEM, .start = phys_base, .end = phys_base + size - 1}, while the resource we want to register is { .flags = IORESOURCE_IO, .start = log_base, .end = log_base + size -1}. In the of_pci_range struct for the I/O space, the "pci_space" is IORESOURCE_IO (for the pci_addr), while the "flags" are IORESOURCE_MEM, to go along with the cpu_addr.
Arnd
On Tue, Feb 04, 2014 at 10:09:44AM +0000, Arnd Bergmann wrote:
On Monday 03 February 2014 22:17:44 Liviu Dudau wrote:
On Mon, Feb 03, 2014 at 07:31:31PM +0000, Arnd Bergmann wrote:
Let's try to come up with nomenclature so we can talk about this better
The ioport_resource is in "logical I/O space", which is a Linux fiction, it goes from 0 to IO_SPACE_LIMIT (2MB on ARM) and is mapped into "virtual I/O space", which start at (void __iomem *)PCI_IO_VIRT_BASE.
Each PCI domain can have its own "bus I/O aperture", which is typically between 0x1000 and 0xffff and reflects the address that is used in PCI transactions and in BARs.
Actually, the bus I/O aperture can start from 0x0000 if you are talking about PCI bus addresses.
Right.
The aperture here reflects the subset of the 4GB bus I/O space that is actually mapped into a CPU visible "physical I/O aperture" using an inbound mapping of the host bridge. The physical I/O aperture in turn gets mapped to the virtual I/O space using pci_ioremap_io.
Agree.
The difference between a bus I/O address and a logical I/O address is stored in the io_offset.
Not exactly. If that would be true that means that for an I/O range that start at bus I/O address zero but physical I/O apperture starts at 0x40000000 the io_offset is zero. For me, the io_offset should be 0x40000000.
That's not how we do it on any of the existing host controllers. Typically the io_offset is zero for the first one, and may be either zero for all the others (meaning BARs get > 64KB values for secondary buses) or between 64KB and 2MB (meaning each bus starts at I/O port number 0).
In that case it is probably worth to rename my variable into phys_io_offset.
I need to go back over my driver code. My assumptions were probably wrong wrt to meaning of the io_offset.
Let me see if I can summarise this correctly, using only CPU addresses:
0x0000 - IO_SPACE_LIMIT <- logical I/O address 0xPPPPPPPP - 0xPPPPPPPP+IO_SIZE <- physical address for PCI I/O space 0xVVVVVVVV - 0xVVVVVVVV+IO_SPACE_LIMIT <- virtual address for I/O
The io_offset then is 0xPPPPPPPP - logical I/O address. At least that is the intent of the io_offset variable that I introduced in pci_host_bridge.
That is highly confusing then, because we already have something called io_offset with a different meaning. I would call 0xPPPPPPPP the io_phys_base if I had to come up with a variable name for it.
The bus I/O address is generated by the host bridge, I think we can ignore it here as it tends to confuse the message.
No, it's important because the PCI core code has to transform between bus I/O address and logical I/O address when accessing the BARs.
So much for basic definitions. When a device driver calls pci_request_region, the port number it sees is the bus I/O port number adjusted using the io_offset to turn it into a logical I/O port number, which should always be within the host bridge window, which in turn is a subset of the ioport_resource.
My understanding is that device drivers all user port numbers that are logical I/O numbers, so no io_offset needs to be applied here. It is only when one wants to access the port, that the translation happens. First, inb or outb will add the PCI_IO_VIRT_BASE to generate the virtual address, the MMU will then convert that address to physical address and the host bridge will then translate the physical address into bus address.
This is correct. The bus I/O number is not visible to the device driver, but it is what you put into the 'ranges' property in DT, and it gets used during PCI resource scanning.
And that is why the code in probe.c has been added to deal with that. It is too early to do the adjustments here as all we have is the list of resources and that might get culled by the architecture fixup code. Remembering the io_offset will happen once the pci_host_bridge gets created, and the resources are then adjusted.
So you want to register an incorrect I/O resource first and then have it fixed up later, rather than registering the correct one from the start as everyone else?
The incorrect I/O resource is added to a temporary list of resources, it has not been attached yet to the list of windows in the bridge. What gets added is the I/O resource as described if it would be an ordinary resource.
I'm not completely sure I'm following here, but let's work out the other things first, this will probably get clearer then.
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..16febae 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1787,6 +1787,17 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, list_for_each_entry_safe(window, n, resources, list) { list_move_tail(&window->list, &bridge->windows); res = window->res;
/*
* IO resources are stored in the kernel with a CPU start
* address of zero. Adjust the data accordingly and remember
* the offset
*/
if (resource_type(res) == IORESOURCE_IO) {
bridge->io_offset = res->start;
res->end -= res->start;
window->offset -= res->start;
res->start = 0;
}
Here, we correct for the fact that IORESOURCE_IO is not a normal resource, because Linux wants a logical I/O as start and end address, not the physical CPU address. We adjust to that and remember the offset.
But the offset (phys_base) doesn't actually matter to the PCI core or the driver. Why save it?
Because I need it later for the host bridge ATR setup.
offset = window->offset; if (res->flags & IORESOURCE_BUS)
Won't this break all existing host bridges?
I am not sure. I believe not, due to what I've explained earlier, but you might be right.
The adjustment happens before the resource is added to the host bridge windows and translates it from MMIO range into IO range.
AFAICT, the resource_type of the resource you register above should be IORESOURCE_MEM, so you are not actually matching it here.
No, all resources are added here. For IORESOURCE_IO we do an adjustment.
But there should never be an IORESOURCE_IO resource structure that is not in IO space, i.e. within ioport_resource. Doing an "adjustment" is not an operation defined on this structure. What I meant above is that the pci range parser gets this right and gives you a resource that looks like { .flags = IORESOURCE_MEM, .start = phys_base, .end = phys_base + size - 1}, while the resource we want to register is { .flags = IORESOURCE_IO, .start = log_base, .end = log_base + size -1}. In the of_pci_range struct for the I/O space, the "pci_space" is IORESOURCE_IO (for the pci_addr), while the "flags" are IORESOURCE_MEM, to go along with the cpu_addr.
The pci range parser gives me a range with .flags = IORESOURCE_IO for IO space. It does not convert it to IORESOURCE_MEM. Hence the need for adjustment.
Best regards, Liviu
Arnd
To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 04 February 2014, Liviu Dudau wrote:
On Tue, Feb 04, 2014 at 10:09:44AM +0000, Arnd Bergmann wrote:
On Monday 03 February 2014 22:17:44 Liviu Dudau wrote:
On Mon, Feb 03, 2014 at 07:31:31PM +0000, Arnd Bergmann wrote:
The aperture here reflects the subset of the 4GB bus I/O space that is actually mapped into a CPU visible "physical I/O aperture" using an inbound mapping of the host bridge. The physical I/O aperture in turn gets mapped to the virtual I/O space using pci_ioremap_io.
Agree.
The difference between a bus I/O address and a logical I/O address is stored in the io_offset.
Not exactly. If that would be true that means that for an I/O range that start at bus I/O address zero but physical I/O apperture starts at 0x40000000 the io_offset is zero. For me, the io_offset should be 0x40000000.
That's not how we do it on any of the existing host controllers. Typically the io_offset is zero for the first one, and may be either zero for all the others (meaning BARs get > 64KB values for secondary buses) or between 64KB and 2MB (meaning each bus starts at I/O port number 0).
In that case it is probably worth to rename my variable into phys_io_offset.
I need to go back over my driver code. My assumptions were probably wrong wrt to meaning of the io_offset.
Ok. I'd still call it 'base' rather than 'offset', although the meaning isn't all that different.
But there should never be an IORESOURCE_IO resource structure that is not in IO space, i.e. within ioport_resource. Doing an "adjustment" is not an operation defined on this structure. What I meant above is that the pci range parser gets this right and gives you a resource that looks like { .flags = IORESOURCE_MEM, .start = phys_base, .end = phys_base + size - 1}, while the resource we want to register is { .flags = IORESOURCE_IO, .start = log_base, .end = log_base + size -1}. In the of_pci_range struct for the I/O space, the "pci_space" is IORESOURCE_IO (for the pci_addr), while the "flags" are IORESOURCE_MEM, to go along with the cpu_addr.
The pci range parser gives me a range with .flags = IORESOURCE_IO for IO space. It does not convert it to IORESOURCE_MEM. Hence the need for adjustment.
Ah, I see that now in the code too. This seems to be a bug in the range parser though: range->flags should not be initialized to of_bus_pci_get_flags(parser->range).
Arnd
Hello Liviu,
I did not get the first email of this particular patch on any of subscribed mailing lists (don't know why), hence replying here.
+struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, int busno, struct pci_ops *ops, + void *host_data, struct list_head *resources) +{ + struct pci_bus *root_bus; + struct pci_host_bridge *bridge; + + /* first parse the host bridge bus ranges */ + if (pci_host_bridge_of_get_ranges(parent->of_node, resources)) + return NULL; + + /* then create the root bus */ + root_bus = pci_create_root_bus(parent, busno, ops, host_data, resources); + if (!root_bus) + return NULL; + + bridge = to_pci_host_bridge(root_bus->bridge); + + return bridge; +}
You are keeping the domain_nr inside pci_host_bridge structure. In above API, domain_nr is required in 'pci_find_bus' function called from 'pci_create_root_bus'. Since the bridge is allocated after creating root bus, 'pci_find_bus' always gets domain_nr as 0. This will cause problem for scanning multiple domains.
On Mon, Feb 3, 2014 at 10:46 AM, Arnd Bergmann arnd@arndb.de wrote:
On Monday 03 February 2014 18:33:48 Liviu Dudau wrote:
+/**
- pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
- @dev: device node of the host bridge having the range property
- @resources: list where the range of resources will be added after DT parsing
- This function will parse the "ranges" property of a PCI host bridge device
- node and setup the resource mapping based on its content. It is expected
- that the property conforms with the Power ePAPR document.
- Each architecture will then apply their filtering based on the limitations
- of each platform. One general restriction seems to be the number of IO space
- ranges, the PCI framework makes intensive use of struct resource management,
- and for IORESOURCE_IO types they can only be requested if they are contained
- within the global ioport_resource, so that should be limited to one IO space
- range.
Actually we have quite a different set of restrictions around I/O space on ARM32 at the moment: Each host bridge can have its own 64KB range in an arbitrary location on MMIO space, and the total must not exceed 2MB of I/O space.
- */
+static int pci_host_bridge_of_get_ranges(struct device_node *dev,
struct list_head *resources)
+{
struct resource *res;
struct of_pci_range range;
struct of_pci_range_parser parser;
int err;
pr_info("PCI host bridge %s ranges:\n", dev->full_name);
/* Check for ranges property */
err = of_pci_range_parser_init(&parser, dev);
if (err)
return err;
pr_debug("Parsing ranges property...\n");
for_each_of_pci_range(&parser, &range) {
/* Read next ranges element */
pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ",
range.pci_space, range.pci_addr);
pr_debug("cpu_addr:0x%016llx size:0x%016llx\n",
range.cpu_addr, range.size);
/* If we failed translation or got a zero-sized region
* (some FW try to feed us with non sensical zero sized regions
* such as power3 which look like some kind of attempt
* at exposing the VGA memory hole) then skip this range
*/
if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;
res = kzalloc(sizeof(struct resource), GFP_KERNEL);
if (!res) {
err = -ENOMEM;
goto bridge_ranges_nomem;
}
of_pci_range_to_resource(&range, dev, res);
pci_add_resource_offset(resources, res,
range.cpu_addr - range.pci_addr);
}
I believe of_pci_range_to_resource() will return the MMIO aperture for the I/O space window here, which is not what you are supposed to pass into pci_add_resource_offset.
+EXPORT_SYMBOL(pci_host_bridge_of_init);
EXPORT_SYMBOL_GPL
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..16febae 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1787,6 +1787,17 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, list_for_each_entry_safe(window, n, resources, list) { list_move_tail(&window->list, &bridge->windows); res = window->res;
/*
* IO resources are stored in the kernel with a CPU start
* address of zero. Adjust the data accordingly and remember
* the offset
*/
if (resource_type(res) == IORESOURCE_IO) {
bridge->io_offset = res->start;
res->end -= res->start;
window->offset -= res->start;
res->start = 0;
} offset = window->offset; if (res->flags & IORESOURCE_BUS)
Won't this break all existing host bridges?
Arnd
linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Wed, Feb 05, 2014 at 10:26:27PM +0000, Tanmay Inamdar wrote:
Hello Liviu,
I did not get the first email of this particular patch on any of subscribed mailing lists (don't know why), hence replying here.
Strange, it shows in the MARC and GMANE archive for linux-pci, probably a hickup on your receiving side?
+struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, int busno, struct pci_ops *ops,
- void *host_data, struct list_head *resources)
+{
- struct pci_bus *root_bus;
- struct pci_host_bridge *bridge;
- /* first parse the host bridge bus ranges */
- if (pci_host_bridge_of_get_ranges(parent->of_node, resources))
- return NULL;
- /* then create the root bus */
- root_bus = pci_create_root_bus(parent, busno, ops, host_data, resources);
- if (!root_bus)
- return NULL;
- bridge = to_pci_host_bridge(root_bus->bridge);
- return bridge;
+}
You are keeping the domain_nr inside pci_host_bridge structure. In above API, domain_nr is required in 'pci_find_bus' function called from 'pci_create_root_bus'. Since the bridge is allocated after creating root bus, 'pci_find_bus' always gets domain_nr as 0. This will cause problem for scanning multiple domains.
Good catch. I was switching between creating a pci_controller in arch/arm64 and adding the needed bits in pci_host_bridge. After internal review I've decided to add the domain_nr to pci_host_bridge, but forgot to update the code everywhere.
Thanks for reviewing this, will fix in v2.
Do you find porting to the new API straight forward?
Best regards, Liviu
On Mon, Feb 3, 2014 at 10:46 AM, Arnd Bergmann arnd@arndb.de wrote:
On Monday 03 February 2014 18:33:48 Liviu Dudau wrote:
+/**
- pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
- @dev: device node of the host bridge having the range property
- @resources: list where the range of resources will be added after DT parsing
- This function will parse the "ranges" property of a PCI host bridge device
- node and setup the resource mapping based on its content. It is expected
- that the property conforms with the Power ePAPR document.
- Each architecture will then apply their filtering based on the limitations
- of each platform. One general restriction seems to be the number of IO space
- ranges, the PCI framework makes intensive use of struct resource management,
- and for IORESOURCE_IO types they can only be requested if they are contained
- within the global ioport_resource, so that should be limited to one IO space
- range.
Actually we have quite a different set of restrictions around I/O space on ARM32 at the moment: Each host bridge can have its own 64KB range in an arbitrary location on MMIO space, and the total must not exceed 2MB of I/O space.
- */
+static int pci_host_bridge_of_get_ranges(struct device_node *dev,
struct list_head *resources)
+{
struct resource *res;
struct of_pci_range range;
struct of_pci_range_parser parser;
int err;
pr_info("PCI host bridge %s ranges:\n", dev->full_name);
/* Check for ranges property */
err = of_pci_range_parser_init(&parser, dev);
if (err)
return err;
pr_debug("Parsing ranges property...\n");
for_each_of_pci_range(&parser, &range) {
/* Read next ranges element */
pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ",
range.pci_space, range.pci_addr);
pr_debug("cpu_addr:0x%016llx size:0x%016llx\n",
range.cpu_addr, range.size);
/* If we failed translation or got a zero-sized region
* (some FW try to feed us with non sensical zero sized regions
* such as power3 which look like some kind of attempt
* at exposing the VGA memory hole) then skip this range
*/
if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;
res = kzalloc(sizeof(struct resource), GFP_KERNEL);
if (!res) {
err = -ENOMEM;
goto bridge_ranges_nomem;
}
of_pci_range_to_resource(&range, dev, res);
pci_add_resource_offset(resources, res,
range.cpu_addr - range.pci_addr);
}
I believe of_pci_range_to_resource() will return the MMIO aperture for the I/O space window here, which is not what you are supposed to pass into pci_add_resource_offset.
+EXPORT_SYMBOL(pci_host_bridge_of_init);
EXPORT_SYMBOL_GPL
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..16febae 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1787,6 +1787,17 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, list_for_each_entry_safe(window, n, resources, list) { list_move_tail(&window->list, &bridge->windows); res = window->res;
/*
* IO resources are stored in the kernel with a CPU start
* address of zero. Adjust the data accordingly and remember
* the offset
*/
if (resource_type(res) == IORESOURCE_IO) {
bridge->io_offset = res->start;
res->end -= res->start;
window->offset -= res->start;
res->start = 0;
} offset = window->offset; if (res->flags & IORESOURCE_BUS)
Won't this break all existing host bridges?
Arnd
linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Thu, Feb 6, 2014 at 2:18 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Wed, Feb 05, 2014 at 10:26:27PM +0000, Tanmay Inamdar wrote:
Hello Liviu,
I did not get the first email of this particular patch on any of subscribed mailing lists (don't know why), hence replying here.
Strange, it shows in the MARC and GMANE archive for linux-pci, probably a hickup on your receiving side?
+struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, int busno, struct pci_ops *ops,
- void *host_data, struct list_head *resources)
+{
- struct pci_bus *root_bus;
- struct pci_host_bridge *bridge;
- /* first parse the host bridge bus ranges */
- if (pci_host_bridge_of_get_ranges(parent->of_node, resources))
- return NULL;
- /* then create the root bus */
- root_bus = pci_create_root_bus(parent, busno, ops, host_data, resources);
- if (!root_bus)
- return NULL;
- bridge = to_pci_host_bridge(root_bus->bridge);
- return bridge;
+}
You are keeping the domain_nr inside pci_host_bridge structure. In above API, domain_nr is required in 'pci_find_bus' function called from 'pci_create_root_bus'. Since the bridge is allocated after creating root bus, 'pci_find_bus' always gets domain_nr as 0. This will cause problem for scanning multiple domains.
Good catch. I was switching between creating a pci_controller in arch/arm64 and adding the needed bits in pci_host_bridge. After internal review I've decided to add the domain_nr to pci_host_bridge, but forgot to update the code everywhere.
Thanks for reviewing this, will fix in v2.
Do you find porting to the new API straight forward?
It is quite straight forward for MEM regions but for IO regions it is not. You always assume IO resource starting at 0x0. IMO, this will cause problem for systems with multiple ports / IO windows. You can take a look at 'drivers/pci/host/pcie-designware.c'.
Also the manipulations of addresses for IO_RESOURCES can be dangerous. It can make some value negative. For example if my PCI IO range starts in 32 bit address range say 0x8000_0000 with CPU address 0xb0_8000_0000, window->offset after manipulation will become negative.
Personally I would like to do get all the PCI and CPU addresses from my device tree as is and do the 'pci_add_resource_offset'. This will help me setup my regions correctly without any further arithmetic. Please let me know what you think.
Best regards, Liviu
On Mon, Feb 3, 2014 at 10:46 AM, Arnd Bergmann arnd@arndb.de wrote:
On Monday 03 February 2014 18:33:48 Liviu Dudau wrote:
+/**
- pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
- @dev: device node of the host bridge having the range property
- @resources: list where the range of resources will be added after DT parsing
- This function will parse the "ranges" property of a PCI host bridge device
- node and setup the resource mapping based on its content. It is expected
- that the property conforms with the Power ePAPR document.
- Each architecture will then apply their filtering based on the limitations
- of each platform. One general restriction seems to be the number of IO space
- ranges, the PCI framework makes intensive use of struct resource management,
- and for IORESOURCE_IO types they can only be requested if they are contained
- within the global ioport_resource, so that should be limited to one IO space
- range.
Actually we have quite a different set of restrictions around I/O space on ARM32 at the moment: Each host bridge can have its own 64KB range in an arbitrary location on MMIO space, and the total must not exceed 2MB of I/O space.
- */
+static int pci_host_bridge_of_get_ranges(struct device_node *dev,
struct list_head *resources)
+{
struct resource *res;
struct of_pci_range range;
struct of_pci_range_parser parser;
int err;
pr_info("PCI host bridge %s ranges:\n", dev->full_name);
/* Check for ranges property */
err = of_pci_range_parser_init(&parser, dev);
if (err)
return err;
pr_debug("Parsing ranges property...\n");
for_each_of_pci_range(&parser, &range) {
/* Read next ranges element */
pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ",
range.pci_space, range.pci_addr);
pr_debug("cpu_addr:0x%016llx size:0x%016llx\n",
range.cpu_addr, range.size);
/* If we failed translation or got a zero-sized region
* (some FW try to feed us with non sensical zero sized regions
* such as power3 which look like some kind of attempt
* at exposing the VGA memory hole) then skip this range
*/
if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;
res = kzalloc(sizeof(struct resource), GFP_KERNEL);
if (!res) {
err = -ENOMEM;
goto bridge_ranges_nomem;
}
of_pci_range_to_resource(&range, dev, res);
pci_add_resource_offset(resources, res,
range.cpu_addr - range.pci_addr);
}
I believe of_pci_range_to_resource() will return the MMIO aperture for the I/O space window here, which is not what you are supposed to pass into pci_add_resource_offset.
+EXPORT_SYMBOL(pci_host_bridge_of_init);
EXPORT_SYMBOL_GPL
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..16febae 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1787,6 +1787,17 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, list_for_each_entry_safe(window, n, resources, list) { list_move_tail(&window->list, &bridge->windows); res = window->res;
/*
* IO resources are stored in the kernel with a CPU start
* address of zero. Adjust the data accordingly and remember
* the offset
*/
if (resource_type(res) == IORESOURCE_IO) {
bridge->io_offset = res->start;
res->end -= res->start;
window->offset -= res->start;
res->start = 0;
} offset = window->offset; if (res->flags & IORESOURCE_BUS)
Won't this break all existing host bridges?
Arnd
linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--
| I would like to | | fix the world, | | but they're not | | giving me the | \ source code! /
¯\_(ツ)_/¯
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
On Sat, Feb 08, 2014 at 12:21:56AM +0000, Tanmay Inamdar wrote:
On Thu, Feb 6, 2014 at 2:18 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Wed, Feb 05, 2014 at 10:26:27PM +0000, Tanmay Inamdar wrote:
Hello Liviu,
I did not get the first email of this particular patch on any of subscribed mailing lists (don't know why), hence replying here.
Strange, it shows in the MARC and GMANE archive for linux-pci, probably a hickup on your receiving side?
+struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, int busno, struct pci_ops *ops,
- void *host_data, struct list_head *resources)
+{
- struct pci_bus *root_bus;
- struct pci_host_bridge *bridge;
- /* first parse the host bridge bus ranges */
- if (pci_host_bridge_of_get_ranges(parent->of_node, resources))
- return NULL;
- /* then create the root bus */
- root_bus = pci_create_root_bus(parent, busno, ops, host_data, resources);
- if (!root_bus)
- return NULL;
- bridge = to_pci_host_bridge(root_bus->bridge);
- return bridge;
+}
You are keeping the domain_nr inside pci_host_bridge structure. In above API, domain_nr is required in 'pci_find_bus' function called from 'pci_create_root_bus'. Since the bridge is allocated after creating root bus, 'pci_find_bus' always gets domain_nr as 0. This will cause problem for scanning multiple domains.
Good catch. I was switching between creating a pci_controller in arch/arm64 and adding the needed bits in pci_host_bridge. After internal review I've decided to add the domain_nr to pci_host_bridge, but forgot to update the code everywhere.
Thanks for reviewing this, will fix in v2.
Do you find porting to the new API straight forward?
It is quite straight forward for MEM regions but for IO regions it is not. You always assume IO resource starting at 0x0. IMO, this will cause problem for systems with multiple ports / IO windows. You can take a look at 'drivers/pci/host/pcie-designware.c'.
Also the manipulations of addresses for IO_RESOURCES can be dangerous. It can make some value negative. For example if my PCI IO range starts in 32 bit address range say 0x8000_0000 with CPU address 0xb0_8000_0000, window->offset after manipulation will become negative.
Personally I would like to do get all the PCI and CPU addresses from my device tree as is and do the 'pci_add_resource_offset'. This will help me setup my regions correctly without any further arithmetic. Please let me know what you think.
Yes, that parsing code of ranges is incorrect in light of the current discussion. I will try to propose a fix in the v2 series.
Regarding your PCI IO range: if your bus address starts at 0x8000_0000, would that not cause problems with devices that use less than 32bits for address decoding? It is a rather unusual setup, I believe the x86 world (even PCI spec?) mandates IO bus ranges in the first 16MB of address range?
Best regards, Liviu
Best regards, Liviu
On Mon, Feb 3, 2014 at 10:46 AM, Arnd Bergmann arnd@arndb.de wrote:
On Monday 03 February 2014 18:33:48 Liviu Dudau wrote:
+/**
- pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
- @dev: device node of the host bridge having the range property
- @resources: list where the range of resources will be added after DT parsing
- This function will parse the "ranges" property of a PCI host bridge device
- node and setup the resource mapping based on its content. It is expected
- that the property conforms with the Power ePAPR document.
- Each architecture will then apply their filtering based on the limitations
- of each platform. One general restriction seems to be the number of IO space
- ranges, the PCI framework makes intensive use of struct resource management,
- and for IORESOURCE_IO types they can only be requested if they are contained
- within the global ioport_resource, so that should be limited to one IO space
- range.
Actually we have quite a different set of restrictions around I/O space on ARM32 at the moment: Each host bridge can have its own 64KB range in an arbitrary location on MMIO space, and the total must not exceed 2MB of I/O space.
- */
+static int pci_host_bridge_of_get_ranges(struct device_node *dev,
struct list_head *resources)
+{
struct resource *res;
struct of_pci_range range;
struct of_pci_range_parser parser;
int err;
pr_info("PCI host bridge %s ranges:\n", dev->full_name);
/* Check for ranges property */
err = of_pci_range_parser_init(&parser, dev);
if (err)
return err;
pr_debug("Parsing ranges property...\n");
for_each_of_pci_range(&parser, &range) {
/* Read next ranges element */
pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ",
range.pci_space, range.pci_addr);
pr_debug("cpu_addr:0x%016llx size:0x%016llx\n",
range.cpu_addr, range.size);
/* If we failed translation or got a zero-sized region
* (some FW try to feed us with non sensical zero sized regions
* such as power3 which look like some kind of attempt
* at exposing the VGA memory hole) then skip this range
*/
if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;
res = kzalloc(sizeof(struct resource), GFP_KERNEL);
if (!res) {
err = -ENOMEM;
goto bridge_ranges_nomem;
}
of_pci_range_to_resource(&range, dev, res);
pci_add_resource_offset(resources, res,
range.cpu_addr - range.pci_addr);
}
I believe of_pci_range_to_resource() will return the MMIO aperture for the I/O space window here, which is not what you are supposed to pass into pci_add_resource_offset.
+EXPORT_SYMBOL(pci_host_bridge_of_init);
EXPORT_SYMBOL_GPL
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..16febae 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1787,6 +1787,17 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, list_for_each_entry_safe(window, n, resources, list) { list_move_tail(&window->list, &bridge->windows); res = window->res;
/*
* IO resources are stored in the kernel with a CPU start
* address of zero. Adjust the data accordingly and remember
* the offset
*/
if (resource_type(res) == IORESOURCE_IO) {
bridge->io_offset = res->start;
res->end -= res->start;
window->offset -= res->start;
res->start = 0;
} offset = window->offset; if (res->flags & IORESOURCE_BUS)
Won't this break all existing host bridges?
Arnd
linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--
| I would like to | | fix the world, | | but they're not | | giving me the | \ source code! /
¯\_(ツ)_/¯
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
-- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday 08 February 2014, Liviu Dudau wrote:
Yes, that parsing code of ranges is incorrect in light of the current discussion. I will try to propose a fix in the v2 series.
Regarding your PCI IO range: if your bus address starts at 0x8000_0000, would that not cause problems with devices that use less than 32bits for address decoding? It is a rather unusual setup, I believe the x86 world (even PCI spec?) mandates IO bus ranges in the first 16MB of address range?
Actually even less than that. The common location of the I/O aperture is 0x0000-0xffff in bus space, which comes from the way the x86 instruction set handles it. You can have larger bus addresses, but you can easily get into trouble that way. One limitation is that the total I/O space size for all buses combined is 2MB on arm32 (to be extended to 16MB on arm64 as it stands right now), the other is that we probably have code assuming that io_offset is a non-negative number, i.e. the bus address is equal or less than the logial I/O port number, which in turn is limited to a few MB.
Arnd
On Sat, Feb 8, 2014 at 6:22 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Sat, Feb 08, 2014 at 12:21:56AM +0000, Tanmay Inamdar wrote:
On Thu, Feb 6, 2014 at 2:18 AM, Liviu Dudau Liviu.Dudau@arm.com wrote:
On Wed, Feb 05, 2014 at 10:26:27PM +0000, Tanmay Inamdar wrote:
Hello Liviu,
I did not get the first email of this particular patch on any of subscribed mailing lists (don't know why), hence replying here.
Strange, it shows in the MARC and GMANE archive for linux-pci, probably a hickup on your receiving side?
+struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, int busno, struct pci_ops *ops,
- void *host_data, struct list_head *resources)
+{
- struct pci_bus *root_bus;
- struct pci_host_bridge *bridge;
- /* first parse the host bridge bus ranges */
- if (pci_host_bridge_of_get_ranges(parent->of_node, resources))
- return NULL;
- /* then create the root bus */
- root_bus = pci_create_root_bus(parent, busno, ops, host_data, resources);
- if (!root_bus)
- return NULL;
- bridge = to_pci_host_bridge(root_bus->bridge);
- return bridge;
+}
You are keeping the domain_nr inside pci_host_bridge structure. In above API, domain_nr is required in 'pci_find_bus' function called from 'pci_create_root_bus'. Since the bridge is allocated after creating root bus, 'pci_find_bus' always gets domain_nr as 0. This will cause problem for scanning multiple domains.
Good catch. I was switching between creating a pci_controller in arch/arm64 and adding the needed bits in pci_host_bridge. After internal review I've decided to add the domain_nr to pci_host_bridge, but forgot to update the code everywhere.
Thanks for reviewing this, will fix in v2.
Do you find porting to the new API straight forward?
It is quite straight forward for MEM regions but for IO regions it is not. You always assume IO resource starting at 0x0. IMO, this will cause problem for systems with multiple ports / IO windows. You can take a look at 'drivers/pci/host/pcie-designware.c'.
Also the manipulations of addresses for IO_RESOURCES can be dangerous. It can make some value negative. For example if my PCI IO range starts in 32 bit address range say 0x8000_0000 with CPU address 0xb0_8000_0000, window->offset after manipulation will become negative.
Personally I would like to do get all the PCI and CPU addresses from my device tree as is and do the 'pci_add_resource_offset'. This will help me setup my regions correctly without any further arithmetic. Please let me know what you think.
Yes, that parsing code of ranges is incorrect in light of the current discussion. I will try to propose a fix in the v2 series.
Okay. Thanks.
Regarding your PCI IO range: if your bus address starts at 0x8000_0000, would that not cause problems with devices that use less than 32bits for address decoding? It is a rather unusual setup, I believe the x86 world (even PCI spec?) mandates IO bus ranges in the first 16MB of address range?
Yes. It will cause problem with few devices. Anyways it was just an example.
Best regards, Liviu
Best regards, Liviu
On Mon, Feb 3, 2014 at 10:46 AM, Arnd Bergmann arnd@arndb.de wrote:
On Monday 03 February 2014 18:33:48 Liviu Dudau wrote:
+/**
- pci_host_bridge_of_get_ranges - Parse PCI host bridge resources from DT
- @dev: device node of the host bridge having the range property
- @resources: list where the range of resources will be added after DT parsing
- This function will parse the "ranges" property of a PCI host bridge device
- node and setup the resource mapping based on its content. It is expected
- that the property conforms with the Power ePAPR document.
- Each architecture will then apply their filtering based on the limitations
- of each platform. One general restriction seems to be the number of IO space
- ranges, the PCI framework makes intensive use of struct resource management,
- and for IORESOURCE_IO types they can only be requested if they are contained
- within the global ioport_resource, so that should be limited to one IO space
- range.
Actually we have quite a different set of restrictions around I/O space on ARM32 at the moment: Each host bridge can have its own 64KB range in an arbitrary location on MMIO space, and the total must not exceed 2MB of I/O space.
- */
+static int pci_host_bridge_of_get_ranges(struct device_node *dev,
struct list_head *resources)
+{
struct resource *res;
struct of_pci_range range;
struct of_pci_range_parser parser;
int err;
pr_info("PCI host bridge %s ranges:\n", dev->full_name);
/* Check for ranges property */
err = of_pci_range_parser_init(&parser, dev);
if (err)
return err;
pr_debug("Parsing ranges property...\n");
for_each_of_pci_range(&parser, &range) {
/* Read next ranges element */
pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ",
range.pci_space, range.pci_addr);
pr_debug("cpu_addr:0x%016llx size:0x%016llx\n",
range.cpu_addr, range.size);
/* If we failed translation or got a zero-sized region
* (some FW try to feed us with non sensical zero sized regions
* such as power3 which look like some kind of attempt
* at exposing the VGA memory hole) then skip this range
*/
if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;
res = kzalloc(sizeof(struct resource), GFP_KERNEL);
if (!res) {
err = -ENOMEM;
goto bridge_ranges_nomem;
}
of_pci_range_to_resource(&range, dev, res);
pci_add_resource_offset(resources, res,
range.cpu_addr - range.pci_addr);
}
I believe of_pci_range_to_resource() will return the MMIO aperture for the I/O space window here, which is not what you are supposed to pass into pci_add_resource_offset.
+EXPORT_SYMBOL(pci_host_bridge_of_init);
EXPORT_SYMBOL_GPL
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 6e34498..16febae 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1787,6 +1787,17 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus, list_for_each_entry_safe(window, n, resources, list) { list_move_tail(&window->list, &bridge->windows); res = window->res;
/*
* IO resources are stored in the kernel with a CPU start
* address of zero. Adjust the data accordingly and remember
* the offset
*/
if (resource_type(res) == IORESOURCE_IO) {
bridge->io_offset = res->start;
res->end -= res->start;
window->offset -= res->start;
res->start = 0;
} offset = window->offset; if (res->flags & IORESOURCE_BUS)
Won't this break all existing host bridges?
Arnd
linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--
| I would like to | | fix the world, | | but they're not | | giving me the | \ source code! /
¯\_(ツ)_/¯
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
-- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
| I would like to | | fix the world, | | but they're not | | giving me the | \ source code! /
¯\_(ツ)_/¯
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
On Thursday, February 06, 2014 7:18 PM, Liviu Dudau wrote:
On Wed, Feb 05, 2014 at 10:26:27PM +0000, Tanmay Inamdar wrote:
Hello Liviu,
I did not get the first email of this particular patch on any of subscribed mailing lists (don't know why), hence replying here.
Strange, it shows in the MARC and GMANE archive for linux-pci, probably a hickup on your receiving side?
+struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, int busno, struct pci_ops *ops,
- void *host_data, struct list_head *resources)
+{
- struct pci_bus *root_bus;
- struct pci_host_bridge *bridge;
- /* first parse the host bridge bus ranges */
- if (pci_host_bridge_of_get_ranges(parent->of_node, resources))
- return NULL;
- /* then create the root bus */
- root_bus = pci_create_root_bus(parent, busno, ops, host_data, resources);
- if (!root_bus)
- return NULL;
- bridge = to_pci_host_bridge(root_bus->bridge);
- return bridge;
+}
You are keeping the domain_nr inside pci_host_bridge structure. In above API, domain_nr is required in 'pci_find_bus' function called from 'pci_create_root_bus'. Since the bridge is allocated after creating root bus, 'pci_find_bus' always gets domain_nr as 0. This will cause problem for scanning multiple domains.
Good catch. I was switching between creating a pci_controller in arch/arm64 and adding the needed bits in pci_host_bridge. After internal review I've decided to add the domain_nr to pci_host_bridge, but forgot to update the code everywhere.
Hi Liviu Dudau,
One more thing, I am reviewing and compiling your patch. Would you consider adding 'struct pci_sys_data' and 'struct hw_pci'?
Currently, 4 PCIe Host drivers (pci-mvebu.c, pci-tegra.c, pci-rcar-gen2.c, pcie-designware.c) are using 'struct pci_sys_data' and 'struct hw_pci' in their drivers. Without this, it makes build errors.
In arm32, 'struct pci_sys_data' and 'struct hw_pci' is defined in "arch/arm/include/asm/mach/pci.h".
Tanmay Inamdar, Your 'APM X-Gene PCIe' patch also needs 'struct pci_sys_data' and 'struct hw_pci'. With Liviu Dudau's patch, it will make build errors. Would you check this?
Thank you.
Best regards, Jingoo Han
Thanks for reviewing this, will fix in v2.
Do you find porting to the new API straight forward?
On
-----Original Message----- From: Jingoo Han [mailto:jg1.han@samsung.com] Sent: Thursday, February 13, 2014 5:10 PM To: 'Liviu Dudau'; 'Tanmay Inamdar' Cc: 'Arnd Bergmann'; devicetree@vger.kernel.org; 'linaro-kernel'; 'linux-pci'; 'Will Deacon'; 'LKML'; 'Catalin Marinas'; 'Bjorn Helgaas'; 'LAKML'; 'Jingoo Han' Subject: Re: [PATCH] pci: Add support for creating a generic host_bridge from device tree
On Thursday, February 06, 2014 7:18 PM, Liviu Dudau wrote:
On Wed, Feb 05, 2014 at 10:26:27PM +0000, Tanmay Inamdar wrote:
Hello Liviu,
I did not get the first email of this particular patch on any of subscribed mailing lists (don't know why), hence replying here.
Strange, it shows in the MARC and GMANE archive for linux-pci, probably a hickup on your receiving side?
+struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, int busno, struct pci_ops *ops,
- void *host_data, struct list_head *resources)
+{
- struct pci_bus *root_bus;
- struct pci_host_bridge *bridge;
- /* first parse the host bridge bus ranges */
- if (pci_host_bridge_of_get_ranges(parent->of_node, resources))
- return NULL;
- /* then create the root bus */
- root_bus = pci_create_root_bus(parent, busno, ops, host_data, resources);
- if (!root_bus)
- return NULL;
- bridge = to_pci_host_bridge(root_bus->bridge);
- return bridge;
+}
You are keeping the domain_nr inside pci_host_bridge structure. In above API, domain_nr is required in 'pci_find_bus' function called from 'pci_create_root_bus'. Since the bridge is allocated after creating root bus, 'pci_find_bus' always gets domain_nr as 0. This will cause problem for scanning multiple domains.
Good catch. I was switching between creating a pci_controller in arch/arm64 and adding the needed bits in pci_host_bridge. After internal review I've decided to add the domain_nr to pci_host_bridge, but forgot to update the code everywhere.
Hi Liviu Dudau,
One more thing, I am reviewing and compiling your patch. Would you consider adding 'struct pci_sys_data' and 'struct hw_pci'?
Currently, 4 PCIe Host drivers (pci-mvebu.c, pci-tegra.c, pci-rcar-gen2.c, pcie-designware.c) are using 'struct pci_sys_data' and 'struct hw_pci' in their drivers. Without this, it makes build errors.
In arm32, 'struct pci_sys_data' and 'struct hw_pci' is defined in "arch/arm/include/asm/mach/pci.h".
Tanmay Inamdar, Your 'APM X-Gene PCIe' patch also needs 'struct pci_sys_data' and 'struct hw_pci'. With Liviu Dudau's patch, it will make build errors. Would you check this?
I mean the patch '[PATCH] arm64: Add architecture support for PCI'. With this patch, it makes build errors in PCIe Host drivers such as pcie-designware.c.
Best regards, Jingoo Han
On Thu, Feb 13, 2014 at 12:10 AM, Jingoo Han jg1.han@samsung.com wrote:
On Thursday, February 06, 2014 7:18 PM, Liviu Dudau wrote:
On Wed, Feb 05, 2014 at 10:26:27PM +0000, Tanmay Inamdar wrote:
Hello Liviu,
I did not get the first email of this particular patch on any of subscribed mailing lists (don't know why), hence replying here.
Strange, it shows in the MARC and GMANE archive for linux-pci, probably a hickup on your receiving side?
+struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, int busno, struct pci_ops *ops,
- void *host_data, struct list_head *resources)
+{
- struct pci_bus *root_bus;
- struct pci_host_bridge *bridge;
- /* first parse the host bridge bus ranges */
- if (pci_host_bridge_of_get_ranges(parent->of_node, resources))
- return NULL;
- /* then create the root bus */
- root_bus = pci_create_root_bus(parent, busno, ops, host_data, resources);
- if (!root_bus)
- return NULL;
- bridge = to_pci_host_bridge(root_bus->bridge);
- return bridge;
+}
You are keeping the domain_nr inside pci_host_bridge structure. In above API, domain_nr is required in 'pci_find_bus' function called from 'pci_create_root_bus'. Since the bridge is allocated after creating root bus, 'pci_find_bus' always gets domain_nr as 0. This will cause problem for scanning multiple domains.
Good catch. I was switching between creating a pci_controller in arch/arm64 and adding the needed bits in pci_host_bridge. After internal review I've decided to add the domain_nr to pci_host_bridge, but forgot to update the code everywhere.
Hi Liviu Dudau,
One more thing, I am reviewing and compiling your patch. Would you consider adding 'struct pci_sys_data' and 'struct hw_pci'?
Currently, 4 PCIe Host drivers (pci-mvebu.c, pci-tegra.c, pci-rcar-gen2.c, pcie-designware.c) are using 'struct pci_sys_data' and 'struct hw_pci' in their drivers. Without this, it makes build errors.
In arm32, 'struct pci_sys_data' and 'struct hw_pci' is defined in "arch/arm/include/asm/mach/pci.h".
Tanmay Inamdar, Your 'APM X-Gene PCIe' patch also needs 'struct pci_sys_data' and 'struct hw_pci'. With Liviu Dudau's patch, it will make build errors. Would you check this?
X-Gene PCIe host driver is dependent on arm64 PCI patch. My previous approach was based on 32bit arm PCI support. With Liviu's approach, I will have to make changes in host driver to get rid of hw_pci and pci_sys_data which are no longer required.
IMO it should not cause build errors for PCI host drivers dependent on architectures other than arm64. Can you post the error?
Thank you.
Best regards, Jingoo Han
Thanks for reviewing this, will fix in v2.
Do you find porting to the new API straight forward?
-----Original Message----- From: Tanmay Inamdar [mailto:tinamdar@apm.com] Sent: Thursday, February 13, 2014 5:37 PM To: Jingoo Han Cc: Liviu Dudau; Arnd Bergmann; devicetree@vger.kernel.org; linaro-kernel; linux-pci; Will Deacon; LKML; Catalin Marinas; Bjorn Helgaas; LAKML Subject: Re: [PATCH] pci: Add support for creating a generic host_bridge from device tree
On Thu, Feb 13, 2014 at 12:10 AM, Jingoo Han jg1.han@samsung.com wrote:
On Thursday, February 06, 2014 7:18 PM, Liviu Dudau wrote:
On Wed, Feb 05, 2014 at 10:26:27PM +0000, Tanmay Inamdar wrote:
Hello Liviu,
I did not get the first email of this particular patch on any of subscribed mailing lists (don't know why), hence replying here.
Strange, it shows in the MARC and GMANE archive for linux-pci, probably a hickup on your receiving side?
+struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, int busno, struct pci_ops *ops,
- void *host_data, struct list_head *resources)
+{
- struct pci_bus *root_bus;
- struct pci_host_bridge *bridge;
- /* first parse the host bridge bus ranges */
- if (pci_host_bridge_of_get_ranges(parent->of_node, resources))
- return NULL;
- /* then create the root bus */
- root_bus = pci_create_root_bus(parent, busno, ops, host_data, resources);
- if (!root_bus)
- return NULL;
- bridge = to_pci_host_bridge(root_bus->bridge);
- return bridge;
+}
You are keeping the domain_nr inside pci_host_bridge structure. In above API, domain_nr is required in 'pci_find_bus' function called from 'pci_create_root_bus'. Since the bridge is allocated after creating root bus, 'pci_find_bus' always gets domain_nr as 0. This will cause problem for scanning multiple domains.
Good catch. I was switching between creating a pci_controller in arch/arm64 and adding the needed bits in pci_host_bridge. After internal review I've decided to add the domain_nr to pci_host_bridge, but forgot to update the code everywhere.
Hi Liviu Dudau,
One more thing, I am reviewing and compiling your patch. Would you consider adding 'struct pci_sys_data' and 'struct hw_pci'?
Currently, 4 PCIe Host drivers (pci-mvebu.c, pci-tegra.c, pci-rcar-gen2.c, pcie-designware.c) are using 'struct pci_sys_data' and 'struct hw_pci' in their drivers. Without this, it makes build errors.
In arm32, 'struct pci_sys_data' and 'struct hw_pci' is defined in "arch/arm/include/asm/mach/pci.h".
Tanmay Inamdar, Your 'APM X-Gene PCIe' patch also needs 'struct pci_sys_data' and 'struct hw_pci'. With Liviu Dudau's patch, it will make build errors. Would you check this?
X-Gene PCIe host driver is dependent on arm64 PCI patch. My previous approach was based on 32bit arm PCI support. With Liviu's approach, I will have to make changes in host driver to get rid of hw_pci and pci_sys_data which are no longer required.
I want to use 'drivers/pci/host/pcie-designware.c' for both arm32 and arm64, without any code changes. However, it looks impossible.
I made 'drivers/pci/host/pcie-designware.c' based on 32bit arm PCI support. Then, with Liviu's patch, do I have to make new code for arm64, even though the same HW PCIe IP is used?
- For arm32 drivers/pci/host/pcie-designware.c
- For arm64 drivers/pci/host/pcie-designware-arm64.c
IMO it should not cause build errors for PCI host drivers dependent on architectures other than arm64. Can you post the error?
I post the build errors.
CC drivers/pci/host/pcie-designware.o CHK kernel/config_data.h drivers/pci/host/pcie-designware.c:72:52: warning: 'struct pci_sys_data' declared inside parameter list [enabled by default] static inline struct pcie_port *sys_to_pcie(struct pci_sys_data *sys) ^ drivers/pci/host/pcie-designware.c:72:52: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] drivers/pci/host/pcie-designware.c: In function 'sys_to_pcie' drivers/pci/host/pcie-designware.c:74:12: error: dereferencing pointer to incomplete type return sys->private_data; ^ drivers/pci/host/pcie-designware.c: In function 'dw_pcie_msi_map' drivers/pci/host/pcie-designware.c:384:2: error: implicit declaration of function 'set_irq_flags' [-Werror=implicit-function-declaration] set_irq_flags(irq, IRQF_VALID); ^ drivers/pci/host/pcie-designware.c:384:21: error: 'IRQF_VALID??undeclared (first use in this function) set_irq_flags(irq, IRQF_VALID); ^ drivers/pci/host/pcie-designware.c:384:21: note: each undeclared identifier is reported only once for each function it appears in drivers/pci/host/pcie-designware.c: In function 'dw_pcie_host_init' drivers/pci/host/pcie-designware.c:492:2: error: invalid use of undefined type 'struct hw_pci' dw_pci.nr_controllers = 1; ^ drivers/pci/host/pcie-designware.c:493:2: error: invalid use of undefined type 'struct hw_pci' dw_pci.private_data = (void **)&pp; ^ drivers/pci/host/pcie-designware.c:495:2: error: implicit declaration of function 'pci_common_init' [-Werror=implicit-function-declaration] pci_common_init(&dw_pci); ^ drivers/pci/host/pcie-designware.c:498:2: error: invalid use of undefined type 'struct hw_pci' dw_pci.domain++; ^ drivers/pci/host/pcie-designware.c: At top level: drivers/pci/host/pcie-designware.c:698:41: warning: 'struct pci_sys_data??declared inside parameter list [enabled by default] static int dw_pcie_setup(int nr, struct pci_sys_data *sys) ^ drivers/pci/host/pcie-designware.c: In function 'dw_pcie_setup' drivers/pci/host/pcie-designware.c:702:2: warning: passing argument 1 of 'sys_to_pcie' from incompatible pointer type [enabled by default] pp = sys_to_pcie(sys); ^ drivers/pci/host/pcie-designware.c:72:33: note: expected 'struct pci_sys_data *' but argument is of type 'struct pci_sys_data *' static inline struct pcie_port *sys_to_pcie(struct pci_sys_data *sys) ^ drivers/pci/host/pcie-designware.c:708:6: error: dereferencing pointer to incomplete type sys->io_offset = global_io_offset - pp->config.io_bus_addr; ^ drivers/pci/host/pcie-designware.c:711:31: error: dereferencing pointer to incomplete type pci_add_resource_offset(&sys->resources, &pp->io, ^ drivers/pci/host/pcie-designware.c:712:9: error: dereferencing pointer to incomplete type sys->io_offset); ^ drivers/pci/host/pcie-designware.c:715:5: error: dereferencing pointer to incomplete type sys->mem_offset = pp->mem.start - pp->config.mem_bus_addr; ^ drivers/pci/host/pcie-designware.c:716:30: error: dereferencing pointer to incomplete type pci_add_resource_offset(&sys->resources, &pp->mem, sys->mem_offset); ^ drivers/pci/host/pcie-designware.c:716:56: error: dereferencing pointer to incomplete type pci_add_resource_offset(&sys->resources, &pp->mem, sys->mem_offset); ^ drivers/pci/host/pcie-designware.c: At top level: drivers/pci/host/pcie-designware.c:721:56: warning: 'struct pci_sys_data' declared inside parameter list [enabled by default] static struct pci_bus *dw_pcie_scan_bus(int nr, struct pci_sys_data *sys) ^ drivers/pci/host/pcie-designware.c: In function 'dw_pcie_scan_bus' drivers/pci/host/pcie-designware.c:724:9: warning: passing argument 1 of 'sys_to_pcie' from incompatible pointer type [enabled by default] struct pcie_port *pp = sys_to_pcie(sys); ^ drivers/pci/host/pcie-designware.c:72:33: note: expected 'struct pci_sys_data *' but argument is of type 'sruct pci_sys_data *' static inline struct pcie_port *sys_to_pcie(struct pci_sys_data *sys) ^ drivers/pci/host/pcie-designware.c:727:24: error: dereferencing pointer to incomplete type pp->root_bus_nr = sys->busnr; ^ drivers/pci/host/pcie-designware.c:728:36: error: dereferencing pointer to incomplete type bus = pci_scan_root_bus(NULL, sys->busnr, &dw_pcie_ops, ^ drivers/pci/host/pcie-designware.c:729:15: error: dereferencing pointer to incomplete type sys, &sys->resources); ^ drivers/pci/host/pcie-designware.c: At top level: drivers/pci/host/pcie-designware.c:755:15: error: variable 'dw_pci' has initializer but incomplete type static struct hw_pci dw_pci = { ^ drivers/pci/host/pcie-designware.c:756:2: error: unknown field 'setup' specified in initializer .setup = dw_pcie_setup, ^ drivers/pci/host/pcie-designware.c:756:2: warning: excess elements in struct initializer [enabled by default] drivers/pci/host/pcie-designware.c:756:2: warning: (near initialization for 'dw_pci' [enabled by default] drivers/pci/host/pcie-designware.c:757:2: error: unknown field 'scan' specified in initializer .scan = dw_pcie_scan_bus, ^ drivers/pci/host/pcie-designware.c:757:2: warning: excess elements in struct initializer [enabled by default] drivers/pci/host/pcie-designware.c:757:2: warning: (near initialization for 'dw_pci' [enabled by default] drivers/pci/host/pcie-designware.c:758:2: error: unknown field 'map_irq' specified in initializer .map_irq = dw_pcie_map_irq, ^ drivers/pci/host/pcie-designware.c:758:2: warning: excess elements in struct initializer [enabled by default] drivers/pci/host/pcie-designware.c:758:2: warning: (near initialization for 'dw_pci' [enabled by default] drivers/pci/host/pcie-designware.c:759:2: error: unknown field 'add_bus' specified in initializer .add_bus = dw_pcie_add_bus, ^ drivers/pci/host/pcie-designware.c:759:2: warning: excess elements in struct initializer [enabled by default] drivers/pci/host/pcie-designware.c:759:2: warning: (near initialization for 'dw_pci' [enabled by default] cc1: some warnings being treated as errors make[3]: *** [drivers/pci/host/pcie-designware.o] Error 1
Thanks for reviewing this, will fix in v2.
Do you find porting to the new API straight forward?
On Thursday 13 February 2014 17:57:41 Jingoo Han wrote:
I want to use 'drivers/pci/host/pcie-designware.c' for both arm32 and arm64, without any code changes. However, it looks impossible.
It is impossible at the moment, and I agree we have to fix that.
I made 'drivers/pci/host/pcie-designware.c' based on 32bit arm PCI support. Then, with Liviu's patch, do I have to make new code for arm64, even though the same HW PCIe IP is used?
For arm32 drivers/pci/host/pcie-designware.c
For arm64 drivers/pci/host/pcie-designware-arm64.c
As a start, I'd suggest using "#ifdef CONFIG_ARM" in the driver, but sharing as much code as you can. We should try to make the #else section of the #ifdef architecture independent and get have the arm64 implementation shared with any architecture that doesn't have or want its own pcibios32.c implementation.
I am reviewing and compiling your patch. Would you consider adding 'struct pci_sys_data' and 'struct hw_pci'?
I would rather get rid of struct hw_pci for architecture independent drivers and add a different registration method on arm32 that is compatible with what we come up with on arm64. The main purpose of hw_pci is to allow multiple PCI controllers to be initialized at once, but we don't actually need that for any of the "modern" platforms where we already have a probe function that gets called once for each controller.
As a start, we could add a pci_host_bridge_register() function like the one below to arm32 and migrate the drivers/pci/host/ drivers over to use it with little effort. Instead of filling out hw_pci, these drivers would allocate (by embedding in their device struct) and fill out pci_sys_data directly. After that, we can gradually move more code out of the arm32 implementation into common code, if it doesn't already exist there, up to the point where a host driver no longer has to call any function in bios32.c.
Arnd
diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c index 317da88..12c2178 100644 --- a/arch/arm/kernel/bios32.c +++ b/arch/arm/kernel/bios32.c @@ -514,6 +514,26 @@ static void pcibios_init_hw(struct device *parent, struct hw_pci *hw, } }
+static void pci_common_bus_probe(struct pci_bus *bus) +{ + if (!pci_has_flag(PCI_PROBE_ONLY)) { + /* + * Size the bridge windows. + */ + pci_bus_size_bridges(bus); + + /* + * Assign resources. + */ + pci_bus_assign_resources(bus); + } + + /* + * Tell drivers about devices found. + */ + pci_bus_add_devices(bus); +} + void pci_common_init_dev(struct device *parent, struct hw_pci *hw) { struct pci_sys_data *sys; @@ -528,27 +548,38 @@ void pci_common_init_dev(struct device *parent, struct hw_pci *hw)
pci_fixup_irqs(pcibios_swizzle, pcibios_map_irq);
- list_for_each_entry(sys, &head, node) { - struct pci_bus *bus = sys->bus; + list_for_each_entry(sys, &head, node) + pci_common_bus_probe(sys->bus); +}
- if (!pci_has_flag(PCI_PROBE_ONLY)) { - /* - * Size the bridge windows. - */ - pci_bus_size_bridges(bus);
- /* - * Assign resources. - */ - pci_bus_assign_resources(bus); - }
- /* - * Tell drivers about devices found. - */ - pci_bus_add_devices(bus); - } + +int pci_host_bridge_register(struct device *parent, struct pci_sys_data *sys, struct pci_ops *ops, int (*setup)(int nr, struct pci_sys_data *)) +{ + int ret; + + pci_add_flags(PCI_REASSIGN_ALL_RSRC); + INIT_LIST_HEAD(&sys->resources); + + ret = setup(0, sys); + if (ret) + return ret; + + ret = pcibios_init_resources(0, sys); + if (ret) + return ret; + + sys->bus = pci_scan_root_bus(parent, sys->busnr, ops, sys, &sys->resources); + if (!sys->bus) + return -ENODEV; + + pci_fixup_irqs(pcibios_swizzle, pcibios_map_irq); + + pci_common_bus_probe(sys->bus); + return ret; } +EXPORT_SYMBOL_GPL(pci_host_bridge_register);
#ifndef CONFIG_PCI_HOST_ITE8152 void pcibios_set_master(struct pci_dev *dev)
On Thu, Feb 13, 2014 at 12:27:05PM +0100, Arnd Bergmann wrote:
I would rather get rid of struct hw_pci for architecture independent drivers and add a different registration method on arm32 that is compatible with what we come up with on arm64. The main purpose of hw_pci is to allow multiple PCI controllers to be initialized at once, but we don't actually need that for any of the "modern" platforms where we already have a probe function that gets called once for each controller.
No. The main purpose of hw_pci is as a container to support multiple different platform specific PCI implementations in one kernel. It's exactly what you need for single zImage.
On Thursday 13 February 2014 11:53:27 Russell King - ARM Linux wrote:
On Thu, Feb 13, 2014 at 12:27:05PM +0100, Arnd Bergmann wrote:
I would rather get rid of struct hw_pci for architecture independent drivers and add a different registration method on arm32 that is compatible with what we come up with on arm64. The main purpose of hw_pci is to allow multiple PCI controllers to be initialized at once, but we don't actually need that for any of the "modern" platforms where we already have a probe function that gets called once for each controller.
No. The main purpose of hw_pci is as a container to support multiple different platform specific PCI implementations in one kernel. It's exactly what you need for single zImage.
Well, we definitely need something to manage the assignment of domains, bus numbers and I/O space windows, but the main issue I see with existing hw_pci container is that it assumes that you can pass give it all host bridges for a given domain at once. The problem with this is that the pci host bridge drivers don't interact with one another, so a system that needs two different PCI host drivers can't use hw_pci to register them both, unless we come up with some extra infrastructure.
Also, the calling conventions for pci_common_init_dev() mean that it's hard to propagate -EPROBE_DEFER errors back to the driver probe function, so it seems easier to come up with something new that deals with all issues at once and that is outside of architecture specific code.
Arnd
On Thu, Feb 13, 2014 at 08:57:41AM +0000, Jingoo Han wrote:
-----Original Message----- From: Tanmay Inamdar [mailto:tinamdar@apm.com] Sent: Thursday, February 13, 2014 5:37 PM To: Jingoo Han Cc: Liviu Dudau; Arnd Bergmann; devicetree@vger.kernel.org; linaro-kernel; linux-pci; Will Deacon; LKML; Catalin Marinas; Bjorn Helgaas; LAKML Subject: Re: [PATCH] pci: Add support for creating a generic host_bridge from device tree
On Thu, Feb 13, 2014 at 12:10 AM, Jingoo Han jg1.han@samsung.com wrote:
On Thursday, February 06, 2014 7:18 PM, Liviu Dudau wrote:
On Wed, Feb 05, 2014 at 10:26:27PM +0000, Tanmay Inamdar wrote:
Hello Liviu,
I did not get the first email of this particular patch on any of subscribed mailing lists (don't know why), hence replying here.
Strange, it shows in the MARC and GMANE archive for linux-pci, probably a hickup on your receiving side?
+struct pci_host_bridge * +pci_host_bridge_of_init(struct device *parent, int busno, struct pci_ops *ops,
- void *host_data, struct list_head *resources)
+{
- struct pci_bus *root_bus;
- struct pci_host_bridge *bridge;
- /* first parse the host bridge bus ranges */
- if (pci_host_bridge_of_get_ranges(parent->of_node, resources))
- return NULL;
- /* then create the root bus */
- root_bus = pci_create_root_bus(parent, busno, ops, host_data, resources);
- if (!root_bus)
- return NULL;
- bridge = to_pci_host_bridge(root_bus->bridge);
- return bridge;
+}
You are keeping the domain_nr inside pci_host_bridge structure. In above API, domain_nr is required in 'pci_find_bus' function called from 'pci_create_root_bus'. Since the bridge is allocated after creating root bus, 'pci_find_bus' always gets domain_nr as 0. This will cause problem for scanning multiple domains.
Good catch. I was switching between creating a pci_controller in arch/arm64 and adding the needed bits in pci_host_bridge. After internal review I've decided to add the domain_nr to pci_host_bridge, but forgot to update the code everywhere.
Hi Liviu Dudau,
One more thing, I am reviewing and compiling your patch. Would you consider adding 'struct pci_sys_data' and 'struct hw_pci'?
Currently, 4 PCIe Host drivers (pci-mvebu.c, pci-tegra.c, pci-rcar-gen2.c, pcie-designware.c) are using 'struct pci_sys_data' and 'struct hw_pci' in their drivers. Without this, it makes build errors.
In arm32, 'struct pci_sys_data' and 'struct hw_pci' is defined in "arch/arm/include/asm/mach/pci.h".
Tanmay Inamdar, Your 'APM X-Gene PCIe' patch also needs 'struct pci_sys_data' and 'struct hw_pci'. With Liviu Dudau's patch, it will make build errors. Would you check this?
X-Gene PCIe host driver is dependent on arm64 PCI patch. My previous approach was based on 32bit arm PCI support. With Liviu's approach, I will have to make changes in host driver to get rid of hw_pci and pci_sys_data which are no longer required.
I want to use 'drivers/pci/host/pcie-designware.c' for both arm32 and arm64, without any code changes. However, it looks impossible.
I made 'drivers/pci/host/pcie-designware.c' based on 32bit arm PCI support. Then, with Liviu's patch, do I have to make new code for arm64, even though the same HW PCIe IP is used?
Hi Jingoo,
Arnd has asked about the transition path for 32bit arm PCI host bridges and I don't think we came up with a solution yet. My preferred solution would be to modify the arch/arm API to be more in line with the generic code and not have to use hw_pci and pci_sys_data anymore. That should fix your problem of having a single code base for your host controller.
If you look at the pci_sys_data and hw_pci structures, you can see that there is a bit of duplication between the two and also that some of the data contained in that structure is generic enough to be contained in the PCI infrastructure code rather than down in the arch-dependent code.
I confess that I have no in-depth knowledge of the reasons why the arm code looks like this and if it is still required with the current infrastructure code. Russell can provide us with entertaining stories here, I'm sure, on why the code looks like it does.
I have not done any work in this area yet, but if I were to update the generic arm code I would try to make the code use the pci_host_bridge structure rather than pci_sys_data. The hw_pci will disappear with the new APIs.
For arm32 drivers/pci/host/pcie-designware.c
For arm64 drivers/pci/host/pcie-designware-arm64.c
IMO it should not cause build errors for PCI host drivers dependent on architectures other than arm64. Can you post the error?
I post the build errors.
IMHO it is not worth trying to have the host bridge code cope with two APIs. It's going to lead to problems no matter what.
Best regards, Liviu
CC drivers/pci/host/pcie-designware.o CHK kernel/config_data.h drivers/pci/host/pcie-designware.c:72:52: warning: 'struct pci_sys_data' declared inside parameter list [enabled by default] static inline struct pcie_port *sys_to_pcie(struct pci_sys_data *sys) ^ drivers/pci/host/pcie-designware.c:72:52: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] drivers/pci/host/pcie-designware.c: In function 'sys_to_pcie' drivers/pci/host/pcie-designware.c:74:12: error: dereferencing pointer to incomplete type return sys->private_data; ^ drivers/pci/host/pcie-designware.c: In function 'dw_pcie_msi_map' drivers/pci/host/pcie-designware.c:384:2: error: implicit declaration of function 'set_irq_flags' [-Werror=implicit-function-declaration] set_irq_flags(irq, IRQF_VALID); ^ drivers/pci/host/pcie-designware.c:384:21: error: 'IRQF_VALID??undeclared (first use in this function) set_irq_flags(irq, IRQF_VALID); ^ drivers/pci/host/pcie-designware.c:384:21: note: each undeclared identifier is reported only once for each function it appears in drivers/pci/host/pcie-designware.c: In function 'dw_pcie_host_init' drivers/pci/host/pcie-designware.c:492:2: error: invalid use of undefined type 'struct hw_pci' dw_pci.nr_controllers = 1; ^ drivers/pci/host/pcie-designware.c:493:2: error: invalid use of undefined type 'struct hw_pci' dw_pci.private_data = (void **)&pp; ^ drivers/pci/host/pcie-designware.c:495:2: error: implicit declaration of function 'pci_common_init' [-Werror=implicit-function-declaration] pci_common_init(&dw_pci); ^ drivers/pci/host/pcie-designware.c:498:2: error: invalid use of undefined type 'struct hw_pci' dw_pci.domain++; ^ drivers/pci/host/pcie-designware.c: At top level: drivers/pci/host/pcie-designware.c:698:41: warning: 'struct pci_sys_data??declared inside parameter list [enabled by default] static int dw_pcie_setup(int nr, struct pci_sys_data *sys) ^ drivers/pci/host/pcie-designware.c: In function 'dw_pcie_setup' drivers/pci/host/pcie-designware.c:702:2: warning: passing argument 1 of 'sys_to_pcie' from incompatible pointer type [enabled by default] pp = sys_to_pcie(sys); ^ drivers/pci/host/pcie-designware.c:72:33: note: expected 'struct pci_sys_data *' but argument is of type 'struct pci_sys_data *' static inline struct pcie_port *sys_to_pcie(struct pci_sys_data *sys) ^ drivers/pci/host/pcie-designware.c:708:6: error: dereferencing pointer to incomplete type sys->io_offset = global_io_offset - pp->config.io_bus_addr; ^ drivers/pci/host/pcie-designware.c:711:31: error: dereferencing pointer to incomplete type pci_add_resource_offset(&sys->resources, &pp->io, ^ drivers/pci/host/pcie-designware.c:712:9: error: dereferencing pointer to incomplete type sys->io_offset); ^ drivers/pci/host/pcie-designware.c:715:5: error: dereferencing pointer to incomplete type sys->mem_offset = pp->mem.start - pp->config.mem_bus_addr; ^ drivers/pci/host/pcie-designware.c:716:30: error: dereferencing pointer to incomplete type pci_add_resource_offset(&sys->resources, &pp->mem, sys->mem_offset); ^ drivers/pci/host/pcie-designware.c:716:56: error: dereferencing pointer to incomplete type pci_add_resource_offset(&sys->resources, &pp->mem, sys->mem_offset); ^ drivers/pci/host/pcie-designware.c: At top level: drivers/pci/host/pcie-designware.c:721:56: warning: 'struct pci_sys_data' declared inside parameter list [enabled by default] static struct pci_bus *dw_pcie_scan_bus(int nr, struct pci_sys_data *sys) ^ drivers/pci/host/pcie-designware.c: In function 'dw_pcie_scan_bus' drivers/pci/host/pcie-designware.c:724:9: warning: passing argument 1 of 'sys_to_pcie' from incompatible pointer type [enabled by default] struct pcie_port *pp = sys_to_pcie(sys); ^ drivers/pci/host/pcie-designware.c:72:33: note: expected 'struct pci_sys_data *' but argument is of type 'sruct pci_sys_data *' static inline struct pcie_port *sys_to_pcie(struct pci_sys_data *sys) ^ drivers/pci/host/pcie-designware.c:727:24: error: dereferencing pointer to incomplete type pp->root_bus_nr = sys->busnr; ^ drivers/pci/host/pcie-designware.c:728:36: error: dereferencing pointer to incomplete type bus = pci_scan_root_bus(NULL, sys->busnr, &dw_pcie_ops, ^ drivers/pci/host/pcie-designware.c:729:15: error: dereferencing pointer to incomplete type sys, &sys->resources); ^ drivers/pci/host/pcie-designware.c: At top level: drivers/pci/host/pcie-designware.c:755:15: error: variable 'dw_pci' has initializer but incomplete type static struct hw_pci dw_pci = { ^ drivers/pci/host/pcie-designware.c:756:2: error: unknown field 'setup' specified in initializer .setup = dw_pcie_setup, ^ drivers/pci/host/pcie-designware.c:756:2: warning: excess elements in struct initializer [enabled by default] drivers/pci/host/pcie-designware.c:756:2: warning: (near initialization for 'dw_pci' [enabled by default] drivers/pci/host/pcie-designware.c:757:2: error: unknown field 'scan' specified in initializer .scan = dw_pcie_scan_bus, ^ drivers/pci/host/pcie-designware.c:757:2: warning: excess elements in struct initializer [enabled by default] drivers/pci/host/pcie-designware.c:757:2: warning: (near initialization for 'dw_pci' [enabled by default] drivers/pci/host/pcie-designware.c:758:2: error: unknown field 'map_irq' specified in initializer .map_irq = dw_pcie_map_irq, ^ drivers/pci/host/pcie-designware.c:758:2: warning: excess elements in struct initializer [enabled by default] drivers/pci/host/pcie-designware.c:758:2: warning: (near initialization for 'dw_pci' [enabled by default] drivers/pci/host/pcie-designware.c:759:2: error: unknown field 'add_bus' specified in initializer .add_bus = dw_pcie_add_bus, ^ drivers/pci/host/pcie-designware.c:759:2: warning: excess elements in struct initializer [enabled by default] drivers/pci/host/pcie-designware.c:759:2: warning: (near initialization for 'dw_pci' [enabled by default] cc1: some warnings being treated as errors make[3]: *** [drivers/pci/host/pcie-designware.o] Error 1
Thanks for reviewing this, will fix in v2.
Do you find porting to the new API straight forward?
linaro-kernel@lists.linaro.org