This is v2 of my attempt to add support for a generic pci_host_bridge controller created
from a description passed in the device tree.
Changes from v1:
- Add patch to fix conversion of IO ranges into IO resources.
- Added a domain_nr member to pci_host_bridge structure, and a new function
to create a root bus in a given domain number. In order to facilitate that
I propose changing the order of initialisation between pci_host_bridge and
it's related bus in pci_create_root_bus() as sort of a rever of 7b5436635800.
This is done in patch 1/4 and 2/4.
- Added a simple allocator of domain numbers in drivers/pci/host-bridge.c. The
code will first try to get a domain id from of_alias_get_id(..., "pci-domain")
and if that fails assign the next unallocated domain id.
- Changed the name of the function that creates the generic host bridge from
pci_host_bridge_of_init to of_create_pci_host_bridge and exported as GPL symbol.
v1 thread here: https://lkml.org/lkml/2014/2/3/380
The following is an edit of the original blurb:
Following the discussion started here [1], I now have a proposal for tackling
generic support for host bridges described via device tree. It is an initial
stab at it, to try to get feedback and suggestions, but it is functional enough
that I have PCI Express for arm64 working on an FPGA using the patch that I am
also publishing that adds support for PCI for that platform.
Looking at the existing architectures that fit the requirements (use of device
tree and PCI) yields the powerpc and microblaze as generic enough to make them
candidates for conversion. I have a tentative patch for microblaze that I can
only compile test it, unfortunately using qemu-microblaze leads to an early
crash in the kernel.
As Bjorn has mentioned in the previous discussion, the idea is to add to
struct pci_host_bridge enough data to be able to reduce the size or remove the
architecture specific pci_controller structure. arm64 support actually manages
to get rid of all the architecture static data and has no pci_controller structure
defined. For host bridge drivers that means a change of API unless architectures
decide to provide a compatibility layer (comments here please).
In order to initialise a host bridge with the new API, the following example
code is sufficient for a _probe() function:
static int myhostbridge_probe(struct platform_device *pdev)
{
int err;
struct device_node *dev;
struct pci_host_bridge *bridge;
struct myhostbridge_port *pp;
resource_size_t lastbus;
dev = pdev->dev.of_node;
if (!of_device_is_available(dev)) {
pr_warn("%s: disabled\n", dev->full_name);
return -ENODEV;
}
pp = kzalloc(sizeof(struct myhostbridge_port), GFP_KERNEL);
if (!pp)
return -ENOMEM;
bridge = of_create_pci_host_bridge(&pdev->dev, &myhostbridge_ops, pp);
if (!bridge) {
err = -EINVAL;
goto bridge_init_fail;
}
err = myhostbridge_setup(bridge->bus);
if (err)
goto bridge_init_fail;
/* We always enable PCI domains and we keep domain 0 backward
* compatible in /proc for video cards
*/
pci_add_flags(PCI_ENABLE_PROC_DOMAINS | PCI_COMPAT_DOMAIN_0);
pci_add_flags(PCI_REASSIGN_ALL_BUS | PCI_REASSIGN_ALL_RSRC);
lastbus = pci_scan_child_bus(bridge->bus);
pci_bus_update_busn_res_end(bridge->bus, lastbus);
pci_assign_unassigned_bus_resources(bridge->bus);
pci_bus_add_devices(bridge->bus);
return 0;
bridge_init_fail:
kfree(pp);
return err;
}
[1] http://thread.gmane.org/gmane.linux.kernel.pci/25946
Best regards,
Liviu
Liviu Dudau (4):
pci: OF: Fix the conversion of IO ranges into IO resources.
pci: Create pci_host_bridge before its associated bus in pci_create_root_bus.
pci: Introduce a domain number for pci_host_bridge.
pci: Add support for creating a generic host_bridge from device tree
drivers/of/address.c | 31 +++++++++++
drivers/pci/host-bridge.c | 134 +++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/probe.c | 66 ++++++++++++++--------
include/linux/of_address.h | 13 +----
include/linux/pci.h | 17 ++++++
5 files changed, 227 insertions(+), 34 deletions(-)
--
1.9.0
Hi,
This patch adds support for PCI to AArch64. It is based on my v5 patch
that adds support for creating generic host bridge structure from
device tree. With that in place, I was able to boot a platform that
has PCIe host bridge support and use a PCIe network card.
Changes from v4:
- Fixed the pci_domain_nr() implementation for arm64. Now we use
find_pci_host_bride() to find the host bridge before we retrieve
the domain number.
Changes from v3:
- Added Acks accumulated so far ;)
- Still carrying Catalin's patch for moving the PCI_IO_BASE until it
lands in linux-next or mainline, in order to ease applying the series
Changes from v2:
- Implement an arch specific version of pci_register_io_range() and
pci_address_to_pio().
- Return 1 from pci_proc_domain().
Changes from v1:
- Added Catalin's patch for moving the PCI_IO_BASE location and extend
its size to 16MB
- Integrated Arnd's version of pci_ioremap_io that uses a bitmap for
keeping track of assigned IO space and returns an io_offset. At the
moment the code is added in arch/arm64 but it can be moved in drivers/pci.
- Added a fix for the generic ioport_map() function when !CONFIG_GENERIC_IOMAP
as suggested by Arnd.
v4 thread here: https://lkml.org/lkml/2014/3/3/298
v3 thread here: https://lkml.org/lkml/2014/2/28/211
v2 thread here: https://lkml.org/lkml/2014/2/27/255
v1 thread here: https://lkml.org/lkml/2014/2/3/389
The API used is different from the one used by ARM architecture. There is
no pci_common_init_dev() function and no hw_pci structure, as that is no
longer needed. Once the last signature is added to the legal agreement, I
will post the host bridge driver code that I am using. Meanwhile, here
is an example of what the probe function looks like, posted as an example:
static int myhostbridge_probe(struct platform_device *pdev)
{
int err;
struct device_node *dev;
struct pci_host_bridge *bridge;
struct myhostbridge_port *pp;
resource_size_t lastbus;
dev = pdev->dev.of_node;
if (!of_device_is_available(dev)) {
pr_warn("%s: disabled\n", dev->full_name);
return -ENODEV;
}
pp = kzalloc(sizeof(struct myhostbridge_port), GFP_KERNEL);
if (!pp)
return -ENOMEM;
bridge = of_create_pci_host_bridge(&pdev->dev, &myhostbridge_ops, pp);
if (IS_ERR(bridge)) {
err = PTR_ERR(bridge);
goto bridge_create_fail;
}
err = myhostbridge_setup(bridge->bus);
if (err)
goto bridge_setup_fail;
/* We always enable PCI domains and we keep domain 0 backward
* compatible in /proc for video cards
*/
pci_add_flags(PCI_ENABLE_PROC_DOMAINS);
pci_add_flags(PCI_REASSIGN_ALL_BUS | PCI_REASSIGN_ALL_RSRC);
lastbus = pci_scan_child_bus(bridge->bus);
pci_bus_update_busn_res_end(bridge->bus, lastbus);
pci_assign_unassigned_bus_resources(bridge->bus);
pci_bus_add_devices(bridge->bus);
return 0;
bridge_setup_fail:
put_device(&bridge->dev);
device_unregister(&bridge->dev);
bridge_create_fail:
kfree(pp);
return err;
}
Best regards,
Liviu
Catalin Marinas (1):
arm64: Extend the PCI I/O space to 16MB
Liviu Dudau (2):
Fix ioport_map() for !CONFIG_GENERIC_IOMAP cases.
arm64: Add architecture support for PCI
Documentation/arm64/memory.txt | 16 ++--
arch/arm64/Kconfig | 19 ++++-
arch/arm64/include/asm/Kbuild | 1 +
arch/arm64/include/asm/io.h | 5 +-
arch/arm64/include/asm/pci.h | 49 ++++++++++++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/pci.c | 178 +++++++++++++++++++++++++++++++++++++++++
include/asm-generic/io.h | 2 +-
8 files changed, 261 insertions(+), 10 deletions(-)
create mode 100644 arch/arm64/include/asm/pci.h
create mode 100644 arch/arm64/kernel/pci.c
--
1.9.0
This patch adds cpufreq suspend/resume calls to dpm_{suspend|resume}() for
handling suspend/resume of cpufreq governors.
Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found an issue where tunables
configuration for clusters/sockets with non-boot CPUs was getting lost after
suspend/resume, as we were notifying governors with CPUFREQ_GOV_POLICY_EXIT on
removal of the last cpu for that policy and so deallocating memory for tunables.
This is fixed by this patch as we don't allow any operation on governors after
device suspend and before device resume now.
We could have added these callbacks at dpm_{suspend|resume}_noirq() level but
the problem here is that most of the devices (i.e. devices with ->suspend()
callbacks) have already been suspended by now and so if drivers want to change
frequency before suspending, then it might not be possible for many platforms
(which depend on other peripherals like i2c, regulators, etc).
Reported-and-tested-by: Lan Tianyu <tianyu.lan(a)intel.com>
Reported-by: Jinhyuk Choi <jinchoi(a)broadcom.com>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
V5->V6: 1-2-3/7 merged into 1/5
drivers/base/power/main.c | 5 +++
drivers/cpufreq/cpufreq.c | 111 +++++++++++++++++++++++-----------------------
include/linux/cpufreq.h | 8 ++++
3 files changed, 69 insertions(+), 55 deletions(-)
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 42355e4..86d5e4f 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -29,6 +29,7 @@
#include <linux/async.h>
#include <linux/suspend.h>
#include <trace/events/power.h>
+#include <linux/cpufreq.h>
#include <linux/cpuidle.h>
#include <linux/timer.h>
@@ -866,6 +867,8 @@ void dpm_resume(pm_message_t state)
mutex_unlock(&dpm_list_mtx);
async_synchronize_full();
dpm_show_time(starttime, state, NULL);
+
+ cpufreq_resume();
}
/**
@@ -1434,6 +1437,8 @@ int dpm_suspend(pm_message_t state)
might_sleep();
+ cpufreq_suspend();
+
mutex_lock(&dpm_list_mtx);
pm_transition = state;
async_error = 0;
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 56b7b1b..2e43c08 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -26,7 +26,7 @@
#include <linux/module.h>
#include <linux/mutex.h>
#include <linux/slab.h>
-#include <linux/syscore_ops.h>
+#include <linux/suspend.h>
#include <linux/tick.h>
#include <trace/events/power.h>
@@ -47,6 +47,9 @@ static LIST_HEAD(cpufreq_policy_list);
static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
#endif
+/* Flag to suspend/resume CPUFreq governors */
+static bool cpufreq_suspended;
+
static inline bool has_target(void)
{
return cpufreq_driver->target_index || cpufreq_driver->target;
@@ -1576,82 +1579,77 @@ static struct subsys_interface cpufreq_interface = {
};
/**
- * cpufreq_bp_suspend - Prepare the boot CPU for system suspend.
+ * cpufreq_suspend() - Suspend CPUFreq governors
*
- * This function is only executed for the boot processor. The other CPUs
- * have been put offline by means of CPU hotplug.
+ * Called during system wide Suspend/Hibernate cycles for suspending governors
+ * as some platforms can't change frequency after this point in suspend cycle.
+ * Because some of the devices (like: i2c, regulators, etc) they use for
+ * changing frequency are suspended quickly after this point.
*/
-static int cpufreq_bp_suspend(void)
+void cpufreq_suspend(void)
{
- int ret = 0;
-
- int cpu = smp_processor_id();
struct cpufreq_policy *policy;
- pr_debug("suspending cpu %u\n", cpu);
+ if (!cpufreq_driver)
+ return;
- /* If there's no policy for the boot CPU, we have nothing to do. */
- policy = cpufreq_cpu_get(cpu);
- if (!policy)
- return 0;
+ if (!has_target())
+ return;
- if (cpufreq_driver->suspend) {
- ret = cpufreq_driver->suspend(policy);
- if (ret)
- printk(KERN_ERR "cpufreq: suspend failed in ->suspend "
- "step on CPU %u\n", policy->cpu);
+ pr_debug("%s: Suspending Governors\n", __func__);
+
+ list_for_each_entry(policy, &cpufreq_policy_list, policy_list) {
+ if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP))
+ pr_err("%s: Failed to stop governor for policy: %p\n",
+ __func__, policy);
+ else if (cpufreq_driver->suspend
+ && cpufreq_driver->suspend(policy))
+ pr_err("%s: Failed to suspend driver: %p\n", __func__,
+ policy);
}
- cpufreq_cpu_put(policy);
- return ret;
+ cpufreq_suspended = true;
}
/**
- * cpufreq_bp_resume - Restore proper frequency handling of the boot CPU.
+ * cpufreq_resume() - Resume CPUFreq governors
*
- * 1.) resume CPUfreq hardware support (cpufreq_driver->resume())
- * 2.) schedule call cpufreq_update_policy() ASAP as interrupts are
- * restored. It will verify that the current freq is in sync with
- * what we believe it to be. This is a bit later than when it
- * should be, but nonethteless it's better than calling
- * cpufreq_driver->get() here which might re-enable interrupts...
- *
- * This function is only executed for the boot CPU. The other CPUs have not
- * been turned on yet.
+ * Called during system wide Suspend/Hibernate cycle for resuming governors that
+ * are suspended with cpufreq_suspend().
*/
-static void cpufreq_bp_resume(void)
+void cpufreq_resume(void)
{
- int ret = 0;
-
- int cpu = smp_processor_id();
struct cpufreq_policy *policy;
- pr_debug("resuming cpu %u\n", cpu);
+ if (!cpufreq_driver)
+ return;
- /* If there's no policy for the boot CPU, we have nothing to do. */
- policy = cpufreq_cpu_get(cpu);
- if (!policy)
+ if (!has_target())
return;
- if (cpufreq_driver->resume) {
- ret = cpufreq_driver->resume(policy);
- if (ret) {
- printk(KERN_ERR "cpufreq: resume failed in ->resume "
- "step on CPU %u\n", policy->cpu);
- goto fail;
- }
- }
+ pr_debug("%s: Resuming Governors\n", __func__);
- schedule_work(&policy->update);
+ cpufreq_suspended = false;
-fail:
- cpufreq_cpu_put(policy);
-}
+ list_for_each_entry(policy, &cpufreq_policy_list, policy_list) {
+ if (__cpufreq_governor(policy, CPUFREQ_GOV_START)
+ || __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS))
+ pr_err("%s: Failed to start governor for policy: %p\n",
+ __func__, policy);
+ else if (cpufreq_driver->resume
+ && cpufreq_driver->resume(policy))
+ pr_err("%s: Failed to resume driver: %p\n", __func__,
+ policy);
-static struct syscore_ops cpufreq_syscore_ops = {
- .suspend = cpufreq_bp_suspend,
- .resume = cpufreq_bp_resume,
-};
+ /*
+ * schedule call cpufreq_update_policy() for boot CPU, i.e. last
+ * policy in list. It will verify that the current freq is in
+ * sync with what we believe it to be.
+ */
+ if (list_is_last(&policy->policy_list, &cpufreq_policy_list))
+ schedule_work(&policy->update);
+ }
+}
/**
* cpufreq_get_current_driver - return current driver's name
@@ -1868,6 +1866,10 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
struct cpufreq_governor *gov = NULL;
#endif
+ /* Don't start any governor operations if we are entering suspend */
+ if (cpufreq_suspended)
+ return 0;
+
if (policy->governor->max_transition_latency &&
policy->cpuinfo.transition_latency >
policy->governor->max_transition_latency) {
@@ -2407,7 +2409,6 @@ static int __init cpufreq_core_init(void)
cpufreq_global_kobject = kobject_create();
BUG_ON(!cpufreq_global_kobject);
- register_syscore_ops(&cpufreq_syscore_ops);
return 0;
}
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 4d89e0e..94ed907 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -296,6 +296,14 @@ cpufreq_verify_within_cpu_limits(struct cpufreq_policy *policy)
policy->cpuinfo.max_freq);
}
+#ifdef CONFIG_CPU_FREQ
+void cpufreq_suspend(void);
+void cpufreq_resume(void);
+#else
+static inline void cpufreq_suspend(void) {}
+static inline void cpufreq_resume(void) {}
+#endif
+
/*********************************************************************
* CPUFREQ NOTIFIER INTERFACE *
*********************************************************************/
--
1.7.12.rc2.18.g61b472e
We call __find_governor() during addition of first CPU of every policy from
__cpufreq_add_dev() to find the last governor used for this CPU before it was
hotplugged-out.
After that we call cpufreq_parse_governor() in cpufreq_init_policy() either with
this governor or default governor. And right after that policy->governor is set
to NULL.
There is no functional problem here with this code, but it is just that some of
the code required in cpufreq_init_policy() is being called by its caller, i.e.
__cpufreq_add_dev(). So, it would make more sense to get all these together at a
single place to make code more readable.
So, instead of doing this move the relevant parts to cpufreq_init_policy()
policy only and initialize policy->governor to NULL at the beginning.
In order to clean the code a bit more, some of the #ifdefs for
CONFIG_HOTPLUG_CPU are also removed which used to maintain the last governor
used for any CPU. Keeping that code wouldn't harm us much but would improve the
cleanliness of code.
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
V1->V2:
- Improved commit logs
- Removed unrelated whitespace changes
- Removed some #ifdef CONFIG_HOTPLUG_CPU to clean code a bit.
drivers/cpufreq/cpufreq.c | 38 ++++++++++++++------------------------
1 file changed, 14 insertions(+), 24 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 56b7b1b..fff2968 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -42,10 +42,8 @@ static DEFINE_RWLOCK(cpufreq_driver_lock);
DEFINE_MUTEX(cpufreq_governor_lock);
static LIST_HEAD(cpufreq_policy_list);
-#ifdef CONFIG_HOTPLUG_CPU
/* This one keeps track of the previously set governor of a removed CPU */
static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
-#endif
static inline bool has_target(void)
{
@@ -879,18 +877,25 @@ err_out_kobj_put:
static void cpufreq_init_policy(struct cpufreq_policy *policy)
{
+ struct cpufreq_governor *gov = NULL;
struct cpufreq_policy new_policy;
int ret = 0;
memcpy(&new_policy, policy, sizeof(*policy));
+ /* Update governor of new_policy to the governor used before hotplug */
+ gov = __find_governor(per_cpu(cpufreq_cpu_governor, policy->cpu));
+ if (gov)
+ pr_debug("Restoring governor %s for cpu %d\n",
+ policy->governor->name, policy->cpu);
+ else
+ gov = CPUFREQ_DEFAULT_GOVERNOR;
+
+ new_policy.governor = gov;
+
/* Use the default policy if its valid. */
if (cpufreq_driver->setpolicy)
- cpufreq_parse_governor(policy->governor->name,
- &new_policy.policy, NULL);
-
- /* assure that the starting sequence is run in cpufreq_set_policy */
- policy->governor = NULL;
+ cpufreq_parse_governor(gov->name, &new_policy.policy, NULL);
/* set default policy */
ret = cpufreq_set_policy(policy, &new_policy);
@@ -949,6 +954,8 @@ static struct cpufreq_policy *cpufreq_policy_restore(unsigned int cpu)
read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+ policy->governor = NULL;
+
return policy;
}
@@ -1036,7 +1043,6 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif,
unsigned long flags;
#ifdef CONFIG_HOTPLUG_CPU
struct cpufreq_policy *tpolicy;
- struct cpufreq_governor *gov;
#endif
if (cpu_is_offline(cpu))
@@ -1094,7 +1100,6 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif,
else
policy->cpu = cpu;
- policy->governor = CPUFREQ_DEFAULT_GOVERNOR;
cpumask_copy(policy->cpus, cpumask_of(cpu));
init_completion(&policy->kobj_unregister);
@@ -1179,15 +1184,6 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif,
blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
CPUFREQ_START, policy);
-#ifdef CONFIG_HOTPLUG_CPU
- gov = __find_governor(per_cpu(cpufreq_cpu_governor, cpu));
- if (gov) {
- policy->governor = gov;
- pr_debug("Restoring governor %s for cpu %d\n",
- policy->governor->name, cpu);
- }
-#endif
-
if (!frozen) {
ret = cpufreq_add_dev_interface(policy, dev);
if (ret)
@@ -1312,11 +1308,9 @@ static int __cpufreq_remove_dev_prepare(struct device *dev,
}
}
-#ifdef CONFIG_HOTPLUG_CPU
if (!cpufreq_driver->setpolicy)
strncpy(per_cpu(cpufreq_cpu_governor, cpu),
policy->governor->name, CPUFREQ_NAME_LEN);
-#endif
down_read(&policy->rwsem);
cpus = cpumask_weight(policy->cpus);
@@ -1955,9 +1949,7 @@ EXPORT_SYMBOL_GPL(cpufreq_register_governor);
void cpufreq_unregister_governor(struct cpufreq_governor *governor)
{
-#ifdef CONFIG_HOTPLUG_CPU
int cpu;
-#endif
if (!governor)
return;
@@ -1965,14 +1957,12 @@ void cpufreq_unregister_governor(struct cpufreq_governor *governor)
if (cpufreq_disabled())
return;
-#ifdef CONFIG_HOTPLUG_CPU
for_each_present_cpu(cpu) {
if (cpu_online(cpu))
continue;
if (!strcmp(per_cpu(cpufreq_cpu_governor, cpu), governor->name))
strcpy(per_cpu(cpufreq_cpu_governor, cpu), "\0");
}
-#endif
mutex_lock(&cpufreq_governor_mutex);
list_del(&governor->governor_list);
--
1.7.12.rc2.18.g61b472e