CPPC: ====
CPPC (Collaborative Processor Performance Control) is a new way to control CPU performance using an abstract continous scale as against a discretized P-state scale which is tied to CPU frequency only. It is defined in the ACPI 5.0+ spec. In brief, the basic operation involves: - OS makes a CPU performance request. (Can provide min and max tolerable bounds)
- Platform (such as BMC) is free to optimize request within requested bounds depending on power/thermal budgets etc.
- Platform conveys its decision back to OS
The communication between OS and platform occurs through another medium called (PCC) Platform communication Channel. This is a generic mailbox like mechanism which includes doorbell semantics to indicate register updates. See drivers/mailbox/pcc.c
This patchset introduces a CPPC based CPUFreq driver that works with existing governors such as ondemand. The CPPC table parsing and the CPPC communication semantics are abstracted into separate files to allow future CPPC based drivers to implement their own governors if required.
Initial patchsets included an adaptation of the PID governor from intel_pstate.c. However recent experiments led to extensive modifications of the algorithm to calculate CPU busyness. Until it is verified that these changes are worthwhile, the existing governors should provide for a good enough starting point for ARM64 servers.
Finer details about the PCC and CPPC spec are available in the latest ACPI 5.1 specification.[2]
Testing: =======
This was tested on an SBSA compatible ARMv8 server with CPPCv2 firmware running on a remote processor. I verified that each CPUs performance limits were detected and that new performance requests were made by the on-demand governor proportional to the load on each CPU. I also verified that using the acpi_processor driver correctly maps the physical CPU ids to logical CPU ids, which helps in picking up the proper _CPC details from a processor object, in the case where CPU physical ids may not be contiguous.
Changes since V7: - Simplied new kconfig options for PSS and idle. - Separated patch to enable acpi processor on ARM64. - Removed redundant kconfig cross deps on PCC. - Decoupled processor_perflib from new PSS kconfig option.
Changes since V6: - Separated PSS and CST from ACPI processor driver in two patches. - Made new Kconfig symbols auto selectable from Arch Kconfigs.
Changes since V5: - Checkpatch cleanups. - Change pss_init to pss_perf_init. Rec by Srinivas Pandruvada. - Explicit comment explaining why postcore_initcall to pcc mailbox. - Fold acpi_processor_syscore_init/exit into CONFIG_ACPI_CST. - Added patch with dummy functions used by ACPI_HOTPLUG_CPU.
Changes since V4: - Misc cleanups. Addressed feedback from Rafael. - Made acpi_processor.c independent of C-states, P-states and others. - Per CPU scanning for _CPC is now made from acpi_processor.c - Added new Kconfig options for legacy C states and P states to enable future support for newer alternatives as defined in the ACPI spec 6.0.
Changes since V3: - Split CPPC backend methods into separate files. - Add frontend driver which plugs into existing CPUfreq governors. - Simplify PCC driver by moving communication space mapping and read/write into client drivers.
Changes since V2: - Select driver if !X86, since intel_pstate will use HWP extensions instead. - Added more comments. - Added Freq domain awareness and PSD parsing.
Changes since V1: - Create a new driver based on Dirks suggestion. - Fold in CPPC backend hooks into main driver.
Changes since V0: [1] - Split intel_pstate.c into a generic PID governor and platform specific backend. - Add CPPC accessors as PID backend.
[1] - http://lwn.net/Articles/608715/ [2] - http://www.uefi.org/sites/default/files/resources/ACPI_5_1release.pdf [3] - https://patches.linaro.org/40705/
Ashwin Chaugule (9): PCC: Initialize PCC Mailbox earlier at boot ACPI: Split out ACPI PSS from ACPI Processor driver ACPI: Decouple ACPI idle and ACPI processor drivers ACPI: Introduce CPU performance controls using CPPC CPPC: Add a CPUFreq driver for use with CPPC ACPI: Add weak routines for ACPI CPU Hotplug CPPC: Probe for CPPC tables for each ACPI Processor object PCC: Disable compilation by default ACPI: Allow selection of the ACPI processor driver for ARM64
drivers/acpi/Kconfig | 35 +- drivers/acpi/Makefile | 7 +- drivers/acpi/acpi_processor.c | 18 + drivers/acpi/cppc_acpi.c | 812 ++++++++++++++++++++++++++++++++++++++++ drivers/acpi/processor_driver.c | 90 +++-- drivers/cpufreq/Kconfig.arm | 17 + drivers/cpufreq/Makefile | 2 + drivers/cpufreq/cppc_cpufreq.c | 197 ++++++++++ drivers/mailbox/Kconfig | 1 + drivers/mailbox/pcc.c | 8 +- include/acpi/cppc_acpi.h | 137 +++++++ include/acpi/processor.h | 63 +++- 12 files changed, 1345 insertions(+), 42 deletions(-) create mode 100644 drivers/acpi/cppc_acpi.c create mode 100644 drivers/cpufreq/cppc_cpufreq.c create mode 100644 include/acpi/cppc_acpi.h
This change initializes the PCC Mailbox earlier than the ACPI processor driver. This enables drivers introduced in follow up patches (e.g. CPPC) to be probed via the ACPI processor driver interface. The CPPC probe requires the PCC channel to be initialized for it to query each CPUs performance capabilities.
Signed-off-by: Ashwin Chaugule ashwin.chaugule@linaro.org Reviewed-by: Al Stone al.stone@linaro.org --- drivers/mailbox/pcc.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c index 26d121d..f814313 100644 --- a/drivers/mailbox/pcc.c +++ b/drivers/mailbox/pcc.c @@ -352,4 +352,10 @@ static int __init pcc_init(void)
return 0; } -device_initcall(pcc_init); + +/* + * Make pcc init postcore so that users of this mailbox + * such as the ACPI Processor driver have it available + * at their init. + */ +postcore_initcall(pcc_init);
The ACPI processor driver is currently tied too closely to the ACPI P-states (PSS) and other related constructs for controlling CPU performance.
The newer ACPI specification (v5.1 onwards) introduces alternative methods to PSS. These new mechanisms are described within each ACPI Processor object and so they need to be scanned whenever a new Processor object is detected. This patch introduces a new Kconfig symbol to allow for finer configurability among the two options for controlling performance states. There is no change in functionality and the option is auto-selected by the architectures which support it.
The following patchwork introduces CPPC: A newer method of controlling CPU performance. The OS is not expected to support CPPC and PSS at runtime. So the kconfig option lets us make these two mutually exclusive at compile time.
Signed-off-by: Ashwin Chaugule ashwin.chaugule@linaro.org --- drivers/acpi/Kconfig | 15 ++++--- drivers/acpi/Makefile | 5 ++- drivers/acpi/processor_driver.c | 86 +++++++++++++++++++++++++++-------------- include/acpi/processor.h | 28 +++++++++++++- 4 files changed, 96 insertions(+), 38 deletions(-)
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 114cf48..d6e2a86 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -189,17 +189,20 @@ config ACPI_DOCK This driver supports ACPI-controlled docking stations and removable drive bays such as the IBM Ultrabay and the Dell Module Bay.
+config ACPI_CPU_FREQ_PSS + bool + select THERMAL + config ACPI_PROCESSOR tristate "Processor" - select THERMAL - select CPU_IDLE depends on X86 || IA64 + select CPU_IDLE + select ACPI_CPU_FREQ_PSS default y help - This driver installs ACPI as the idle handler for Linux and uses - ACPI C2 and C3 processor states to save power on systems that - support it. It is required by several flavors of cpufreq - performance-state drivers. + This driver adds support for the ACPI Processor package. It is required + by several flavors of cpufreq performance-state, thermal, throttling and + idle drivers.
To compile this driver as a module, choose M here: the module will be called processor. diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile index 8321430..7e97aef 100644 --- a/drivers/acpi/Makefile +++ b/drivers/acpi/Makefile @@ -80,8 +80,9 @@ obj-$(CONFIG_ACPI_CUSTOM_METHOD)+= custom_method.o obj-$(CONFIG_ACPI_BGRT) += bgrt.o
# processor has its own "processor." module_param namespace -processor-y := processor_driver.o processor_throttling.o -processor-y += processor_idle.o processor_thermal.o +processor-y := processor_driver.o processor_idle.o +processor-$(CONFIG_ACPI_CPU_FREQ_PSS) += processor_throttling.o \ + processor_thermal.o processor-$(CONFIG_CPU_FREQ) += processor_perflib.o
obj-$(CONFIG_ACPI_PROCESSOR_AGGREGATOR) += acpi_pad.o diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c index d9f7158..16d44ad 100644 --- a/drivers/acpi/processor_driver.c +++ b/drivers/acpi/processor_driver.c @@ -163,34 +163,24 @@ static struct notifier_block __refdata acpi_cpu_notifier = { .notifier_call = acpi_cpu_soft_notify, };
-static int __acpi_processor_start(struct acpi_device *device) +#ifdef CONFIG_ACPI_CPU_FREQ_PSS +static int acpi_pss_perf_init(struct acpi_processor *pr, + struct acpi_device *device) { - struct acpi_processor *pr = acpi_driver_data(device); - acpi_status status; int result = 0;
- if (!pr) - return -ENODEV; - - if (pr->flags.need_hotplug_init) - return 0; - -#ifdef CONFIG_CPU_FREQ acpi_processor_ppc_has_changed(pr, 0); -#endif + acpi_processor_get_throttling_info(pr);
if (pr->flags.throttling) pr->flags.limit = 1;
- if (!cpuidle_get_driver() || cpuidle_get_driver() == &acpi_idle_driver) - acpi_processor_power_init(pr); - pr->cdev = thermal_cooling_device_register("Processor", device, &processor_cooling_ops); if (IS_ERR(pr->cdev)) { result = PTR_ERR(pr->cdev); - goto err_power_exit; + return result; }
dev_dbg(&device->dev, "registered as cooling_device%d\n", @@ -204,6 +194,7 @@ static int __acpi_processor_start(struct acpi_device *device) "Failed to create sysfs link 'thermal_cooling'\n"); goto err_thermal_unregister; } + result = sysfs_create_link(&pr->cdev->device.kobj, &device->dev.kobj, "device"); @@ -213,17 +204,61 @@ static int __acpi_processor_start(struct acpi_device *device) goto err_remove_sysfs_thermal; }
- status = acpi_install_notify_handler(device->handle, ACPI_DEVICE_NOTIFY, - acpi_processor_notify, device); - if (ACPI_SUCCESS(status)) - return 0; - sysfs_remove_link(&pr->cdev->device.kobj, "device"); err_remove_sysfs_thermal: sysfs_remove_link(&device->dev.kobj, "thermal_cooling"); err_thermal_unregister: thermal_cooling_device_unregister(pr->cdev); - err_power_exit: + + return result; +} + +static void acpi_pss_perf_exit(struct acpi_processor *pr, + struct acpi_device *device) +{ + if (pr->cdev) { + sysfs_remove_link(&device->dev.kobj, "thermal_cooling"); + sysfs_remove_link(&pr->cdev->device.kobj, "device"); + thermal_cooling_device_unregister(pr->cdev); + pr->cdev = NULL; + } +} +#else +static inline int acpi_pss_perf_init(struct acpi_processor *pr, + struct acpi_device *device) +{ + return 0; +} + +static inline void acpi_pss_perf_exit(struct acpi_processor *pr, + struct acpi_device *device) {} +#endif /* CONFIG_ACPI_CPU_FREQ_PSS */ + +static int __acpi_processor_start(struct acpi_device *device) +{ + struct acpi_processor *pr = acpi_driver_data(device); + acpi_status status; + int result = 0; + + if (!pr) + return -ENODEV; + + if (pr->flags.need_hotplug_init) + return 0; + + if (!cpuidle_get_driver() || cpuidle_get_driver() == &acpi_idle_driver) + acpi_processor_power_init(pr); + + result = acpi_pss_perf_init(pr, device); + if (result) + goto err_power_exit; + + status = acpi_install_notify_handler(device->handle, ACPI_DEVICE_NOTIFY, + acpi_processor_notify, device); + if (ACPI_SUCCESS(status)) + return 0; + +err_power_exit: acpi_processor_power_exit(pr); return result; } @@ -252,15 +287,10 @@ static int acpi_processor_stop(struct device *dev) pr = acpi_driver_data(device); if (!pr) return 0; - acpi_processor_power_exit(pr);
- if (pr->cdev) { - sysfs_remove_link(&device->dev.kobj, "thermal_cooling"); - sysfs_remove_link(&pr->cdev->device.kobj, "device"); - thermal_cooling_device_unregister(pr->cdev); - pr->cdev = NULL; - } + acpi_pss_perf_exit(pr, device); + return 0; }
diff --git a/include/acpi/processor.h b/include/acpi/processor.h index 4188a4d..b6c9178 100644 --- a/include/acpi/processor.h +++ b/include/acpi/processor.h @@ -318,6 +318,7 @@ int acpi_get_cpuid(acpi_handle, int type, u32 acpi_id); void acpi_processor_set_pdc(acpi_handle handle);
/* in processor_throttling.c */ +#ifdef CONFIG_ACPI_CPU_FREQ_PSS int acpi_processor_tstate_has_changed(struct acpi_processor *pr); int acpi_processor_get_throttling_info(struct acpi_processor *pr); extern int acpi_processor_set_throttling(struct acpi_processor *pr, @@ -330,6 +331,29 @@ extern void acpi_processor_reevaluate_tstate(struct acpi_processor *pr, unsigned long action); extern const struct file_operations acpi_processor_throttling_fops; extern void acpi_processor_throttling_init(void); +#else +static inline int acpi_processor_tstate_has_changed(struct acpi_processor *pr) +{ + return 0; +} + +static inline int acpi_processor_get_throttling_info(struct acpi_processor *pr) +{ + return -ENODEV; +} + +static inline int acpi_processor_set_throttling(struct acpi_processor *pr, + int state, bool force) +{ + return -ENODEV; +} + +static inline void acpi_processor_reevaluate_tstate(struct acpi_processor *pr, + unsigned long action) {} + +static inline void acpi_processor_throttling_init(void) {} +#endif /* CONFIG_ACPI_CPU_FREQ_PSS */ + /* in processor_idle.c */ int acpi_processor_power_init(struct acpi_processor *pr); int acpi_processor_power_exit(struct acpi_processor *pr); @@ -348,7 +372,7 @@ static inline void acpi_processor_syscore_exit(void) {} /* in processor_thermal.c */ int acpi_processor_get_limit_info(struct acpi_processor *pr); extern const struct thermal_cooling_device_ops processor_cooling_ops; -#ifdef CONFIG_CPU_FREQ +#if defined(CONFIG_ACPI_CPU_FREQ_PSS) & defined(CONFIG_CPU_FREQ) void acpi_thermal_cpufreq_init(void); void acpi_thermal_cpufreq_exit(void); #else @@ -360,6 +384,6 @@ static inline void acpi_thermal_cpufreq_exit(void) { return; } -#endif +#endif /* CONFIG_ACPI_CPU_FREQ_PSS */
#endif
This patch introduces a new Kconfig symbol, ACPI_PROCESSOR_IDLE, which is auto selected by architectures which support the ACPI based C states for CPU Idle management.
The processor_idle driver in its present form contains declarations specific to X86 and IA64. Since there are no reasonable defaults for other architectures e.g. ARM64, the driver is selected only for X86 or IA64.
This helps in decoupling the ACPI processor_driver from the ACPI processor_idle driver which is useful for the upcoming alternative patchwork for controlling CPU Performance (CPPC) and CPU Idle (LPI).
Signed-off-by: Ashwin Chaugule ashwin.chaugule@linaro.org --- drivers/acpi/Kconfig | 6 +++++- drivers/acpi/Makefile | 3 ++- include/acpi/processor.h | 26 ++++++++++++++++++++++++-- 3 files changed, 31 insertions(+), 4 deletions(-)
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index d6e2a86..54e9729 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -193,10 +193,14 @@ config ACPI_CPU_FREQ_PSS bool select THERMAL
+config ACPI_PROCESSOR_IDLE + bool + select CPU_IDLE + config ACPI_PROCESSOR tristate "Processor" depends on X86 || IA64 - select CPU_IDLE + select ACPI_PROCESSOR_IDLE select ACPI_CPU_FREQ_PSS default y help diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile index 7e97aef..3ea59ae 100644 --- a/drivers/acpi/Makefile +++ b/drivers/acpi/Makefile @@ -80,7 +80,8 @@ obj-$(CONFIG_ACPI_CUSTOM_METHOD)+= custom_method.o obj-$(CONFIG_ACPI_BGRT) += bgrt.o
# processor has its own "processor." module_param namespace -processor-y := processor_driver.o processor_idle.o +processor-y := processor_driver.o +processor-$(CONFIG_ACPI_PROCESSOR_IDLE) += processor_idle.o processor-$(CONFIG_ACPI_CPU_FREQ_PSS) += processor_throttling.o \ processor_thermal.o processor-$(CONFIG_CPU_FREQ) += processor_perflib.o diff --git a/include/acpi/processor.h b/include/acpi/processor.h index b6c9178..2c4e7a9 100644 --- a/include/acpi/processor.h +++ b/include/acpi/processor.h @@ -355,13 +355,35 @@ static inline void acpi_processor_throttling_init(void) {} #endif /* CONFIG_ACPI_CPU_FREQ_PSS */
/* in processor_idle.c */ +extern struct cpuidle_driver acpi_idle_driver; +#ifdef CONFIG_ACPI_PROCESSOR_IDLE int acpi_processor_power_init(struct acpi_processor *pr); int acpi_processor_power_exit(struct acpi_processor *pr); int acpi_processor_cst_has_changed(struct acpi_processor *pr); int acpi_processor_hotplug(struct acpi_processor *pr); -extern struct cpuidle_driver acpi_idle_driver; +#else +static inline int acpi_processor_power_init(struct acpi_processor *pr) +{ + return -ENODEV; +} + +static inline int acpi_processor_power_exit(struct acpi_processor *pr) +{ + return -ENODEV; +} + +static inline int acpi_processor_cst_has_changed(struct acpi_processor *pr) +{ + return -ENODEV; +} + +static inline int acpi_processor_hotplug(struct acpi_processor *pr) +{ + return -ENODEV; +} +#endif /* CONFIG_ACPI_PROCESSOR_IDLE */
-#ifdef CONFIG_PM_SLEEP +#if defined(CONFIG_PM_SLEEP) & defined(CONFIG_ACPI_PROCESSOR_IDLE) void acpi_processor_syscore_init(void); void acpi_processor_syscore_exit(void); #else
CPPC stands for Collaborative Processor Performance Controls and is defined in the ACPI v5.0+ spec. It describes CPU performance controls on an abstract and continuous scale allowing the platform (e.g. remote power processor) to flexibly optimize CPU performance with its knowledge of power budgets and other architecture specific knowledge.
This patch adds a shim which exports commonly used functions to get and set CPPC specific controls for each CPU. This enables CPUFreq drivers to gather per CPU performance data and use with exisiting governors or even allows for customized governors which are implemented inside CPUFreq drivers.
Signed-off-by: Ashwin Chaugule ashwin.chaugule@linaro.org Reviewed-by: Al Stone al.stone@linaro.org --- drivers/acpi/Kconfig | 14 + drivers/acpi/Makefile | 1 + drivers/acpi/cppc_acpi.c | 812 +++++++++++++++++++++++++++++++++++++++++++++++ include/acpi/cppc_acpi.h | 137 ++++++++ 4 files changed, 964 insertions(+) create mode 100644 drivers/acpi/cppc_acpi.c create mode 100644 include/acpi/cppc_acpi.h
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 54e9729..c6ec903 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -197,6 +197,20 @@ config ACPI_PROCESSOR_IDLE bool select CPU_IDLE
+config ACPI_CPPC_LIB + bool + depends on ACPI_PROCESSOR + depends on !ACPI_CPU_FREQ_PSS + select MAILBOX + select PCC + help + This file implements common functionality to parse + CPPC tables as described in the ACPI 5.1+ spec. The + routines implemented are meant to be used by other + drivers to control CPU performance using CPPC semantics. + If your platform does not support CPPC in firmware, + leave this option disabled. + config ACPI_PROCESSOR tristate "Processor" depends on X86 || IA64 diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile index 3ea59ae..4c393a69 100644 --- a/drivers/acpi/Makefile +++ b/drivers/acpi/Makefile @@ -78,6 +78,7 @@ obj-$(CONFIG_ACPI_HED) += hed.o obj-$(CONFIG_ACPI_EC_DEBUGFS) += ec_sys.o obj-$(CONFIG_ACPI_CUSTOM_METHOD)+= custom_method.o obj-$(CONFIG_ACPI_BGRT) += bgrt.o +obj-$(CONFIG_ACPI_CPPC_LIB) += cppc_acpi.o
# processor has its own "processor." module_param namespace processor-y := processor_driver.o diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c new file mode 100644 index 0000000..9c89767 --- /dev/null +++ b/drivers/acpi/cppc_acpi.c @@ -0,0 +1,812 @@ +/* + * CPPC (Collaborative Processor Performance Control) methods used + * by CPUfreq drivers. + * + * (C) Copyright 2014, 2015 Linaro Ltd. + * Author: Ashwin Chaugule ashwin.chaugule@linaro.org + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; version 2 + * of the License. + * + * CPPC describes a few methods for controlling CPU performance using + * information from a per CPU table called CPC. This table is described in + * the ACPI v5.0+ specification. The table consists of a list of + * registers which may be memory mapped or hardware registers and also may + * include some static integer values. + * + * CPU performance is on an abstract continuous scale as against a discretized + * P-state scale which is tied to CPU frequency only. In brief, the basic + * operation involves: + * + * - OS makes a CPU performance request. (Can provide min and max bounds) + * + * - Platform (such as BMC) is free to optimize request within requested bounds + * depending on power/thermal budgets etc. + * + * - Platform conveys its decision back to OS + * + * The communication between OS and platform occurs through another medium + * called (PCC) Platform Communication Channel. This is a generic mailbox like + * mechanism which includes doorbell semantics to indicate register updates. + * See drivers/mailbox/pcc.c for details on PCC. + * + * Finer details about the PCC and CPPC spec are available in the latest + * ACPI 5.1 specification. + */ + +#define pr_fmt(fmt) "ACPI CPPC: " fmt + +#include <linux/cpufreq.h> +#include <linux/delay.h> + +#include <acpi/cppc_acpi.h> +/* + * Lock to provide mutually exclusive access to the PCC + * channel. e.g. When the remote updates the shared region + * with new data, the reader needs to be protected from + * other CPUs activity on the same channel. + */ +static DEFINE_SPINLOCK(pcc_lock); + +static DEFINE_PER_CPU(struct cpc_desc *, cpc_desc_ptr); + +/* This layer handles all the PCC specifics for CPPC. */ +static struct mbox_chan *pcc_channel; +static void __iomem *pcc_comm_addr; +static u64 comm_base_addr; +static int pcc_subspace_idx = -1; +static u16 pcc_cmd_delay; +static int pcc_channel_acquired; + +#define NUM_RETRIES 500 + +static int send_pcc_cmd(u16 cmd) +{ + int err, result = 0; + int retries = NUM_RETRIES; + struct acpi_pcct_hw_reduced *pcct_ss = pcc_channel->con_priv; + struct acpi_pcct_shared_memory *generic_comm_base = + (struct acpi_pcct_shared_memory *) pcc_comm_addr; + u32 cmd_latency = pcct_ss->latency; + + /* Write to the shared comm region. */ + writew(cmd, &generic_comm_base->command); + + /* Flip CMD COMPLETE bit */ + writew(0, &generic_comm_base->status); + + err = mbox_send_message(pcc_channel, &cmd); + if (err < 0) { + pr_err("Err sending PCC mbox message. cmd:%d, ret:%d\n", + cmd, err); + return err; + } + + /* Wait for a nominal time to let platform processes command. */ + udelay(cmd_latency); + + /* Retry in case the remote processor was too slow to catch up. */ + while (retries--) { + result = readw_relaxed(&generic_comm_base->status) + & PCC_CMD_COMPLETE ? 0 : -EIO; + if (!result) { + /* Success. */ + retries = NUM_RETRIES; + break; + } + } + + mbox_client_txdone(pcc_channel, result); + return result; +} + +static void cppc_chan_tx_done(struct mbox_client *cl, void *mssg, int ret) +{ + if (ret) + pr_debug("TX did not complete: CMD sent:%x, ret:%d\n", + *(u16 *)mssg, ret); + else + pr_debug("TX completed. CMD sent:%x, ret:%d\n", + *(u16 *)mssg, ret); +} + +struct mbox_client cppc_mbox_cl = { + .tx_done = cppc_chan_tx_done, + .knows_txdone = true, +}; + +static int acpi_get_psd(struct cpc_desc *cpc_ptr, acpi_handle handle) +{ + int result = 0; + acpi_status status = AE_OK; + struct acpi_buffer buffer = {ACPI_ALLOCATE_BUFFER, NULL}; + struct acpi_buffer format = {sizeof("NNNNN"), "NNNNN"}; + struct acpi_buffer state = {0, NULL}; + union acpi_object *psd = NULL; + struct acpi_psd_package *pdomain; + + status = acpi_evaluate_object(handle, "_PSD", NULL, &buffer); + if (ACPI_FAILURE(status)) + return -ENODEV; + + psd = buffer.pointer; + if (!psd || (psd->type != ACPI_TYPE_PACKAGE)) { + pr_err("Invalid _PSD data\n"); + result = -ENODATA; + goto end; + } + + if (psd->package.count != 1) { + pr_err("Invalid _PSD data\n"); + result = -ENODATA; + goto end; + } + + pdomain = &(cpc_ptr->domain_info); + + state.length = sizeof(struct acpi_psd_package); + state.pointer = pdomain; + + status = acpi_extract_package(&(psd->package.elements[0]), + &format, &state); + if (ACPI_FAILURE(status)) { + pr_err("Invalid _PSD data\n"); + result = -ENODATA; + goto end; + } + + if (pdomain->num_entries != ACPI_PSD_REV0_ENTRIES) { + pr_err("Unknown _PSD:num_entries\n"); + result = -ENODATA; + goto end; + } + + if (pdomain->revision != ACPI_PSD_REV0_REVISION) { + pr_err("Unknown _PSD:revision\n"); + result = -ENODATA; + goto end; + } + + if (pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ALL && + pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ANY && + pdomain->coord_type != DOMAIN_COORD_TYPE_HW_ALL) { + pr_err("Invalid _PSD:coord_type\n"); + result = -ENODATA; + goto end; + } +end: + kfree(buffer.pointer); + return result; +} + +int acpi_get_psd_map(struct cpudata **all_cpu_data) +{ + int count_target; + int retval = 0; + unsigned int i, j; + cpumask_var_t covered_cpus; + struct cpudata *pr, *match_pr; + struct acpi_psd_package *pdomain; + struct acpi_psd_package *match_pdomain; + struct cpc_desc *cpc_ptr, *match_cpc_ptr; + + if (!zalloc_cpumask_var(&covered_cpus, GFP_KERNEL)) + return -ENOMEM; + + /* + * Now that we have _PSD data from all CPUs, lets setup P-state + * domain info. + */ + for_each_possible_cpu(i) { + pr = all_cpu_data[i]; + if (!pr) + continue; + + if (cpumask_test_cpu(i, covered_cpus)) + continue; + + cpc_ptr = per_cpu(cpc_desc_ptr, i); + if (!cpc_ptr) + continue; + + pdomain = &(cpc_ptr->domain_info); + cpumask_set_cpu(i, pr->shared_cpu_map); + cpumask_set_cpu(i, covered_cpus); + if (pdomain->num_processors <= 1) + continue; + + /* Validate the Domain info */ + count_target = pdomain->num_processors; + if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ALL) + pr->shared_type = CPUFREQ_SHARED_TYPE_ALL; + else if (pdomain->coord_type == DOMAIN_COORD_TYPE_HW_ALL) + pr->shared_type = CPUFREQ_SHARED_TYPE_HW; + else if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ANY) + pr->shared_type = CPUFREQ_SHARED_TYPE_ANY; + + for_each_possible_cpu(j) { + if (i == j) + continue; + + match_cpc_ptr = per_cpu(cpc_desc_ptr, j); + if (!match_cpc_ptr) + continue; + + match_pdomain = &(match_cpc_ptr->domain_info); + if (match_pdomain->domain != pdomain->domain) + continue; + + /* Here i and j are in the same domain */ + + if (match_pdomain->num_processors != count_target) { + retval = -EINVAL; + goto err_ret; + } + + if (pdomain->coord_type != match_pdomain->coord_type) { + retval = -EINVAL; + goto err_ret; + } + + cpumask_set_cpu(j, covered_cpus); + cpumask_set_cpu(j, pr->shared_cpu_map); + } + + for_each_possible_cpu(j) { + if (i == j) + continue; + + match_pr = all_cpu_data[j]; + if (!match_pr) + continue; + + match_cpc_ptr = per_cpu(cpc_desc_ptr, j); + if (!match_cpc_ptr) + continue; + + match_pdomain = &(match_cpc_ptr->domain_info); + if (match_pdomain->domain != pdomain->domain) + continue; + + match_pr->shared_type = pr->shared_type; + cpumask_copy(match_pr->shared_cpu_map, + pr->shared_cpu_map); + } + } + +err_ret: + for_each_possible_cpu(i) { + pr = all_cpu_data[i]; + if (!pr) + continue; + + /* Assume no coordination on any error parsing domain info */ + if (retval) { + cpumask_clear(pr->shared_cpu_map); + cpumask_set_cpu(i, pr->shared_cpu_map); + pr->shared_type = CPUFREQ_SHARED_TYPE_ALL; + } + } + + free_cpumask_var(covered_cpus); + return retval; +} +EXPORT_SYMBOL_GPL(acpi_get_psd_map); + +static int register_pcc_channel(unsigned pcc_subspace_idx) +{ + struct acpi_pcct_subspace *cppc_ss; + unsigned int len; + + if (pcc_subspace_idx >= 0) { + pcc_channel = pcc_mbox_request_channel(&cppc_mbox_cl, + pcc_subspace_idx); + + if (IS_ERR(pcc_channel)) { + pr_err("No PCC communication channel found\n"); + return -ENODEV; + } + + /* + * The PCC mailbox controller driver should + * have parsed the PCCT (global table of all + * PCC channels) and stored pointers to the + * subspace communication region in con_priv. + */ + cppc_ss = pcc_channel->con_priv; + + if (!cppc_ss) { + pr_err("No PCC subspace found for CPPC\n"); + return -ENODEV; + } + + /* + * This is the shared communication region + * for the OS and Platform to communicate over. + */ + comm_base_addr = cppc_ss->base_address; + len = cppc_ss->length; + pcc_cmd_delay = cppc_ss->min_turnaround_time; + + pcc_comm_addr = ioremap(comm_base_addr, len); + if (!pcc_comm_addr) { + pr_err("Failed to ioremap PCC comm region mem\n"); + return -ENOMEM; + } + + /* Set flag so that we dont come here for each CPU. */ + pcc_channel_acquired = 1; + + } else + /* + * For the case where registers are not defined as PCC regs. + * Assuming all regs are FFH / SystemIO. + */ + pr_debug("No PCC subspace detected in any CPC entries.\n"); + + return 0; +} + +/** + * acpi_cppc_processor_probe - The _CPC table is a per CPU table + * which a bunch of entries which may be registers or integers. + * An example table looks like the following. + * + * Name(_CPC, Package() + * { + * 17, + * NumEntries + * 1, + * // Revision + * ResourceTemplate(){Register(PCC, 32, 0, 0x120, 2)}, + * // Highest Performance + * ResourceTemplate(){Register(PCC, 32, 0, 0x124, 2)}, + * // Nominal Performance + * ResourceTemplate(){Register(PCC, 32, 0, 0x128, 2)}, + * // Lowest Nonlinear Performance + * ResourceTemplate(){Register(PCC, 32, 0, 0x12C, 2)}, + * // Lowest Performance + * ResourceTemplate(){Register(PCC, 32, 0, 0x130, 2)}, + * // Guaranteed Performance Register + * ResourceTemplate(){Register(PCC, 32, 0, 0x110, 2)}, + * // Desired Performance Register + * ResourceTemplate(){Register(SystemMemory, 0, 0, 0, 0)}, + * .. + * .. + * .. + * + * } + * Each Register() encodes how to access that specific register. + * e.g. a sample PCC entry has the following encoding: + * + * Register ( + * PCC, + * AddressSpaceKeyword + * 8, + * //RegisterBitWidth + * 8, + * //RegisterBitOffset + * 0x30, + * //RegisterAddress + * 9 + * //AccessSize (subspace ID) + * 0 + * ) + * } + * + * This function walks through all the per CPU _CPC entries and extracts + * the Register details. + * + * Return: 0 for success or negative value for err. + */ +int acpi_cppc_processor_probe(struct acpi_processor *pr) +{ + struct acpi_buffer output = {ACPI_ALLOCATE_BUFFER, NULL}; + union acpi_object *out_obj, *cpc_obj; + struct cpc_desc *cpc_ptr; + struct cpc_reg *gas_t; + acpi_handle handle = pr->handle; + unsigned int num_ent, i, cpc_rev, ret = 0; + acpi_status status; + + /* Parse the ACPI _CPC table for this cpu. */ + if (!acpi_has_method(handle, "_CPC")) { + pr_debug("_CPC table not found\n"); + ret = -ENODEV; + goto out_buf_free; + } + + status = acpi_evaluate_object(handle, "_CPC", NULL, &output); + if (ACPI_FAILURE(status)) { + ret = -ENODEV; + goto out_buf_free; + } + + out_obj = (union acpi_object *) output.pointer; + if (out_obj->type != ACPI_TYPE_PACKAGE) { + ret = -ENODEV; + goto out_buf_free; + } + + cpc_ptr = kzalloc(sizeof(struct cpc_desc), GFP_KERNEL); + if (!cpc_ptr) + return -ENOMEM; + + /* First entry is NumEntries. */ + cpc_obj = &out_obj->package.elements[0]; + if (cpc_obj->type == ACPI_TYPE_INTEGER) { + num_ent = cpc_obj->integer.value; + } else { + pr_debug("Unexpected entry type(%d) for NumEntries\n", + cpc_obj->type); + goto out_free; + } + + /* Only support CPPCv2. Bail otherwise. */ + if (num_ent != CPPC_NUM_ENT) { + pr_err("Firmware exports %d entries. Expected: %d\n", + num_ent, CPPC_NUM_ENT); + ret = -EINVAL; + goto out_free; + } + + /* Second entry should be revision. */ + cpc_obj = &out_obj->package.elements[1]; + if (cpc_obj->type == ACPI_TYPE_INTEGER) { + cpc_rev = cpc_obj->integer.value; + } else { + pr_debug("Unexpected entry type(%d) for Revision\n", + cpc_obj->type); + goto out_free; + } + + if (cpc_rev != CPPC_REV) { + pr_err("Firmware exports revision:%d. Expected:%d\n", + cpc_rev, CPPC_REV); + goto out_free; + } + + /* Iterate through remaining entries in _CPC */ + for (i = 2; i < num_ent; i++) { + cpc_obj = &out_obj->package.elements[i]; + + if (cpc_obj->type == ACPI_TYPE_INTEGER) { + cpc_ptr->cpc_regs[i-2].type = + ACPI_TYPE_INTEGER; + cpc_ptr->cpc_regs[i-2].cpc_entry.int_value = + cpc_obj->integer.value; + } else if (cpc_obj->type == ACPI_TYPE_BUFFER) { + gas_t = (struct cpc_reg *) + cpc_obj->buffer.pointer; + + /* + * The PCC Subspace index is encoded inside + * the CPC table entries. The same PCC index + * will be used for all the PCC entries, + * so extract it only once. + */ + if (gas_t->space_id == + ACPI_ADR_SPACE_PLATFORM_COMM) { + if (pcc_subspace_idx < 0) + pcc_subspace_idx = + gas_t->access_width; + else if (pcc_subspace_idx != + gas_t->access_width) { + /* + * Mismatched PCC id detected. + * Firmware bug. + */ + goto out_free; + } + } + + cpc_ptr->cpc_regs[i-2].type = + ACPI_TYPE_BUFFER; + cpc_ptr->cpc_regs[i-2].cpc_entry.reg = + (struct cpc_reg) { + .space_id = gas_t->space_id, + .length = gas_t->length, + .bit_width = gas_t->bit_width, + .bit_offset = gas_t->bit_offset, + .address = gas_t->address, + .access_width = + gas_t->access_width, + }; + } else { + pr_debug("Error in entry:%d in CPC table.\n", i); + ret = -EINVAL; + goto out_free; + } + } + + /* Plug it into this CPUs CPC descriptor. */ + per_cpu(cpc_desc_ptr, pr->id) = cpc_ptr; + + /* Parse PSD data for this CPU */ + ret = acpi_get_psd(cpc_ptr, handle); + if (ret) + goto out_free; + + /* Register PCC channel once for all CPUs. */ + if (!pcc_channel_acquired) { + ret = register_pcc_channel(pcc_subspace_idx); + if (ret) + goto out_free; + } + + /* Everything looks okay */ + pr_info("Successfully parsed CPC struct for CPU: %d\n", pr->id); + + kfree(output.pointer); + return 0; + +out_free: + cpc_ptr = per_cpu(cpc_desc_ptr, pr->id); + kfree(cpc_ptr); + +out_buf_free: + kfree(output.pointer); + return -ENODEV; +} +EXPORT_SYMBOL_GPL(acpi_cppc_processor_probe); + +static u64 cpc_trans(struct cpc_register_resource *reg, int cmd, u64 write_val, + bool is_pcc) +{ + u64 addr; + u64 read_val = 0; + + /* PCC communication addr space begins at byte offset 0x8. */ + addr = is_pcc ? (u64)pcc_comm_addr + 0x8 + reg->cpc_entry.reg.address : + reg->cpc_entry.reg.address; + + if (reg->type == ACPI_TYPE_BUFFER) { + switch (reg->cpc_entry.reg.bit_width) { + case 8: + if (cmd == CMD_READ) + read_val = readb((void *) (addr)); + else if (cmd == CMD_WRITE) + writeb(write_val, (void *)(addr)); + else + pr_debug("Unsupported cmd type: %d\n", cmd); + break; + case 16: + if (cmd == CMD_READ) + read_val = readw((void *) (addr)); + else if (cmd == CMD_WRITE) + writew(write_val, (void *)(addr)); + else + pr_debug("Unsupported cmd type: %d\n", cmd); + break; + case 32: + if (cmd == CMD_READ) + read_val = readl((void *) (addr)); + else if (cmd == CMD_WRITE) + writel(write_val, (void *)(addr)); + else + pr_debug("Unsupported cmd type: %d\n", cmd); + break; + case 64: + if (cmd == CMD_READ) + read_val = readq((void *) (addr)); + else if (cmd == CMD_WRITE) + writeq(write_val, (void *)(addr)); + else + pr_debug("Unsupported cmd type: %d\n", cmd); + break; + default: + pr_debug("Unsupported bit width for CPC cmd:%d\n", + cmd); + break; + } + } else if (reg->type == ACPI_TYPE_INTEGER) { + if (cmd == CMD_READ) + read_val = reg->cpc_entry.int_value; + else if (cmd == CMD_WRITE) + reg->cpc_entry.int_value = write_val; + else + pr_debug("Unsupported cmd type: %d\n", cmd); + } else + pr_debug("Unsupported CPC entry type:%d\n", reg->type); + + return read_val; +} + +/** + * cppc_get_perf_caps - Get a CPUs performance capabilities. + * @cpunum: CPU from which to get capabilities info. + * @perf_caps: ptr to cppc_perf_caps. See cppc_acpi.h + * + * Return - 0 for success with perf_caps populated else + * -ERRNO. + */ +int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps) +{ + struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum); + struct cpc_register_resource *highest_reg, *lowest_reg, *ref_perf, + *nom_perf; + u64 min, max, ref, nom; + bool is_pcc = false; + int ret; + + if (!cpc_desc) { + pr_debug("No CPC descriptor for CPU:%d\n", cpunum); + return -ENODEV; + } + + highest_reg = &cpc_desc->cpc_regs[HIGHEST_PERF]; + lowest_reg = &cpc_desc->cpc_regs[LOWEST_PERF]; + ref_perf = &cpc_desc->cpc_regs[REFERENCE_PERF]; + nom_perf = &cpc_desc->cpc_regs[NOMINAL_PERF]; + + spin_lock(&pcc_lock); + + /* Are any of the regs PCC ?*/ + if ((highest_reg->cpc_entry.reg.space_id == + ACPI_ADR_SPACE_PLATFORM_COMM) || + (lowest_reg->cpc_entry.reg.space_id == + ACPI_ADR_SPACE_PLATFORM_COMM) || + (ref_perf->cpc_entry.reg.space_id == + ACPI_ADR_SPACE_PLATFORM_COMM) || + (nom_perf->cpc_entry.reg.space_id == + ACPI_ADR_SPACE_PLATFORM_COMM)) + is_pcc = true; + + if (is_pcc) { + /* + * Min time OS should wait before sending + * next command. + */ + udelay(pcc_cmd_delay); + /* Ring doorbell */ + ret = send_pcc_cmd(CMD_READ); + if (ret) { + spin_unlock(&pcc_lock); + return -EIO; + } + } + + max = cpc_trans(highest_reg, CMD_READ, 0, is_pcc); + perf_caps->highest_perf = max; + + min = cpc_trans(lowest_reg, CMD_READ, 0, is_pcc); + perf_caps->lowest_perf = min; + + ref = cpc_trans(ref_perf, CMD_READ, 0, is_pcc); + perf_caps->reference_perf = ref; + + nom = cpc_trans(nom_perf, CMD_READ, 0, is_pcc); + perf_caps->nominal_perf = nom; + + if (!ref) + perf_caps->reference_perf = perf_caps->nominal_perf; + + spin_unlock(&pcc_lock); + + if (!perf_caps->highest_perf || + !perf_caps->lowest_perf || + !perf_caps->reference_perf || + !perf_caps->nominal_perf) { + return -EINVAL; + } + + return 0; +} +EXPORT_SYMBOL_GPL(cppc_get_perf_caps); + +/** + * cppc_get_perf_ctrs - Read a CPUs performance feedback counters. + * @cpunum: CPU from which to read counters. + * @perf_fb_ctrs: ptr to cppc_perf_fb_ctrs. See cppc_acpi.h + * + * Return - 0 for success with perf_fb_ctrs populated else + * -ERRNO. + */ +int cppc_get_perf_ctrs(int cpunum, struct cppc_perf_fb_ctrs *perf_fb_ctrs) +{ + struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum); + struct cpc_register_resource *delivered_reg, *reference_reg; + u64 delivered, reference; + bool is_pcc = false; + int ret; + + if (!cpc_desc) { + pr_debug("No CPC descriptor for CPU:%d\n", cpunum); + return -ENODEV; + } + + delivered_reg = &cpc_desc->cpc_regs[DELIVERED_CTR]; + reference_reg = &cpc_desc->cpc_regs[REFERENCE_CTR]; + + spin_lock(&pcc_lock); + + /* Are any of the regs PCC ?*/ + if ((delivered_reg->cpc_entry.reg.space_id == + ACPI_ADR_SPACE_PLATFORM_COMM) || + (reference_reg->cpc_entry.reg.space_id == + ACPI_ADR_SPACE_PLATFORM_COMM)) + is_pcc = true; + + if (is_pcc) { + /* + * Min time OS should wait before sending + * next command. + */ + udelay(pcc_cmd_delay); + /* Ring doorbell */ + ret = send_pcc_cmd(CMD_READ); + if (ret) { + spin_unlock(&pcc_lock); + return -EIO; + } + } + + delivered = cpc_trans(delivered_reg, CMD_READ, 0, is_pcc); + reference = cpc_trans(reference_reg, CMD_READ, 0, is_pcc); + + spin_unlock(&pcc_lock); + + if (!delivered || !reference) + return -EINVAL; + + perf_fb_ctrs->delivered = delivered; + perf_fb_ctrs->reference = reference; + + perf_fb_ctrs->delivered -= perf_fb_ctrs->prev_delivered; + perf_fb_ctrs->reference -= perf_fb_ctrs->prev_reference; + + perf_fb_ctrs->prev_delivered = delivered; + perf_fb_ctrs->prev_reference = reference; + + return 0; +} +EXPORT_SYMBOL_GPL(cppc_get_perf_ctrs); + +/** + * cppc_set_perf - Set a CPUs performance controls. + * @cpu: CPU for which to set performance controls. + * @perf_ctrls: ptr to cppc_perf_ctrls. See cppc_acpi.h + * + * Return: 0 for success, -ERRNO otherwise. + */ +int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls) +{ + struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpu); + struct cpc_register_resource *desired_reg; + int ret = 0; + bool is_pcc = false; + + if (!cpc_desc) { + pr_debug("No CPC descriptor for CPU:%d\n", cpu); + return -ENODEV; + } + + desired_reg = &cpc_desc->cpc_regs[DESIRED_PERF]; + + spin_lock(&pcc_lock); + + /* Is this a PCC reg ?*/ + if (desired_reg->cpc_entry.reg.space_id == + ACPI_ADR_SPACE_PLATFORM_COMM) + is_pcc = true; + + cpc_trans(desired_reg, CMD_WRITE, + perf_ctrls->desired_perf, is_pcc); + + if (is_pcc) { + /* + * Min time OS should wait before sending + * next command. + */ + udelay(pcc_cmd_delay); + /* Ring doorbell */ + ret = send_pcc_cmd(CMD_READ); + } + + spin_unlock(&pcc_lock); + + return ret; +} +EXPORT_SYMBOL_GPL(cppc_set_perf); diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h new file mode 100644 index 0000000..b97d2b6 --- /dev/null +++ b/include/acpi/cppc_acpi.h @@ -0,0 +1,137 @@ +/* + * CPPC (Collaborative Processor Performance Control) methods used + * by CPUfreq drivers. + * + * (C) Copyright 2014, 2015 Linaro Ltd. + * Author: Ashwin Chaugule ashwin.chaugule@linaro.org + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; version 2 + * of the License. + */ + +#ifndef _CPPC_ACPI_H +#define _CPPC_ACPI_H + +#include <linux/acpi.h> +#include <linux/mailbox_controller.h> +#include <linux/mailbox_client.h> +#include <linux/types.h> + +#include <acpi/processor.h> + +/* Only support CPPCv2 for now. */ +#define CPPC_NUM_ENT 21 +#define CPPC_REV 2 + +#define PCC_CMD_COMPLETE 1 +#define MAX_CPC_REG_ENT 19 + +/* CPPC specific PCC commands. */ +#define CMD_READ 0 +#define CMD_WRITE 1 + +/* Each register has the folowing format. */ +struct cpc_reg { + u8 descriptor; + u16 length; + u8 space_id; + u8 bit_width; + u8 bit_offset; + u8 access_width; + u64 __iomem address; +} __packed; + +/* + * Each entry in the CPC table is either + * of type ACPI_TYPE_BUFFER or + * ACPI_TYPE_INTEGER. + */ +struct cpc_register_resource { + acpi_object_type type; + union { + struct cpc_reg reg; + u64 int_value; + } cpc_entry; +}; + +/* Container to hold the CPC details for each CPU */ +struct cpc_desc { + int num_entries; + int version; + struct cpc_register_resource cpc_regs[MAX_CPC_REG_ENT]; + struct acpi_psd_package domain_info; +}; + +/* These are indexes into the per-cpu cpc_regs[]. Order is important. */ +enum cppc_regs { + HIGHEST_PERF, + NOMINAL_PERF, + LOW_NON_LINEAR_PERF, + LOWEST_PERF, + GUARANTEED_PERF, + DESIRED_PERF, + MIN_PERF, + MAX_PERF, + PERF_REDUC_TOLERANCE, + TIME_WINDOW, + CTR_WRAP_TIME, + REFERENCE_CTR, + DELIVERED_CTR, + PERF_LIMITED, + ENABLE, + AUTO_SEL_ENABLE, + AUTO_ACT_WINDOW, + ENERGY_PERF, + REFERENCE_PERF, +}; + +/* + * Categorization of registers as described + * in the ACPI v.5.1 spec. + * XXX: Only filling up ones which are used by governors + * today. + */ +struct cppc_perf_caps { + u32 highest_perf; + u32 nominal_perf; + u32 reference_perf; + u32 lowest_perf; +}; + +struct cppc_perf_ctrls { + u32 max_perf; + u32 min_perf; + u32 desired_perf; +}; + +struct cppc_perf_fb_ctrs { + u64 reference; + u64 prev_reference; + u64 delivered; + u64 prev_delivered; +}; + +/* Per CPU container for runtime CPPC management. */ +struct cpudata { + int cpu; + struct cppc_perf_caps perf_caps; + struct cppc_perf_ctrls perf_ctrls; + struct cppc_perf_fb_ctrs perf_fb_ctrs; + struct cpufreq_policy *cur_policy; + unsigned int shared_type; + cpumask_var_t shared_cpu_map; +}; + +extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs); +extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls); +extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps); +extern int acpi_get_psd_map(struct cpudata **); + +/* Methods to interact with the PCC mailbox controller. */ +extern struct mbox_chan * + pcc_mbox_request_channel(struct mbox_client *, unsigned int); +extern int mbox_send_message(struct mbox_chan *chan, void *mssg); + +#endif /* _CPPC_ACPI_H*/
On Wednesday, August 05, 2015 09:40:27 AM Ashwin Chaugule wrote:
CPPC stands for Collaborative Processor Performance Controls and is defined in the ACPI v5.0+ spec. It describes CPU performance controls on an abstract and continuous scale allowing the platform (e.g. remote power processor) to flexibly optimize CPU performance with its knowledge of power budgets and other architecture specific knowledge.
This patch adds a shim which exports commonly used functions to get and set CPPC specific controls for each CPU. This enables CPUFreq drivers to gather per CPU performance data and use with exisiting governors or even allows for customized governors which are implemented inside CPUFreq drivers.
Signed-off-by: Ashwin Chaugule ashwin.chaugule@linaro.org Reviewed-by: Al Stone al.stone@linaro.org
drivers/acpi/Kconfig | 14 + drivers/acpi/Makefile | 1 + drivers/acpi/cppc_acpi.c | 812 +++++++++++++++++++++++++++++++++++++++++++++++ include/acpi/cppc_acpi.h | 137 ++++++++ 4 files changed, 964 insertions(+) create mode 100644 drivers/acpi/cppc_acpi.c create mode 100644 include/acpi/cppc_acpi.h
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 54e9729..c6ec903 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -197,6 +197,20 @@ config ACPI_PROCESSOR_IDLE bool select CPU_IDLE +config ACPI_CPPC_LIB
- bool
- depends on ACPI_PROCESSOR
- depends on !ACPI_CPU_FREQ_PSS
- select MAILBOX
- select PCC
- help
This file implements common functionality to parse
It's better to start with "If this option is enabled".
CPPC tables as described in the ACPI 5.1+ spec. The
routines implemented are meant to be used by other
drivers to control CPU performance using CPPC semantics.
If your platform does not support CPPC in firmware,
leave this option disabled.
config ACPI_PROCESSOR tristate "Processor" depends on X86 || IA64 diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile index 3ea59ae..4c393a69 100644 --- a/drivers/acpi/Makefile +++ b/drivers/acpi/Makefile @@ -78,6 +78,7 @@ obj-$(CONFIG_ACPI_HED) += hed.o obj-$(CONFIG_ACPI_EC_DEBUGFS) += ec_sys.o obj-$(CONFIG_ACPI_CUSTOM_METHOD)+= custom_method.o obj-$(CONFIG_ACPI_BGRT) += bgrt.o +obj-$(CONFIG_ACPI_CPPC_LIB) += cppc_acpi.o # processor has its own "processor." module_param namespace processor-y := processor_driver.o diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c new file mode 100644 index 0000000..9c89767 --- /dev/null +++ b/drivers/acpi/cppc_acpi.c @@ -0,0 +1,812 @@ +/*
- CPPC (Collaborative Processor Performance Control) methods used
- by CPUfreq drivers.
One line please.
- (C) Copyright 2014, 2015 Linaro Ltd.
- Author: Ashwin Chaugule ashwin.chaugule@linaro.org
- This program is free software; you can redistribute it and/or
- modify it under the terms of the GNU General Public License
- as published by the Free Software Foundation; version 2
- of the License.
- CPPC describes a few methods for controlling CPU performance using
- information from a per CPU table called CPC. This table is described in
- the ACPI v5.0+ specification. The table consists of a list of
- registers which may be memory mapped or hardware registers and also may
- include some static integer values.
- CPU performance is on an abstract continuous scale as against a discretized
- P-state scale which is tied to CPU frequency only. In brief, the basic
- operation involves:
- OS makes a CPU performance request. (Can provide min and max bounds)
- Platform (such as BMC) is free to optimize request within requested bounds
- depending on power/thermal budgets etc.
- Platform conveys its decision back to OS
- The communication between OS and platform occurs through another medium
- called (PCC) Platform Communication Channel. This is a generic mailbox like
- mechanism which includes doorbell semantics to indicate register updates.
- See drivers/mailbox/pcc.c for details on PCC.
- Finer details about the PCC and CPPC spec are available in the latest
- ACPI 5.1 specification.
ACPI 5.1 is not the latest any more. I'd say "ACPI 6.0 or later" to be on the safe side.
- */
+#define pr_fmt(fmt) "ACPI CPPC: " fmt
+#include <linux/cpufreq.h> +#include <linux/delay.h>
+#include <acpi/cppc_acpi.h> +/*
- Lock to provide mutually exclusive access to the PCC
- channel. e.g. When the remote updates the shared region
- with new data, the reader needs to be protected from
- other CPUs activity on the same channel.
- */
+static DEFINE_SPINLOCK(pcc_lock);
+static DEFINE_PER_CPU(struct cpc_desc *, cpc_desc_ptr);
A description of what the per-CPU thing is and how it is used would be good to have here.
+/* This layer handles all the PCC specifics for CPPC. */ +static struct mbox_chan *pcc_channel; +static void __iomem *pcc_comm_addr; +static u64 comm_base_addr; +static int pcc_subspace_idx = -1; +static u16 pcc_cmd_delay; +static int pcc_channel_acquired;
+#define NUM_RETRIES 500
How did you get that number?
+static int send_pcc_cmd(u16 cmd) +{
- int err, result = 0;
- int retries = NUM_RETRIES;
- struct acpi_pcct_hw_reduced *pcct_ss = pcc_channel->con_priv;
- struct acpi_pcct_shared_memory *generic_comm_base =
(struct acpi_pcct_shared_memory *) pcc_comm_addr;
- u32 cmd_latency = pcct_ss->latency;
- /* Write to the shared comm region. */
- writew(cmd, &generic_comm_base->command);
- /* Flip CMD COMPLETE bit */
- writew(0, &generic_comm_base->status);
- err = mbox_send_message(pcc_channel, &cmd);
- if (err < 0) {
pr_err("Err sending PCC mbox message. cmd:%d, ret:%d\n",
cmd, err);
return err;
- }
- /* Wait for a nominal time to let platform processes command. */
- udelay(cmd_latency);
- /* Retry in case the remote processor was too slow to catch up. */
- while (retries--) {
It looks like this can be written as
for (retries = NUM_RETRIES; retries > 0; retries--) {
result = readw_relaxed(&generic_comm_base->status)
& PCC_CMD_COMPLETE ? 0 : -EIO;
I'm not sure why do you need the ternary operator here.
You could just do
if (readw_relaxed(&generic_comm_base->status) & PCC_CMD_COMPLETE) { result = 0; break; }
and set "result" to -EIO beforehand.
if (!result) {
/* Success. */
retries = NUM_RETRIES;
We break out of the loop in the next statement, so why is this needed?
BTW, why do you need both "err" and "result"? Why not to use "result" everywhere?
break;
}
- }
- mbox_client_txdone(pcc_channel, result);
- return result;
+}
+static void cppc_chan_tx_done(struct mbox_client *cl, void *mssg, int ret) +{
- if (ret)
pr_debug("TX did not complete: CMD sent:%x, ret:%d\n",
*(u16 *)mssg, ret);
- else
pr_debug("TX completed. CMD sent:%x, ret:%d\n",
*(u16 *)mssg, ret);
It would be good to identify the client somehow in these messages. Otherwise they may not be quite useful.
+}
+struct mbox_client cppc_mbox_cl = {
- .tx_done = cppc_chan_tx_done,
- .knows_txdone = true,
+};
+static int acpi_get_psd(struct cpc_desc *cpc_ptr, acpi_handle handle) +{
- int result = 0;
- acpi_status status = AE_OK;
- struct acpi_buffer buffer = {ACPI_ALLOCATE_BUFFER, NULL};
- struct acpi_buffer format = {sizeof("NNNNN"), "NNNNN"};
- struct acpi_buffer state = {0, NULL};
- union acpi_object *psd = NULL;
- struct acpi_psd_package *pdomain;
- status = acpi_evaluate_object(handle, "_PSD", NULL, &buffer);
- if (ACPI_FAILURE(status))
return -ENODEV;
- psd = buffer.pointer;
- if (!psd || (psd->type != ACPI_TYPE_PACKAGE)) {
pr_err("Invalid _PSD data\n");
result = -ENODATA;
goto end;
- }
acpi_evaluate_object_typed() can be used here and then you save one "if".
- if (psd->package.count != 1) {
pr_err("Invalid _PSD data\n");
result = -ENODATA;
goto end;
- }
- pdomain = &(cpc_ptr->domain_info);
- state.length = sizeof(struct acpi_psd_package);
- state.pointer = pdomain;
So beyond this point, if there's an error, you always set "result" to -ENODATA. Why not to set it to -ENODATA upfront and then reset it to 0 on success only? That would save you a bunch of statements.
- status = acpi_extract_package(&(psd->package.elements[0]),
&format, &state);
- if (ACPI_FAILURE(status)) {
pr_err("Invalid _PSD data\n");
Why is that error priority and what can users see from the error message?
Same pretty much everywhere below?
result = -ENODATA;
goto end;
- }
- if (pdomain->num_entries != ACPI_PSD_REV0_ENTRIES) {
pr_err("Unknown _PSD:num_entries\n");
result = -ENODATA;
goto end;
- }
- if (pdomain->revision != ACPI_PSD_REV0_REVISION) {
pr_err("Unknown _PSD:revision\n");
result = -ENODATA;
goto end;
- }
- if (pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ALL &&
pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ANY &&
pdomain->coord_type != DOMAIN_COORD_TYPE_HW_ALL) {
pr_err("Invalid _PSD:coord_type\n");
result = -ENODATA;
goto end;
- }
+end:
- kfree(buffer.pointer);
- return result;
+}
+int acpi_get_psd_map(struct cpudata **all_cpu_data) +{
- int count_target;
- int retval = 0;
- unsigned int i, j;
- cpumask_var_t covered_cpus;
- struct cpudata *pr, *match_pr;
- struct acpi_psd_package *pdomain;
- struct acpi_psd_package *match_pdomain;
- struct cpc_desc *cpc_ptr, *match_cpc_ptr;
- if (!zalloc_cpumask_var(&covered_cpus, GFP_KERNEL))
return -ENOMEM;
- /*
* Now that we have _PSD data from all CPUs, lets setup P-state
* domain info.
*/
- for_each_possible_cpu(i) {
pr = all_cpu_data[i];
if (!pr)
continue;
if (cpumask_test_cpu(i, covered_cpus))
continue;
cpc_ptr = per_cpu(cpc_desc_ptr, i);
if (!cpc_ptr)
continue;
Well, is this actually safe? What if we have CPPC control for some CPUs in a domain only?
pdomain = &(cpc_ptr->domain_info);
cpumask_set_cpu(i, pr->shared_cpu_map);
cpumask_set_cpu(i, covered_cpus);
if (pdomain->num_processors <= 1)
continue;
/* Validate the Domain info */
count_target = pdomain->num_processors;
if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ALL)
pr->shared_type = CPUFREQ_SHARED_TYPE_ALL;
else if (pdomain->coord_type == DOMAIN_COORD_TYPE_HW_ALL)
pr->shared_type = CPUFREQ_SHARED_TYPE_HW;
else if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ANY)
pr->shared_type = CPUFREQ_SHARED_TYPE_ANY;
for_each_possible_cpu(j) {
if (i == j)
continue;
match_cpc_ptr = per_cpu(cpc_desc_ptr, j);
if (!match_cpc_ptr)
continue;
match_pdomain = &(match_cpc_ptr->domain_info);
if (match_pdomain->domain != pdomain->domain)
continue;
/* Here i and j are in the same domain */
if (match_pdomain->num_processors != count_target) {
retval = -EINVAL;
So we do bail out here, so why don't we bail out on any errors? Why do we silently ignore some of them (like NULL cpc_ptr above)?
goto err_ret;
}
if (pdomain->coord_type != match_pdomain->coord_type) {
retval = -EINVAL;
goto err_ret;
}
cpumask_set_cpu(j, covered_cpus);
cpumask_set_cpu(j, pr->shared_cpu_map);
}
for_each_possible_cpu(j) {
Why do we need a separate loop over all CPUs for this? Could not the loops be combined?
if (i == j)
continue;
match_pr = all_cpu_data[j];
if (!match_pr)
continue;
match_cpc_ptr = per_cpu(cpc_desc_ptr, j);
if (!match_cpc_ptr)
continue;
match_pdomain = &(match_cpc_ptr->domain_info);
if (match_pdomain->domain != pdomain->domain)
continue;
match_pr->shared_type = pr->shared_type;
cpumask_copy(match_pr->shared_cpu_map,
pr->shared_cpu_map);
}
- }
+err_ret:
- for_each_possible_cpu(i) {
pr = all_cpu_data[i];
if (!pr)
continue;
/* Assume no coordination on any error parsing domain info */
if (retval) {
cpumask_clear(pr->shared_cpu_map);
cpumask_set_cpu(i, pr->shared_cpu_map);
pr->shared_type = CPUFREQ_SHARED_TYPE_ALL;
}
- }
- free_cpumask_var(covered_cpus);
- return retval;
+} +EXPORT_SYMBOL_GPL(acpi_get_psd_map);
+static int register_pcc_channel(unsigned pcc_subspace_idx) +{
- struct acpi_pcct_subspace *cppc_ss;
- unsigned int len;
- if (pcc_subspace_idx >= 0) {
I'd check the reverse (ie. < 0) here and return immediately if that's the case.
pcc_channel = pcc_mbox_request_channel(&cppc_mbox_cl,
pcc_subspace_idx);
if (IS_ERR(pcc_channel)) {
pr_err("No PCC communication channel found\n");
return -ENODEV;
}
/*
* The PCC mailbox controller driver should
* have parsed the PCCT (global table of all
* PCC channels) and stored pointers to the
* subspace communication region in con_priv.
*/
cppc_ss = pcc_channel->con_priv;
if (!cppc_ss) {
pr_err("No PCC subspace found for CPPC\n");
return -ENODEV;
}
/*
* This is the shared communication region
* for the OS and Platform to communicate over.
*/
comm_base_addr = cppc_ss->base_address;
len = cppc_ss->length;
pcc_cmd_delay = cppc_ss->min_turnaround_time;
pcc_comm_addr = ioremap(comm_base_addr, len);
if (!pcc_comm_addr) {
pr_err("Failed to ioremap PCC comm region mem\n");
return -ENOMEM;
}
/* Set flag so that we dont come here for each CPU. */
pcc_channel_acquired = 1;
Should pcc_channel_acquired be a bool variable rather?
- } else
/*
* For the case where registers are not defined as PCC regs.
* Assuming all regs are FFH / SystemIO.
*/
pr_debug("No PCC subspace detected in any CPC entries.\n");
- return 0;
+}
+/**
- acpi_cppc_processor_probe - The _CPC table is a per CPU table
One line description here, please.
- which a bunch of entries which may be registers or integers.
Move the example to a separate comment above the kerneldoc.
- An example table looks like the following.
- Name(_CPC, Package()
{
17,
NumEntries
1,
// Revision
ResourceTemplate(){Register(PCC, 32, 0, 0x120, 2)},
// Highest Performance
ResourceTemplate(){Register(PCC, 32, 0, 0x124, 2)},
// Nominal Performance
ResourceTemplate(){Register(PCC, 32, 0, 0x128, 2)},
// Lowest Nonlinear Performance
ResourceTemplate(){Register(PCC, 32, 0, 0x12C, 2)},
// Lowest Performance
ResourceTemplate(){Register(PCC, 32, 0, 0x130, 2)},
// Guaranteed Performance Register
ResourceTemplate(){Register(PCC, 32, 0, 0x110, 2)},
// Desired Performance Register
ResourceTemplate(){Register(SystemMemory, 0, 0, 0, 0)},
..
..
..
}
- Each Register() encodes how to access that specific register.
- e.g. a sample PCC entry has the following encoding:
- Register (
PCC,
AddressSpaceKeyword
8,
//RegisterBitWidth
8,
//RegisterBitOffset
0x30,
//RegisterAddress
9
//AccessSize (subspace ID)
0
)
}
- This function walks through all the per CPU _CPC entries and extracts
- the Register details.
- Return: 0 for success or negative value for err.
And the argument needs to be documented in the kerneldoc too.
- */
+int acpi_cppc_processor_probe(struct acpi_processor *pr) +{
- struct acpi_buffer output = {ACPI_ALLOCATE_BUFFER, NULL};
- union acpi_object *out_obj, *cpc_obj;
- struct cpc_desc *cpc_ptr;
- struct cpc_reg *gas_t;
- acpi_handle handle = pr->handle;
- unsigned int num_ent, i, cpc_rev, ret = 0;
- acpi_status status;
- /* Parse the ACPI _CPC table for this cpu. */
- if (!acpi_has_method(handle, "_CPC")) {
pr_debug("_CPC table not found\n");
ret = -ENODEV;
goto out_buf_free;
- }
You don't need to do the above (the below will fail if _CPC is not present) and I'm not sure if the debug message is worth it.
- status = acpi_evaluate_object(handle, "_CPC", NULL, &output);
- if (ACPI_FAILURE(status)) {
ret = -ENODEV;
goto out_buf_free;
- }
- out_obj = (union acpi_object *) output.pointer;
- if (out_obj->type != ACPI_TYPE_PACKAGE) {
ret = -ENODEV;
goto out_buf_free;
- }
Again, acpi_evaluate_object_typed() would save you one branch.
- cpc_ptr = kzalloc(sizeof(struct cpc_desc), GFP_KERNEL);
- if (!cpc_ptr)
return -ENOMEM;
- /* First entry is NumEntries. */
- cpc_obj = &out_obj->package.elements[0];
- if (cpc_obj->type == ACPI_TYPE_INTEGER) {
num_ent = cpc_obj->integer.value;
- } else {
pr_debug("Unexpected entry type(%d) for NumEntries\n",
cpc_obj->type);
goto out_free;
- }
- /* Only support CPPCv2. Bail otherwise. */
- if (num_ent != CPPC_NUM_ENT) {
pr_err("Firmware exports %d entries. Expected: %d\n",
num_ent, CPPC_NUM_ENT);
ret = -EINVAL;
Why -EINVAL? It doesn't mean "invalid argument" surely?
goto out_free;
- }
- /* Second entry should be revision. */
- cpc_obj = &out_obj->package.elements[1];
- if (cpc_obj->type == ACPI_TYPE_INTEGER) {
cpc_rev = cpc_obj->integer.value;
- } else {
pr_debug("Unexpected entry type(%d) for Revision\n",
cpc_obj->type);
goto out_free;
- }
- if (cpc_rev != CPPC_REV) {
pr_err("Firmware exports revision:%d. Expected:%d\n",
cpc_rev, CPPC_REV);
goto out_free;
- }
- /* Iterate through remaining entries in _CPC */
- for (i = 2; i < num_ent; i++) {
cpc_obj = &out_obj->package.elements[i];
if (cpc_obj->type == ACPI_TYPE_INTEGER) {
cpc_ptr->cpc_regs[i-2].type =
ACPI_TYPE_INTEGER;
cpc_ptr->cpc_regs[i-2].cpc_entry.int_value =
cpc_obj->integer.value;
} else if (cpc_obj->type == ACPI_TYPE_BUFFER) {
gas_t = (struct cpc_reg *)
cpc_obj->buffer.pointer;
/*
* The PCC Subspace index is encoded inside
* the CPC table entries. The same PCC index
* will be used for all the PCC entries,
* so extract it only once.
*/
if (gas_t->space_id ==
ACPI_ADR_SPACE_PLATFORM_COMM) {
Please don't break lines like this. I know that it'll be more than 80 chars, but that's OK. Or if you really care, you can move that code to a helper function.
if (pcc_subspace_idx < 0)
pcc_subspace_idx =
gas_t->access_width;
else if (pcc_subspace_idx !=
gas_t->access_width) {
/*
* Mismatched PCC id detected.
* Firmware bug.
*/
goto out_free;
}
}
cpc_ptr->cpc_regs[i-2].type =
ACPI_TYPE_BUFFER;
cpc_ptr->cpc_regs[i-2].cpc_entry.reg =
(struct cpc_reg) {
.space_id = gas_t->space_id,
.length = gas_t->length,
.bit_width = gas_t->bit_width,
.bit_offset = gas_t->bit_offset,
.address = gas_t->address,
.access_width =
gas_t->access_width,
Why don't you use memcpy() for copying this?
};
} else {
pr_debug("Error in entry:%d in CPC table.\n", i);
ret = -EINVAL;
goto out_free;
}
- }
- /* Plug it into this CPUs CPC descriptor. */
- per_cpu(cpc_desc_ptr, pr->id) = cpc_ptr;
- /* Parse PSD data for this CPU */
- ret = acpi_get_psd(cpc_ptr, handle);
- if (ret)
goto out_free;
- /* Register PCC channel once for all CPUs. */
- if (!pcc_channel_acquired) {
ret = register_pcc_channel(pcc_subspace_idx);
So here's a question: What if pcc_subspace_idx for the new CPU is different from the one we've registered the channel with?
Also, is this guaranteed to be run sequentially for all of the different CPUs?
If not, what if they race with each other here and the channel is registered twice as a result?
if (ret)
goto out_free;
- }
- /* Everything looks okay */
- pr_info("Successfully parsed CPC struct for CPU: %d\n", pr->id);
- kfree(output.pointer);
- return 0;
+out_free:
- cpc_ptr = per_cpu(cpc_desc_ptr, pr->id);
- kfree(cpc_ptr);
+out_buf_free:
- kfree(output.pointer);
- return -ENODEV;
+} +EXPORT_SYMBOL_GPL(acpi_cppc_processor_probe);
+static u64 cpc_trans(struct cpc_register_resource *reg, int cmd, u64 write_val,
bool is_pcc)
+{
- u64 addr;
- u64 read_val = 0;
- /* PCC communication addr space begins at byte offset 0x8. */
- addr = is_pcc ? (u64)pcc_comm_addr + 0x8 + reg->cpc_entry.reg.address :
reg->cpc_entry.reg.address;
Move the above to a separate function and document the formula.
- if (reg->type == ACPI_TYPE_BUFFER) {
Quite a bit of code duplication below. Any chance to reduce it?
switch (reg->cpc_entry.reg.bit_width) {
case 8:
if (cmd == CMD_READ)
read_val = readb((void *) (addr));
else if (cmd == CMD_WRITE)
writeb(write_val, (void *)(addr));
else
pr_debug("Unsupported cmd type: %d\n", cmd);
break;
case 16:
if (cmd == CMD_READ)
read_val = readw((void *) (addr));
else if (cmd == CMD_WRITE)
writew(write_val, (void *)(addr));
else
pr_debug("Unsupported cmd type: %d\n", cmd);
break;
case 32:
if (cmd == CMD_READ)
read_val = readl((void *) (addr));
else if (cmd == CMD_WRITE)
writel(write_val, (void *)(addr));
else
pr_debug("Unsupported cmd type: %d\n", cmd);
break;
case 64:
if (cmd == CMD_READ)
read_val = readq((void *) (addr));
else if (cmd == CMD_WRITE)
writeq(write_val, (void *)(addr));
else
pr_debug("Unsupported cmd type: %d\n", cmd);
break;
default:
pr_debug("Unsupported bit width for CPC cmd:%d\n",
cmd);
break;
}
- } else if (reg->type == ACPI_TYPE_INTEGER) {
if (cmd == CMD_READ)
read_val = reg->cpc_entry.int_value;
else if (cmd == CMD_WRITE)
reg->cpc_entry.int_value = write_val;
else
pr_debug("Unsupported cmd type: %d\n", cmd);
- } else
pr_debug("Unsupported CPC entry type:%d\n", reg->type);
- return read_val;
+}
+/**
- cppc_get_perf_caps - Get a CPUs performance capabilities.
- @cpunum: CPU from which to get capabilities info.
- @perf_caps: ptr to cppc_perf_caps. See cppc_acpi.h
- Return - 0 for success with perf_caps populated else
- -ERRNO.
- */
+int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps) +{
- struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum);
- struct cpc_register_resource *highest_reg, *lowest_reg, *ref_perf,
*nom_perf;
- u64 min, max, ref, nom;
- bool is_pcc = false;
- int ret;
- if (!cpc_desc) {
pr_debug("No CPC descriptor for CPU:%d\n", cpunum);
return -ENODEV;
- }
- highest_reg = &cpc_desc->cpc_regs[HIGHEST_PERF];
- lowest_reg = &cpc_desc->cpc_regs[LOWEST_PERF];
- ref_perf = &cpc_desc->cpc_regs[REFERENCE_PERF];
- nom_perf = &cpc_desc->cpc_regs[NOMINAL_PERF];
- spin_lock(&pcc_lock);
Are we only going to acquire this spinlock from IRQ context of from process context or from both? If from both, what prevents deadlocks from happening if the below is interrupted and the interrupt context attempts to acquire the lock?
- /* Are any of the regs PCC ?*/
- if ((highest_reg->cpc_entry.reg.space_id ==
ACPI_ADR_SPACE_PLATFORM_COMM) ||
(lowest_reg->cpc_entry.reg.space_id ==
ACPI_ADR_SPACE_PLATFORM_COMM) ||
(ref_perf->cpc_entry.reg.space_id ==
ACPI_ADR_SPACE_PLATFORM_COMM) ||
(nom_perf->cpc_entry.reg.space_id ==
ACPI_ADR_SPACE_PLATFORM_COMM))
is_pcc = true;
- if (is_pcc) {
/*
* Min time OS should wait before sending
* next command.
*/
udelay(pcc_cmd_delay);
/* Ring doorbell */
ret = send_pcc_cmd(CMD_READ);
if (ret) {
spin_unlock(&pcc_lock);
return -EIO;
}
- }
- max = cpc_trans(highest_reg, CMD_READ, 0, is_pcc);
- perf_caps->highest_perf = max;
- min = cpc_trans(lowest_reg, CMD_READ, 0, is_pcc);
- perf_caps->lowest_perf = min;
- ref = cpc_trans(ref_perf, CMD_READ, 0, is_pcc);
- perf_caps->reference_perf = ref;
- nom = cpc_trans(nom_perf, CMD_READ, 0, is_pcc);
- perf_caps->nominal_perf = nom;
- if (!ref)
perf_caps->reference_perf = perf_caps->nominal_perf;
- spin_unlock(&pcc_lock);
- if (!perf_caps->highest_perf ||
!perf_caps->lowest_perf ||
!perf_caps->reference_perf ||
!perf_caps->nominal_perf) {
return -EINVAL;
Again, why -EINVAL?
- }
- return 0;
+} +EXPORT_SYMBOL_GPL(cppc_get_perf_caps);
+/**
- cppc_get_perf_ctrs - Read a CPUs performance feedback counters.
- @cpunum: CPU from which to read counters.
- @perf_fb_ctrs: ptr to cppc_perf_fb_ctrs. See cppc_acpi.h
- Return - 0 for success with perf_fb_ctrs populated else
- -ERRNO.
- */
+int cppc_get_perf_ctrs(int cpunum, struct cppc_perf_fb_ctrs *perf_fb_ctrs) +{
- struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum);
- struct cpc_register_resource *delivered_reg, *reference_reg;
- u64 delivered, reference;
- bool is_pcc = false;
- int ret;
- if (!cpc_desc) {
pr_debug("No CPC descriptor for CPU:%d\n", cpunum);
return -ENODEV;
- }
- delivered_reg = &cpc_desc->cpc_regs[DELIVERED_CTR];
- reference_reg = &cpc_desc->cpc_regs[REFERENCE_CTR];
- spin_lock(&pcc_lock);
- /* Are any of the regs PCC ?*/
- if ((delivered_reg->cpc_entry.reg.space_id ==
ACPI_ADR_SPACE_PLATFORM_COMM) ||
(reference_reg->cpc_entry.reg.space_id ==
ACPI_ADR_SPACE_PLATFORM_COMM))
is_pcc = true;
- if (is_pcc) {
/*
* Min time OS should wait before sending
* next command.
*/
udelay(pcc_cmd_delay);
/* Ring doorbell */
ret = send_pcc_cmd(CMD_READ);
if (ret) {
spin_unlock(&pcc_lock);
return -EIO;
}
The above looks like some duplicated code. Any chance to move it into a separate routine and call from both places?
- }
- delivered = cpc_trans(delivered_reg, CMD_READ, 0, is_pcc);
- reference = cpc_trans(reference_reg, CMD_READ, 0, is_pcc);
- spin_unlock(&pcc_lock);
- if (!delivered || !reference)
return -EINVAL;
Why -EINVAL?
- perf_fb_ctrs->delivered = delivered;
- perf_fb_ctrs->reference = reference;
- perf_fb_ctrs->delivered -= perf_fb_ctrs->prev_delivered;
- perf_fb_ctrs->reference -= perf_fb_ctrs->prev_reference;
- perf_fb_ctrs->prev_delivered = delivered;
- perf_fb_ctrs->prev_reference = reference;
- return 0;
+} +EXPORT_SYMBOL_GPL(cppc_get_perf_ctrs);
+/**
- cppc_set_perf - Set a CPUs performance controls.
- @cpu: CPU for which to set performance controls.
- @perf_ctrls: ptr to cppc_perf_ctrls. See cppc_acpi.h
- Return: 0 for success, -ERRNO otherwise.
- */
+int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls) +{
- struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpu);
- struct cpc_register_resource *desired_reg;
- int ret = 0;
- bool is_pcc = false;
- if (!cpc_desc) {
pr_debug("No CPC descriptor for CPU:%d\n", cpu);
return -ENODEV;
- }
- desired_reg = &cpc_desc->cpc_regs[DESIRED_PERF];
- spin_lock(&pcc_lock);
- /* Is this a PCC reg ?*/
- if (desired_reg->cpc_entry.reg.space_id ==
ACPI_ADR_SPACE_PLATFORM_COMM)
is_pcc = true;
- cpc_trans(desired_reg, CMD_WRITE,
perf_ctrls->desired_perf, is_pcc);
- if (is_pcc) {
/*
* Min time OS should wait before sending
* next command.
*/
udelay(pcc_cmd_delay);
/* Ring doorbell */
ret = send_pcc_cmd(CMD_READ);
- }
- spin_unlock(&pcc_lock);
- return ret;
+} +EXPORT_SYMBOL_GPL(cppc_set_perf);
The header looks OK to me.
That's it for now, I need to move to other stuff probably for the rest of this week.
Thanks, Rafael
Hi Rafael, On 25 August 2015 at 21:46, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Wednesday, August 05, 2015 09:40:27 AM Ashwin Chaugule wrote:
CPPC stands for Collaborative Processor Performance Controls and is defined in the ACPI v5.0+ spec. It describes CPU performance controls on an abstract and continuous scale allowing the platform (e.g. remote power processor) to flexibly optimize CPU performance with its knowledge of power budgets and other architecture specific knowledge.
This patch adds a shim which exports commonly used functions to get and set CPPC specific controls for each CPU. This enables CPUFreq drivers to gather per CPU performance data and use with exisiting governors or even allows for customized governors which are implemented inside CPUFreq drivers.
Signed-off-by: Ashwin Chaugule ashwin.chaugule@linaro.org Reviewed-by: Al Stone al.stone@linaro.org
drivers/acpi/Kconfig | 14 + drivers/acpi/Makefile | 1 + drivers/acpi/cppc_acpi.c | 812 +++++++++++++++++++++++++++++++++++++++++++++++ include/acpi/cppc_acpi.h | 137 ++++++++ 4 files changed, 964 insertions(+) create mode 100644 drivers/acpi/cppc_acpi.c create mode 100644 include/acpi/cppc_acpi.h
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 54e9729..c6ec903 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -197,6 +197,20 @@ config ACPI_PROCESSOR_IDLE bool select CPU_IDLE
+config ACPI_CPPC_LIB
bool
depends on ACPI_PROCESSOR
depends on !ACPI_CPU_FREQ_PSS
select MAILBOX
select PCC
help
This file implements common functionality to parse
It's better to start with "If this option is enabled".
Done.
+/*
- CPPC (Collaborative Processor Performance Control) methods used
- by CPUfreq drivers.
One line please.
Done.
- Finer details about the PCC and CPPC spec are available in the latest
- ACPI 5.1 specification.
ACPI 5.1 is not the latest any more. I'd say "ACPI 6.0 or later" to be on the safe side.
Done.
+static DEFINE_PER_CPU(struct cpc_desc *, cpc_desc_ptr);
A description of what the per-CPU thing is and how it is used would be good to have here.
Done.
+/* This layer handles all the PCC specifics for CPPC. */ +static struct mbox_chan *pcc_channel; +static void __iomem *pcc_comm_addr; +static u64 comm_base_addr; +static int pcc_subspace_idx = -1; +static u16 pcc_cmd_delay; +static int pcc_channel_acquired;
+#define NUM_RETRIES 500
How did you get that number?
Loosely based on pcc-cpufreq.c which implements an out-of-ACPI-spec CPPC + PCC-ish driver. I added a comment now to describe what its for. In reality on silicon, we hope there's no more than a couple of retries at worst, but its hard to tell whats out there.
/* Retry in case the remote processor was too slow to catch up. */
while (retries--) {
It looks like this can be written as
for (retries = NUM_RETRIES; retries > 0; retries--) {
result = readw_relaxed(&generic_comm_base->status)
& PCC_CMD_COMPLETE ? 0 : -EIO;
I'm not sure why do you need the ternary operator here.
You could just do
if (readw_relaxed(&generic_comm_base->status) & PCC_CMD_COMPLETE) { result = 0; break; }
and set "result" to -EIO beforehand.
if (!result) {
/* Success. */
retries = NUM_RETRIES;
We break out of the loop in the next statement, so why is this needed?
BTW, why do you need both "err" and "result"? Why not to use "result" everywhere?
True. Done.
break;
}
}
mbox_client_txdone(pcc_channel, result);
return result;
+}
+static void cppc_chan_tx_done(struct mbox_client *cl, void *mssg, int ret) +{
if (ret)
pr_debug("TX did not complete: CMD sent:%x, ret:%d\n",
*(u16 *)mssg, ret);
else
pr_debug("TX completed. CMD sent:%x, ret:%d\n",
*(u16 *)mssg, ret);
It would be good to identify the client somehow in these messages. Otherwise they may not be quite useful.
For more details, I'd have to pack the CPU id in the PCC cmd field and unpack it here. But from the PCC point of view, CPPC as a whole is a client, so the pr_fmt prefix at least helps to identify it. Seemed helpful enough for debug so far.
psd = buffer.pointer;
if (!psd || (psd->type != ACPI_TYPE_PACKAGE)) {
pr_err("Invalid _PSD data\n");
result = -ENODATA;
goto end;
}
acpi_evaluate_object_typed() can be used here and then you save one "if".
Ok. I suppose it helps readability here, although that function has many more if's inside it. :)
if (psd->package.count != 1) {
pr_err("Invalid _PSD data\n");
result = -ENODATA;
goto end;
}
pdomain = &(cpc_ptr->domain_info);
state.length = sizeof(struct acpi_psd_package);
state.pointer = pdomain;
So beyond this point, if there's an error, you always set "result" to -ENODATA. Why not to set it to -ENODATA upfront and then reset it to 0 on success only? That would save you a bunch of statements.
True. Done.
status = acpi_extract_package(&(psd->package.elements[0]),
&format, &state);
if (ACPI_FAILURE(status)) {
pr_err("Invalid _PSD data\n");
Why is that error priority and what can users see from the error message?
Same pretty much everywhere below?
So, I ported all this PSD stuff over from processor_perflib.c assuming it "just works" there. FWIW I couldn't reuse that function since it is tied too closely to _PSS structures. This err would indicate the PSD package itself is screwed up, otherwise the errs below indicate specific entries within PSD could be wrong. I'll make them pr_debugs here though.
result = -ENODATA;
goto end;
}
if (pdomain->num_entries != ACPI_PSD_REV0_ENTRIES) {
pr_err("Unknown _PSD:num_entries\n");
result = -ENODATA;
goto end;
}
if (pdomain->revision != ACPI_PSD_REV0_REVISION) {
pr_err("Unknown _PSD:revision\n");
result = -ENODATA;
goto end;
}
if (pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ALL &&
pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ANY &&
pdomain->coord_type != DOMAIN_COORD_TYPE_HW_ALL) {
pr_err("Invalid _PSD:coord_type\n");
result = -ENODATA;
goto end;
}
+end:
kfree(buffer.pointer);
return result;
+}
+int acpi_get_psd_map(struct cpudata **all_cpu_data) +{
int count_target;
int retval = 0;
unsigned int i, j;
cpumask_var_t covered_cpus;
struct cpudata *pr, *match_pr;
struct acpi_psd_package *pdomain;
struct acpi_psd_package *match_pdomain;
struct cpc_desc *cpc_ptr, *match_cpc_ptr;
if (!zalloc_cpumask_var(&covered_cpus, GFP_KERNEL))
return -ENOMEM;
/*
* Now that we have _PSD data from all CPUs, lets setup P-state
* domain info.
*/
for_each_possible_cpu(i) {
pr = all_cpu_data[i];
if (!pr)
continue;
if (cpumask_test_cpu(i, covered_cpus))
continue;
cpc_ptr = per_cpu(cpc_desc_ptr, i);
if (!cpc_ptr)
continue;
Well, is this actually safe? What if we have CPPC control for some CPUs in a domain only?
I dont think thats possible since we can't have CPPC and any other scheme (e.g. PSS) actively running at the same time. Also in this case, IIUC there could be some CPUs in a domain that are present but not available at bootup so their cpc_desc ptr could be NULL.
pdomain = &(cpc_ptr->domain_info);
cpumask_set_cpu(i, pr->shared_cpu_map);
cpumask_set_cpu(i, covered_cpus);
if (pdomain->num_processors <= 1)
continue;
/* Validate the Domain info */
count_target = pdomain->num_processors;
if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ALL)
pr->shared_type = CPUFREQ_SHARED_TYPE_ALL;
else if (pdomain->coord_type == DOMAIN_COORD_TYPE_HW_ALL)
pr->shared_type = CPUFREQ_SHARED_TYPE_HW;
else if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ANY)
pr->shared_type = CPUFREQ_SHARED_TYPE_ANY;
for_each_possible_cpu(j) {
if (i == j)
continue;
match_cpc_ptr = per_cpu(cpc_desc_ptr, j);
if (!match_cpc_ptr)
continue;
match_pdomain = &(match_cpc_ptr->domain_info);
if (match_pdomain->domain != pdomain->domain)
continue;
/* Here i and j are in the same domain */
if (match_pdomain->num_processors != count_target) {
retval = -EINVAL;
So we do bail out here, so why don't we bail out on any errors? Why do we silently ignore some of them (like NULL cpc_ptr above)?
I think the idea is that you cant have a system with matching PSDs and mismatching entries within. processor_perflib.c has the same assumption.
goto err_ret;
}
if (pdomain->coord_type != match_pdomain->coord_type) {
retval = -EINVAL;
goto err_ret;
}
cpumask_set_cpu(j, covered_cpus);
cpumask_set_cpu(j, pr->shared_cpu_map);
}
for_each_possible_cpu(j) {
Why do we need a separate loop over all CPUs for this? Could not the loops be combined?
Without getting too fancy, I dont see how to avoid this O(n^2) looping.
+static int register_pcc_channel(unsigned pcc_subspace_idx) +{
struct acpi_pcct_subspace *cppc_ss;
unsigned int len;
if (pcc_subspace_idx >= 0) {
I'd check the reverse (ie. < 0) here and return immediately if that's the case.
Ok.
pcc_channel = pcc_mbox_request_channel(&cppc_mbox_cl,
pcc_subspace_idx);
if (IS_ERR(pcc_channel)) {
pr_err("No PCC communication channel found\n");
return -ENODEV;
}
/*
* The PCC mailbox controller driver should
* have parsed the PCCT (global table of all
* PCC channels) and stored pointers to the
* subspace communication region in con_priv.
*/
cppc_ss = pcc_channel->con_priv;
if (!cppc_ss) {
pr_err("No PCC subspace found for CPPC\n");
return -ENODEV;
}
/*
* This is the shared communication region
* for the OS and Platform to communicate over.
*/
comm_base_addr = cppc_ss->base_address;
len = cppc_ss->length;
pcc_cmd_delay = cppc_ss->min_turnaround_time;
pcc_comm_addr = ioremap(comm_base_addr, len);
if (!pcc_comm_addr) {
pr_err("Failed to ioremap PCC comm region mem\n");
return -ENOMEM;
}
/* Set flag so that we dont come here for each CPU. */
pcc_channel_acquired = 1;
Should pcc_channel_acquired be a bool variable rather?
Sure.
} else
/*
* For the case where registers are not defined as PCC regs.
* Assuming all regs are FFH / SystemIO.
*/
pr_debug("No PCC subspace detected in any CPC entries.\n");
return 0;
+}
+/**
- acpi_cppc_processor_probe - The _CPC table is a per CPU table
One line description here, please.
Done.
- which a bunch of entries which may be registers or integers.
Move the example to a separate comment above the kerneldoc.
Ok.
- This function walks through all the per CPU _CPC entries and extracts
- the Register details.
- Return: 0 for success or negative value for err.
And the argument needs to be documented in the kerneldoc too.
Gah! Right.
- */
+int acpi_cppc_processor_probe(struct acpi_processor *pr) +{
struct acpi_buffer output = {ACPI_ALLOCATE_BUFFER, NULL};
union acpi_object *out_obj, *cpc_obj;
struct cpc_desc *cpc_ptr;
struct cpc_reg *gas_t;
acpi_handle handle = pr->handle;
unsigned int num_ent, i, cpc_rev, ret = 0;
acpi_status status;
/* Parse the ACPI _CPC table for this cpu. */
if (!acpi_has_method(handle, "_CPC")) {
pr_debug("_CPC table not found\n");
ret = -ENODEV;
goto out_buf_free;
}
You don't need to do the above (the below will fail if _CPC is not present) and I'm not sure if the debug message is worth it.
Ok.
status = acpi_evaluate_object(handle, "_CPC", NULL, &output);
if (ACPI_FAILURE(status)) {
ret = -ENODEV;
goto out_buf_free;
}
out_obj = (union acpi_object *) output.pointer;
if (out_obj->type != ACPI_TYPE_PACKAGE) {
ret = -ENODEV;
goto out_buf_free;
}
Again, acpi_evaluate_object_typed() would save you one branch.
Ok.
/* Only support CPPCv2. Bail otherwise. */
if (num_ent != CPPC_NUM_ENT) {
pr_err("Firmware exports %d entries. Expected: %d\n",
num_ent, CPPC_NUM_ENT);
ret = -EINVAL;
Why -EINVAL? It doesn't mean "invalid argument" surely?
:) Changed to -EFAULT.
/*
* The PCC Subspace index is encoded inside
* the CPC table entries. The same PCC index
* will be used for all the PCC entries,
* so extract it only once.
*/
if (gas_t->space_id ==
ACPI_ADR_SPACE_PLATFORM_COMM) {
Please don't break lines like this. I know that it'll be more than 80 chars, but that's OK. Or if you really care, you can move that code to a helper function.
Works for me. Thanks.
if (pcc_subspace_idx < 0)
pcc_subspace_idx =
gas_t->access_width;
else if (pcc_subspace_idx !=
gas_t->access_width) {
/*
* Mismatched PCC id detected.
* Firmware bug.
*/
goto out_free;
}
}
cpc_ptr->cpc_regs[i-2].type =
ACPI_TYPE_BUFFER;
cpc_ptr->cpc_regs[i-2].cpc_entry.reg =
(struct cpc_reg) {
.space_id = gas_t->space_id,
.length = gas_t->length,
.bit_width = gas_t->bit_width,
.bit_offset = gas_t->bit_offset,
.address = gas_t->address,
.access_width =
gas_t->access_width,
Why don't you use memcpy() for copying this?
Will do. I think previously I had gas_t as a generic register type, which has a slightly different layout than the PCC register.
/* Register PCC channel once for all CPUs. */
if (!pcc_channel_acquired) {
ret = register_pcc_channel(pcc_subspace_idx);
So here's a question: What if pcc_subspace_idx for the new CPU is different from the one we've registered the channel with?
That would be a bug in the CPC tables. CPPC being one client of PCC is assigned only one PCC subspace, so all CPUs should have the same PCC subspace id. This is caught in the check above.
Also, is this guaranteed to be run sequentially for all of the different CPUs?
Yes. IIUC its called sequentially when the processor_driver detects a Processor object.
If not, what if they race with each other here and the channel is registered twice as a result?
I couldn't find a place in the ACPI boot flow where the Processor object probing could happen in parallel, but you're more familiar with this than me. :)
/* PCC communication addr space begins at byte offset 0x8. */
addr = is_pcc ? (u64)pcc_comm_addr + 0x8 + reg->cpc_entry.reg.address :
reg->cpc_entry.reg.address;
Move the above to a separate function and document the formula.
Done.
if (reg->type == ACPI_TYPE_BUFFER) {
Quite a bit of code duplication below. Any chance to reduce it?
Will rethink. Doubt I can avoid the switch-case though.
switch (reg->cpc_entry.reg.bit_width) {
case 8:
if (cmd == CMD_READ)
read_val = readb((void *) (addr));
else if (cmd == CMD_WRITE)
writeb(write_val, (void *)(addr));
else
pr_debug("Unsupported cmd type: %d\n", cmd);
break;
case 16:
if (cmd == CMD_READ)
read_val = readw((void *) (addr));
else if (cmd == CMD_WRITE)
writew(write_val, (void *)(addr));
else
pr_debug("Unsupported cmd type: %d\n", cmd);
break;
case 32:
if (cmd == CMD_READ)
read_val = readl((void *) (addr));
else if (cmd == CMD_WRITE)
writel(write_val, (void *)(addr));
else
pr_debug("Unsupported cmd type: %d\n", cmd);
break;
case 64:
if (cmd == CMD_READ)
read_val = readq((void *) (addr));
else if (cmd == CMD_WRITE)
writeq(write_val, (void *)(addr));
else
pr_debug("Unsupported cmd type: %d\n", cmd);
break;
default:
pr_debug("Unsupported bit width for CPC cmd:%d\n",
cmd);
break;
}
} else if (reg->type == ACPI_TYPE_INTEGER) {
if (cmd == CMD_READ)
read_val = reg->cpc_entry.int_value;
else if (cmd == CMD_WRITE)
reg->cpc_entry.int_value = write_val;
else
pr_debug("Unsupported cmd type: %d\n", cmd);
} else
pr_debug("Unsupported CPC entry type:%d\n", reg->type);
return read_val;
+}
+/**
- cppc_get_perf_caps - Get a CPUs performance capabilities.
- @cpunum: CPU from which to get capabilities info.
- @perf_caps: ptr to cppc_perf_caps. See cppc_acpi.h
- Return - 0 for success with perf_caps populated else
- -ERRNO.
- */
+int cppc_get_perf_caps(int cpunum, struct cppc_perf_caps *perf_caps) +{
struct cpc_desc *cpc_desc = per_cpu(cpc_desc_ptr, cpunum);
struct cpc_register_resource *highest_reg, *lowest_reg, *ref_perf,
*nom_perf;
u64 min, max, ref, nom;
bool is_pcc = false;
int ret;
if (!cpc_desc) {
pr_debug("No CPC descriptor for CPU:%d\n", cpunum);
return -ENODEV;
}
highest_reg = &cpc_desc->cpc_regs[HIGHEST_PERF];
lowest_reg = &cpc_desc->cpc_regs[LOWEST_PERF];
ref_perf = &cpc_desc->cpc_regs[REFERENCE_PERF];
nom_perf = &cpc_desc->cpc_regs[NOMINAL_PERF];
spin_lock(&pcc_lock);
Are we only going to acquire this spinlock from IRQ context of from process context or from both? If from both, what prevents deadlocks from happening if the below is interrupted and the interrupt context attempts to acquire the lock?
IIUC Process context only. Looking around at other cpufreq drivers, (e.g. pcc-cpufreq.c) I dont think the deadlock is a possibility here either.
if (!perf_caps->highest_perf ||
!perf_caps->lowest_perf ||
!perf_caps->reference_perf ||
!perf_caps->nominal_perf) {
return -EINVAL;
Again, why -EINVAL?
Changed to -EFAULT.
if (is_pcc) {
/*
* Min time OS should wait before sending
* next command.
*/
udelay(pcc_cmd_delay);
/* Ring doorbell */
ret = send_pcc_cmd(CMD_READ);
if (ret) {
spin_unlock(&pcc_lock);
return -EIO;
}
The above looks like some duplicated code. Any chance to move it into a separate routine and call from both places?
Yep. Done.
if (!delivered || !reference)
return -EINVAL;
Why -EINVAL?
:) Changed to -EFAULT.
The header looks OK to me.
Great!
That's it for now, I need to move to other stuff probably for the rest of this week.
Thanks for the follow up! I'll update this patch and resend for review sometime next week.
Regards, Ashwin.
This driver utilizes the methods introduced in the previous patch - "ACPI: Introduce CPU performance controls using CPPC" and enables usage with existing CPUFreq governors.
Signed-off-by: Ashwin Chaugule ashwin.chaugule@linaro.org Reviewed-by: Al Stone al.stone@linaro.org --- drivers/cpufreq/Kconfig.arm | 17 ++++ drivers/cpufreq/Makefile | 2 + drivers/cpufreq/cppc_cpufreq.c | 197 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 216 insertions(+) create mode 100644 drivers/cpufreq/cppc_cpufreq.c
diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm index cc8a71c..e8bb41b 100644 --- a/drivers/cpufreq/Kconfig.arm +++ b/drivers/cpufreq/Kconfig.arm @@ -261,3 +261,20 @@ config ARM_PXA2xx_CPUFREQ This add the CPUFreq driver support for Intel PXA2xx SOCs.
If in doubt, say N. + +config ACPI_CPPC_CPUFREQ + tristate "CPUFreq driver based on the ACPI CPPC spec" + depends on ACPI + select ACPI_CPPC_LIB + default n + help + This adds a CPUFreq driver which uses CPPC methods + as described in the ACPIv5.1 spec. CPPC stands for + Collaborative Processor Performance Controls. It + is based on an abstract continuous scale of CPU + performance values which allows the remote power + processor to flexibly optimize for power and + performance. CPPC relies on power management firmware + support for its operation. + + If in doubt, say N. diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile index 2169bf7..676660e 100644 --- a/drivers/cpufreq/Makefile +++ b/drivers/cpufreq/Makefile @@ -78,6 +78,8 @@ obj-$(CONFIG_ARM_SA1110_CPUFREQ) += sa1110-cpufreq.o obj-$(CONFIG_ARM_SPEAR_CPUFREQ) += spear-cpufreq.o obj-$(CONFIG_ARM_TEGRA_CPUFREQ) += tegra-cpufreq.o obj-$(CONFIG_ARM_VEXPRESS_SPC_CPUFREQ) += vexpress-spc-cpufreq.o +obj-$(CONFIG_ACPI_CPPC_CPUFREQ) += cppc_cpufreq.o +
################################################################################## # PowerPC platform drivers diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c new file mode 100644 index 0000000..0933d4f --- /dev/null +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -0,0 +1,197 @@ +/* + * CPPC (Collaborative Processor Performance Control) driver for + * interfacing with the CPUfreq layer and governors. See + * cppc_acpi.c for CPPC specific methods. + * + * (C) Copyright 2014, 2015 Linaro Ltd. + * Author: Ashwin Chaugule ashwin.chaugule@linaro.org + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; version 2 + * of the License. + */ + +#define pr_fmt(fmt) "CPPC Cpufreq:" fmt + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/delay.h> +#include <linux/cpu.h> +#include <linux/cpufreq.h> +#include <linux/vmalloc.h> + +#include <acpi/cppc_acpi.h> + +static struct cpudata **all_cpu_data; + +static int cppc_cpufreq_set_target(struct cpufreq_policy *policy, + unsigned int target_freq, + unsigned int relation) +{ + struct cpudata *cpu; + struct cpufreq_freqs freqs; + int ret; + + cpu = all_cpu_data[policy->cpu]; + + cpu->perf_ctrls.desired_perf = target_freq; + freqs.old = policy->cur; + freqs.new = target_freq; + + cpufreq_freq_transition_begin(policy, &freqs); + ret = cppc_set_perf(cpu->cpu, &cpu->perf_ctrls); + cpufreq_freq_transition_end(policy, &freqs, ret != 0); + + if (ret) { + pr_debug("Failed to set target on CPU:%d. ret:%d\n", + cpu->cpu, ret); + return -EINVAL; + } + + return 0; +} + +static unsigned int cppc_cpufreq_get_perf(unsigned int cpu_num) +{ + struct cpudata *cpu; + int32_t delivered_perf = 1; + int ret; + + cpu = all_cpu_data[cpu_num]; + if (!cpu) + return 0; + + ret = cppc_get_perf_ctrs(cpu_num, &cpu->perf_fb_ctrs); + + if (ret) { + pr_err("Err reading CPU%d, perf counters. ret:%d\n", + cpu->cpu, ret); + return -ENODEV; + } + delivered_perf = cpu->perf_caps.reference_perf * + cpu->perf_fb_ctrs.delivered; + delivered_perf /= cpu->perf_fb_ctrs.reference; + pr_debug("delivered_perf: %d\n", delivered_perf); + pr_debug("reference_perf: %d\n", + cpu->perf_caps.reference_perf); + pr_debug("delivered, reference deltas: %lld, %lld\n", + cpu->perf_fb_ctrs.delivered, + cpu->perf_fb_ctrs.reference); + + return delivered_perf; +} + +static int cppc_verify_policy(struct cpufreq_policy *policy) +{ + cpufreq_verify_within_cpu_limits(policy); + return 0; +} + +static void cppc_cpufreq_stop_cpu(struct cpufreq_policy *policy) +{ + int cpu_num = policy->cpu; + struct cpudata *cpu = all_cpu_data[cpu_num]; + int ret; + + pr_info("CPU %d exiting\n", cpu_num); + + cpu->perf_ctrls.desired_perf = cpu->perf_caps.lowest_perf; + + ret = cppc_set_perf(cpu_num, &cpu->perf_ctrls); + if (ret) + pr_err("Err setting perf value:%d on CPU:%d. ret:%d\n", + cpu->perf_caps.lowest_perf, cpu_num, ret); +} + +static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy) +{ + struct cpudata *cpu; + int ret; + + cpu = all_cpu_data[policy->cpu]; + + cpu->cpu = policy->cpu; + ret = cppc_get_perf_caps(policy->cpu, &cpu->perf_caps); + + if (ret) { + pr_err("Err reading CPU%d, perf capabilities. ret:%d\n", + cpu->cpu, ret); + return -ENODEV; + } + + policy->min = cpu->perf_caps.lowest_perf; + policy->max = cpu->perf_caps.highest_perf; + /* cpuinfo and default policy values */ + policy->cpuinfo.min_freq = cpu->perf_caps.lowest_perf; + policy->cpuinfo.max_freq = cpu->perf_caps.highest_perf; + + if (policy->shared_type == CPUFREQ_SHARED_TYPE_ALL || + policy->shared_type == CPUFREQ_SHARED_TYPE_ANY) + cpumask_copy(policy->cpus, cpu->shared_cpu_map); + + cpumask_set_cpu(policy->cpu, policy->cpus); + cpu->cur_policy = policy; + + pr_debug("Initialized on cpu:%d\n", cpu->cpu); + return 0; +} + +static struct cpufreq_driver cppc_cpufreq_driver = { + .flags = CPUFREQ_CONST_LOOPS, + .verify = cppc_verify_policy, + .target = cppc_cpufreq_set_target, + .get = cppc_cpufreq_get_perf, + .init = cppc_cpufreq_cpu_init, + .stop_cpu = cppc_cpufreq_stop_cpu, + .name = "cppc_cpufreq", +}; + +static int __init cppc_cpufreq_init(void) +{ + int i, rc = 0; + struct cpudata *cpu; + + if (acpi_disabled) + return -ENODEV; + + all_cpu_data = vzalloc(sizeof(void *) * num_possible_cpus()); + if (!all_cpu_data) + return -ENOMEM; + + for_each_possible_cpu(i) { + all_cpu_data[i] = kzalloc(sizeof(struct cpudata), GFP_KERNEL); + if (!all_cpu_data[i]) + goto out; + + cpu = all_cpu_data[i]; + if (!zalloc_cpumask_var(&cpu->shared_cpu_map, + GFP_KERNEL)) { + pr_debug("No memory for shared_cpus cpumask\n"); + goto out; + } + } + + rc = acpi_get_psd_map(all_cpu_data); + if (rc) { + pr_err("Error parsing PSD data\n"); + goto out; + } + + rc = cpufreq_register_driver(&cppc_cpufreq_driver); + if (rc) + goto out; + + pr_info("Registration completed\n"); + return rc; + +out: + for_each_possible_cpu(i) + if (all_cpu_data[i]) + kfree(all_cpu_data[i]); + + vfree(all_cpu_data); + return -ENODEV; +} + +late_initcall(cppc_cpufreq_init);
Add weak functions for architectures which do not support hot-adding and removing CPUs which aren't detected at bootup. (e.g. via MADT).
This helps preserve the Kconfig dependency from:
commit cbfc1bae55bb ("[ACPI] ACPI_HOTPLUG_CPU Kconfig dependency update")
prevent:
HOTPLUG_CPU=y ACPI_PROCESSOR=y ACPI_HOTPLUG_CPU=n
Signed-off-by: Ashwin Chaugule ashwin.chaugule@linaro.org --- drivers/acpi/acpi_processor.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 92a5f73..e1e0285 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -164,6 +164,24 @@ static int acpi_processor_errata(void) -------------------------------------------------------------------------- */
#ifdef CONFIG_ACPI_HOTPLUG_CPU +int __weak acpi_map_cpu(acpi_handle handle, + phys_cpuid_t physid, int *pcpu) +{ + return -ENODEV; +} + +int __weak acpi_unmap_cpu(int cpu) +{ + return -ENODEV; +} + +int __weak arch_register_cpu(int cpu) +{ + return -ENODEV; +} + +void __weak arch_unregister_cpu(int cpu) {} + static int acpi_processor_hotadd_init(struct acpi_processor *pr) { unsigned long long sta;
For each detected ACPI Processor object (ACPI0007), search its device handle for CPPC specific tables (i.e. _CPC) and extract CPU specific performance capabilities.
Signed-off-by: Ashwin Chaugule ashwin.chaugule@linaro.org Reviewed-by: Al Stone al.stone@linaro.org --- drivers/acpi/processor_driver.c | 4 ++++ include/acpi/processor.h | 9 +++++++++ 2 files changed, 13 insertions(+)
diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c index 16d44ad..ac3dd51 100644 --- a/drivers/acpi/processor_driver.c +++ b/drivers/acpi/processor_driver.c @@ -246,6 +246,10 @@ static int __acpi_processor_start(struct acpi_device *device) if (pr->flags.need_hotplug_init) return 0;
+ result = acpi_cppc_processor_probe(pr); + if (result) + return -ENODEV; + if (!cpuidle_get_driver() || cpuidle_get_driver() == &acpi_idle_driver) acpi_processor_power_init(pr);
diff --git a/include/acpi/processor.h b/include/acpi/processor.h index 2c4e7a9..9b3977f 100644 --- a/include/acpi/processor.h +++ b/include/acpi/processor.h @@ -314,6 +314,15 @@ phys_cpuid_t acpi_get_phys_id(acpi_handle, int type, u32 acpi_id); int acpi_map_cpuid(phys_cpuid_t phys_id, u32 acpi_id); int acpi_get_cpuid(acpi_handle, int type, u32 acpi_id);
+#ifdef CONFIG_ACPI_CPPC_LIB +extern int acpi_cppc_processor_probe(struct acpi_processor *pr); +#else +static inline int acpi_cppc_processor_probe(struct acpi_processor *pr) +{ + return 0; +} +#endif /* CONFIG_ACPI_CPPC_LIB */ + /* in processor_pdc.c */ void acpi_processor_set_pdc(acpi_handle handle);
PCC is made selectable only by clients which use it. e.g. CPPC Default it to disabled so that it is not included accidentally on platforms which dont use it.
Signed-off-by: Ashwin Chaugule ashwin.chaugule@linaro.org Reviewed-by: Al Stone al.stone@linaro.org --- drivers/mailbox/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/mailbox/Kconfig b/drivers/mailbox/Kconfig index e269f08..bbec500 100644 --- a/drivers/mailbox/Kconfig +++ b/drivers/mailbox/Kconfig @@ -46,6 +46,7 @@ config OMAP_MBOX_KFIFO_SIZE config PCC bool "Platform Communication Channel Driver" depends on ACPI + default n help ACPI 5.0+ spec defines a generic mode of communication between the OS and a platform such as the BMC. This medium
Now that the ACPI processor driver has been decoupled from the C states and P states functionality, make it selectable on ARM64 so that it can be used by others e.g. CPPC.
The C states and P states code is selected only on X86 or IA64 until the relevant support is added on ARM64.
Signed-off-by: Ashwin Chaugule ashwin.chaugule@linaro.org --- drivers/acpi/Kconfig | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index c6ec903..9787172 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -213,9 +213,9 @@ config ACPI_CPPC_LIB
config ACPI_PROCESSOR tristate "Processor" - depends on X86 || IA64 - select ACPI_PROCESSOR_IDLE - select ACPI_CPU_FREQ_PSS + depends on X86 || IA64 || ARM64 + select ACPI_PROCESSOR_IDLE if X86 || IA64 + select ACPI_CPU_FREQ_PSS if X86 || IA64 default y help This driver adds support for the ACPI Processor package. It is required
On Wednesday, August 05, 2015 09:40:23 AM Ashwin Chaugule wrote:
CPPC:
CPPC (Collaborative Processor Performance Control) is a new way to control CPU performance using an abstract continous scale as against a discretized P-state scale which is tied to CPU frequency only. It is defined in the ACPI 5.0+ spec. In brief, the basic operation involves:
OS makes a CPU performance request. (Can provide min and max tolerable bounds)
Platform (such as BMC) is free to optimize request within requested bounds depending
on power/thermal budgets etc.
- Platform conveys its decision back to OS
The communication between OS and platform occurs through another medium called (PCC) Platform communication Channel. This is a generic mailbox like mechanism which includes doorbell semantics to indicate register updates. See drivers/mailbox/pcc.c
This patchset introduces a CPPC based CPUFreq driver that works with existing governors such as ondemand. The CPPC table parsing and the CPPC communication semantics are abstracted into separate files to allow future CPPC based drivers to implement their own governors if required.
Initial patchsets included an adaptation of the PID governor from intel_pstate.c. However recent experiments led to extensive modifications of the algorithm to calculate CPU busyness. Until it is verified that these changes are worthwhile, the existing governors should provide for a good enough starting point for ARM64 servers.
Finer details about the PCC and CPPC spec are available in the latest ACPI 5.1 specification.[2]
Testing:
This was tested on an SBSA compatible ARMv8 server with CPPCv2 firmware running on a remote processor. I verified that each CPUs performance limits were detected and that new performance requests were made by the on-demand governor proportional to the load on each CPU. I also verified that using the acpi_processor driver correctly maps the physical CPU ids to logical CPU ids, which helps in picking up the proper _CPC details from a processor object, in the case where CPU physical ids may not be contiguous.
Changes since V7:
- Simplied new kconfig options for PSS and idle.
- Separated patch to enable acpi processor on ARM64.
- Removed redundant kconfig cross deps on PCC.
- Decoupled processor_perflib from new PSS kconfig option.
Changes since V6:
- Separated PSS and CST from ACPI processor driver in two patches.
- Made new Kconfig symbols auto selectable from Arch Kconfigs.
Changes since V5:
- Checkpatch cleanups.
- Change pss_init to pss_perf_init. Rec by Srinivas Pandruvada.
- Explicit comment explaining why postcore_initcall to pcc mailbox.
- Fold acpi_processor_syscore_init/exit into CONFIG_ACPI_CST.
- Added patch with dummy functions used by ACPI_HOTPLUG_CPU.
Changes since V4:
- Misc cleanups. Addressed feedback from Rafael.
- Made acpi_processor.c independent of C-states, P-states and others.
- Per CPU scanning for _CPC is now made from acpi_processor.c
- Added new Kconfig options for legacy C states and P states to enable future
support for newer alternatives as defined in the ACPI spec 6.0.
Changes since V3:
- Split CPPC backend methods into separate files.
- Add frontend driver which plugs into existing CPUfreq governors.
- Simplify PCC driver by moving communication space mapping and read/write into client drivers.
Changes since V2:
- Select driver if !X86, since intel_pstate will use HWP extensions instead.
- Added more comments.
- Added Freq domain awareness and PSD parsing.
Changes since V1:
- Create a new driver based on Dirks suggestion.
- Fold in CPPC backend hooks into main driver.
Changes since V0: [1]
- Split intel_pstate.c into a generic PID governor and platform specific backend.
- Add CPPC accessors as PID backend.
[1] - http://lwn.net/Articles/608715/ [2] - http://www.uefi.org/sites/default/files/resources/ACPI_5_1release.pdf [3] - https://patches.linaro.org/40705/
Ashwin Chaugule (9): PCC: Initialize PCC Mailbox earlier at boot ACPI: Split out ACPI PSS from ACPI Processor driver ACPI: Decouple ACPI idle and ACPI processor drivers ACPI: Introduce CPU performance controls using CPPC CPPC: Add a CPUFreq driver for use with CPPC ACPI: Add weak routines for ACPI CPU Hotplug CPPC: Probe for CPPC tables for each ACPI Processor object PCC: Disable compilation by default ACPI: Allow selection of the ACPI processor driver for ARM64
I've queued up [1-3/9] for 4.3, but I still have a couple of questions/comments regarding [4/9] and the rest of the series (I'll respond to the patch messages with those).
Thanks!
On 25 August 2015 at 20:24, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Wednesday, August 05, 2015 09:40:23 AM Ashwin Chaugule wrote:
CPPC:
CPPC (Collaborative Processor Performance Control) is a new way to control CPU performance using an abstract continous scale as against a discretized P-state scale which is tied to CPU frequency only. It is defined in the ACPI 5.0+ spec. In brief, the basic operation involves:
OS makes a CPU performance request. (Can provide min and max tolerable bounds)
Platform (such as BMC) is free to optimize request within requested bounds depending
on power/thermal budgets etc.
- Platform conveys its decision back to OS
The communication between OS and platform occurs through another medium called (PCC) Platform communication Channel. This is a generic mailbox like mechanism which includes doorbell semantics to indicate register updates. See drivers/mailbox/pcc.c
This patchset introduces a CPPC based CPUFreq driver that works with existing governors such as ondemand. The CPPC table parsing and the CPPC communication semantics are abstracted into separate files to allow future CPPC based drivers to implement their own governors if required.
Initial patchsets included an adaptation of the PID governor from intel_pstate.c. However recent experiments led to extensive modifications of the algorithm to calculate CPU busyness. Until it is verified that these changes are worthwhile, the existing governors should provide for a good enough starting point for ARM64 servers.
Finer details about the PCC and CPPC spec are available in the latest ACPI 5.1 specification.[2]
Testing:
This was tested on an SBSA compatible ARMv8 server with CPPCv2 firmware running on a remote processor. I verified that each CPUs performance limits were detected and that new performance requests were made by the on-demand governor proportional to the load on each CPU. I also verified that using the acpi_processor driver correctly maps the physical CPU ids to logical CPU ids, which helps in picking up the proper _CPC details from a processor object, in the case where CPU physical ids may not be contiguous.
Changes since V7:
- Simplied new kconfig options for PSS and idle.
- Separated patch to enable acpi processor on ARM64.
- Removed redundant kconfig cross deps on PCC.
- Decoupled processor_perflib from new PSS kconfig option.
Changes since V6:
- Separated PSS and CST from ACPI processor driver in two patches.
- Made new Kconfig symbols auto selectable from Arch Kconfigs.
Changes since V5:
- Checkpatch cleanups.
- Change pss_init to pss_perf_init. Rec by Srinivas Pandruvada.
- Explicit comment explaining why postcore_initcall to pcc mailbox.
- Fold acpi_processor_syscore_init/exit into CONFIG_ACPI_CST.
- Added patch with dummy functions used by ACPI_HOTPLUG_CPU.
Changes since V4:
- Misc cleanups. Addressed feedback from Rafael.
- Made acpi_processor.c independent of C-states, P-states and others.
- Per CPU scanning for _CPC is now made from acpi_processor.c
- Added new Kconfig options for legacy C states and P states to enable future
support for newer alternatives as defined in the ACPI spec 6.0.
Changes since V3:
- Split CPPC backend methods into separate files.
- Add frontend driver which plugs into existing CPUfreq governors.
- Simplify PCC driver by moving communication space mapping and read/write into client drivers.
Changes since V2:
- Select driver if !X86, since intel_pstate will use HWP extensions instead.
- Added more comments.
- Added Freq domain awareness and PSD parsing.
Changes since V1:
- Create a new driver based on Dirks suggestion.
- Fold in CPPC backend hooks into main driver.
Changes since V0: [1]
- Split intel_pstate.c into a generic PID governor and platform specific backend.
- Add CPPC accessors as PID backend.
[1] - http://lwn.net/Articles/608715/ [2] - http://www.uefi.org/sites/default/files/resources/ACPI_5_1release.pdf [3] - https://patches.linaro.org/40705/
Ashwin Chaugule (9): PCC: Initialize PCC Mailbox earlier at boot ACPI: Split out ACPI PSS from ACPI Processor driver ACPI: Decouple ACPI idle and ACPI processor drivers ACPI: Introduce CPU performance controls using CPPC CPPC: Add a CPUFreq driver for use with CPPC ACPI: Add weak routines for ACPI CPU Hotplug CPPC: Probe for CPPC tables for each ACPI Processor object PCC: Disable compilation by default ACPI: Allow selection of the ACPI processor driver for ARM64
I've queued up [1-3/9] for 4.3, but I still have a couple of questions/comments regarding [4/9] and the rest of the series (I'll respond to the patch messages with those).
Thanks! Would you mind taking [8/9] too? It just defaults PCC to disabled.
Cheers, Ashwin.
On Tuesday, August 25, 2015 08:03:06 PM Ashwin Chaugule wrote:
On 25 August 2015 at 20:24, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Wednesday, August 05, 2015 09:40:23 AM Ashwin Chaugule wrote:
CPPC:
CPPC (Collaborative Processor Performance Control) is a new way to control CPU performance using an abstract continous scale as against a discretized P-state scale which is tied to CPU frequency only. It is defined in the ACPI 5.0+ spec. In brief, the basic operation involves:
OS makes a CPU performance request. (Can provide min and max tolerable bounds)
Platform (such as BMC) is free to optimize request within requested bounds depending
on power/thermal budgets etc.
- Platform conveys its decision back to OS
The communication between OS and platform occurs through another medium called (PCC) Platform communication Channel. This is a generic mailbox like mechanism which includes doorbell semantics to indicate register updates. See drivers/mailbox/pcc.c
This patchset introduces a CPPC based CPUFreq driver that works with existing governors such as ondemand. The CPPC table parsing and the CPPC communication semantics are abstracted into separate files to allow future CPPC based drivers to implement their own governors if required.
Initial patchsets included an adaptation of the PID governor from intel_pstate.c. However recent experiments led to extensive modifications of the algorithm to calculate CPU busyness. Until it is verified that these changes are worthwhile, the existing governors should provide for a good enough starting point for ARM64 servers.
Finer details about the PCC and CPPC spec are available in the latest ACPI 5.1 specification.[2]
Testing:
This was tested on an SBSA compatible ARMv8 server with CPPCv2 firmware running on a remote processor. I verified that each CPUs performance limits were detected and that new performance requests were made by the on-demand governor proportional to the load on each CPU. I also verified that using the acpi_processor driver correctly maps the physical CPU ids to logical CPU ids, which helps in picking up the proper _CPC details from a processor object, in the case where CPU physical ids may not be contiguous.
Changes since V7:
- Simplied new kconfig options for PSS and idle.
- Separated patch to enable acpi processor on ARM64.
- Removed redundant kconfig cross deps on PCC.
- Decoupled processor_perflib from new PSS kconfig option.
Changes since V6:
- Separated PSS and CST from ACPI processor driver in two patches.
- Made new Kconfig symbols auto selectable from Arch Kconfigs.
Changes since V5:
- Checkpatch cleanups.
- Change pss_init to pss_perf_init. Rec by Srinivas Pandruvada.
- Explicit comment explaining why postcore_initcall to pcc mailbox.
- Fold acpi_processor_syscore_init/exit into CONFIG_ACPI_CST.
- Added patch with dummy functions used by ACPI_HOTPLUG_CPU.
Changes since V4:
- Misc cleanups. Addressed feedback from Rafael.
- Made acpi_processor.c independent of C-states, P-states and others.
- Per CPU scanning for _CPC is now made from acpi_processor.c
- Added new Kconfig options for legacy C states and P states to enable future
support for newer alternatives as defined in the ACPI spec 6.0.
Changes since V3:
- Split CPPC backend methods into separate files.
- Add frontend driver which plugs into existing CPUfreq governors.
- Simplify PCC driver by moving communication space mapping and read/write into client drivers.
Changes since V2:
- Select driver if !X86, since intel_pstate will use HWP extensions instead.
- Added more comments.
- Added Freq domain awareness and PSD parsing.
Changes since V1:
- Create a new driver based on Dirks suggestion.
- Fold in CPPC backend hooks into main driver.
Changes since V0: [1]
- Split intel_pstate.c into a generic PID governor and platform specific backend.
- Add CPPC accessors as PID backend.
[1] - http://lwn.net/Articles/608715/ [2] - http://www.uefi.org/sites/default/files/resources/ACPI_5_1release.pdf [3] - https://patches.linaro.org/40705/
Ashwin Chaugule (9): PCC: Initialize PCC Mailbox earlier at boot ACPI: Split out ACPI PSS from ACPI Processor driver ACPI: Decouple ACPI idle and ACPI processor drivers ACPI: Introduce CPU performance controls using CPPC CPPC: Add a CPUFreq driver for use with CPPC ACPI: Add weak routines for ACPI CPU Hotplug CPPC: Probe for CPPC tables for each ACPI Processor object PCC: Disable compilation by default ACPI: Allow selection of the ACPI processor driver for ARM64
I've queued up [1-3/9] for 4.3, but I still have a couple of questions/comments regarding [4/9] and the rest of the series (I'll respond to the patch messages with those).
Thanks! Would you mind taking [8/9] too? It just defaults PCC to disabled.
OK, I'll do that.
Thanks, Rafael
On 25 August 2015 at 20:46, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Tuesday, August 25, 2015 08:03:06 PM Ashwin Chaugule wrote:
On 25 August 2015 at 20:24, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Wednesday, August 05, 2015 09:40:23 AM Ashwin Chaugule wrote:
CPPC:
CPPC (Collaborative Processor Performance Control) is a new way to control CPU performance using an abstract continous scale as against a discretized P-state scale which is tied to CPU frequency only. It is defined in the ACPI 5.0+ spec. In brief, the basic operation involves:
OS makes a CPU performance request. (Can provide min and max tolerable bounds)
Platform (such as BMC) is free to optimize request within requested bounds depending
on power/thermal budgets etc.
- Platform conveys its decision back to OS
The communication between OS and platform occurs through another medium called (PCC) Platform communication Channel. This is a generic mailbox like mechanism which includes doorbell semantics to indicate register updates. See drivers/mailbox/pcc.c
This patchset introduces a CPPC based CPUFreq driver that works with existing governors such as ondemand. The CPPC table parsing and the CPPC communication semantics are abstracted into separate files to allow future CPPC based drivers to implement their own governors if required.
Initial patchsets included an adaptation of the PID governor from intel_pstate.c. However recent experiments led to extensive modifications of the algorithm to calculate CPU busyness. Until it is verified that these changes are worthwhile, the existing governors should provide for a good enough starting point for ARM64 servers.
Finer details about the PCC and CPPC spec are available in the latest ACPI 5.1 specification.[2]
Testing:
This was tested on an SBSA compatible ARMv8 server with CPPCv2 firmware running on a remote processor. I verified that each CPUs performance limits were detected and that new performance requests were made by the on-demand governor proportional to the load on each CPU. I also verified that using the acpi_processor driver correctly maps the physical CPU ids to logical CPU ids, which helps in picking up the proper _CPC details from a processor object, in the case where CPU physical ids may not be contiguous.
Changes since V7:
- Simplied new kconfig options for PSS and idle.
- Separated patch to enable acpi processor on ARM64.
- Removed redundant kconfig cross deps on PCC.
- Decoupled processor_perflib from new PSS kconfig option.
Changes since V6:
- Separated PSS and CST from ACPI processor driver in two patches.
- Made new Kconfig symbols auto selectable from Arch Kconfigs.
Changes since V5:
- Checkpatch cleanups.
- Change pss_init to pss_perf_init. Rec by Srinivas Pandruvada.
- Explicit comment explaining why postcore_initcall to pcc mailbox.
- Fold acpi_processor_syscore_init/exit into CONFIG_ACPI_CST.
- Added patch with dummy functions used by ACPI_HOTPLUG_CPU.
Changes since V4:
- Misc cleanups. Addressed feedback from Rafael.
- Made acpi_processor.c independent of C-states, P-states and others.
- Per CPU scanning for _CPC is now made from acpi_processor.c
- Added new Kconfig options for legacy C states and P states to enable future
support for newer alternatives as defined in the ACPI spec 6.0.
Changes since V3:
- Split CPPC backend methods into separate files.
- Add frontend driver which plugs into existing CPUfreq governors.
- Simplify PCC driver by moving communication space mapping and read/write into client drivers.
Changes since V2:
- Select driver if !X86, since intel_pstate will use HWP extensions instead.
- Added more comments.
- Added Freq domain awareness and PSD parsing.
Changes since V1:
- Create a new driver based on Dirks suggestion.
- Fold in CPPC backend hooks into main driver.
Changes since V0: [1]
- Split intel_pstate.c into a generic PID governor and platform specific backend.
- Add CPPC accessors as PID backend.
[1] - http://lwn.net/Articles/608715/ [2] - http://www.uefi.org/sites/default/files/resources/ACPI_5_1release.pdf [3] - https://patches.linaro.org/40705/
Ashwin Chaugule (9): PCC: Initialize PCC Mailbox earlier at boot ACPI: Split out ACPI PSS from ACPI Processor driver ACPI: Decouple ACPI idle and ACPI processor drivers ACPI: Introduce CPU performance controls using CPPC CPPC: Add a CPUFreq driver for use with CPPC ACPI: Add weak routines for ACPI CPU Hotplug CPPC: Probe for CPPC tables for each ACPI Processor object PCC: Disable compilation by default ACPI: Allow selection of the ACPI processor driver for ARM64
I've queued up [1-3/9] for 4.3, but I still have a couple of questions/comments regarding [4/9] and the rest of the series (I'll respond to the patch messages with those).
Thanks! Would you mind taking [8/9] too? It just defaults PCC to disabled.
OK, I'll do that.
Very much appreciated!