This patchset modifies the GIC driver to allow it, on supported platforms, to route IPI interrupts to FIQ and uses this feature to implement arch_trigger_all_cpu_backtrace for arm.
On platforms not capable of supporting FIQ the signal to generate a backtrace we fall back to using IRQ for propagation instead (relying on a timeout to avoid wedging the CPU requesting the backtrace if other CPUs are not responsive).
It has been tested on two systems capable of supporting grouping (Freescale i.MX6 and STiH416) and two that do not (vexpress-a9 and Qualcomm Snapdragon 600).
Changes since v7:
* Fixed boot regression on vexpress-a9 (reported by Russell King).
* Rebased on v3.18-rc3; removed one patch from set that is already included in mainline.
* Dropped arm64/fiq.h patch from the set (still useful but not related to issuing backtraces).
Changes since v6:
* Re-arranged code within the patch series to fix a regression introduced midway through the series and corrected by a later patch (testing by Olof's autobuilder). Tested offending patch in isolation using defconfig identified by the autobuilder.
Changes since v5:
* Renamed svc_entry's call_trace argument to just trace (example code from Russell King).
* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell King).
* Modified usr_entry to optional avoid calling into the trace code and used this in FIQ entry from usr path. Modified corresponding exit code to avoid calling into trace code and the scheduler (example code from Russell King).
* Ensured the default FIQ register state is restored when the default FIQ handler is reinstalled (example code from Russell King).
* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting a default FIQ handler.
* Re-instated fiq_safe_migration_lock and associated logic in gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd() in the console unlock logic.
Changes since v4:
* Rebased on 3.17-rc4.
* Removed a spurious line from the final "glue it together" patch that broke the build.
Changes since v3:
* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas Pitre).
* Really fix bad pt_regs pointer generation in __fiq_abt.
* Remove fiq_safe_migration_lock and associated logic in gic_raise_softirq() (review of Russell King)
* Restructured to introduce the default FIQ handler first, before the new features (review of Russell King).
Changes since v2:
* Removed redundant header guards from arch/arm64/include/asm/fiq.h (review of Catalin Marinas).
* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas Pitre).
Changes since v1:
* Restructured to sit nicely on a similar FYI patchset from Russell King. It now effectively replaces the work in progress final patch with something much more complete.
* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq (review of Nicolas Pitre)
* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts being acknowledged by the IRQ handler does still exist but should be harmless because the IRQ handler will still wind up calling ipi_cpu_backtrace().
* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively becomes a platform feature (although the use of non-maskable interrupts to implement it is best effort rather than guaranteed).
* Better comments highlighting usage of RAZ/WI registers (and parts of registers) in the GIC code.
Changes *before* v1:
* This patchset is a hugely cut-down successor to "[PATCH v11 00/19] arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting the new structure. For historic details see: https://lkml.org/lkml/2014/9/2/227
* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value). In fixing this we also remove the useless indirection previously found in the fiq_handler macro.
* Make default fiq handler "always on" by migrating from fiq.c to traps.c and replace do_unexp_fiq with the new handler (review of Russell King).
* Add arm64 version of fiq.h (review of Russell King)
* Removed conditional branching and code from irq-gic.c, this is replaced by much simpler code that relies on the GIC specification's heavy use of read-as-zero/write-ignored (review of Russell King)
Daniel Thompson (4): irqchip: gic: Make gic_raise_softirq() FIQ-safe irqchip: gic: Introduce plumbing for IPI FIQ ARM: add basic support for on-demand backtrace of other CPUs arm: smp: Handle ipi_cpu_backtrace() using FIQ (if available)
arch/arm/include/asm/irq.h | 5 ++ arch/arm/include/asm/smp.h | 3 + arch/arm/kernel/smp.c | 64 +++++++++++++++ arch/arm/kernel/traps.c | 8 +- drivers/irqchip/irq-gic.c | 171 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 ++ 6 files changed, 246 insertions(+), 13 deletions(-)
-- 1.9.3
Currently calling printk() from a FIQ can result in deadlock on irq_controller_lock within gic_raise_softirq(). This occurs because printk(), which is otherwise structured to survive calls from FIQ/NMI, calls this function to raise an IPI when it needs to wake_up_klogd().
This patch fixes the problem by introducing an additional rwlock and using that to prevent softirqs being raised whilst the b.L switcher is updating the cpu map.
Other parts of the code are not updated to use the new fiq_safe_cpu_map_lock because other users of gic_cpu_map either rely on external locking or upon irq_controller_lock. Both locks are held by the b.L switcher code.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff..0db62a6 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,13 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock);
/* + * This lock may be locked for reading by FIQ handlers. Thus although + * read locking may be used liberally, write locking must only take + * place only when local FIQ handling is disabled. + */ +static DEFINE_RWLOCK(fiq_safe_cpu_map_lock); + +/* * The GIC mapping of CPU interfaces does not necessarily match * the logical CPU numbering. Let's use a mapping as returned * by the GIC itself. @@ -624,7 +631,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags); + read_lock_irqsave(&fiq_safe_cpu_map_lock, flags);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -639,7 +646,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&irq_controller_lock, flags); + read_unlock_irqrestore(&fiq_safe_cpu_map_lock, flags); } #endif
@@ -687,7 +694,7 @@ int gic_get_cpu_id(unsigned int cpu) * Migrate all peripheral interrupts with a target matching the current CPU * to the interface corresponding to @new_cpu_id. The CPU interface mapping * is also updated. Targets to other CPU interfaces are unchanged. - * This must be called with IRQs locally disabled. + * This must be called with IRQ and FIQ locally disabled. */ void gic_migrate_target(unsigned int new_cpu_id) { @@ -709,6 +716,7 @@ void gic_migrate_target(unsigned int new_cpu_id) ror_val = (cur_cpu_id - new_cpu_id) & 31;
raw_spin_lock(&irq_controller_lock); + write_lock(&fiq_safe_cpu_map_lock);
/* Update the target interface for this logical CPU */ gic_cpu_map[cpu] = 1 << new_cpu_id; @@ -728,6 +736,7 @@ void gic_migrate_target(unsigned int new_cpu_id) } }
+ write_unlock(&fiq_safe_cpu_map_lock); raw_spin_unlock(&irq_controller_lock);
/*
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org --- arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 156 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 159 insertions(+), 10 deletions(-)
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 0c8b108..4dc45b3 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -26,6 +26,7 @@ #include <linux/init.h> #include <linux/sched.h> #include <linux/irq.h> +#include <linux/irqchip/arm-gic.h>
#include <linux/atomic.h> #include <asm/cacheflush.h> @@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
nmi_enter();
- /* nop. FIQ handlers for special arch/arm features can be added here. */ +#ifdef CONFIG_ARM_GIC + gic_handle_fiq_ipi(); +#endif
nmi_exit();
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 0db62a6..6bc08d6 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -39,8 +39,10 @@ #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/ratelimit.h>
#include <asm/cputype.h> +#include <asm/fiq.h> #include <asm/irq.h> #include <asm/exception.h> #include <asm/smp_plat.h> @@ -48,6 +50,10 @@ #include "irq-gic-common.h" #include "irqchip.h"
+#ifndef SMP_IPI_FIQ_MASK +#define SMP_IPI_FIQ_MASK 0 +#endif + union gic_base { void __iomem *common_base; void __percpu * __iomem *percpu_base; @@ -331,6 +337,93 @@ static struct irq_chip gic_chip = { .irq_set_wake = gic_set_wake, };
+/* + * Shift an interrupt between Group 0 and Group 1. + * + * In addition to changing the group we also modify the priority to + * match what "ARM strongly recommends" for a system where no Group 1 + * interrupt must ever preempt a Group 0 interrupt. + * + * If is safe to call this function on systems which do not support + * grouping (it will have no effect). + */ +static void gic_set_group_irq(void __iomem *base, unsigned int hwirq, + int group) +{ + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_mask = BIT(hwirq % 32); + u32 grp_val; + + unsigned int pri_reg = (hwirq / 4) * 4; + u32 pri_mask = BIT(7 + ((hwirq % 4) * 8)); + u32 pri_val; + + /* + * Systems which do not support grouping will have not have + * the EnableGrp1 bit set. + */ + if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))) + return; + + raw_spin_lock(&irq_controller_lock); + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg); + + if (group) { + grp_val |= grp_mask; + pri_val |= pri_mask; + } else { + grp_val &= ~grp_mask; + pri_val &= ~pri_mask; + } + + writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg); + writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg); + + raw_spin_unlock(&irq_controller_lock); +} + +/* + * Test which group an interrupt belongs to. + * + * Returns 0 if the controller does not support grouping. + */ +static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) +{ + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_val; + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + + return (grp_val >> (hwirq % 32)) & 1; +} + +/* + * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI, + * otherwise do nothing. + */ +void gic_handle_fiq_ipi(void) +{ + struct gic_chip_data *gic = &gic_data[0]; + void __iomem *cpu_base = gic_data_cpu_base(gic); + unsigned long irqstat, irqnr; + + if (WARN_ON(!in_nmi())) + return; + + while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) & + SMP_IPI_FIQ_MASK) { + irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK); + writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); + + irqnr = irqstat & GICC_IAR_INT_ID_MASK; + WARN_RATELIMIT(irqnr > 16, + "Unexpected irqnr %lu (bad prioritization?)\n", + irqnr); + } +} + void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq) { if (gic_nr >= MAX_GIC_NR) @@ -362,15 +455,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic) static void gic_cpu_if_up(void) { void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]); - u32 bypass = 0; + void __iomem *dist_base = gic_data_dist_base(&gic_data[0]); + u32 ctrl = 0;
/* - * Preserve bypass disable bits to be written back later - */ - bypass = readl(cpu_base + GIC_CPU_CTRL); - bypass &= GICC_DIS_BYPASS_MASK; + * Preserve bypass disable bits to be written back later + */ + ctrl = readl(cpu_base + GIC_CPU_CTRL); + ctrl &= GICC_DIS_BYPASS_MASK;
- writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); + /* + * If EnableGrp1 is set in the distributor then enable group 1 + * support for this CPU (and route group 0 interrupts to FIQ). + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) + ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL | + GICC_ENABLE_GRP1; + + writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); }
@@ -394,7 +496,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
gic_dist_config(base, gic_irqs, NULL);
- writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL); + /* + * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only, + * bit 1 ignored) depending on current mode. + */ + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL); + + /* + * Set all global interrupts to be group 1 if (and only if) it + * is possible to enable group 1 interrupts. This register is RAZ/WI + * if not accessible or not implemented, however some GICv1 devices + * do not implement the EnableGrp1 bit making it unsafe to set + * this register unconditionally. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)) + for (i = 32; i < gic_irqs; i += 32) + writel_relaxed(0xffffffff, + base + GIC_DIST_IGROUP + i * 4 / 32); }
static void gic_cpu_init(struct gic_chip_data *gic) @@ -403,6 +521,7 @@ static void gic_cpu_init(struct gic_chip_data *gic) void __iomem *base = gic_data_cpu_base(gic); unsigned int cpu_mask, cpu = smp_processor_id(); int i; + unsigned long secure_irqs, secure_irq;
/* * Get what the GIC says our CPU mask is. @@ -421,6 +540,19 @@ static void gic_cpu_init(struct gic_chip_data *gic)
gic_cpu_config(dist_base, NULL);
+ /* + * If the distributor is configured to support interrupt grouping + * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK + * to be group1 and ensure any remaining group 0 interrupts have + * the right priority. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) { + secure_irqs = SMP_IPI_FIQ_MASK; + writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0); + for_each_set_bit(secure_irq, &secure_irqs, 16) + gic_set_group_irq(dist_base, secure_irq, 0); + } + writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK); gic_cpu_if_up(); } @@ -510,7 +642,8 @@ static void gic_dist_restore(unsigned int gic_nr) writel_relaxed(gic_data[gic_nr].saved_spi_enable[i], dist_base + GIC_DIST_ENABLE_SET + i * 4);
- writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL); + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, + dist_base + GIC_DIST_CTRL); }
static void gic_cpu_save(unsigned int gic_nr) @@ -630,6 +763,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; unsigned long flags, map = 0; + unsigned long softint;
read_lock_irqsave(&fiq_safe_cpu_map_lock, flags);
@@ -644,7 +778,11 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) dmb(ishst);
/* this always happens on GIC0 */ - writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); + softint = map << 16 | irq; + if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) + softint |= 0x8000; + writel_relaxed(softint, + gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
read_unlock_irqrestore(&fiq_safe_cpu_map_lock, flags); } diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h index 13eed92..a906fb7 100644 --- a/include/linux/irqchip/arm-gic.h +++ b/include/linux/irqchip/arm-gic.h @@ -22,6 +22,10 @@ #define GIC_CPU_IDENT 0xfc
#define GICC_ENABLE 0x1 +#define GICC_ENABLE_GRP1 0x2 +#define GICC_ACK_CTL 0x4 +#define GICC_FIQ_EN 0x8 +#define GICC_COMMON_BPR 0x10 #define GICC_INT_PRI_THRESHOLD 0xf0 #define GICC_IAR_INT_ID_MASK 0x3ff #define GICC_INT_SPURIOUS 1023 @@ -44,6 +48,7 @@ #define GIC_DIST_SGI_PENDING_SET 0xf20
#define GICD_ENABLE 0x1 +#define GICD_ENABLE_GRP1 0x2 #define GICD_DISABLE 0x0 #define GICD_INT_ACTLOW_LVLTRIG 0x0 #define GICD_INT_EN_CLR_X32 0xffffffff @@ -117,5 +122,8 @@ static inline void __init register_routable_domain_ops { gic_routable_irq_domain_ops = ops; } + +void gic_handle_fiq_ipi(void); + #endif /* __ASSEMBLY */ #endif
Add basic infrastructure for triggering a backtrace of other CPUs via an IPI, preferably at FIQ level. It is intended that this shall be used for cases where we have detected that something has already failed in the kernel.
Signed-off-by: Russell King rmk+kernel@arm.linux.org.uk Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- arch/arm/include/asm/irq.h | 5 ++++ arch/arm/kernel/smp.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+)
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h index 53c15de..be1d07d 100644 --- a/arch/arm/include/asm/irq.h +++ b/arch/arm/include/asm/irq.h @@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *); extern void set_handle_irq(void (*handle_irq)(struct pt_regs *)); #endif
+#ifdef CONFIG_SMP +extern void arch_trigger_all_cpu_backtrace(bool); +#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x) +#endif + #endif
#endif diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 13396d3..14c594a 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -72,8 +72,12 @@ enum ipi_msg_type { IPI_CPU_STOP, IPI_IRQ_WORK, IPI_COMPLETION, + IPI_CPU_BACKTRACE, };
+/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; + static DECLARE_COMPLETION(cpu_running);
static struct smp_operations smp_ops; @@ -535,6 +539,21 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
+static void ipi_cpu_backtrace(struct pt_regs *regs) +{ + int cpu = smp_processor_id(); + + if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { + static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED; + + arch_spin_lock(&lock); + printk(KERN_WARNING "FIQ backtrace for cpu %d\n", cpu); + show_regs(regs); + arch_spin_unlock(&lock); + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + } +} + static DEFINE_PER_CPU(struct completion *, cpu_completion);
int register_ipi_completion(struct completion *completion, int cpu) @@ -614,6 +633,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs) irq_exit(); break;
+ case IPI_CPU_BACKTRACE: + irq_enter(); + ipi_cpu_backtrace(regs); + irq_exit(); + break; + default: printk(KERN_CRIT "CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); @@ -708,3 +733,40 @@ static int __init register_cpufreq_notifier(void) core_initcall(register_cpufreq_notifier);
#endif + +void arch_trigger_all_cpu_backtrace(bool include_self) +{ + static unsigned long backtrace_flag; + int i, cpu = get_cpu(); + + if (test_and_set_bit(0, &backtrace_flag)) { + /* + * If there is already a trigger_all_cpu_backtrace() in progress + * (backtrace_flag == 1), don't output double cpu dump infos. + */ + put_cpu(); + return; + } + + cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask); + if (!include_self) + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + + if (!cpumask_empty(to_cpumask(backtrace_mask))) { + pr_info("Sending FIQ to %s CPUs:\n", + (include_self ? "all" : "other")); + smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE); + } + + /* Wait for up to 10 seconds for all CPUs to do the backtrace */ + for (i = 0; i < 10 * 1000; i++) { + if (cpumask_empty(to_cpumask(backtrace_mask))) + break; + + mdelay(1); + } + + clear_bit(0, &backtrace_flag); + smp_mb__after_atomic(); + put_cpu(); +}
Previous changes have introduced both a replacement default FIQ handler and an implementation of arch_trigger_all_cpu_backtrace for ARM but these are currently independent of each other.
This patch plumbs together these features making it possible, on platforms that support it, to trigger backtrace using FIQ.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- arch/arm/include/asm/smp.h | 3 +++ arch/arm/kernel/smp.c | 4 +++- arch/arm/kernel/traps.c | 3 +++ 3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h index 18f5a55..b076584 100644 --- a/arch/arm/include/asm/smp.h +++ b/arch/arm/include/asm/smp.h @@ -18,6 +18,8 @@ # error "<asm/smp.h> included in non-SMP build" #endif
+#define SMP_IPI_FIQ_MASK 0x0100 + #define raw_smp_processor_id() (current_thread_info()->cpu)
struct seq_file; @@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu); extern void arch_send_call_function_ipi_mask(const struct cpumask *mask); extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
+extern void ipi_cpu_backtrace(struct pt_regs *regs); extern int register_ipi_completion(struct completion *completion, int cpu);
struct smp_operations { diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 14c594a..e923843 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -539,7 +539,7 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
-static void ipi_cpu_backtrace(struct pt_regs *regs) +void ipi_cpu_backtrace(struct pt_regs *regs) { int cpu = smp_processor_id();
@@ -580,6 +580,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs) unsigned int cpu = smp_processor_id(); struct pt_regs *old_regs = set_irq_regs(regs);
+ BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE)); + if ((unsigned)ipinr < NR_IPI) { trace_ipi_entry(ipi_types[ipinr]); __inc_irq_stat(cpu, ipi_irqs[ipinr]); diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 4dc45b3..9eb05be 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) #ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif +#ifdef CONFIG_SMP + ipi_cpu_backtrace(regs); +#endif
nmi_exit();
Hi Thomas, Hi Jason,
[Today I was *planning* to ask if patches 1 & 2 are OK for the irqchip tree. However just to be on the safe side I ran some build tests and they picked up something I overlooked last time. So instead of a poke I've put out a new patchset instead. Just to be sure I ran some build tests on the just patch 1 & 2 on their own. I couldn't find any mid-series build regressions so taking just the first two patches should present no problems.]
This patchset modifies the GIC driver to allow it, on supported platforms, to route IPI interrupts to FIQ and uses this feature to implement arch_trigger_all_cpu_backtrace for arm.
On platforms not capable of supporting FIQ the signal to generate a backtrace we fall back to using IRQ for propagation instead (relying on a timeout to avoid wedging the CPU requesting the backtrace if other CPUs are not responsive).
It has been tested on two systems capable of supporting grouping (Freescale i.MX6 and STiH416) and two that do not (vexpress-a9 and Qualcomm Snapdragon 600).
v8:
* Fixed build on arm64 causes by a spurious include file in irq-gic.c.
v7-2 (accidentally released twice with same number):
* Fixed boot regression on vexpress-a9 (reported by Russell King).
* Rebased on v3.18-rc3; removed one patch from set that is already included in mainline.
* Dropped arm64/fiq.h patch from the set (still useful but not related to issuing backtraces).
v7:
* Re-arranged code within the patch series to fix a regression introduced midway through the series and corrected by a later patch (testing by Olof's autobuilder). Tested offending patch in isolation using defconfig identified by the autobuilder.
v6:
* Renamed svc_entry's call_trace argument to just trace (example code from Russell King).
* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell King).
* Modified usr_entry to optional avoid calling into the trace code and used this in FIQ entry from usr path. Modified corresponding exit code to avoid calling into trace code and the scheduler (example code from Russell King).
* Ensured the default FIQ register state is restored when the default FIQ handler is reinstalled (example code from Russell King).
* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting a default FIQ handler.
* Re-instated fiq_safe_migration_lock and associated logic in gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd() in the console unlock logic.
v5:
* Rebased on 3.17-rc4.
* Removed a spurious line from the final "glue it together" patch that broke the build.
v4:
* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas Pitre).
* Really fix bad pt_regs pointer generation in __fiq_abt.
* Remove fiq_safe_migration_lock and associated logic in gic_raise_softirq() (review of Russell King)
* Restructured to introduce the default FIQ handler first, before the new features (review of Russell King).
v3:
* Removed redundant header guards from arch/arm64/include/asm/fiq.h (review of Catalin Marinas).
* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas Pitre).
v2:
* Restructured to sit nicely on a similar FYI patchset from Russell King. It now effectively replaces the work in progress final patch with something much more complete.
* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq (review of Nicolas Pitre)
* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts being acknowledged by the IRQ handler does still exist but should be harmless because the IRQ handler will still wind up calling ipi_cpu_backtrace().
* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively becomes a platform feature (although the use of non-maskable interrupts to implement it is best effort rather than guaranteed).
* Better comments highlighting usage of RAZ/WI registers (and parts of registers) in the GIC code.
Changes *before* v1:
* This patchset is a hugely cut-down successor to "[PATCH v11 00/19] arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting the new structure. For historic details see: https://lkml.org/lkml/2014/9/2/227
* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value). In fixing this we also remove the useless indirection previously found in the fiq_handler macro.
* Make default fiq handler "always on" by migrating from fiq.c to traps.c and replace do_unexp_fiq with the new handler (review of Russell King).
* Add arm64 version of fiq.h (review of Russell King)
* Removed conditional branching and code from irq-gic.c, this is replaced by much simpler code that relies on the GIC specification's heavy use of read-as-zero/write-ignored (review of Russell King)
Daniel Thompson (4): irqchip: gic: Make gic_raise_softirq() FIQ-safe irqchip: gic: Introduce plumbing for IPI FIQ ARM: add basic support for on-demand backtrace of other CPUs arm: smp: Handle ipi_cpu_backtrace() using FIQ (if available)
arch/arm/include/asm/irq.h | 5 ++ arch/arm/include/asm/smp.h | 3 + arch/arm/kernel/smp.c | 64 +++++++++++++++ arch/arm/kernel/traps.c | 8 +- drivers/irqchip/irq-gic.c | 170 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 ++ 6 files changed, 245 insertions(+), 13 deletions(-)
-- 1.9.3
Currently calling printk() from a FIQ can result in deadlock on irq_controller_lock within gic_raise_softirq(). This occurs because printk(), which is otherwise structured to survive calls from FIQ/NMI, calls this function to raise an IPI when it needs to wake_up_klogd().
This patch fixes the problem by introducing an additional rwlock and using that to prevent softirqs being raised whilst the b.L switcher is updating the cpu map.
Other parts of the code are not updated to use the new fiq_safe_cpu_map_lock because other users of gic_cpu_map either rely on external locking or upon irq_controller_lock. Both locks are held by the b.L switcher code.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..0db62a6f1ee3 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,13 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock);
/* + * This lock may be locked for reading by FIQ handlers. Thus although + * read locking may be used liberally, write locking must only take + * place only when local FIQ handling is disabled. + */ +static DEFINE_RWLOCK(fiq_safe_cpu_map_lock); + +/* * The GIC mapping of CPU interfaces does not necessarily match * the logical CPU numbering. Let's use a mapping as returned * by the GIC itself. @@ -624,7 +631,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags); + read_lock_irqsave(&fiq_safe_cpu_map_lock, flags);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -639,7 +646,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&irq_controller_lock, flags); + read_unlock_irqrestore(&fiq_safe_cpu_map_lock, flags); } #endif
@@ -687,7 +694,7 @@ int gic_get_cpu_id(unsigned int cpu) * Migrate all peripheral interrupts with a target matching the current CPU * to the interface corresponding to @new_cpu_id. The CPU interface mapping * is also updated. Targets to other CPU interfaces are unchanged. - * This must be called with IRQs locally disabled. + * This must be called with IRQ and FIQ locally disabled. */ void gic_migrate_target(unsigned int new_cpu_id) { @@ -709,6 +716,7 @@ void gic_migrate_target(unsigned int new_cpu_id) ror_val = (cur_cpu_id - new_cpu_id) & 31;
raw_spin_lock(&irq_controller_lock); + write_lock(&fiq_safe_cpu_map_lock);
/* Update the target interface for this logical CPU */ gic_cpu_map[cpu] = 1 << new_cpu_id; @@ -728,6 +736,7 @@ void gic_migrate_target(unsigned int new_cpu_id) } }
+ write_unlock(&fiq_safe_cpu_map_lock); raw_spin_unlock(&irq_controller_lock);
/*
On Fri, 14 Nov 2014, Daniel Thompson wrote:
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..0db62a6f1ee3 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,13 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock); /*
- This lock may be locked for reading by FIQ handlers. Thus although
- read locking may be used liberally, write locking must only take
- place only when local FIQ handling is disabled.
- */
+static DEFINE_RWLOCK(fiq_safe_cpu_map_lock);
+/*
- The GIC mapping of CPU interfaces does not necessarily match
- the logical CPU numbering. Let's use a mapping as returned
- by the GIC itself.
@@ -624,7 +631,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags);
- read_lock_irqsave(&fiq_safe_cpu_map_lock, flags);
Just for the record:
You might have noticed that you replace a raw lock with a non raw one. That's not an issue on mainline, but that pretty much renders that code broken for RT.
Surely nothing I worry too much about given the current state of RT.
Thanks,
tglx
On 24/11/14 18:20, Thomas Gleixner wrote:
On Fri, 14 Nov 2014, Daniel Thompson wrote:
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..0db62a6f1ee3 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,13 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock); /*
- This lock may be locked for reading by FIQ handlers. Thus although
- read locking may be used liberally, write locking must only take
- place only when local FIQ handling is disabled.
- */
+static DEFINE_RWLOCK(fiq_safe_cpu_map_lock);
+/*
- The GIC mapping of CPU interfaces does not necessarily match
- the logical CPU numbering. Let's use a mapping as returned
- by the GIC itself.
@@ -624,7 +631,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags);
- read_lock_irqsave(&fiq_safe_cpu_map_lock, flags);
Just for the record:
You might have noticed that you replace a raw lock with a non raw one. That's not an issue on mainline, but that pretty much renders that code broken for RT.
Indeed. For that reason I've been pretty anxious to hear your views on this one.
Older versions of this patch did retain the raw lock but the code ends up looking a bit weird and resulted in negative comments during review:
if (in_nmi()) raw_spin_lock(&fiq_exclusive_cpu_map_lock); else raw_spin_lock_irqsave(&irq_controller_lock, flags);
The above form relies for correctness on the fact the b.L switcher code can take both locks and already runs with FIQ disabled.
Surely nothing I worry too much about given the current state of RT.
Hobby or not, I don't want to make your work here any harder. I could go back to the old form.
Alternatively I could provide a patch to go in -rt that converts the rw locks to spin locks but that just sounds like a maintenance hassle for you.
Daniel.
On Mon, 24 Nov 2014, Thomas Gleixner wrote:
On Fri, 14 Nov 2014, Daniel Thompson wrote:
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..0db62a6f1ee3 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,13 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock); /*
- This lock may be locked for reading by FIQ handlers. Thus although
- read locking may be used liberally, write locking must only take
- place only when local FIQ handling is disabled.
- */
+static DEFINE_RWLOCK(fiq_safe_cpu_map_lock);
+/*
- The GIC mapping of CPU interfaces does not necessarily match
- the logical CPU numbering. Let's use a mapping as returned
- by the GIC itself.
@@ -624,7 +631,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags);
- read_lock_irqsave(&fiq_safe_cpu_map_lock, flags);
Just for the record:
You might have noticed that you replace a raw lock with a non raw one. That's not an issue on mainline, but that pretty much renders that code broken for RT. Surely nothing I worry too much about given the current state of RT.
And having a second thought here. Looking at the protection scope independent of the spin vs. rw lock
gic_raise_softirq()
lock();
/* Does not need any protection */ for_each_cpu(cpu, mask) map |= gic_cpu_map[cpu];
/* * Can be outside the lock region as well as it makes sure * that previous writes (usually the IPI data) are visible * before the write to the SOFTINT register. */ dmb(ishst);
/* Why needs this protection? */ write(map, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT));
unlock();
gic_migrate_target()
.... lock();
/* Migrate all peripheral interrupts */
unlock();
So what's the point of that protection?
gic_raise_softirq() is used to send IPIs, which are PPIs on the target CPUs so they are not affected from the migration of the peripheral interrupts at all.
The write to the SOFTINT register in gic_migrate_target() is not inside the lock region. So what's serialized by the lock in gic_raise_softirq() at all?
Either I'm missing something really important here or this locking exercise in gic_raise_softirq() and therefor the rwlock conversion is completely pointless.
Thanks,
tglx
On 24/11/14 18:48, Thomas Gleixner wrote:
On Mon, 24 Nov 2014, Thomas Gleixner wrote:
On Fri, 14 Nov 2014, Daniel Thompson wrote:
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..0db62a6f1ee3 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,13 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock); /*
- This lock may be locked for reading by FIQ handlers. Thus although
- read locking may be used liberally, write locking must only take
- place only when local FIQ handling is disabled.
- */
+static DEFINE_RWLOCK(fiq_safe_cpu_map_lock);
+/*
- The GIC mapping of CPU interfaces does not necessarily match
- the logical CPU numbering. Let's use a mapping as returned
- by the GIC itself.
@@ -624,7 +631,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags);
- read_lock_irqsave(&fiq_safe_cpu_map_lock, flags);
Just for the record:
You might have noticed that you replace a raw lock with a non raw one. That's not an issue on mainline, but that pretty much renders that code broken for RT. Surely nothing I worry too much about given the current state of RT.
And having a second thought here. Looking at the protection scope independent of the spin vs. rw lock
gic_raise_softirq()
lock();
/* Does not need any protection */ for_each_cpu(cpu, mask) map |= gic_cpu_map[cpu];
/* * Can be outside the lock region as well as it makes sure * that previous writes (usually the IPI data) are visible * before the write to the SOFTINT register. */ dmb(ishst);
/* Why needs this protection? */ write(map, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT));
If gic_cpu_map changed value during the execution of this function then this could raise an IPI on the wrong CPU.
Most of the rest of this mail is explaining how this could happen.
unlock();
gic_migrate_target()
.... lock();
Also: Value of gic_cpu_map is updated.
/* Migrate all peripheral interrupts */
unlock();
Also: /* Migrate all IPIs pending on the old core */
So what's the point of that protection?
gic_raise_softirq() is used to send IPIs, which are PPIs on the target CPUs so they are not affected from the migration of the peripheral interrupts at all.
The write to the SOFTINT register in gic_migrate_target() is not inside the lock region. So what's serialized by the lock in gic_raise_softirq() at all?
At the point that gic_migrate_target() takes the lock it knows that no further IPIs can be submitted.
Once the gic_cpu_map is updated we can permit new IPIs to be submitted because these will be routed to the correct core.
As a result we don't actually need to hold the lock to migrate the pending IPIs since we know that no new IPIs can possibly be sent to the wrong core.
Either I'm missing something really important here or this locking exercise in gic_raise_softirq() and therefor the rwlock conversion is completely pointless.
I did want to remove the lock too. However when I reviewed this code I concluded the lock was still required. Without it I think it is possible for gic_raise_softirq() to raise an IPI on the old core *after* the code to migrate pending IPIs has been run.
Daniel.
On Mon, 24 Nov 2014, Daniel Thompson wrote:
I did want to remove the lock too. However when I reviewed this code I concluded the lock was still required. Without it I think it is possible for gic_raise_softirq() to raise an IPI on the old core *after* the code to migrate pending IPIs has been run.
And I bet it took you quite some time to figure that out from that overly documented abuse of irq_controller_lock. See my other reply.
Thanks,
tglx
On 24/11/14 20:41, Thomas Gleixner wrote:
On Mon, 24 Nov 2014, Daniel Thompson wrote:
I did want to remove the lock too. However when I reviewed this code I concluded the lock was still required. Without it I think it is possible for gic_raise_softirq() to raise an IPI on the old core *after* the code to migrate pending IPIs has been run.
And I bet it took you quite some time to figure that out from that overly documented abuse of irq_controller_lock. See my other reply.
Yes. It did take quite some time, although compared to some of the other FIQ/NMI-safety reviews I've been doing recently it could be worse. ;-)
On Mon, 24 Nov 2014, Thomas Gleixner wrote:
On Mon, 24 Nov 2014, Thomas Gleixner wrote:
On Fri, 14 Nov 2014, Daniel Thompson wrote:
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..0db62a6f1ee3 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,13 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock); /*
- This lock may be locked for reading by FIQ handlers. Thus although
- read locking may be used liberally, write locking must only take
- place only when local FIQ handling is disabled.
- */
+static DEFINE_RWLOCK(fiq_safe_cpu_map_lock);
+/*
- The GIC mapping of CPU interfaces does not necessarily match
- the logical CPU numbering. Let's use a mapping as returned
- by the GIC itself.
@@ -624,7 +631,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags);
- read_lock_irqsave(&fiq_safe_cpu_map_lock, flags);
Just for the record:
You might have noticed that you replace a raw lock with a non raw one. That's not an issue on mainline, but that pretty much renders that code broken for RT. Surely nothing I worry too much about given the current state of RT.
And having a second thought here. Looking at the protection scope independent of the spin vs. rw lock
gic_raise_softirq()
lock();
/* Does not need any protection */ for_each_cpu(cpu, mask) map |= gic_cpu_map[cpu];
/* * Can be outside the lock region as well as it makes sure * that previous writes (usually the IPI data) are visible * before the write to the SOFTINT register. */ dmb(ishst);
/* Why needs this protection? */ write(map, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT));
unlock();
gic_migrate_target()
.... lock();
/* Migrate all peripheral interrupts */
unlock();
So what's the point of that protection?
gic_raise_softirq() is used to send IPIs, which are PPIs on the target CPUs so they are not affected from the migration of the peripheral interrupts at all.
The write to the SOFTINT register in gic_migrate_target() is not inside the lock region. So what's serialized by the lock in gic_raise_softirq() at all?
Either I'm missing something really important here or this locking exercise in gic_raise_softirq() and therefor the rwlock conversion is completely pointless.
Thanks to Marc I figured it out now what I'm missing. That stuff is part of the bl switcher horror. Well documented as all of that ...
So the lock protects against an IPI being sent to the current cpu while the target map is redirected and the pending state of the current cpu is migrated to another cpu.
It's not your fault, that the initial authors of that just abused irq_controller_lock for that purpose instead of introducing a seperate lock with a clear description of the protection scope in the first place.
Now you came up with the rw lock to handle the following FIQ related case: gic_raise_softirq() lock(x); ---> FIQ handle_fiq() gic_raise_softirq() lock(x); <-- Live lock
Now the rwlock lets you avoid that, and it only lets you avoid that because rwlocks are not fair.
So while I cannot come up with a brilliant replacement, it would be really helpful documentation wise if you could do the following:
1) Create a patch which introduces irq_migration_lock as a raw spinlock and replaces the usage of irq_controller_lock in gic_raise_softirq() and gic_migrate_target() along with a proper explanation in the code and the changelog of course.
2) Make the rwlock conversion on top of that with a proper documentation in the code of the only relevant reason (See above).
The protection scope which prevents IPIs being sent while switching over is still the same and not affected.
That's not the first time that I stumble over this bl switcher mess which got boltet into the kernel mindlessly.
If the scope of the issue would have been clear up front, I wouldn't have complained about the RT relevance for this as it is simple to either disable FIQs for RT or just handle the above case differently.
Thanks,
tglx
On 24/11/14 20:38, Thomas Gleixner wrote:
On Mon, 24 Nov 2014, Thomas Gleixner wrote:
On Mon, 24 Nov 2014, Thomas Gleixner wrote:
On Fri, 14 Nov 2014, Daniel Thompson wrote:
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..0db62a6f1ee3 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,13 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock); /*
- This lock may be locked for reading by FIQ handlers. Thus although
- read locking may be used liberally, write locking must only take
- place only when local FIQ handling is disabled.
- */
+static DEFINE_RWLOCK(fiq_safe_cpu_map_lock);
+/*
- The GIC mapping of CPU interfaces does not necessarily match
- the logical CPU numbering. Let's use a mapping as returned
- by the GIC itself.
@@ -624,7 +631,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags);
- read_lock_irqsave(&fiq_safe_cpu_map_lock, flags);
Just for the record:
You might have noticed that you replace a raw lock with a non raw one. That's not an issue on mainline, but that pretty much renders that code broken for RT. Surely nothing I worry too much about given the current state of RT.
And having a second thought here. Looking at the protection scope independent of the spin vs. rw lock
gic_raise_softirq()
lock();
/* Does not need any protection */ for_each_cpu(cpu, mask) map |= gic_cpu_map[cpu];
/* * Can be outside the lock region as well as it makes sure * that previous writes (usually the IPI data) are visible * before the write to the SOFTINT register. */ dmb(ishst);
/* Why needs this protection? */ write(map, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT));
unlock();
gic_migrate_target()
.... lock();
/* Migrate all peripheral interrupts */
unlock();
So what's the point of that protection?
gic_raise_softirq() is used to send IPIs, which are PPIs on the target CPUs so they are not affected from the migration of the peripheral interrupts at all.
The write to the SOFTINT register in gic_migrate_target() is not inside the lock region. So what's serialized by the lock in gic_raise_softirq() at all?
Either I'm missing something really important here or this locking exercise in gic_raise_softirq() and therefor the rwlock conversion is completely pointless.
Thanks to Marc I figured it out now what I'm missing. That stuff is part of the bl switcher horror. Well documented as all of that ...
So the lock protects against an IPI being sent to the current cpu while the target map is redirected and the pending state of the current cpu is migrated to another cpu.
It's not your fault, that the initial authors of that just abused irq_controller_lock for that purpose instead of introducing a seperate lock with a clear description of the protection scope in the first place.
Now you came up with the rw lock to handle the following FIQ related case: gic_raise_softirq() lock(x); ---> FIQ handle_fiq() gic_raise_softirq() lock(x); <-- Live lock
Now the rwlock lets you avoid that, and it only lets you avoid that because rwlocks are not fair.
So while I cannot come up with a brilliant replacement, it would be really helpful documentation wise if you could do the following:
- Create a patch which introduces irq_migration_lock as a raw spinlock and replaces the usage of irq_controller_lock in gic_raise_softirq() and gic_migrate_target() along with a proper explanation in the code and the changelog of course.
Replace irq_controller_lock or augment it with a new one?
irq_raise_softirq() can share a single r/w lock with irq_set_affinity() because irq_set_affinity() would have to lock it for writing and that would bring the deadlock back for a badly timed FIQ.
Thus if we want calls to gic_raise_softirq() to be FIQ-safe there there must be two locks taken in gic_migrate_target().
We can eliminate irq_controller_lock but we cannot replace it with one r/w lock.
Make the rwlock conversion on top of that with a proper documentation in the code of the only relevant reason (See above).
The protection scope which prevents IPIs being sent while switching over is still the same and not affected.
That's not the first time that I stumble over this bl switcher mess which got boltet into the kernel mindlessly.
If the scope of the issue would have been clear up front, I wouldn't have complained about the RT relevance for this as it is simple to either disable FIQs for RT or just handle the above case differently.
Thanks,
tglx
On Mon, 24 Nov 2014, Daniel Thompson wrote:
On 24/11/14 20:38, Thomas Gleixner wrote:
So while I cannot come up with a brilliant replacement, it would be really helpful documentation wise if you could do the following:
- Create a patch which introduces irq_migration_lock as a raw spinlock and replaces the usage of irq_controller_lock in gic_raise_softirq() and gic_migrate_target() along with a proper explanation in the code and the changelog of course.
Replace irq_controller_lock or augment it with a new one?
Replace it in gic_raise_softirq() and add it to gic_migrate_target() as you did with the RW lock.
Thanks,
tglx
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org --- arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 155 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 158 insertions(+), 10 deletions(-)
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 0c8b10801d36..4dc45b38e56e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -26,6 +26,7 @@ #include <linux/init.h> #include <linux/sched.h> #include <linux/irq.h> +#include <linux/irqchip/arm-gic.h>
#include <linux/atomic.h> #include <asm/cacheflush.h> @@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
nmi_enter();
- /* nop. FIQ handlers for special arch/arm features can be added here. */ +#ifdef CONFIG_ARM_GIC + gic_handle_fiq_ipi(); +#endif
nmi_exit();
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 0db62a6f1ee3..fe6a35f891ac 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -39,6 +39,7 @@ #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/ratelimit.h>
#include <asm/cputype.h> #include <asm/irq.h> @@ -48,6 +49,10 @@ #include "irq-gic-common.h" #include "irqchip.h"
+#ifndef SMP_IPI_FIQ_MASK +#define SMP_IPI_FIQ_MASK 0 +#endif + union gic_base { void __iomem *common_base; void __percpu * __iomem *percpu_base; @@ -331,6 +336,93 @@ static struct irq_chip gic_chip = { .irq_set_wake = gic_set_wake, };
+/* + * Shift an interrupt between Group 0 and Group 1. + * + * In addition to changing the group we also modify the priority to + * match what "ARM strongly recommends" for a system where no Group 1 + * interrupt must ever preempt a Group 0 interrupt. + * + * If is safe to call this function on systems which do not support + * grouping (it will have no effect). + */ +static void gic_set_group_irq(void __iomem *base, unsigned int hwirq, + int group) +{ + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_mask = BIT(hwirq % 32); + u32 grp_val; + + unsigned int pri_reg = (hwirq / 4) * 4; + u32 pri_mask = BIT(7 + ((hwirq % 4) * 8)); + u32 pri_val; + + /* + * Systems which do not support grouping will have not have + * the EnableGrp1 bit set. + */ + if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))) + return; + + raw_spin_lock(&irq_controller_lock); + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg); + + if (group) { + grp_val |= grp_mask; + pri_val |= pri_mask; + } else { + grp_val &= ~grp_mask; + pri_val &= ~pri_mask; + } + + writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg); + writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg); + + raw_spin_unlock(&irq_controller_lock); +} + +/* + * Test which group an interrupt belongs to. + * + * Returns 0 if the controller does not support grouping. + */ +static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) +{ + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_val; + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + + return (grp_val >> (hwirq % 32)) & 1; +} + +/* + * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI, + * otherwise do nothing. + */ +void gic_handle_fiq_ipi(void) +{ + struct gic_chip_data *gic = &gic_data[0]; + void __iomem *cpu_base = gic_data_cpu_base(gic); + unsigned long irqstat, irqnr; + + if (WARN_ON(!in_nmi())) + return; + + while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) & + SMP_IPI_FIQ_MASK) { + irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK); + writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); + + irqnr = irqstat & GICC_IAR_INT_ID_MASK; + WARN_RATELIMIT(irqnr > 16, + "Unexpected irqnr %lu (bad prioritization?)\n", + irqnr); + } +} + void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq) { if (gic_nr >= MAX_GIC_NR) @@ -362,15 +454,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic) static void gic_cpu_if_up(void) { void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]); - u32 bypass = 0; + void __iomem *dist_base = gic_data_dist_base(&gic_data[0]); + u32 ctrl = 0;
/* - * Preserve bypass disable bits to be written back later - */ - bypass = readl(cpu_base + GIC_CPU_CTRL); - bypass &= GICC_DIS_BYPASS_MASK; + * Preserve bypass disable bits to be written back later + */ + ctrl = readl(cpu_base + GIC_CPU_CTRL); + ctrl &= GICC_DIS_BYPASS_MASK;
- writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); + /* + * If EnableGrp1 is set in the distributor then enable group 1 + * support for this CPU (and route group 0 interrupts to FIQ). + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) + ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL | + GICC_ENABLE_GRP1; + + writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); }
@@ -394,7 +495,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
gic_dist_config(base, gic_irqs, NULL);
- writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL); + /* + * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only, + * bit 1 ignored) depending on current mode. + */ + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL); + + /* + * Set all global interrupts to be group 1 if (and only if) it + * is possible to enable group 1 interrupts. This register is RAZ/WI + * if not accessible or not implemented, however some GICv1 devices + * do not implement the EnableGrp1 bit making it unsafe to set + * this register unconditionally. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)) + for (i = 32; i < gic_irqs; i += 32) + writel_relaxed(0xffffffff, + base + GIC_DIST_IGROUP + i * 4 / 32); }
static void gic_cpu_init(struct gic_chip_data *gic) @@ -403,6 +520,7 @@ static void gic_cpu_init(struct gic_chip_data *gic) void __iomem *base = gic_data_cpu_base(gic); unsigned int cpu_mask, cpu = smp_processor_id(); int i; + unsigned long secure_irqs, secure_irq;
/* * Get what the GIC says our CPU mask is. @@ -421,6 +539,19 @@ static void gic_cpu_init(struct gic_chip_data *gic)
gic_cpu_config(dist_base, NULL);
+ /* + * If the distributor is configured to support interrupt grouping + * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK + * to be group1 and ensure any remaining group 0 interrupts have + * the right priority. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) { + secure_irqs = SMP_IPI_FIQ_MASK; + writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0); + for_each_set_bit(secure_irq, &secure_irqs, 16) + gic_set_group_irq(dist_base, secure_irq, 0); + } + writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK); gic_cpu_if_up(); } @@ -510,7 +641,8 @@ static void gic_dist_restore(unsigned int gic_nr) writel_relaxed(gic_data[gic_nr].saved_spi_enable[i], dist_base + GIC_DIST_ENABLE_SET + i * 4);
- writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL); + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, + dist_base + GIC_DIST_CTRL); }
static void gic_cpu_save(unsigned int gic_nr) @@ -630,6 +762,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; unsigned long flags, map = 0; + unsigned long softint;
read_lock_irqsave(&fiq_safe_cpu_map_lock, flags);
@@ -644,7 +777,11 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) dmb(ishst);
/* this always happens on GIC0 */ - writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); + softint = map << 16 | irq; + if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) + softint |= 0x8000; + writel_relaxed(softint, + gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
read_unlock_irqrestore(&fiq_safe_cpu_map_lock, flags); } diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h index 13eed92c7d24..a906fb7ac11f 100644 --- a/include/linux/irqchip/arm-gic.h +++ b/include/linux/irqchip/arm-gic.h @@ -22,6 +22,10 @@ #define GIC_CPU_IDENT 0xfc
#define GICC_ENABLE 0x1 +#define GICC_ENABLE_GRP1 0x2 +#define GICC_ACK_CTL 0x4 +#define GICC_FIQ_EN 0x8 +#define GICC_COMMON_BPR 0x10 #define GICC_INT_PRI_THRESHOLD 0xf0 #define GICC_IAR_INT_ID_MASK 0x3ff #define GICC_INT_SPURIOUS 1023 @@ -44,6 +48,7 @@ #define GIC_DIST_SGI_PENDING_SET 0xf20
#define GICD_ENABLE 0x1 +#define GICD_ENABLE_GRP1 0x2 #define GICD_DISABLE 0x0 #define GICD_INT_ACTLOW_LVLTRIG 0x0 #define GICD_INT_EN_CLR_X32 0xffffffff @@ -117,5 +122,8 @@ static inline void __init register_routable_domain_ops { gic_routable_irq_domain_ops = ops; } + +void gic_handle_fiq_ipi(void); + #endif /* __ASSEMBLY */ #endif
Add basic infrastructure for triggering a backtrace of other CPUs via an IPI, preferably at FIQ level. It is intended that this shall be used for cases where we have detected that something has already failed in the kernel.
Signed-off-by: Russell King rmk+kernel@arm.linux.org.uk Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- arch/arm/include/asm/irq.h | 5 ++++ arch/arm/kernel/smp.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+)
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h index 53c15dec7af6..be1d07d59ee9 100644 --- a/arch/arm/include/asm/irq.h +++ b/arch/arm/include/asm/irq.h @@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *); extern void set_handle_irq(void (*handle_irq)(struct pt_regs *)); #endif
+#ifdef CONFIG_SMP +extern void arch_trigger_all_cpu_backtrace(bool); +#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x) +#endif + #endif
#endif diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 13396d3d600e..14c594a12bef 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -72,8 +72,12 @@ enum ipi_msg_type { IPI_CPU_STOP, IPI_IRQ_WORK, IPI_COMPLETION, + IPI_CPU_BACKTRACE, };
+/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; + static DECLARE_COMPLETION(cpu_running);
static struct smp_operations smp_ops; @@ -535,6 +539,21 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
+static void ipi_cpu_backtrace(struct pt_regs *regs) +{ + int cpu = smp_processor_id(); + + if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { + static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED; + + arch_spin_lock(&lock); + printk(KERN_WARNING "FIQ backtrace for cpu %d\n", cpu); + show_regs(regs); + arch_spin_unlock(&lock); + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + } +} + static DEFINE_PER_CPU(struct completion *, cpu_completion);
int register_ipi_completion(struct completion *completion, int cpu) @@ -614,6 +633,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs) irq_exit(); break;
+ case IPI_CPU_BACKTRACE: + irq_enter(); + ipi_cpu_backtrace(regs); + irq_exit(); + break; + default: printk(KERN_CRIT "CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); @@ -708,3 +733,40 @@ static int __init register_cpufreq_notifier(void) core_initcall(register_cpufreq_notifier);
#endif + +void arch_trigger_all_cpu_backtrace(bool include_self) +{ + static unsigned long backtrace_flag; + int i, cpu = get_cpu(); + + if (test_and_set_bit(0, &backtrace_flag)) { + /* + * If there is already a trigger_all_cpu_backtrace() in progress + * (backtrace_flag == 1), don't output double cpu dump infos. + */ + put_cpu(); + return; + } + + cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask); + if (!include_self) + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + + if (!cpumask_empty(to_cpumask(backtrace_mask))) { + pr_info("Sending FIQ to %s CPUs:\n", + (include_self ? "all" : "other")); + smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE); + } + + /* Wait for up to 10 seconds for all CPUs to do the backtrace */ + for (i = 0; i < 10 * 1000; i++) { + if (cpumask_empty(to_cpumask(backtrace_mask))) + break; + + mdelay(1); + } + + clear_bit(0, &backtrace_flag); + smp_mb__after_atomic(); + put_cpu(); +}
Previous changes have introduced both a replacement default FIQ handler and an implementation of arch_trigger_all_cpu_backtrace for ARM but these are currently independent of each other.
This patch plumbs together these features making it possible, on platforms that support it, to trigger backtrace using FIQ.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- arch/arm/include/asm/smp.h | 3 +++ arch/arm/kernel/smp.c | 4 +++- arch/arm/kernel/traps.c | 3 +++ 3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h index 18f5a554134f..b076584ac0fa 100644 --- a/arch/arm/include/asm/smp.h +++ b/arch/arm/include/asm/smp.h @@ -18,6 +18,8 @@ # error "<asm/smp.h> included in non-SMP build" #endif
+#define SMP_IPI_FIQ_MASK 0x0100 + #define raw_smp_processor_id() (current_thread_info()->cpu)
struct seq_file; @@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu); extern void arch_send_call_function_ipi_mask(const struct cpumask *mask); extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
+extern void ipi_cpu_backtrace(struct pt_regs *regs); extern int register_ipi_completion(struct completion *completion, int cpu);
struct smp_operations { diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 14c594a12bef..e923843562d9 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -539,7 +539,7 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
-static void ipi_cpu_backtrace(struct pt_regs *regs) +void ipi_cpu_backtrace(struct pt_regs *regs) { int cpu = smp_processor_id();
@@ -580,6 +580,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs) unsigned int cpu = smp_processor_id(); struct pt_regs *old_regs = set_irq_regs(regs);
+ BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE)); + if ((unsigned)ipinr < NR_IPI) { trace_ipi_entry(ipi_types[ipinr]); __inc_irq_stat(cpu, ipi_irqs[ipinr]); diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 4dc45b38e56e..9eb05be9526e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) #ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif +#ifdef CONFIG_SMP + ipi_cpu_backtrace(regs); +#endif
nmi_exit();
On 14/11/14 12:35, Daniel Thompson wrote:
Hi Thomas, Hi Jason,
[Today I was *planning* to ask if patches 1 & 2 are OK for the irqchip tree. However just to be on the safe side I ran some build tests and they picked up something I overlooked last time. So instead of a poke I've put out a new patchset instead. Just to be sure I ran some build tests on the just patch 1 & 2 on their own. I couldn't find any mid-series build regressions so taking just the first two patches should present no problems.]
Are there any comments on patches 1 & 2?
Daniel.
This patchset modifies the GIC driver to allow it, on supported platforms, to route IPI interrupts to FIQ and uses this feature to implement arch_trigger_all_cpu_backtrace for arm.
On platforms not capable of supporting FIQ the signal to generate a backtrace we fall back to using IRQ for propagation instead (relying on a timeout to avoid wedging the CPU requesting the backtrace if other CPUs are not responsive).
It has been tested on two systems capable of supporting grouping (Freescale i.MX6 and STiH416) and two that do not (vexpress-a9 and Qualcomm Snapdragon 600).
v8:
- Fixed build on arm64 causes by a spurious include file in irq-gic.c.
v7-2 (accidentally released twice with same number):
Fixed boot regression on vexpress-a9 (reported by Russell King).
Rebased on v3.18-rc3; removed one patch from set that is already included in mainline.
Dropped arm64/fiq.h patch from the set (still useful but not related to issuing backtraces).
v7:
- Re-arranged code within the patch series to fix a regression introduced midway through the series and corrected by a later patch (testing by Olof's autobuilder). Tested offending patch in isolation using defconfig identified by the autobuilder.
v6:
Renamed svc_entry's call_trace argument to just trace (example code from Russell King).
Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell King).
Modified usr_entry to optional avoid calling into the trace code and used this in FIQ entry from usr path. Modified corresponding exit code to avoid calling into trace code and the scheduler (example code from Russell King).
Ensured the default FIQ register state is restored when the default FIQ handler is reinstalled (example code from Russell King).
Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting a default FIQ handler.
Re-instated fiq_safe_migration_lock and associated logic in gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd() in the console unlock logic.
v5:
Rebased on 3.17-rc4.
Removed a spurious line from the final "glue it together" patch that broke the build.
v4:
Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas Pitre).
Really fix bad pt_regs pointer generation in __fiq_abt.
Remove fiq_safe_migration_lock and associated logic in gic_raise_softirq() (review of Russell King)
Restructured to introduce the default FIQ handler first, before the new features (review of Russell King).
v3:
Removed redundant header guards from arch/arm64/include/asm/fiq.h (review of Catalin Marinas).
Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas Pitre).
v2:
Restructured to sit nicely on a similar FYI patchset from Russell King. It now effectively replaces the work in progress final patch with something much more complete.
Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq (review of Nicolas Pitre)
Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts being acknowledged by the IRQ handler does still exist but should be harmless because the IRQ handler will still wind up calling ipi_cpu_backtrace().
Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively becomes a platform feature (although the use of non-maskable interrupts to implement it is best effort rather than guaranteed).
Better comments highlighting usage of RAZ/WI registers (and parts of registers) in the GIC code.
Changes *before* v1:
This patchset is a hugely cut-down successor to "[PATCH v11 00/19] arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting the new structure. For historic details see: https://lkml.org/lkml/2014/9/2/227
Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value). In fixing this we also remove the useless indirection previously found in the fiq_handler macro.
Make default fiq handler "always on" by migrating from fiq.c to traps.c and replace do_unexp_fiq with the new handler (review of Russell King).
Add arm64 version of fiq.h (review of Russell King)
Removed conditional branching and code from irq-gic.c, this is replaced by much simpler code that relies on the GIC specification's heavy use of read-as-zero/write-ignored (review of Russell King)
Daniel Thompson (4): irqchip: gic: Make gic_raise_softirq() FIQ-safe irqchip: gic: Introduce plumbing for IPI FIQ ARM: add basic support for on-demand backtrace of other CPUs arm: smp: Handle ipi_cpu_backtrace() using FIQ (if available)
arch/arm/include/asm/irq.h | 5 ++ arch/arm/include/asm/smp.h | 3 + arch/arm/kernel/smp.c | 64 +++++++++++++++ arch/arm/kernel/traps.c | 8 +- drivers/irqchip/irq-gic.c | 170 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 ++ 6 files changed, 245 insertions(+), 13 deletions(-)
-- 1.9.3
Hi Thomas, Hi Jason,
This patchset modifies the GIC driver to allow it, on supported platforms, to route IPI interrupts to FIQ and uses this feature to implement arch_trigger_all_cpu_backtrace for arm.
On platforms not capable of supporting FIQ the signal to generate a backtrace we fall back to using IRQ for propagation instead (relying on a timeout to avoid wedging the CPU requesting the backtrace if other CPUs are not responsive).
It has been tested on two systems capable of supporting grouping (Freescale i.MX6 and STiH416) and two that do not (vexpress-a9 and Qualcomm Snapdragon 600).
v9:
* Improved documentation and structure of initial patch (now initial two patches) to make gic_raise_softirq() safe to call from FIQ (Thomas Gleixner).
* Avoid masking interrupts during gic_raise_softirq(). The use of the read lock makes this redundant (because we can safely re-enter the function).
v8:
* Fixed build on arm64 causes by a spurious include file in irq-gic.c.
v7-2 (accidentally released twice with same number):
* Fixed boot regression on vexpress-a9 (reported by Russell King).
* Rebased on v3.18-rc3; removed one patch from set that is already included in mainline.
* Dropped arm64/fiq.h patch from the set (still useful but not related to issuing backtraces).
v7:
* Re-arranged code within the patch series to fix a regression introduced midway through the series and corrected by a later patch (testing by Olof's autobuilder). Tested offending patch in isolation using defconfig identified by the autobuilder.
v6:
* Renamed svc_entry's call_trace argument to just trace (example code from Russell King).
* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell King).
* Modified usr_entry to optional avoid calling into the trace code and used this in FIQ entry from usr path. Modified corresponding exit code to avoid calling into trace code and the scheduler (example code from Russell King).
* Ensured the default FIQ register state is restored when the default FIQ handler is reinstalled (example code from Russell King).
* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting a default FIQ handler.
* Re-instated fiq_safe_migration_lock and associated logic in gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd() in the console unlock logic.
v5:
* Rebased on 3.17-rc4.
* Removed a spurious line from the final "glue it together" patch that broke the build.
v4:
* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas Pitre).
* Really fix bad pt_regs pointer generation in __fiq_abt.
* Remove fiq_safe_migration_lock and associated logic in gic_raise_softirq() (review of Russell King)
* Restructured to introduce the default FIQ handler first, before the new features (review of Russell King).
v3:
* Removed redundant header guards from arch/arm64/include/asm/fiq.h (review of Catalin Marinas).
* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas Pitre).
v2:
* Restructured to sit nicely on a similar FYI patchset from Russell King. It now effectively replaces the work in progress final patch with something much more complete.
* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq (review of Nicolas Pitre)
* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts being acknowledged by the IRQ handler does still exist but should be harmless because the IRQ handler will still wind up calling ipi_cpu_backtrace().
* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively becomes a platform feature (although the use of non-maskable interrupts to implement it is best effort rather than guaranteed).
* Better comments highlighting usage of RAZ/WI registers (and parts of registers) in the GIC code.
Changes *before* v1:
* This patchset is a hugely cut-down successor to "[PATCH v11 00/19] arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting the new structure. For historic details see: https://lkml.org/lkml/2014/9/2/227
* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value). In fixing this we also remove the useless indirection previously found in the fiq_handler macro.
* Make default fiq handler "always on" by migrating from fiq.c to traps.c and replace do_unexp_fiq with the new handler (review of Russell King).
* Add arm64 version of fiq.h (review of Russell King)
* Removed conditional branching and code from irq-gic.c, this is replaced by much simpler code that relies on the GIC specification's heavy use of read-as-zero/write-ignored (review of Russell King)
Daniel Thompson (5): irqchip: gic: Finer grain locking for gic_raise_softirq irqchip: gic: Make gic_raise_softirq() FIQ-safe irqchip: gic: Introduce plumbing for IPI FIQ ARM: add basic support for on-demand backtrace of other CPUs arm: smp: Handle ipi_cpu_backtrace() using FIQ (if available)
arch/arm/include/asm/irq.h | 5 ++ arch/arm/include/asm/smp.h | 3 + arch/arm/kernel/smp.c | 64 ++++++++++++++ arch/arm/kernel/traps.c | 8 +- drivers/irqchip/irq-gic.c | 192 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 ++ 6 files changed, 265 insertions(+), 15 deletions(-)
-- 1.9.3
irq_controller_lock is used for multiple purposes within the gic driver. Primarily it is used to make register read-modify-write sequences atomic. It is also used by gic_raise_softirq() in order that the big.LITTLE migration logic can figure out when it is safe to migrate interrupts between physical cores.
The second usage of irq_controller_lock is difficult to discern when reviewing the code because the migration itself takes place outside the lock.
This patch makes the second usage more explicit by splitting it out into a separate lock and providing better comments.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..bb4bc20573ea 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,12 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock);
/* + * This lock is used by the big.LITTLE migration code to ensure no + * IPIs can be pended on the old core after the map has been updated. + */ +static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock); + +/* * The GIC mapping of CPU interfaces does not necessarily match * the logical CPU numbering. Let's use a mapping as returned * by the GIC itself. @@ -624,7 +630,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags); + raw_spin_lock_irqsave(&cpu_map_migration_lock, flags);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -639,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&irq_controller_lock, flags); + raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); } #endif
@@ -710,8 +716,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
raw_spin_lock(&irq_controller_lock);
- /* Update the target interface for this logical CPU */ + /* + * Update the target interface for this logical CPU + * + * From the point we release the cpu_map_migration_lock any new + * SGIs will be pended on the new cpu which makes the set of SGIs + * pending on the old cpu static. That means we can defer the + * migration until after we have released the irq_controller_lock. + */ + raw_spin_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id; + raw_spin_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
Hi Daniel,
On 25/11/14 17:26, Daniel Thompson wrote:
irq_controller_lock is used for multiple purposes within the gic driver. Primarily it is used to make register read-modify-write sequences atomic. It is also used by gic_raise_softirq() in order that the big.LITTLE migration logic can figure out when it is safe to migrate interrupts between physical cores.
The second usage of irq_controller_lock is difficult to discern when reviewing the code because the migration itself takes place outside the lock.
This patch makes the second usage more explicit by splitting it out into a separate lock and providing better comments.
While we're at it, how about an additional patch that would make this lock disappear entirely when the big-little stuff is not compiled in, which is likely to be the case on a lot of (dare I say most?) systems? That will save expensive barriers that we definitely could do without.
It otherwise looks good to me.
Thanks,
M.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com
drivers/irqchip/irq-gic.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..bb4bc20573ea 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,12 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock); /*
- This lock is used by the big.LITTLE migration code to ensure no
- IPIs can be pended on the old core after the map has been updated.
- */
+static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+/*
- The GIC mapping of CPU interfaces does not necessarily match
- the logical CPU numbering. Let's use a mapping as returned
- by the GIC itself.
@@ -624,7 +630,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags);
- raw_spin_lock_irqsave(&cpu_map_migration_lock, flags);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -639,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
- raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
} #endif @@ -710,8 +716,17 @@ void gic_migrate_target(unsigned int new_cpu_id) raw_spin_lock(&irq_controller_lock);
- /* Update the target interface for this logical CPU */
- /*
* Update the target interface for this logical CPU
*
* From the point we release the cpu_map_migration_lock any new
* SGIs will be pended on the new cpu which makes the set of SGIs
* pending on the old cpu static. That means we can defer the
* migration until after we have released the irq_controller_lock.
*/
- raw_spin_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id;
- raw_spin_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
On Tue, 25 Nov 2014, Marc Zyngier wrote:
Hi Daniel,
On 25/11/14 17:26, Daniel Thompson wrote:
irq_controller_lock is used for multiple purposes within the gic driver. Primarily it is used to make register read-modify-write sequences atomic. It is also used by gic_raise_softirq() in order that the big.LITTLE migration logic can figure out when it is safe to migrate interrupts between physical cores.
The second usage of irq_controller_lock is difficult to discern when reviewing the code because the migration itself takes place outside the lock.
This patch makes the second usage more explicit by splitting it out into a separate lock and providing better comments.
While we're at it, how about an additional patch that would make this lock disappear entirely when the big-little stuff is not compiled in, which is likely to be the case on a lot of (dare I say most?) systems? That will save expensive barriers that we definitely could do without.
For the record, I reviewed and ACKed a patch doing exactly that a while ago:
http://lkml.org/lkml/2014/8/13/486
As far as I can see, no follo-ups happened.
Nicolas
On 25/11/14 20:17, Nicolas Pitre wrote:
On Tue, 25 Nov 2014, Marc Zyngier wrote:
Hi Daniel,
On 25/11/14 17:26, Daniel Thompson wrote:
irq_controller_lock is used for multiple purposes within the gic driver. Primarily it is used to make register read-modify-write sequences atomic. It is also used by gic_raise_softirq() in order that the big.LITTLE migration logic can figure out when it is safe to migrate interrupts between physical cores.
The second usage of irq_controller_lock is difficult to discern when reviewing the code because the migration itself takes place outside the lock.
This patch makes the second usage more explicit by splitting it out into a separate lock and providing better comments.
While we're at it, how about an additional patch that would make this lock disappear entirely when the big-little stuff is not compiled in, which is likely to be the case on a lot of (dare I say most?) systems? That will save expensive barriers that we definitely could do without.
For the record, I reviewed and ACKed a patch doing exactly that a while ago:
Well remembered! That patch had a different motivation but is very similar to mine... so much so I might steal bit of it.
I'll make sure I put Stephen on Cc: when I respin with the changes Marc requested.
On 11/25/2014 01:10 PM, Daniel Thompson wrote:
On 25/11/14 20:17, Nicolas Pitre wrote:
On Tue, 25 Nov 2014, Marc Zyngier wrote:
Hi Daniel,
On 25/11/14 17:26, Daniel Thompson wrote:
irq_controller_lock is used for multiple purposes within the gic driver. Primarily it is used to make register read-modify-write sequences atomic. It is also used by gic_raise_softirq() in order that the big.LITTLE migration logic can figure out when it is safe to migrate interrupts between physical cores.
The second usage of irq_controller_lock is difficult to discern when reviewing the code because the migration itself takes place outside the lock.
This patch makes the second usage more explicit by splitting it out into a separate lock and providing better comments.
While we're at it, how about an additional patch that would make this lock disappear entirely when the big-little stuff is not compiled in, which is likely to be the case on a lot of (dare I say most?) systems? That will save expensive barriers that we definitely could do without.
For the record, I reviewed and ACKed a patch doing exactly that a while ago:
Well remembered! That patch had a different motivation but is very similar to mine... so much so I might steal bit of it.
I'll make sure I put Stephen on Cc: when I respin with the changes Marc requested.
I don't get a random Cc here? :-)
Anyway, yes please let's merge that patch.
On 25/11/14 17:40, Marc Zyngier wrote:
Hi Daniel,
On 25/11/14 17:26, Daniel Thompson wrote:
irq_controller_lock is used for multiple purposes within the gic driver. Primarily it is used to make register read-modify-write sequences atomic. It is also used by gic_raise_softirq() in order that the big.LITTLE migration logic can figure out when it is safe to migrate interrupts between physical cores.
The second usage of irq_controller_lock is difficult to discern when reviewing the code because the migration itself takes place outside the lock.
This patch makes the second usage more explicit by splitting it out into a separate lock and providing better comments.
While we're at it, how about an additional patch that would make this lock disappear entirely when the big-little stuff is not compiled in, which is likely to be the case on a lot of (dare I say most?) systems? That will save expensive barriers that we definitely could do without.
Will do.
It otherwise looks good to me.
Thanks,
M.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com
drivers/irqchip/irq-gic.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..bb4bc20573ea 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,12 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock); /*
- This lock is used by the big.LITTLE migration code to ensure no
- IPIs can be pended on the old core after the map has been updated.
- */
+static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+/*
- The GIC mapping of CPU interfaces does not necessarily match
- the logical CPU numbering. Let's use a mapping as returned
- by the GIC itself.
@@ -624,7 +630,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags);
- raw_spin_lock_irqsave(&cpu_map_migration_lock, flags);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -639,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
- raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags);
} #endif @@ -710,8 +716,17 @@ void gic_migrate_target(unsigned int new_cpu_id) raw_spin_lock(&irq_controller_lock);
- /* Update the target interface for this logical CPU */
- /*
* Update the target interface for this logical CPU
*
* From the point we release the cpu_map_migration_lock any new
* SGIs will be pended on the new cpu which makes the set of SGIs
* pending on the old cpu static. That means we can defer the
* migration until after we have released the irq_controller_lock.
*/
- raw_spin_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id;
- raw_spin_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
It is currently possible for FIQ handlers to re-enter gic_raise_softirq() and lock up.
gic_raise_softirq() lock(x); ---> FIQ handle_fiq() gic_raise_softirq() lock(x); <-- Lockup
Calling printk() from a FIQ handler can trigger this problem because printk() raises an IPI when it needs to wake_up_klogd(). More generally, IPIs are the only means for FIQ handlers to safely defer work to less restrictive calling context so the function to raise them really needs to be FIQ-safe.
This patch fixes the problem by converting the cpu_map_migration_lock into a rwlock making it safe to re-enter the function.
Having made it safe to re-enter gic_raise_softirq() we no longer need to mask interrupts during gic_raise_softirq() because the b.L migration is always performed from task context.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index bb4bc20573ea..a53aa11e4f17 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -75,8 +75,11 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock); /* * This lock is used by the big.LITTLE migration code to ensure no * IPIs can be pended on the old core after the map has been updated. + * + * This lock may be locked for reading from FIQ handlers and therefore + * must not be locked for writing when FIQs are enabled. */ -static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock); +static DEFINE_RWLOCK(cpu_map_migration_lock);
/* * The GIC mapping of CPU interfaces does not necessarily match @@ -625,12 +628,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic) #endif
#ifdef CONFIG_SMP +/* + * Raise the specified IPI on all cpus set in mask. + * + * This function is safe to call from all calling contexts, including + * FIQ handlers. It relies on read locks being multiply acquirable to + * avoid deadlocks when the function is re-entered at different + * exception levels. + */ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; - unsigned long flags, map = 0; + unsigned long map = 0;
- raw_spin_lock_irqsave(&cpu_map_migration_lock, flags); + read_lock(&cpu_map_migration_lock);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -645,7 +656,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); + read_unlock(&cpu_map_migration_lock); } #endif
@@ -693,7 +704,8 @@ int gic_get_cpu_id(unsigned int cpu) * Migrate all peripheral interrupts with a target matching the current CPU * to the interface corresponding to @new_cpu_id. The CPU interface mapping * is also updated. Targets to other CPU interfaces are unchanged. - * This must be called with IRQs locally disabled. + * This must be called from a task context and with IRQ and FIQ locally + * disabled. */ void gic_migrate_target(unsigned int new_cpu_id) { @@ -724,9 +736,9 @@ void gic_migrate_target(unsigned int new_cpu_id) * pending on the old cpu static. That means we can defer the * migration until after we have released the irq_controller_lock. */ - raw_spin_lock(&cpu_map_migration_lock); + write_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id; - raw_spin_unlock(&cpu_map_migration_lock); + write_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org --- arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 155 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 158 insertions(+), 10 deletions(-)
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 0c8b10801d36..4dc45b38e56e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -26,6 +26,7 @@ #include <linux/init.h> #include <linux/sched.h> #include <linux/irq.h> +#include <linux/irqchip/arm-gic.h>
#include <linux/atomic.h> #include <asm/cacheflush.h> @@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
nmi_enter();
- /* nop. FIQ handlers for special arch/arm features can be added here. */ +#ifdef CONFIG_ARM_GIC + gic_handle_fiq_ipi(); +#endif
nmi_exit();
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index a53aa11e4f17..dfec7a4c1c64 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -39,6 +39,7 @@ #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/ratelimit.h>
#include <asm/cputype.h> #include <asm/irq.h> @@ -48,6 +49,10 @@ #include "irq-gic-common.h" #include "irqchip.h"
+#ifndef SMP_IPI_FIQ_MASK +#define SMP_IPI_FIQ_MASK 0 +#endif + union gic_base { void __iomem *common_base; void __percpu * __iomem *percpu_base; @@ -333,6 +338,93 @@ static struct irq_chip gic_chip = { .irq_set_wake = gic_set_wake, };
+/* + * Shift an interrupt between Group 0 and Group 1. + * + * In addition to changing the group we also modify the priority to + * match what "ARM strongly recommends" for a system where no Group 1 + * interrupt must ever preempt a Group 0 interrupt. + * + * If is safe to call this function on systems which do not support + * grouping (it will have no effect). + */ +static void gic_set_group_irq(void __iomem *base, unsigned int hwirq, + int group) +{ + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_mask = BIT(hwirq % 32); + u32 grp_val; + + unsigned int pri_reg = (hwirq / 4) * 4; + u32 pri_mask = BIT(7 + ((hwirq % 4) * 8)); + u32 pri_val; + + /* + * Systems which do not support grouping will have not have + * the EnableGrp1 bit set. + */ + if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))) + return; + + raw_spin_lock(&irq_controller_lock); + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg); + + if (group) { + grp_val |= grp_mask; + pri_val |= pri_mask; + } else { + grp_val &= ~grp_mask; + pri_val &= ~pri_mask; + } + + writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg); + writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg); + + raw_spin_unlock(&irq_controller_lock); +} + +/* + * Test which group an interrupt belongs to. + * + * Returns 0 if the controller does not support grouping. + */ +static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) +{ + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_val; + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + + return (grp_val >> (hwirq % 32)) & 1; +} + +/* + * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI, + * otherwise do nothing. + */ +void gic_handle_fiq_ipi(void) +{ + struct gic_chip_data *gic = &gic_data[0]; + void __iomem *cpu_base = gic_data_cpu_base(gic); + unsigned long irqstat, irqnr; + + if (WARN_ON(!in_nmi())) + return; + + while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) & + SMP_IPI_FIQ_MASK) { + irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK); + writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); + + irqnr = irqstat & GICC_IAR_INT_ID_MASK; + WARN_RATELIMIT(irqnr > 16, + "Unexpected irqnr %lu (bad prioritization?)\n", + irqnr); + } +} + void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq) { if (gic_nr >= MAX_GIC_NR) @@ -364,15 +456,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic) static void gic_cpu_if_up(void) { void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]); - u32 bypass = 0; + void __iomem *dist_base = gic_data_dist_base(&gic_data[0]); + u32 ctrl = 0;
/* - * Preserve bypass disable bits to be written back later - */ - bypass = readl(cpu_base + GIC_CPU_CTRL); - bypass &= GICC_DIS_BYPASS_MASK; + * Preserve bypass disable bits to be written back later + */ + ctrl = readl(cpu_base + GIC_CPU_CTRL); + ctrl &= GICC_DIS_BYPASS_MASK;
- writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); + /* + * If EnableGrp1 is set in the distributor then enable group 1 + * support for this CPU (and route group 0 interrupts to FIQ). + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) + ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL | + GICC_ENABLE_GRP1; + + writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); }
@@ -396,7 +497,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
gic_dist_config(base, gic_irqs, NULL);
- writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL); + /* + * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only, + * bit 1 ignored) depending on current mode. + */ + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL); + + /* + * Set all global interrupts to be group 1 if (and only if) it + * is possible to enable group 1 interrupts. This register is RAZ/WI + * if not accessible or not implemented, however some GICv1 devices + * do not implement the EnableGrp1 bit making it unsafe to set + * this register unconditionally. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)) + for (i = 32; i < gic_irqs; i += 32) + writel_relaxed(0xffffffff, + base + GIC_DIST_IGROUP + i * 4 / 32); }
static void gic_cpu_init(struct gic_chip_data *gic) @@ -405,6 +522,7 @@ static void gic_cpu_init(struct gic_chip_data *gic) void __iomem *base = gic_data_cpu_base(gic); unsigned int cpu_mask, cpu = smp_processor_id(); int i; + unsigned long secure_irqs, secure_irq;
/* * Get what the GIC says our CPU mask is. @@ -423,6 +541,19 @@ static void gic_cpu_init(struct gic_chip_data *gic)
gic_cpu_config(dist_base, NULL);
+ /* + * If the distributor is configured to support interrupt grouping + * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK + * to be group1 and ensure any remaining group 0 interrupts have + * the right priority. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) { + secure_irqs = SMP_IPI_FIQ_MASK; + writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0); + for_each_set_bit(secure_irq, &secure_irqs, 16) + gic_set_group_irq(dist_base, secure_irq, 0); + } + writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK); gic_cpu_if_up(); } @@ -512,7 +643,8 @@ static void gic_dist_restore(unsigned int gic_nr) writel_relaxed(gic_data[gic_nr].saved_spi_enable[i], dist_base + GIC_DIST_ENABLE_SET + i * 4);
- writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL); + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, + dist_base + GIC_DIST_CTRL); }
static void gic_cpu_save(unsigned int gic_nr) @@ -640,6 +772,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; unsigned long map = 0; + unsigned long softint;
read_lock(&cpu_map_migration_lock);
@@ -654,7 +787,11 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) dmb(ishst);
/* this always happens on GIC0 */ - writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); + softint = map << 16 | irq; + if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) + softint |= 0x8000; + writel_relaxed(softint, + gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
read_unlock(&cpu_map_migration_lock); } diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h index 13eed92c7d24..a906fb7ac11f 100644 --- a/include/linux/irqchip/arm-gic.h +++ b/include/linux/irqchip/arm-gic.h @@ -22,6 +22,10 @@ #define GIC_CPU_IDENT 0xfc
#define GICC_ENABLE 0x1 +#define GICC_ENABLE_GRP1 0x2 +#define GICC_ACK_CTL 0x4 +#define GICC_FIQ_EN 0x8 +#define GICC_COMMON_BPR 0x10 #define GICC_INT_PRI_THRESHOLD 0xf0 #define GICC_IAR_INT_ID_MASK 0x3ff #define GICC_INT_SPURIOUS 1023 @@ -44,6 +48,7 @@ #define GIC_DIST_SGI_PENDING_SET 0xf20
#define GICD_ENABLE 0x1 +#define GICD_ENABLE_GRP1 0x2 #define GICD_DISABLE 0x0 #define GICD_INT_ACTLOW_LVLTRIG 0x0 #define GICD_INT_EN_CLR_X32 0xffffffff @@ -117,5 +122,8 @@ static inline void __init register_routable_domain_ops { gic_routable_irq_domain_ops = ops; } + +void gic_handle_fiq_ipi(void); + #endif /* __ASSEMBLY */ #endif
I would be quite happy if grouping support for gic would be mainlined. Then only the dance to get the old gic version 1 working with fiqs would be needed...
I have on comment inline below which seems as a race to me.
Am Dienstag, 25. November 2014, 17:26:39 schrieb Daniel Thompson:
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org
arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 155 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 158 insertions(+), 10 deletions(-)
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 0c8b10801d36..4dc45b38e56e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -26,6 +26,7 @@ #include <linux/init.h> #include <linux/sched.h> #include <linux/irq.h> +#include <linux/irqchip/arm-gic.h>
#include <linux/atomic.h> #include <asm/cacheflush.h> @@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
nmi_enter();
- /* nop. FIQ handlers for special arch/arm features can be added here. */
+#ifdef CONFIG_ARM_GIC
- gic_handle_fiq_ipi();
+#endif
nmi_exit();
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index a53aa11e4f17..dfec7a4c1c64 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -39,6 +39,7 @@ #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/ratelimit.h>
#include <asm/cputype.h> #include <asm/irq.h> @@ -48,6 +49,10 @@ #include "irq-gic-common.h" #include "irqchip.h"
+#ifndef SMP_IPI_FIQ_MASK +#define SMP_IPI_FIQ_MASK 0 +#endif
union gic_base { void __iomem *common_base; void __percpu * __iomem *percpu_base; @@ -333,6 +338,93 @@ static struct irq_chip gic_chip = { .irq_set_wake = gic_set_wake, };
+/*
- Shift an interrupt between Group 0 and Group 1.
- In addition to changing the group we also modify the priority to
- match what "ARM strongly recommends" for a system where no Group 1
- interrupt must ever preempt a Group 0 interrupt.
- If is safe to call this function on systems which do not support
- grouping (it will have no effect).
- */
+static void gic_set_group_irq(void __iomem *base, unsigned int hwirq,
int group)
+{
- unsigned int grp_reg = hwirq / 32 * 4;
- u32 grp_mask = BIT(hwirq % 32);
- u32 grp_val;
- unsigned int pri_reg = (hwirq / 4) * 4;
- u32 pri_mask = BIT(7 + ((hwirq % 4) * 8));
- u32 pri_val;
- /*
* Systems which do not support grouping will have not have
* the EnableGrp1 bit set.
*/
- if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)))
return;
- raw_spin_lock(&irq_controller_lock);
Assumption: The interrupt in question is not masked over here?
- grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
- pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg);
- if (group) {
grp_val |= grp_mask;
pri_val |= pri_mask;
- } else {
grp_val &= ~grp_mask;
pri_val &= ~pri_mask;
- }
- writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg);
If the assumption is true, then there is a race if the interrupt in question hits here with undefined priority setting. Recomended workaround would be masking the interrupt in question.
- writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg);
- raw_spin_unlock(&irq_controller_lock);
+}
+/*
- Test which group an interrupt belongs to.
- Returns 0 if the controller does not support grouping.
- */
+static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) +{
- unsigned int grp_reg = hwirq / 32 * 4;
- u32 grp_val;
- grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
- return (grp_val >> (hwirq % 32)) & 1;
+}
+/*
- Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI,
- otherwise do nothing.
- */
+void gic_handle_fiq_ipi(void) +{
- struct gic_chip_data *gic = &gic_data[0];
- void __iomem *cpu_base = gic_data_cpu_base(gic);
- unsigned long irqstat, irqnr;
- if (WARN_ON(!in_nmi()))
return;
- while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) &
SMP_IPI_FIQ_MASK) {
irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
irqnr = irqstat & GICC_IAR_INT_ID_MASK;
WARN_RATELIMIT(irqnr > 16,
"Unexpected irqnr %lu (bad prioritization?)\n",
irqnr);
- }
+}
void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq) { if (gic_nr >= MAX_GIC_NR) @@ -364,15 +456,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic) static void gic_cpu_if_up(void) { void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]);
- u32 bypass = 0;
void __iomem *dist_base = gic_data_dist_base(&gic_data[0]);
u32 ctrl = 0;
/*
- Preserve bypass disable bits to be written back later
- */
- bypass = readl(cpu_base + GIC_CPU_CTRL);
- bypass &= GICC_DIS_BYPASS_MASK;
* Preserve bypass disable bits to be written back later
*/
- ctrl = readl(cpu_base + GIC_CPU_CTRL);
- ctrl &= GICC_DIS_BYPASS_MASK;
- writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
- /*
* If EnableGrp1 is set in the distributor then enable group 1
* support for this CPU (and route group 0 interrupts to FIQ).
*/
- if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL))
ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL |
GICC_ENABLE_GRP1;
- writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL);
}
@@ -396,7 +497,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
gic_dist_config(base, gic_irqs, NULL);
- writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL);
- /*
* Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only,
* bit 1 ignored) depending on current mode.
*/
- writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL);
- /*
* Set all global interrupts to be group 1 if (and only if) it
* is possible to enable group 1 interrupts. This register is RAZ/WI
* if not accessible or not implemented, however some GICv1 devices
* do not implement the EnableGrp1 bit making it unsafe to set
* this register unconditionally.
*/
- if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))
for (i = 32; i < gic_irqs; i += 32)
writel_relaxed(0xffffffff,
base + GIC_DIST_IGROUP + i * 4 / 32);
}
static void gic_cpu_init(struct gic_chip_data *gic) @@ -405,6 +522,7 @@ static void gic_cpu_init(struct gic_chip_data *gic) void __iomem *base = gic_data_cpu_base(gic); unsigned int cpu_mask, cpu = smp_processor_id(); int i;
unsigned long secure_irqs, secure_irq;
/*
- Get what the GIC says our CPU mask is.
@@ -423,6 +541,19 @@ static void gic_cpu_init(struct gic_chip_data *gic)
gic_cpu_config(dist_base, NULL);
- /*
* If the distributor is configured to support interrupt grouping
* then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK
* to be group1 and ensure any remaining group 0 interrupts have
* the right priority.
*/
- if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) {
secure_irqs = SMP_IPI_FIQ_MASK;
writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0);
for_each_set_bit(secure_irq, &secure_irqs, 16)
gic_set_group_irq(dist_base, secure_irq, 0);
- }
- writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK); gic_cpu_if_up();
} @@ -512,7 +643,8 @@ static void gic_dist_restore(unsigned int gic_nr) writel_relaxed(gic_data[gic_nr].saved_spi_enable[i], dist_base + GIC_DIST_ENABLE_SET + i * 4);
- writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL);
- writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE,
dist_base + GIC_DIST_CTRL);
}
static void gic_cpu_save(unsigned int gic_nr) @@ -640,6 +772,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; unsigned long map = 0;
unsigned long softint;
read_lock(&cpu_map_migration_lock);
@@ -654,7 +787,11 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) dmb(ishst);
/* this always happens on GIC0 */
- writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) +
GIC_DIST_SOFTINT); + softint = map << 16 | irq;
if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq))
softint |= 0x8000;
writel_relaxed(softint,
gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
read_unlock(&cpu_map_migration_lock);
} diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h index 13eed92c7d24..a906fb7ac11f 100644 --- a/include/linux/irqchip/arm-gic.h +++ b/include/linux/irqchip/arm-gic.h @@ -22,6 +22,10 @@ #define GIC_CPU_IDENT 0xfc
#define GICC_ENABLE 0x1 +#define GICC_ENABLE_GRP1 0x2 +#define GICC_ACK_CTL 0x4 +#define GICC_FIQ_EN 0x8 +#define GICC_COMMON_BPR 0x10 #define GICC_INT_PRI_THRESHOLD 0xf0 #define GICC_IAR_INT_ID_MASK 0x3ff #define GICC_INT_SPURIOUS 1023 @@ -44,6 +48,7 @@ #define GIC_DIST_SGI_PENDING_SET 0xf20
#define GICD_ENABLE 0x1 +#define GICD_ENABLE_GRP1 0x2 #define GICD_DISABLE 0x0 #define GICD_INT_ACTLOW_LVLTRIG 0x0 #define GICD_INT_EN_CLR_X32 0xffffffff @@ -117,5 +122,8 @@ static inline void __init register_routable_domain_ops { gic_routable_irq_domain_ops = ops; }
+void gic_handle_fiq_ipi(void);
#endif /* __ASSEMBLY */ #endif
On 26/11/14 15:09, Tim Sander wrote:
I would be quite happy if grouping support for gic would be mainlined. Then only the dance to get the old gic version 1 working with fiqs would be needed...
You mention "the dance"...
Are you familiar with this work from Marek Vasut? https://lkml.org/lkml/2014/7/15/550
Marek blushed a bit when it was written and it wasn't very popular in code review... however it does arranges memory to mapped in a manner that allows FIQ to be deployed by the kernel on early gic v1 devices.
+/*
- Shift an interrupt between Group 0 and Group 1.
- In addition to changing the group we also modify the priority to
- match what "ARM strongly recommends" for a system where no Group 1
- interrupt must ever preempt a Group 0 interrupt.
- If is safe to call this function on systems which do not support
- grouping (it will have no effect).
- */
+static void gic_set_group_irq(void __iomem *base, unsigned int hwirq,
int group)
+{
- unsigned int grp_reg = hwirq / 32 * 4;
- u32 grp_mask = BIT(hwirq % 32);
- u32 grp_val;
- unsigned int pri_reg = (hwirq / 4) * 4;
- u32 pri_mask = BIT(7 + ((hwirq % 4) * 8));
- u32 pri_val;
- /*
* Systems which do not support grouping will have not have
* the EnableGrp1 bit set.
*/
- if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)))
return;
- raw_spin_lock(&irq_controller_lock);
Assumption: The interrupt in question is not masked over here?
At present this function is called only during initialization and all interrupts are globally disabled at that stage in the boot.
- grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
- pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg);
- if (group) {
grp_val |= grp_mask;
pri_val |= pri_mask;
- } else {
grp_val &= ~grp_mask;
pri_val &= ~pri_mask;
- }
- writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg);
If the assumption is true, then there is a race if the interrupt in question hits here with undefined priority setting. Recomended workaround would be masking the interrupt in question.
An interesting question!
Firstly, as mentioned above, such a race is impossible with the code proposed so far.
I do have some code sitting written by untested that makes it possible to set the group based on a flag passed during request_irq() (something requested by tglx in a review from a month or two back). That also means the interrupt is disabled during the call.
I think that means that neither now nor in the immediate future would such a race be possible.
Daniel.
Am Mittwoch, 26. November 2014, 15:48:47 schrieb Daniel Thompson:
On 26/11/14 15:09, Tim Sander wrote:
I would be quite happy if grouping support for gic would be mainlined. Then only the dance to get the old gic version 1 working with fiqs would be needed...
You mention "the dance"...
Are you familiar with this work from Marek Vasut? https://lkml.org/lkml/2014/7/15/550
The world is a small place isn't it. Unfortunatly yes.. and that is not because Marek is a not a nice guy (quite in contrary) but the way it solves the problem we had with the GIC in the socfpga. There should have been some pins from the FPGA fabric to the "legacy" FIQ interrupt "pins" of the core. Unfortunatly these where forgotten...
Marek had also an aproach similar to yours checking if the irq is wrongly signalled. In our workload the performance was much to worse to consider it a solution (which is contrary to Harro Haan's findings but we have a magnitude higher FIQ load). So he got a hint from a french guy (forgot the name) who had the idea to use a non-secure mapping to read the irq id as fiq id's must not be read in non-secure reads.
This leads to the question i was also asking Marc Zyngier at LinuxCon: if this aproach is mainlinable in any way.
And just to get the message out there, espcially to ARM: yes there are users of FIQ interrupts which wan't to use Linux in combination with FIQ's and who don't wan't to resort to Cortex R cores without a MMU. And seeing that ARM is deprecating the use of FIQ on ARM64 i wonder how a solution to have IRQ's not masked by Linux looks in this for upcoming processor generations.
Marek blushed a bit when it was written and it wasn't very popular in code review... however it does arranges memory to mapped in a manner that allows FIQ to be deployed by the kernel on early gic v1 devices.
In a way i made him indirectly do it by asking the right questions to the silicon vendor.
+/*
- Shift an interrupt between Group 0 and Group 1.
- In addition to changing the group we also modify the priority to
- match what "ARM strongly recommends" for a system where no Group 1
- interrupt must ever preempt a Group 0 interrupt.
- If is safe to call this function on systems which do not support
- grouping (it will have no effect).
- */
+static void gic_set_group_irq(void __iomem *base, unsigned int hwirq,
int group)
+{
- unsigned int grp_reg = hwirq / 32 * 4;
- u32 grp_mask = BIT(hwirq % 32);
- u32 grp_val;
- unsigned int pri_reg = (hwirq / 4) * 4;
- u32 pri_mask = BIT(7 + ((hwirq % 4) * 8));
- u32 pri_val;
- /*
* Systems which do not support grouping will have not have
* the EnableGrp1 bit set.
*/
- if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)))
return;
- raw_spin_lock(&irq_controller_lock);
Assumption: The interrupt in question is not masked over here?
At present this function is called only during initialization and all interrupts are globally disabled at that stage in the boot.
- grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
- pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg);
- if (group) {
grp_val |= grp_mask;
pri_val |= pri_mask;
- } else {
grp_val &= ~grp_mask;
pri_val &= ~pri_mask;
- }
- writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg);
If the assumption is true, then there is a race if the interrupt in question hits here with undefined priority setting. Recomended workaround would be masking the interrupt in question.
An interesting question!
Firstly, as mentioned above, such a race is impossible with the code proposed so far.
I do have some code sitting written by untested that makes it possible to set the group based on a flag passed during request_irq() (something requested by tglx in a review from a month or two back). That also means the interrupt is disabled during the call.
I think that means that neither now nor in the immediate future would such a race be possible.
Daniel.
Add basic infrastructure for triggering a backtrace of other CPUs via an IPI, preferably at FIQ level. It is intended that this shall be used for cases where we have detected that something has already failed in the kernel.
Signed-off-by: Russell King rmk+kernel@arm.linux.org.uk Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- arch/arm/include/asm/irq.h | 5 ++++ arch/arm/kernel/smp.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+)
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h index 53c15dec7af6..be1d07d59ee9 100644 --- a/arch/arm/include/asm/irq.h +++ b/arch/arm/include/asm/irq.h @@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *); extern void set_handle_irq(void (*handle_irq)(struct pt_regs *)); #endif
+#ifdef CONFIG_SMP +extern void arch_trigger_all_cpu_backtrace(bool); +#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x) +#endif + #endif
#endif diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 13396d3d600e..14c594a12bef 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -72,8 +72,12 @@ enum ipi_msg_type { IPI_CPU_STOP, IPI_IRQ_WORK, IPI_COMPLETION, + IPI_CPU_BACKTRACE, };
+/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; + static DECLARE_COMPLETION(cpu_running);
static struct smp_operations smp_ops; @@ -535,6 +539,21 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
+static void ipi_cpu_backtrace(struct pt_regs *regs) +{ + int cpu = smp_processor_id(); + + if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { + static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED; + + arch_spin_lock(&lock); + printk(KERN_WARNING "FIQ backtrace for cpu %d\n", cpu); + show_regs(regs); + arch_spin_unlock(&lock); + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + } +} + static DEFINE_PER_CPU(struct completion *, cpu_completion);
int register_ipi_completion(struct completion *completion, int cpu) @@ -614,6 +633,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs) irq_exit(); break;
+ case IPI_CPU_BACKTRACE: + irq_enter(); + ipi_cpu_backtrace(regs); + irq_exit(); + break; + default: printk(KERN_CRIT "CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); @@ -708,3 +733,40 @@ static int __init register_cpufreq_notifier(void) core_initcall(register_cpufreq_notifier);
#endif + +void arch_trigger_all_cpu_backtrace(bool include_self) +{ + static unsigned long backtrace_flag; + int i, cpu = get_cpu(); + + if (test_and_set_bit(0, &backtrace_flag)) { + /* + * If there is already a trigger_all_cpu_backtrace() in progress + * (backtrace_flag == 1), don't output double cpu dump infos. + */ + put_cpu(); + return; + } + + cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask); + if (!include_self) + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + + if (!cpumask_empty(to_cpumask(backtrace_mask))) { + pr_info("Sending FIQ to %s CPUs:\n", + (include_self ? "all" : "other")); + smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE); + } + + /* Wait for up to 10 seconds for all CPUs to do the backtrace */ + for (i = 0; i < 10 * 1000; i++) { + if (cpumask_empty(to_cpumask(backtrace_mask))) + break; + + mdelay(1); + } + + clear_bit(0, &backtrace_flag); + smp_mb__after_atomic(); + put_cpu(); +}
Previous changes have introduced both a replacement default FIQ handler and an implementation of arch_trigger_all_cpu_backtrace for ARM but these are currently independent of each other.
This patch plumbs together these features making it possible, on platforms that support it, to trigger backtrace using FIQ.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- arch/arm/include/asm/smp.h | 3 +++ arch/arm/kernel/smp.c | 4 +++- arch/arm/kernel/traps.c | 3 +++ 3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h index 18f5a554134f..b076584ac0fa 100644 --- a/arch/arm/include/asm/smp.h +++ b/arch/arm/include/asm/smp.h @@ -18,6 +18,8 @@ # error "<asm/smp.h> included in non-SMP build" #endif
+#define SMP_IPI_FIQ_MASK 0x0100 + #define raw_smp_processor_id() (current_thread_info()->cpu)
struct seq_file; @@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu); extern void arch_send_call_function_ipi_mask(const struct cpumask *mask); extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
+extern void ipi_cpu_backtrace(struct pt_regs *regs); extern int register_ipi_completion(struct completion *completion, int cpu);
struct smp_operations { diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 14c594a12bef..e923843562d9 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -539,7 +539,7 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
-static void ipi_cpu_backtrace(struct pt_regs *regs) +void ipi_cpu_backtrace(struct pt_regs *regs) { int cpu = smp_processor_id();
@@ -580,6 +580,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs) unsigned int cpu = smp_processor_id(); struct pt_regs *old_regs = set_irq_regs(regs);
+ BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE)); + if ((unsigned)ipinr < NR_IPI) { trace_ipi_entry(ipi_types[ipinr]); __inc_irq_stat(cpu, ipi_irqs[ipinr]); diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 4dc45b38e56e..9eb05be9526e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) #ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif +#ifdef CONFIG_SMP + ipi_cpu_backtrace(regs); +#endif
nmi_exit();
Hi Daniel
Am Dienstag, 25. November 2014, 17:26:41 schrieb Daniel Thompson:
Previous changes have introduced both a replacement default FIQ handler and an implementation of arch_trigger_all_cpu_backtrace for ARM but these are currently independent of each other.
This patch plumbs together these features making it possible, on platforms that support it, to trigger backtrace using FIQ.
Does this ipi handler interfere in any way with set_fiq_handler?
As far as i know there is only one FIQ handler vector so i guess there is a potential conflict. But i have not worked with IPI's so i might be completley wrong.
Regards Tim
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org
arch/arm/include/asm/smp.h | 3 +++ arch/arm/kernel/smp.c | 4 +++- arch/arm/kernel/traps.c | 3 +++ 3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h index 18f5a554134f..b076584ac0fa 100644 --- a/arch/arm/include/asm/smp.h +++ b/arch/arm/include/asm/smp.h @@ -18,6 +18,8 @@ # error "<asm/smp.h> included in non-SMP build" #endif
+#define SMP_IPI_FIQ_MASK 0x0100
#define raw_smp_processor_id() (current_thread_info()->cpu)
struct seq_file; @@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu); extern void arch_send_call_function_ipi_mask(const struct cpumask *mask); extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
+extern void ipi_cpu_backtrace(struct pt_regs *regs); extern int register_ipi_completion(struct completion *completion, int cpu);
struct smp_operations { diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 14c594a12bef..e923843562d9 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -539,7 +539,7 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
-static void ipi_cpu_backtrace(struct pt_regs *regs) +void ipi_cpu_backtrace(struct pt_regs *regs) { int cpu = smp_processor_id();
@@ -580,6 +580,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs) unsigned int cpu = smp_processor_id(); struct pt_regs *old_regs = set_irq_regs(regs);
- BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE));
- if ((unsigned)ipinr < NR_IPI) { trace_ipi_entry(ipi_types[ipinr]); __inc_irq_stat(cpu, ipi_irqs[ipinr]);
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 4dc45b38e56e..9eb05be9526e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) #ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif +#ifdef CONFIG_SMP
- ipi_cpu_backtrace(regs);
+#endif
nmi_exit();
On Wed, Nov 26, 2014 at 01:46:52PM +0100, Tim Sander wrote:
Hi Daniel
Am Dienstag, 25. November 2014, 17:26:41 schrieb Daniel Thompson:
Previous changes have introduced both a replacement default FIQ handler and an implementation of arch_trigger_all_cpu_backtrace for ARM but these are currently independent of each other.
This patch plumbs together these features making it possible, on platforms that support it, to trigger backtrace using FIQ.
Does this ipi handler interfere in any way with set_fiq_handler?
As far as i know there is only one FIQ handler vector so i guess there is a potential conflict. But i have not worked with IPI's so i might be completley wrong.
First, the code in arch/arm/kernel/fiq.c should work with this new FIQ code in that the new FIQ code is used as the "default" handler (as opposed to the original handler which was a total no-op.)
Secondly, use of arch/arm/kernel/fiq.c in a SMP system is really not a good idea: the FIQ registers are private to each CPU in the system, and there is no infrastructure to allow fiq.c to ensure that it loads the right CPU with the register information for the provided handler.
So, use of arch/arm/kernel/fiq.c and the IPI's use of FIQ /should/ be mutually exclusive.
On 26/11/14 13:12, Russell King - ARM Linux wrote:
On Wed, Nov 26, 2014 at 01:46:52PM +0100, Tim Sander wrote:
Hi Daniel
Am Dienstag, 25. November 2014, 17:26:41 schrieb Daniel Thompson:
Previous changes have introduced both a replacement default FIQ handler and an implementation of arch_trigger_all_cpu_backtrace for ARM but these are currently independent of each other.
This patch plumbs together these features making it possible, on platforms that support it, to trigger backtrace using FIQ.
Does this ipi handler interfere in any way with set_fiq_handler?
As far as i know there is only one FIQ handler vector so i guess there is a potential conflict. But i have not worked with IPI's so i might be completley wrong.
First, the code in arch/arm/kernel/fiq.c should work with this new FIQ code in that the new FIQ code is used as the "default" handler (as opposed to the original handler which was a total no-op.)
Secondly, use of arch/arm/kernel/fiq.c in a SMP system is really not a good idea: the FIQ registers are private to each CPU in the system, and there is no infrastructure to allow fiq.c to ensure that it loads the right CPU with the register information for the provided handler.
So, use of arch/arm/kernel/fiq.c and the IPI's use of FIQ /should/ be mutually exclusive.
Agree with the above. Just to add...
I am currently working to get NMI features from x86 land running on top of the new default FIQ handler: arch_trigger_all_cpu_backtrace (with Russell's patch), perf, hard lockup detector, kgdb.
However I don't think anything I'm doing makes it very much harder than it already is to use arch/arm/kernel/fiq.c . That said, other then setting the GIC up nicely, I am not doing anything to make it easier either.
I'd like to end up somewhere where if you want the NMI features (and have a suitable device) you just use the default handler and it all just works. If you need *Fast* Interrupt reQuests, proper old school "I want to write an overclocked I2C slave in software" craziness and you can pass on the improved debug features then set_fiq_handler() is still there and still need extremely careful handling.
The only thing I might have done to make your life worse is not provide the code to dynamically shunt all the debug and performance monitoring features back to group 1. All except the hard lockup detector will have logic to fall back statically. This means making it dynamic shouldn't be that hard. However since there is no code in the upstream kernel that would use the code I don't plan to go there myself.
Daniel.
Hi Daniel, Russell
Am Mittwoch, 26. November 2014, 16:17:06 schrieb Daniel Thompson:
On 26/11/14 13:12, Russell King - ARM Linux wrote:
On Wed, Nov 26, 2014 at 01:46:52PM +0100, Tim Sander wrote:
Hi Daniel
Am Dienstag, 25. November 2014, 17:26:41 schrieb Daniel Thompson:
Previous changes have introduced both a replacement default FIQ handler and an implementation of arch_trigger_all_cpu_backtrace for ARM but these are currently independent of each other.
This patch plumbs together these features making it possible, on platforms that support it, to trigger backtrace using FIQ.
Does this ipi handler interfere in any way with set_fiq_handler?
As far as i know there is only one FIQ handler vector so i guess there is a potential conflict. But i have not worked with IPI's so i might be completley wrong.
First, the code in arch/arm/kernel/fiq.c should work with this new FIQ code in that the new FIQ code is used as the "default" handler (as opposed to the original handler which was a total no-op.)
Secondly, use of arch/arm/kernel/fiq.c in a SMP system is really not a good idea: the FIQ registers are private to each CPU in the system, and there is no infrastructure to allow fiq.c to ensure that it loads the right CPU with the register information for the provided handler.
Well given the races in the GIC v1. i have seen in the chips on my desk initializing with for_each_possible_cpu(cpu) work_on_cpu(cpu,..) is rather easy.
So, use of arch/arm/kernel/fiq.c and the IPI's use of FIQ /should/ be mutually exclusive.
Yes but i digress on the assessment that this a decision between SMP and non- SMP usage or the availbility of the GIC.
Agree with the above. Just to add...
I am currently working to get NMI features from x86 land running on top of the new default FIQ handler: arch_trigger_all_cpu_backtrace (with Russell's patch), perf, hard lockup detector, kgdb.
However I don't think anything I'm doing makes it very much harder than it already is to use arch/arm/kernel/fiq.c . That said, other then setting the GIC up nicely, I am not doing anything to make it easier either.
I'd like to end up somewhere where if you want the NMI features (and have a suitable device) you just use the default handler and it all just works. If you need *Fast* Interrupt reQuests, proper old school "I want to write an overclocked I2C slave in software" craziness and you can pass on the improved debug features then set_fiq_handler() is still there and still need extremely careful handling.
Well i am not against these features as they assumably improve the backtrace, but it would be nice to have a config option which switches between set_fiq_handler usage and the other conflicting usages of the fiq.
The only thing I might have done to make your life worse is not provide the code to dynamically shunt all the debug and performance monitoring features back to group 1. All except the hard lockup detector will have logic to fall back statically. This means making it dynamic shouldn't be that hard. However since there is no code in the upstream kernel that would use the code I don't plan to go there myself.
I don't think this needs to by dynamic, but from a user perspective a config option would be really nice.
Tim
On Fri, Nov 28, 2014 at 10:10:04AM +0100, Tim Sander wrote:
Hi Daniel, Russell
Am Mittwoch, 26. November 2014, 16:17:06 schrieb Daniel Thompson:
On 26/11/14 13:12, Russell King - ARM Linux wrote:
On Wed, Nov 26, 2014 at 01:46:52PM +0100, Tim Sander wrote:
Hi Daniel
Am Dienstag, 25. November 2014, 17:26:41 schrieb Daniel Thompson:
Previous changes have introduced both a replacement default FIQ handler and an implementation of arch_trigger_all_cpu_backtrace for ARM but these are currently independent of each other.
This patch plumbs together these features making it possible, on platforms that support it, to trigger backtrace using FIQ.
Does this ipi handler interfere in any way with set_fiq_handler?
As far as i know there is only one FIQ handler vector so i guess there is a potential conflict. But i have not worked with IPI's so i might be completley wrong.
First, the code in arch/arm/kernel/fiq.c should work with this new FIQ code in that the new FIQ code is used as the "default" handler (as opposed to the original handler which was a total no-op.)
Secondly, use of arch/arm/kernel/fiq.c in a SMP system is really not a good idea: the FIQ registers are private to each CPU in the system, and there is no infrastructure to allow fiq.c to ensure that it loads the right CPU with the register information for the provided handler.
Well given the races in the GIC v1. i have seen in the chips on my desk initializing with for_each_possible_cpu(cpu) work_on_cpu(cpu,..) is rather easy.
So, use of arch/arm/kernel/fiq.c and the IPI's use of FIQ /should/ be mutually exclusive.
Yes but i digress on the assessment that this a decision between SMP and non- SMP usage or the availbility of the GIC.
The two things are mutually exclusive. You can either have FIQ being used for debug purposes, where we decode the FIQ reason and call some function (which means that we will only service one FIQ at a time) or you can use it in exclusive mode (provided by fiq.c) where your handler has sole usage of the vector, and benefits from fast and immediate servicing of the event.
You can't have fast and immediate servicing of the event _and_ debug usage at the same time.
Well i am not against these features as they assumably improve the backtrace, but it would be nice to have a config option which switches between set_fiq_handler usage and the other conflicting usages of the fiq.
You have a config option already. CONFIG_FIQ.
Hi Russel, Daniel
Am Freitag, 28. November 2014, 10:08:28 schrieb Russell King - ARM Linux:
On Fri, Nov 28, 2014 at 10:10:04AM +0100, Tim Sander wrote:
Hi Daniel, Russell
Am Mittwoch, 26. November 2014, 16:17:06 schrieb Daniel Thompson:
On 26/11/14 13:12, Russell King - ARM Linux wrote:
On Wed, Nov 26, 2014 at 01:46:52PM +0100, Tim Sander wrote:
Hi Daniel
Am Dienstag, 25. November 2014, 17:26:41 schrieb Daniel Thompson:
Previous changes have introduced both a replacement default FIQ handler and an implementation of arch_trigger_all_cpu_backtrace for ARM but these are currently independent of each other.
This patch plumbs together these features making it possible, on platforms that support it, to trigger backtrace using FIQ.
Does this ipi handler interfere in any way with set_fiq_handler?
As far as i know there is only one FIQ handler vector so i guess there is a potential conflict. But i have not worked with IPI's so i might be completley wrong.
First, the code in arch/arm/kernel/fiq.c should work with this new FIQ code in that the new FIQ code is used as the "default" handler (as opposed to the original handler which was a total no-op.)
Secondly, use of arch/arm/kernel/fiq.c in a SMP system is really not a good idea: the FIQ registers are private to each CPU in the system, and there is no infrastructure to allow fiq.c to ensure that it loads the right CPU with the register information for the provided handler.
Well given the races in the GIC v1. i have seen in the chips on my desk initializing with for_each_possible_cpu(cpu) work_on_cpu(cpu,..) is rather easy.
So, use of arch/arm/kernel/fiq.c and the IPI's use of FIQ /should/ be mutually exclusive.
Yes but i digress on the assessment that this a decision between SMP and non- SMP usage or the availbility of the GIC.
The two things are mutually exclusive. You can either have FIQ being used for debug purposes, where we decode the FIQ reason and call some function (which means that we will only service one FIQ at a time) or you can use it in exclusive mode (provided by fiq.c) where your handler has sole usage of the vector, and benefits from fast and immediate servicing of the event.
As far as i am aware, die CONFIG_FIQ symbol is not pulled by all ARM platforms. Since there are ARM platforms which don't use this symbol but the hardware is fully capable of handling FIQ requests i would expect, that adding CONFIG_FIQ to a plattform, that this platform honors the set_fiq_handler functionality.
You can't have fast and immediate servicing of the event _and_ debug usage at the same time.
Well i am not against these features as they assumably improve the backtrace, but it would be nice to have a config option which switches between set_fiq_handler usage and the other conflicting usages of the fiq.
You have a config option already. CONFIG_FIQ.
Yes, but if the FIQ handler is also used for IPI, set_fiq_handler gets IPI interrupts (with the patch starting this thread)? So i think that the patch needs to look like: --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) +#ifndef CONFIG_FIQ #ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif +#endif
As otherwise if the platform has CONFIG_SMP and CONFIG_FIQ and CONFIG_ARM_GIC the GIC will get reprogrammed to deliver FIQ's to the handler set by set_fiq_handler ?
Best regards Tim
On Mon, Dec 01, 2014 at 11:32:00AM +0100, Tim Sander wrote:
Hi Russel, Daniel
Am Freitag, 28. November 2014, 10:08:28 schrieb Russell King - ARM Linux:
The two things are mutually exclusive. You can either have FIQ being used for debug purposes, where we decode the FIQ reason and call some function (which means that we will only service one FIQ at a time) or you can use it in exclusive mode (provided by fiq.c) where your handler has sole usage of the vector, and benefits from fast and immediate servicing of the event.
As far as i am aware, die CONFIG_FIQ symbol is not pulled by all ARM platforms. Since there are ARM platforms which don't use this symbol but the hardware is fully capable of handling FIQ requests i would expect, that adding CONFIG_FIQ to a plattform, that this platform honors the set_fiq_handler functionality.
That whole paragraph doesn't make much sense to me.
Look, in my mind it is very simple. If you are using CONFIG_FIQ on a SMP platform, your life will be very difficult. The FIQ code enabled by that symbol is not designed to be used on SMP systems, *period*.
If you decide to enable CONFIG_FIQ, and you use that code on a SMP platform, I'm going to say right now so it's totally clear: if you encounter a problem, I don't want to know about it. The code is not designed for use on that situation.
Therefore, as far as I'm concerned, the two facilities are mututally exclusive.
I had thought about whether the IPI FIQ should be disabled when a replacement FIQ handler is installed, I deem it not to be a use case that the mainline kernel needs to be concerned about.
Yes, but if the FIQ handler is also used for IPI, set_fiq_handler gets IPI interrupts (with the patch starting this thread)? So i think that the patch needs to look like: --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) +#ifndef CONFIG_FIQ #ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif +#endif
No. With a single zImage kernel, you could very well have SMP and FIQ both enabled, but have a non-SMP platform using FIQ, but also support SMP platforms as well. Your change prevents that happening.
Hi Russell
Am Montag, 1. Dezember 2014, 10:38:32 schrieb Russell King - ARM Linux:
On Mon, Dec 01, 2014 at 11:32:00AM +0100, Tim Sander wrote:
Hi Russel, Daniel
Am Freitag, 28. November 2014, 10:08:28 schrieb Russell King - ARM Linux:
The two things are mutually exclusive. You can either have FIQ being used for debug purposes, where we decode the FIQ reason and call some function (which means that we will only service one FIQ at a time) or you can use it in exclusive mode (provided by fiq.c) where your handler has sole usage of the vector, and benefits from fast and immediate servicing of the event.
As far as i am aware, die CONFIG_FIQ symbol is not pulled by all ARM platforms. Since there are ARM platforms which don't use this symbol but the hardware is fully capable of handling FIQ requests i would expect, that adding CONFIG_FIQ to a plattform, that this platform honors the set_fiq_handler functionality.
That whole paragraph doesn't make much sense to me.
Look, in my mind it is very simple. If you are using CONFIG_FIQ on a SMP platform, your life will be very difficult. The FIQ code enabled by that symbol is not designed to be used on SMP systems, *period*.
Well the only extra thing you had to do is set up the FIQ registers on every cpu, but i would not call that very difficult. Other than that i am not aware of any problems that are not also present on a uniprocessor system. So i have a hard time following your reasoning why SMP is different from UP in regard to the CONFIG_FIQ.
If you decide to enable CONFIG_FIQ, and you use that code on a SMP platform, I'm going to say right now so it's totally clear: if you encounter a problem, I don't want to know about it. The code is not designed for use on that situation.
Even with using the FIQ on a Linux SMP system you have not heard from me before, as i knew that this is not your problem (and that is not to say that there where none!). The only interface Linux has been making available is set_fiq_handler. So it was clear that the FIQ is its own domain otherwise untouched by the kernel. Now the line gets blurried with the linux kernel moving to use the FIQ. And with the descicions forthcoming its not only grabbing land it also claims a previous public path for its own. So it doesn't help that its planting some flowers along the way. So please be nice to the natural inhabitants...
And i really don't get it, that neither ARM nor the kernel community sees fast interrupts as a worthwhile usecase. Unfortunatly the interrupt latencies with Linux are at least a order of magnitude higher than the pure hardware even with longer pipelines can deliver.
Therefore, as far as I'm concerned, the two facilities are mututally exclusive.
Well can't have the cake and eat it too.
I had thought about whether the IPI FIQ should be disabled when a replacement FIQ handler is installed, I deem it not to be a use case that the mainline kernel needs to be concerned about.
That would be nice.
Yes, but if the FIQ handler is also used for IPI, set_fiq_handler gets IPI interrupts (with the patch starting this thread)? So i think that the patch needs to look like: --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) +#ifndef CONFIG_FIQ
#ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif
+#endif
No. With a single zImage kernel, you could very well have SMP and FIQ both enabled, but have a non-SMP platform using FIQ, but also support SMP platforms as well. Your change prevents that happening.
Ah, well i have to get used to this "new" devicetree thingy, where one size fits all...
Still if you boot a single process system which has FIQ available and has a GIC with such a kernel, then you also reprogramm the IPI's as FIQs. But i guess thats not a problem as Linux does not self IPI the kernel as other os'es do?
Best regards Tim
On 01/12/14 13:54, Tim Sander wrote:
Look, in my mind it is very simple. If you are using CONFIG_FIQ on a SMP platform, your life will be very difficult. The FIQ code enabled by that symbol is not designed to be used on SMP systems, *period*.
Well the only extra thing you had to do is set up the FIQ registers on every cpu, but i would not call that very difficult. Other than that i am not aware of any problems that are not also present on a uniprocessor system. So i have a hard time following your reasoning why SMP is different from UP in regard to the CONFIG_FIQ.
If you decide to enable CONFIG_FIQ, and you use that code on a SMP platform, I'm going to say right now so it's totally clear: if you encounter a problem, I don't want to know about it. The code is not designed for use on that situation.
Even with using the FIQ on a Linux SMP system you have not heard from me before, as i knew that this is not your problem (and that is not to say that there where none!). The only interface Linux has been making available is set_fiq_handler. So it was clear that the FIQ is its own domain otherwise untouched by the kernel. Now the line gets blurried with the linux kernel moving to use the FIQ. And with the descicions forthcoming its not only grabbing land it also claims a previous public path for its own. So it doesn't help that its planting some flowers along the way. So please be nice to the natural inhabitants...
Surely only upstream code could claim to be a natural inhabitant.
Whenever I've been working on code that, for whatever reason, cannot be upstreamed I'd probably best be regarded as a tourist.
And i really don't get it, that neither ARM nor the kernel community sees fast interrupts as a worthwhile usecase. Unfortunatly the interrupt latencies with Linux are at least a order of magnitude higher than the pure hardware even with longer pipelines can deliver.
Therefore, as far as I'm concerned, the two facilities are mututally exclusive.
Well can't have the cake and eat it too.
I had thought about whether the IPI FIQ should be disabled when a replacement FIQ handler is installed, I deem it not to be a use case that the mainline kernel needs to be concerned about.
That would be nice.
Just to be clear, this is exactly the dynamic switching that I mentioned a couple of mails ago.
As I said such code should not especially hard to write but, with the current mainline kernel, the code would be unreachable and, as a result, likely also to be more or less untested.
Yes, but if the FIQ handler is also used for IPI, set_fiq_handler gets IPI interrupts (with the patch starting this thread)? So i think that the patch needs to look like: --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) +#ifndef CONFIG_FIQ
#ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif
+#endif
No. With a single zImage kernel, you could very well have SMP and FIQ both enabled, but have a non-SMP platform using FIQ, but also support SMP platforms as well. Your change prevents that happening.
Ah, well i have to get used to this "new" devicetree thingy, where one size fits all...
Still if you boot a single process system which has FIQ available and has a GIC with such a kernel, then you also reprogramm the IPI's as FIQs. But i guess thats not a problem as Linux does not self IPI the kernel as other os'es do?
Best regards Tim
Hi Daniel
Am Montag, 1. Dezember 2014, 14:13:52 schrieb Daniel Thompson:
On 01/12/14 13:54, Tim Sander wrote:
Look, in my mind it is very simple. If you are using CONFIG_FIQ on a SMP platform, your life will be very difficult. The FIQ code enabled by that symbol is not designed to be used on SMP systems, *period*.
Well the only extra thing you had to do is set up the FIQ registers on every cpu, but i would not call that very difficult. Other than that i am not aware of any problems that are not also present on a uniprocessor system. So i have a hard time following your reasoning why SMP is different from UP in regard to the CONFIG_FIQ.
If you decide to enable CONFIG_FIQ, and you use that code on a SMP platform, I'm going to say right now so it's totally clear: if you encounter a problem, I don't want to know about it. The code is not designed for use on that situation.
Even with using the FIQ on a Linux SMP system you have not heard from me before, as i knew that this is not your problem (and that is not to say that there where none!). The only interface Linux has been making available is set_fiq_handler. So it was clear that the FIQ is its own domain otherwise untouched by the kernel. Now the line gets blurried with the linux kernel moving to use the FIQ. And with the descicions forthcoming its not only grabbing land it also claims a previous public path for its own. So it doesn't help that its planting some flowers along the way. So please be nice to the natural inhabitants...
Surely only upstream code could claim to be a natural inhabitant.
Well from a kernel developer perspective this might be true, but well there are things, e.g. the stuff the nice guys at free electrons did, which are quite reasonable but would be laughed at if tried to include in the kernel: http://free-electrons.com/blog/fiq-handlers-in-the-arm-linux-kernel/ Still this shows very much that you can build quite powerfull systems which combine both the power of linux with the lowes latency the bare hardware can give you.
Whenever I've been working on code that, for whatever reason, cannot be upstreamed I'd probably best be regarded as a tourist.
I think that application specific code which needs all the power the hardware gives you in a given power envelope and is so optimized for a special usecase that integration in kernel makes no sense. So i would hope for a more constructive mindset.
And i really don't get it, that neither ARM nor the kernel community sees fast interrupts as a worthwhile usecase. Unfortunatly the interrupt latencies with Linux are at least a order of magnitude higher than the pure hardware even with longer pipelines can deliver.
Therefore, as far as I'm concerned, the two facilities are mututally exclusive.
Well can't have the cake and eat it too.
I had thought about whether the IPI FIQ should be disabled when a replacement FIQ handler is installed, I deem it not to be a use case that the mainline kernel needs to be concerned about.
That would be nice.
Just to be clear, this is exactly the dynamic switching that I mentioned a couple of mails ago.
Ok, my takeaway is there is currently not enough interest from your side to implement it but you would support some changes if submitted?
As I said such code should not especially hard to write but, with the current mainline kernel, the code would be unreachable and, as a result, likely also to be more or less untested.
Well, my misconception was, that this might be done by adding some ifdefs but as Russell pointed out, that is not the way to go.
Best regards Tim
On 03/12/14 13:41, Tim Sander wrote:
Even with using the FIQ on a Linux SMP system you have not heard from me before, as i knew that this is not your problem (and that is not to say that there where none!). The only interface Linux has been making available is set_fiq_handler. So it was clear that the FIQ is its own domain otherwise untouched by the kernel. Now the line gets blurried with the linux kernel moving to use the FIQ. And with the descicions forthcoming its not only grabbing land it also claims a previous public path for its own. So it doesn't help that its planting some flowers along the way. So please be nice to the natural inhabitants...
Surely only upstream code could claim to be a natural inhabitant.
Well from a kernel developer perspective this might be true, but well there are things, e.g. the stuff the nice guys at free electrons did, which are quite reasonable but would be laughed at if tried to include in the kernel: http://free-electrons.com/blog/fiq-handlers-in-the-arm-linux-kernel/ Still this shows very much that you can build quite powerfull systems which combine both the power of linux with the lowes latency the bare hardware can give you.
Whenever I've been working on code that, for whatever reason, cannot be upstreamed I'd probably best be regarded as a tourist.
I think that application specific code which needs all the power the hardware gives you in a given power envelope and is so optimized for a special usecase that integration in kernel makes no sense. So i would hope for a more constructive mindset.
A bad choice of words on my part (although in truth it remains an accurate description of my own experience of working on code not destined to be upstreamed).
However I certainly want to be constructive.
And i really don't get it, that neither ARM nor the kernel community sees fast interrupts as a worthwhile usecase. Unfortunatly the interrupt latencies with Linux are at least a order of magnitude higher than the pure hardware even with longer pipelines can deliver.
Therefore, as far as I'm concerned, the two facilities are mututally exclusive.
Well can't have the cake and eat it too.
I had thought about whether the IPI FIQ should be disabled when a replacement FIQ handler is installed, I deem it not to be a use case that the mainline kernel needs to be concerned about.
That would be nice.
Just to be clear, this is exactly the dynamic switching that I mentioned a couple of mails ago.
Ok, my takeaway is there is currently not enough interest from your side to implement it but you would support some changes if submitted?
I'd take a good look at them (assuming I'm on Cc: or my mail filters pick them out). I may still have some concerns about testing it in the absence of an upstream user but otherwise I would expect to be supportive.
As I said such code should not especially hard to write but, with the current mainline kernel, the code would be unreachable and, as a result, likely also to be more or less untested.
Well, my misconception was, that this might be done by adding some ifdefs but as Russell pointed out, that is not the way to go.
Whether its dynamic or not, a change that does not provide some benefit to the upstream kernel is always going to be much harder to sell to the people who have to maintain it because they derive much benefit from maintaining it.
Daniel.
On Mon, Dec 01, 2014 at 02:54:10PM +0100, Tim Sander wrote:
Hi Russell
Am Montag, 1. Dezember 2014, 10:38:32 schrieb Russell King - ARM Linux:
That whole paragraph doesn't make much sense to me.
Look, in my mind it is very simple. If you are using CONFIG_FIQ on a SMP platform, your life will be very difficult. The FIQ code enabled by that symbol is not designed to be used on SMP systems, *period*.
Well the only extra thing you had to do is set up the FIQ registers on every cpu, but i would not call that very difficult. Other than that i am not aware of any problems that are not also present on a uniprocessor system. So i have a hard time following your reasoning why SMP is different from UP in regard to the CONFIG_FIQ.
One of the things which FIQ handlers can do is they have their own private registers which they can modify on each invocation of the FIQ handler - for example, as a software DMA pointer.
Each CPU has its own private set of FIQ registers, so merely copying the registers to each CPU will only set their initial state: updates by one CPU to the register set will not be seen by a different CPU.
If you decide to enable CONFIG_FIQ, and you use that code on a SMP platform, I'm going to say right now so it's totally clear: if you encounter a problem, I don't want to know about it. The code is not designed for use on that situation.
Even with using the FIQ on a Linux SMP system you have not heard from me before, as i knew that this is not your problem (and that is not to say that there where none!). The only interface Linux has been making available is set_fiq_handler. So it was clear that the FIQ is its own domain otherwise untouched by the kernel.
Correct, because FIQs have very little use in Linux. They have been used in the past to implement: - software DMA to floppy disk controllers (see arch/arm/lib/floppydma.S) - audio DMA (arch/arm/mach-imx/ssi-fiq.S) - s2c24xx SPI DMA (drivers/spi/spi-s3c24xx-fiq.S) - Keyboard (yes, how that qualifies for FIQ I don't know (arch/arm/mach-omap1/ams-delta-fiq-handler.S)
The first three do exactly what I describe above, and none of these users are SMP platforms. Hence, the FIQ code which we currently have does exactly what we need it to for the platforms we have.
Now, you're talking about using this in a SMP context - that's a totally new use for this code which - as I have said several times now - is not really something that this code is intended to support.
And i really don't get it, that neither ARM nor the kernel community sees fast interrupts as a worthwhile usecase. Unfortunatly the interrupt latencies with Linux are at least a order of magnitude higher than the pure hardware even with longer pipelines can deliver.
First point: fast interrupts won't be fast if you load them up with all the interrupt demux and locking that normal interrupts have; if you start doing that, then you end up having to make /all/ IRQ-safe locks in the kernel not only disable normal interrupts, but also disable the FIQs as well.
At that point, FIQs are no longer "fast" - they will be subject to exactly the same latencies as normal interrupts.
Second point: we have embraced FIQs where it is appropriate to do so, but within the restrictions that FIQs present - that is, to keep them fast, we have to avoid the problem in the first point above, which means normal C code called from FIQs /can't/ take any kernel lock what so ever without potentially causing a deadlock.
Even if you think you can (why would UP have locks if it's not SMP) debugging facilities such as the lock validator will bite you if you try taking a lock in FIQ context which was already taken in the parent context.
Third point: FIQs are not available on a lot of ARM platforms. Hardware which routes interrupts to FIQs is very limited, normally it's only a few interrupts which appear there. Moreover, with more modern platforms where the kernel runs in the non-secure side, FIQs are /totally/ and /completely/ unavailable there - FIQs are only available for the secure monitor to use.
No. With a single zImage kernel, you could very well have SMP and FIQ both enabled, but have a non-SMP platform using FIQ, but also support SMP platforms as well. Your change prevents that happening.
Ah, well i have to get used to this "new" devicetree thingy, where one size fits all...
No, you're conflating different things there. It doesn't have much to do with DT vs non-DT, because this same problem existed before DT came along, since there were platforms which could be both UP and SMP.
Still if you boot a single process system which has FIQ available and has a GIC with such a kernel, then you also reprogramm the IPI's as FIQs. But i guess thats not a problem as Linux does not self IPI the kernel as other os'es do?
I'm really sorry, but your above paragraph doesn't make much sense to me. "single process system" - if there's only one process, there's no point having a scheduler (it has nothing to schedule) and so I guess you're not talking about Linux there.
Or do you mean "single processor system" (in other words, uniprocessor or UP). In that case, the kernel doesn't use IPIs, because, by definition, there's no other processors for it to signal to.
Hi Russell, Thomas
I have some replys below, but i just post my most important question up here, which is my current takeaway from this discussion: Would patches be accepted which -as Daniel Thompson pointed out- dynamically switch the FIQ IPI's off when set_fiq_handler is called (given that the FIQ IPI patches are to be merged proper beforhand).
Am Montag, 1. Dezember 2014, 15:02:40 schrieb Russell King - ARM Linux:
On Mon, Dec 01, 2014 at 02:54:10PM +0100, Tim Sander wrote:
Hi Russell
Am Montag, 1. Dezember 2014, 10:38:32 schrieb Russell King - ARM Linux:
That whole paragraph doesn't make much sense to me.
Look, in my mind it is very simple. If you are using CONFIG_FIQ on a SMP platform, your life will be very difficult. The FIQ code enabled by that symbol is not designed to be used on SMP systems, *period*.
Well the only extra thing you had to do is set up the FIQ registers on every cpu, but i would not call that very difficult. Other than that i am not aware of any problems that are not also present on a uniprocessor system. So i have a hard time following your reasoning why SMP is different from UP in regard to the CONFIG_FIQ.
One of the things which FIQ handlers can do is they have their own private registers which they can modify on each invocation of the FIQ handler - for example, as a software DMA pointer.
Each CPU has its own private set of FIQ registers, so merely copying the registers to each CPU will only set their initial state: updates by one CPU to the register set will not be seen by a different CPU.
If you decide to enable CONFIG_FIQ, and you use that code on a SMP platform, I'm going to say right now so it's totally clear: if you encounter a problem, I don't want to know about it. The code is not designed for use on that situation.
Even with using the FIQ on a Linux SMP system you have not heard from me before, as i knew that this is not your problem (and that is not to say that there where none!). The only interface Linux has been making available is set_fiq_handler. So it was clear that the FIQ is its own domain otherwise untouched by the kernel.
Correct, because FIQs have very little use in Linux. They have been used in the past to implement:
- software DMA to floppy disk controllers (see arch/arm/lib/floppydma.S)
- audio DMA (arch/arm/mach-imx/ssi-fiq.S)
- s2c24xx SPI DMA (drivers/spi/spi-s3c24xx-fiq.S)
- Keyboard (yes, how that qualifies for FIQ I don't know (arch/arm/mach-omap1/ams-delta-fiq-handler.S)
The first three do exactly what I describe above, and none of these users are SMP platforms. Hence, the FIQ code which we currently have does exactly what we need it to for the platforms we have.
Now, you're talking about using this in a SMP context - that's a totally new use for this code which - as I have said several times now - is not really something that this code is intended to support.
Yes but as i said the only additional problem is the seperate registers for each core. Given the quirks the current GIC version 1 this is really a minor problem: https://lkml.org/lkml/2014/7/15/550
And i really don't get it, that neither ARM nor the kernel community sees fast interrupts as a worthwhile usecase. Unfortunatly the interrupt latencies with Linux are at least a order of magnitude higher than the pure hardware even with longer pipelines can deliver.
First point: fast interrupts won't be fast if you load them up with all the interrupt demux and locking that normal interrupts have; if you start doing that, then you end up having to make /all/ IRQ-safe locks in the kernel not only disable normal interrupts, but also disable the FIQs as well.
I just want to have an CPU context where IRQ's are not switched off by Linux. It would be nice to use Linux infrastructure like printk but thats just not that important. And no, i don't wan't to use some IRQ demuxing, Thats why i would be nice to disable the FIQ IPI's dynamically if other uses are set.
At that point, FIQs are no longer "fast" - they will be subject to exactly the same latencies as normal interrupts.
Well the main difference i am after, is to have one interrupt which is not masked in any way and which is as fast as the hardware can get (which on a cortex a9 is depending on implementation between 500ns to a couple of µs).
Second point: we have embraced FIQs where it is appropriate to do so, but within the restrictions that FIQs present - that is, to keep them fast, we have to avoid the problem in the first point above, which means normal C code called from FIQs /can't/ take any kernel lock what so ever without potentially causing a deadlock.
Yes i am aware of that. I think thats one of the main reasons why the FIQ has been mainly unused by Linux.
Even if you think you can (why would UP have locks if it's not SMP) debugging facilities such as the lock validator will bite you if you try taking a lock in FIQ context which was already taken in the parent context.
Well no Linux context in FIQ at all. Thats why i was using a daisy chained normal interrupt to hand of the normal stuff in linux context.
Third point: FIQs are not available on a lot of ARM platforms. Hardware which routes interrupts to FIQs is very limited, normally it's only a few interrupts which appear there. Moreover, with more modern platforms where the kernel runs in the non-secure side, FIQs are /totally/ and /completely/ unavailable there - FIQs are only available for the secure monitor to use.
I am fully aware that ARM started to mix up FIQ and SecureMode, confusing even some silicon vendors, which sadly have the FIQ missing. But aside from that i know that i.mx6, xilinx zynq, altera soc all have a FIQ available. The only one i know missing the FIQ is the Sitara which had the FIQ documented in first revisions of the spec (but has not anymore). So from my totally empirical unscientific view 3 of 4 cpus have FIQ functionality.
Or do you mean that the platform is capable of delivering FIQ but have no CONFIG_FIQ set. In that case there is indeed only a small fraction which has this config option in use.
No. With a single zImage kernel, you could very well have SMP and FIQ both enabled, but have a non-SMP platform using FIQ, but also support SMP platforms as well. Your change prevents that happening.
Ah, well i have to get used to this "new" devicetree thingy, where one size fits all...
No, you're conflating different things there. It doesn't have much to do with DT vs non-DT, because this same problem existed before DT came along, since there were platforms which could be both UP and SMP.
D'accord.
Still if you boot a single process system which has FIQ available and has a GIC with such a kernel, then you also reprogramm the IPI's as FIQs. But i guess thats not a problem as Linux does not self IPI the kernel as other os'es do?
I'm really sorry, but your above paragraph doesn't make much sense to me. "single process system" - if there's only one process, there's no point having a scheduler (it has nothing to schedule) and so I guess you're not talking about Linux there.
Or do you mean "single processor system" (in other words, uniprocessor or UP). In that case, the kernel doesn't use IPIs, because, by definition, there's no other processors for it to signal to.
I am sorry, mark that to beeing a non native english speaker, i indeed meant a single processor system as with a single process i would definetly not bother to run linux at all.
Best regards Tim
Hi Thomas, Hi Jason: Patches 1 to 4 are for you.
This patchset modifies the GIC driver to allow it, on supported platforms, to route IPI interrupts to FIQ and uses this feature to implement arch_trigger_all_cpu_backtrace for arm.
On platforms not capable of supporting FIQ the signal to generate a backtrace we fall back to using IRQ for propagation instead (relying on a timeout to avoid wedging the CPU requesting the backtrace if other CPUs are not responsive).
It has been tested on two systems capable of supporting grouping (Freescale i.MX6 and STiH416) and two that do not (vexpress-a9 and Qualcomm Snapdragon 600).
v10:
* Add a further patch to optimize away some of the locking on systems where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with exynos_defconfig (which is the only defconfig to set this option).
* Whitespace fixes in patch 4. That patch previously used spaces for alignment of new constants but the rest of the file used tabs.
v9:
* Improved documentation and structure of initial patch (now initial two patches) to make gic_raise_softirq() safe to call from FIQ (Thomas Gleixner).
* Avoid masking interrupts during gic_raise_softirq(). The use of the read lock makes this redundant (because we can safely re-enter the function).
v8:
* Fixed build on arm64 causes by a spurious include file in irq-gic.c.
v7-2 (accidentally released twice with same number):
* Fixed boot regression on vexpress-a9 (reported by Russell King).
* Rebased on v3.18-rc3; removed one patch from set that is already included in mainline.
* Dropped arm64/fiq.h patch from the set (still useful but not related to issuing backtraces).
v7:
* Re-arranged code within the patch series to fix a regression introduced midway through the series and corrected by a later patch (testing by Olof's autobuilder). Tested offending patch in isolation using defconfig identified by the autobuilder.
v6:
* Renamed svc_entry's call_trace argument to just trace (example code from Russell King).
* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell King).
* Modified usr_entry to optional avoid calling into the trace code and used this in FIQ entry from usr path. Modified corresponding exit code to avoid calling into trace code and the scheduler (example code from Russell King).
* Ensured the default FIQ register state is restored when the default FIQ handler is reinstalled (example code from Russell King).
* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting a default FIQ handler.
* Re-instated fiq_safe_migration_lock and associated logic in gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd() in the console unlock logic.
v5:
* Rebased on 3.17-rc4.
* Removed a spurious line from the final "glue it together" patch that broke the build.
v4:
* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas Pitre).
* Really fix bad pt_regs pointer generation in __fiq_abt.
* Remove fiq_safe_migration_lock and associated logic in gic_raise_softirq() (review of Russell King)
* Restructured to introduce the default FIQ handler first, before the new features (review of Russell King).
v3:
* Removed redundant header guards from arch/arm64/include/asm/fiq.h (review of Catalin Marinas).
* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas Pitre).
v2:
* Restructured to sit nicely on a similar FYI patchset from Russell King. It now effectively replaces the work in progress final patch with something much more complete.
* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq (review of Nicolas Pitre)
* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts being acknowledged by the IRQ handler does still exist but should be harmless because the IRQ handler will still wind up calling ipi_cpu_backtrace().
* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively becomes a platform feature (although the use of non-maskable interrupts to implement it is best effort rather than guaranteed).
* Better comments highlighting usage of RAZ/WI registers (and parts of registers) in the GIC code.
Changes *before* v1:
* This patchset is a hugely cut-down successor to "[PATCH v11 00/19] arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting the new structure. For historic details see: https://lkml.org/lkml/2014/9/2/227
* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value). In fixing this we also remove the useless indirection previously found in the fiq_handler macro.
* Make default fiq handler "always on" by migrating from fiq.c to traps.c and replace do_unexp_fiq with the new handler (review of Russell King).
* Add arm64 version of fiq.h (review of Russell King)
* Removed conditional branching and code from irq-gic.c, this is replaced by much simpler code that relies on the GIC specification's heavy use of read-as-zero/write-ignored (review of Russell King)
Daniel Thompson (6): irqchip: gic: Finer grain locking for gic_raise_softirq irqchip: gic: Optimize locking in gic_raise_softirq irqchip: gic: Make gic_raise_softirq FIQ-safe irqchip: gic: Introduce plumbing for IPI FIQ ARM: add basic support for on-demand backtrace of other CPUs arm: smp: Handle ipi_cpu_backtrace() using FIQ (if available)
arch/arm/include/asm/irq.h | 5 + arch/arm/include/asm/smp.h | 3 + arch/arm/kernel/smp.c | 64 +++++++++++++ arch/arm/kernel/traps.c | 8 +- drivers/irqchip/irq-gic.c | 207 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 ++ 6 files changed, 280 insertions(+), 15 deletions(-)
-- 1.9.3
irq_controller_lock is used for multiple purposes within the gic driver. Primarily it is used to make register read-modify-write sequences atomic. It is also used by gic_raise_softirq() in order that the big.LITTLE migration logic can figure out when it is safe to migrate interrupts between physical cores.
The second usage of irq_controller_lock is difficult to discern when reviewing the code because the migration itself takes place outside the lock.
This patch makes the second usage more explicit by splitting it out into a separate lock and providing better comments.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..94d77118efa8 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,12 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock);
/* + * This lock is used by the big.LITTLE migration code to ensure no IPIs + * can be pended on the old core after the map has been updated. + */ +static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock); + +/* * The GIC mapping of CPU interfaces does not necessarily match * the logical CPU numbering. Let's use a mapping as returned * by the GIC itself. @@ -624,7 +630,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags); + raw_spin_lock_irqsave(&cpu_map_migration_lock, flags);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -639,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&irq_controller_lock, flags); + raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); } #endif
@@ -710,8 +716,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
raw_spin_lock(&irq_controller_lock);
- /* Update the target interface for this logical CPU */ + /* + * Update the target interface for this logical CPU + * + * From the point we release the cpu_map_migration_lock any new + * SGIs will be pended on the new cpu which makes the set of SGIs + * pending on the old cpu static. That means we can defer the + * migration until after we have released the irq_controller_lock. + */ + raw_spin_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id; + raw_spin_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
Currently gic_raise_softirq() unconditionally takes and releases a lock whose only purpose is to synchronize with the b.L switcher.
Remove this lock if the b.L switcher is not compiled in.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 94d77118efa8..e875da93f24a 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -76,8 +76,23 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock); * This lock is used by the big.LITTLE migration code to ensure no IPIs * can be pended on the old core after the map has been updated. */ +#ifdef CONFIG_BL_SWITCHER static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+static inline void bl_migration_lock(unsigned long *flags) +{ + raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags); +} + +static inline void bl_migration_unlock(unsigned long flags) +{ + raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); +} +#else +static inline void bl_migration_lock(unsigned long *flags) {} +static inline void bl_migration_unlock(unsigned long flags) {} +#endif + /* * The GIC mapping of CPU interfaces does not necessarily match * the logical CPU numbering. Let's use a mapping as returned @@ -630,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&cpu_map_migration_lock, flags); + bl_migration_lock(&flags);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -645,7 +660,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); + bl_migration_unlock(flags); } #endif
It is currently possible for FIQ handlers to re-enter gic_raise_softirq() and lock up.
gic_raise_softirq() lock(x); -~-> FIQ handle_fiq() gic_raise_softirq() lock(x); <-- Lockup
Calling printk() from a FIQ handler can trigger this problem because printk() raises an IPI when it needs to wake_up_klogd(). More generally, IPIs are the only means for FIQ handlers to safely defer work to less restrictive calling context so the function to raise them really needs to be FIQ-safe.
This patch fixes the problem by converting the cpu_map_migration_lock into a rwlock making it safe to re-enter the function.
Having made it safe to re-enter gic_raise_softirq() we no longer need to mask interrupts during gic_raise_softirq() because the b.L migration is always performed from task context.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++------------- 1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index e875da93f24a..5d72823bc5e9 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock); /* * This lock is used by the big.LITTLE migration code to ensure no IPIs * can be pended on the old core after the map has been updated. + * + * This lock may be locked for reading from both IRQ and FIQ handlers + * and therefore must not be locked for writing when these are enabled. */ #ifdef CONFIG_BL_SWITCHER -static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock); +static DEFINE_RWLOCK(cpu_map_migration_lock);
-static inline void bl_migration_lock(unsigned long *flags) +static inline void bl_migration_lock(void) { - raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags); + read_lock(&cpu_map_migration_lock); }
-static inline void bl_migration_unlock(unsigned long flags) +static inline void bl_migration_unlock(void) { - raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); + read_unlock(&cpu_map_migration_lock); } #else -static inline void bl_migration_lock(unsigned long *flags) {} -static inline void bl_migration_unlock(unsigned long flags) {} +static inline void bl_migration_lock(void) {} +static inline void bl_migration_unlock(void) {} #endif
/* @@ -640,12 +643,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic) #endif
#ifdef CONFIG_SMP +/* + * Raise the specified IPI on all cpus set in mask. + * + * This function is safe to call from all calling contexts, including + * FIQ handlers. It relies on bl_migration_lock() being multiply acquirable + * to avoid deadlocks when the function is re-entered at different + * exception levels. + */ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; - unsigned long flags, map = 0; + unsigned long map = 0;
- bl_migration_lock(&flags); + bl_migration_lock();
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -660,7 +671,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- bl_migration_unlock(flags); + bl_migration_unlock(); } #endif
@@ -708,7 +719,8 @@ int gic_get_cpu_id(unsigned int cpu) * Migrate all peripheral interrupts with a target matching the current CPU * to the interface corresponding to @new_cpu_id. The CPU interface mapping * is also updated. Targets to other CPU interfaces are unchanged. - * This must be called with IRQs locally disabled. + * This must be called from a task context and with IRQ and FIQ locally + * disabled. */ void gic_migrate_target(unsigned int new_cpu_id) { @@ -739,9 +751,9 @@ void gic_migrate_target(unsigned int new_cpu_id) * pending on the old cpu static. That means we can defer the * migration until after we have released the irq_controller_lock. */ - raw_spin_lock(&cpu_map_migration_lock); + write_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id; - raw_spin_unlock(&cpu_map_migration_lock); + write_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org --- arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 155 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 158 insertions(+), 10 deletions(-)
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 0c8b10801d36..4dc45b38e56e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -26,6 +26,7 @@ #include <linux/init.h> #include <linux/sched.h> #include <linux/irq.h> +#include <linux/irqchip/arm-gic.h>
#include <linux/atomic.h> #include <asm/cacheflush.h> @@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
nmi_enter();
- /* nop. FIQ handlers for special arch/arm features can be added here. */ +#ifdef CONFIG_ARM_GIC + gic_handle_fiq_ipi(); +#endif
nmi_exit();
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 5d72823bc5e9..978e5e48d5c1 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -39,6 +39,7 @@ #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/ratelimit.h>
#include <asm/cputype.h> #include <asm/irq.h> @@ -48,6 +49,10 @@ #include "irq-gic-common.h" #include "irqchip.h"
+#ifndef SMP_IPI_FIQ_MASK +#define SMP_IPI_FIQ_MASK 0 +#endif + union gic_base { void __iomem *common_base; void __percpu * __iomem *percpu_base; @@ -348,6 +353,93 @@ static struct irq_chip gic_chip = { .irq_set_wake = gic_set_wake, };
+/* + * Shift an interrupt between Group 0 and Group 1. + * + * In addition to changing the group we also modify the priority to + * match what "ARM strongly recommends" for a system where no Group 1 + * interrupt must ever preempt a Group 0 interrupt. + * + * If is safe to call this function on systems which do not support + * grouping (it will have no effect). + */ +static void gic_set_group_irq(void __iomem *base, unsigned int hwirq, + int group) +{ + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_mask = BIT(hwirq % 32); + u32 grp_val; + + unsigned int pri_reg = (hwirq / 4) * 4; + u32 pri_mask = BIT(7 + ((hwirq % 4) * 8)); + u32 pri_val; + + /* + * Systems which do not support grouping will have not have + * the EnableGrp1 bit set. + */ + if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))) + return; + + raw_spin_lock(&irq_controller_lock); + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg); + + if (group) { + grp_val |= grp_mask; + pri_val |= pri_mask; + } else { + grp_val &= ~grp_mask; + pri_val &= ~pri_mask; + } + + writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg); + writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg); + + raw_spin_unlock(&irq_controller_lock); +} + +/* + * Test which group an interrupt belongs to. + * + * Returns 0 if the controller does not support grouping. + */ +static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) +{ + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_val; + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + + return (grp_val >> (hwirq % 32)) & 1; +} + +/* + * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI, + * otherwise do nothing. + */ +void gic_handle_fiq_ipi(void) +{ + struct gic_chip_data *gic = &gic_data[0]; + void __iomem *cpu_base = gic_data_cpu_base(gic); + unsigned long irqstat, irqnr; + + if (WARN_ON(!in_nmi())) + return; + + while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) & + SMP_IPI_FIQ_MASK) { + irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK); + writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); + + irqnr = irqstat & GICC_IAR_INT_ID_MASK; + WARN_RATELIMIT(irqnr > 16, + "Unexpected irqnr %lu (bad prioritization?)\n", + irqnr); + } +} + void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq) { if (gic_nr >= MAX_GIC_NR) @@ -379,15 +471,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic) static void gic_cpu_if_up(void) { void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]); - u32 bypass = 0; + void __iomem *dist_base = gic_data_dist_base(&gic_data[0]); + u32 ctrl = 0;
/* - * Preserve bypass disable bits to be written back later - */ - bypass = readl(cpu_base + GIC_CPU_CTRL); - bypass &= GICC_DIS_BYPASS_MASK; + * Preserve bypass disable bits to be written back later + */ + ctrl = readl(cpu_base + GIC_CPU_CTRL); + ctrl &= GICC_DIS_BYPASS_MASK;
- writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); + /* + * If EnableGrp1 is set in the distributor then enable group 1 + * support for this CPU (and route group 0 interrupts to FIQ). + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) + ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL | + GICC_ENABLE_GRP1; + + writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); }
@@ -411,7 +512,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
gic_dist_config(base, gic_irqs, NULL);
- writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL); + /* + * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only, + * bit 1 ignored) depending on current mode. + */ + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL); + + /* + * Set all global interrupts to be group 1 if (and only if) it + * is possible to enable group 1 interrupts. This register is RAZ/WI + * if not accessible or not implemented, however some GICv1 devices + * do not implement the EnableGrp1 bit making it unsafe to set + * this register unconditionally. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)) + for (i = 32; i < gic_irqs; i += 32) + writel_relaxed(0xffffffff, + base + GIC_DIST_IGROUP + i * 4 / 32); }
static void gic_cpu_init(struct gic_chip_data *gic) @@ -420,6 +537,7 @@ static void gic_cpu_init(struct gic_chip_data *gic) void __iomem *base = gic_data_cpu_base(gic); unsigned int cpu_mask, cpu = smp_processor_id(); int i; + unsigned long secure_irqs, secure_irq;
/* * Get what the GIC says our CPU mask is. @@ -438,6 +556,19 @@ static void gic_cpu_init(struct gic_chip_data *gic)
gic_cpu_config(dist_base, NULL);
+ /* + * If the distributor is configured to support interrupt grouping + * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK + * to be group1 and ensure any remaining group 0 interrupts have + * the right priority. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) { + secure_irqs = SMP_IPI_FIQ_MASK; + writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0); + for_each_set_bit(secure_irq, &secure_irqs, 16) + gic_set_group_irq(dist_base, secure_irq, 0); + } + writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK); gic_cpu_if_up(); } @@ -527,7 +658,8 @@ static void gic_dist_restore(unsigned int gic_nr) writel_relaxed(gic_data[gic_nr].saved_spi_enable[i], dist_base + GIC_DIST_ENABLE_SET + i * 4);
- writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL); + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, + dist_base + GIC_DIST_CTRL); }
static void gic_cpu_save(unsigned int gic_nr) @@ -655,6 +787,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; unsigned long map = 0; + unsigned long softint;
bl_migration_lock();
@@ -669,7 +802,11 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) dmb(ishst);
/* this always happens on GIC0 */ - writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); + softint = map << 16 | irq; + if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) + softint |= 0x8000; + writel_relaxed(softint, + gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
bl_migration_unlock(); } diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h index 13eed92c7d24..e83d292d4dbc 100644 --- a/include/linux/irqchip/arm-gic.h +++ b/include/linux/irqchip/arm-gic.h @@ -22,6 +22,10 @@ #define GIC_CPU_IDENT 0xfc
#define GICC_ENABLE 0x1 +#define GICC_ENABLE_GRP1 0x2 +#define GICC_ACK_CTL 0x4 +#define GICC_FIQ_EN 0x8 +#define GICC_COMMON_BPR 0x10 #define GICC_INT_PRI_THRESHOLD 0xf0 #define GICC_IAR_INT_ID_MASK 0x3ff #define GICC_INT_SPURIOUS 1023 @@ -44,6 +48,7 @@ #define GIC_DIST_SGI_PENDING_SET 0xf20
#define GICD_ENABLE 0x1 +#define GICD_ENABLE_GRP1 0x2 #define GICD_DISABLE 0x0 #define GICD_INT_ACTLOW_LVLTRIG 0x0 #define GICD_INT_EN_CLR_X32 0xffffffff @@ -117,5 +122,8 @@ static inline void __init register_routable_domain_ops { gic_routable_irq_domain_ops = ops; } + +void gic_handle_fiq_ipi(void); + #endif /* __ASSEMBLY */ #endif
Daniel,
I've been a bit swamped this cycle and haven't kept as close an eye on this as I should have. :( fwiw, it's looking really good. I have one question below:
On Wed, Nov 26, 2014 at 04:23:28PM +0000, Daniel Thompson wrote:
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org
arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 155 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 158 insertions(+), 10 deletions(-)
...
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 5d72823bc5e9..978e5e48d5c1 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c
...
+/*
- Test which group an interrupt belongs to.
- Returns 0 if the controller does not support grouping.
- */
+static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) +{
- unsigned int grp_reg = hwirq / 32 * 4;
- u32 grp_val;
- grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
- return (grp_val >> (hwirq % 32)) & 1;
+}
...
@@ -669,7 +802,11 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) dmb(ishst); /* this always happens on GIC0 */
- writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- softint = map << 16 | irq;
- if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq))
softint |= 0x8000;
- writel_relaxed(softint,
gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
bl_migration_unlock(); }
Is it worth the code complication to optimize this if the controller doesn't support grouping? Maybe set group_enabled at init so the above would become:
softint = map << 16 | irq; if (group_enabled && gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) softint |= 0x8000; writel_relaxed(...);
thx,
Jason.
On 26/11/14 17:42, Jason Cooper wrote:
Daniel,
I've been a bit swamped this cycle and haven't kept as close an eye on this as I should have. :( fwiw, it's looking really good.
I'll treat that good news. Thanks.
I have one question below:
On Wed, Nov 26, 2014 at 04:23:28PM +0000, Daniel Thompson wrote:
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org
arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 155 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 158 insertions(+), 10 deletions(-)
...
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 5d72823bc5e9..978e5e48d5c1 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c
...
+/*
- Test which group an interrupt belongs to.
- Returns 0 if the controller does not support grouping.
- */
+static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) +{
- unsigned int grp_reg = hwirq / 32 * 4;
- u32 grp_val;
- grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
- return (grp_val >> (hwirq % 32)) & 1;
+}
...
@@ -669,7 +802,11 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) dmb(ishst); /* this always happens on GIC0 */
- writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- softint = map << 16 | irq;
- if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq))
softint |= 0x8000;
- writel_relaxed(softint,
gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
bl_migration_unlock(); }
Is it worth the code complication to optimize this if the controller doesn't support grouping? Maybe set group_enabled at init so the above would become:
softint = map << 16 | irq; if (group_enabled && gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) softint |= 0x8000; writel_relaxed(...);
No objections.
However given this code always calls gic_get_group_irq() with irq < 16 we might be able to do better even than this. The lower 16-bits of IGROUP[0] are constant after boot so if we keep a shadow copy around instead of just a boolean then we can avoid the register read on all code paths.
Daniel.
Daniel,
On Thu, Nov 27, 2014 at 01:39:01PM +0000, Daniel Thompson wrote:
On 26/11/14 17:42, Jason Cooper wrote:
On Wed, Nov 26, 2014 at 04:23:28PM +0000, Daniel Thompson wrote:
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org
arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 155 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 158 insertions(+), 10 deletions(-)
...
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 5d72823bc5e9..978e5e48d5c1 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c
...
+/*
- Test which group an interrupt belongs to.
- Returns 0 if the controller does not support grouping.
- */
+static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) +{
- unsigned int grp_reg = hwirq / 32 * 4;
- u32 grp_val;
- grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
- return (grp_val >> (hwirq % 32)) & 1;
+}
...
@@ -669,7 +802,11 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) dmb(ishst); /* this always happens on GIC0 */
- writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- softint = map << 16 | irq;
- if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq))
softint |= 0x8000;
- writel_relaxed(softint,
gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
bl_migration_unlock(); }
Is it worth the code complication to optimize this if the controller doesn't support grouping? Maybe set group_enabled at init so the above would become:
softint = map << 16 | irq; if (group_enabled && gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) softint |= 0x8000; writel_relaxed(...);
No objections.
However given this code always calls gic_get_group_irq() with irq < 16 we might be able to do better even than this. The lower 16-bits of IGROUP[0] are constant after boot so if we keep a shadow copy around instead of just a boolean then we can avoid the register read on all code paths.
Hmm, I'd look at that as a performance enhancement. I'm more concerned about performance regressions for current users of the gic (non-group enabled).
Let's go ahead and do the change (well, a working facsimile) I suggested above, and we can do a follow on patch to increase performance for the group enabled use case.
If there's no objections, I'd like to try to get this in for v3.19, but it's really late. So we'll see how it goes.
thx,
Jason.
On 27/11/14 18:06, Jason Cooper wrote:
Daniel,
On Thu, Nov 27, 2014 at 01:39:01PM +0000, Daniel Thompson wrote:
On 26/11/14 17:42, Jason Cooper wrote:
On Wed, Nov 26, 2014 at 04:23:28PM +0000, Daniel Thompson wrote:
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org
arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 155 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 158 insertions(+), 10 deletions(-)
...
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 5d72823bc5e9..978e5e48d5c1 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c
...
+/*
- Test which group an interrupt belongs to.
- Returns 0 if the controller does not support grouping.
- */
+static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) +{
- unsigned int grp_reg = hwirq / 32 * 4;
- u32 grp_val;
- grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg);
- return (grp_val >> (hwirq % 32)) & 1;
+}
...
@@ -669,7 +802,11 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) dmb(ishst); /* this always happens on GIC0 */
- writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- softint = map << 16 | irq;
- if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq))
softint |= 0x8000;
- writel_relaxed(softint,
gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
bl_migration_unlock(); }
Is it worth the code complication to optimize this if the controller doesn't support grouping? Maybe set group_enabled at init so the above would become:
softint = map << 16 | irq; if (group_enabled && gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) softint |= 0x8000; writel_relaxed(...);
No objections.
However given this code always calls gic_get_group_irq() with irq < 16 we might be able to do better even than this. The lower 16-bits of IGROUP[0] are constant after boot so if we keep a shadow copy around instead of just a boolean then we can avoid the register read on all code paths.
Hmm, I'd look at that as a performance enhancement. I'm more concerned about performance regressions for current users of the gic (non-group enabled).
"Current users of the gic" doesn't imply "non-group enabled". Whether or not grouping is enabled is a property of the hardware or (secure) bootloader.
If we are seriously worried about a performance regression here we actually have to care about both cases.
Let's go ahead and do the change (well, a working facsimile) I suggested above, and we can do a follow on patch to increase performance for the group enabled use case.
Hmnnn...
I've have a new patch ready to go that shadows the IGROUP[0]. Its looks OK to me and I think it is actually fewer lines of code than v10 because we can remove gic_get_group_irq() completely.
The code in question ends up looking like:
softint = map << 16 | irq; if (gic->igroup0_shadow & BIT(irq)) softint |= 0x8000; writel_relaxed(...);
This should end up with the same (data) cache profile as your proposal in the non-group case and should normally be a win for the grouped case. I even remembered an informative comment to make clear the use of shadowing is as an optimization and nothing to do with working around stupid hardware ;-).
I hope you don't mind but I'm about to share a patchset based on the above so you can see it in full and decide if you like it. I don't object to adding an extra boolean (and will do that if you don't like the above) but I think this code is better.
If there's no objections, I'd like to try to get this in for v3.19, but it's really late. So we'll see how it goes.
I like that too. I also agree its pretty late and that's one of the reasons why I'm turning round new patchsets for each bit of feedback.
Daniel.
On 27/11/14 19:42, Daniel Thompson wrote:
Hmm, I'd look at that as a performance enhancement. I'm more concerned about performance regressions for current users of the gic (non-group enabled).
"Current users of the gic" doesn't imply "non-group enabled". Whether or not grouping is enabled is a property of the hardware or (secure) bootloader.
If we are seriously worried about a performance regression here we actually have to care about both cases.
Let's go ahead and do the change (well, a working facsimile) I suggested above, and we can do a follow on patch to increase performance for the group enabled use case.
Hmnnn...
I've have a new patch ready to go that shadows the IGROUP[0]. Its looks OK to me and I think it is actually fewer lines of code than v10 because we can remove gic_get_group_irq() completely.
Fianlly from me. If you are worried about large "last minute changes" involved in v11 here is the v10 -> v11 diff.
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 978e5e48d5c1..5c36aefa67ea 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -70,6 +70,7 @@ struct gic_chip_data { #endif struct irq_domain *domain; unsigned int gic_irqs; + u32 igroup0_shadow; #ifdef CONFIG_GIC_NON_BANKED void __iomem *(*get_base)(union gic_base *); #endif @@ -363,9 +364,10 @@ static struct irq_chip gic_chip = { * If is safe to call this function on systems which do not support * grouping (it will have no effect). */ -static void gic_set_group_irq(void __iomem *base, unsigned int hwirq, - int group) +static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq, + int group) { + void __iomem *base = gic_data_dist_base(gic); unsigned int grp_reg = hwirq / 32 * 4; u32 grp_mask = BIT(hwirq % 32); u32 grp_val; @@ -395,25 +397,14 @@ static void gic_set_group_irq(void __iomem *base, unsigned int hwirq, }
writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg); + if (grp_reg == 0) + gic->igroup0_shadow = grp_val; + writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg);
raw_spin_unlock(&irq_controller_lock); }
-/* - * Test which group an interrupt belongs to. - * - * Returns 0 if the controller does not support grouping. - */ -static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) -{ - unsigned int grp_reg = hwirq / 32 * 4; - u32 grp_val; - - grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); - - return (grp_val >> (hwirq % 32)) & 1; -}
/* * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI, @@ -565,8 +556,9 @@ static void gic_cpu_init(struct gic_chip_data *gic) if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) { secure_irqs = SMP_IPI_FIQ_MASK; writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0); + gic->igroup0_shadow = ~secure_irqs; for_each_set_bit(secure_irq, &secure_irqs, 16) - gic_set_group_irq(dist_base, secure_irq, 0); + gic_set_group_irq(gic, secure_irq, 0); }
writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK); @@ -801,10 +793,12 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) */ dmb(ishst);
- /* this always happens on GIC0 */ + /* We avoid a readl here by using the shadow copy of IGROUP[0] */ softint = map << 16 | irq; - if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) + if (gic_data[0].igroup0_shadow & BIT(irq)) softint |= 0x8000; + + /* This always happens on GIC0 */ writel_relaxed(softint, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
Add basic infrastructure for triggering a backtrace of other CPUs via an IPI, preferably at FIQ level. It is intended that this shall be used for cases where we have detected that something has already failed in the kernel.
Signed-off-by: Russell King rmk+kernel@arm.linux.org.uk Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- arch/arm/include/asm/irq.h | 5 ++++ arch/arm/kernel/smp.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+)
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h index 53c15dec7af6..be1d07d59ee9 100644 --- a/arch/arm/include/asm/irq.h +++ b/arch/arm/include/asm/irq.h @@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *); extern void set_handle_irq(void (*handle_irq)(struct pt_regs *)); #endif
+#ifdef CONFIG_SMP +extern void arch_trigger_all_cpu_backtrace(bool); +#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x) +#endif + #endif
#endif diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 13396d3d600e..14c594a12bef 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -72,8 +72,12 @@ enum ipi_msg_type { IPI_CPU_STOP, IPI_IRQ_WORK, IPI_COMPLETION, + IPI_CPU_BACKTRACE, };
+/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; + static DECLARE_COMPLETION(cpu_running);
static struct smp_operations smp_ops; @@ -535,6 +539,21 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
+static void ipi_cpu_backtrace(struct pt_regs *regs) +{ + int cpu = smp_processor_id(); + + if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { + static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED; + + arch_spin_lock(&lock); + printk(KERN_WARNING "FIQ backtrace for cpu %d\n", cpu); + show_regs(regs); + arch_spin_unlock(&lock); + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + } +} + static DEFINE_PER_CPU(struct completion *, cpu_completion);
int register_ipi_completion(struct completion *completion, int cpu) @@ -614,6 +633,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs) irq_exit(); break;
+ case IPI_CPU_BACKTRACE: + irq_enter(); + ipi_cpu_backtrace(regs); + irq_exit(); + break; + default: printk(KERN_CRIT "CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); @@ -708,3 +733,40 @@ static int __init register_cpufreq_notifier(void) core_initcall(register_cpufreq_notifier);
#endif + +void arch_trigger_all_cpu_backtrace(bool include_self) +{ + static unsigned long backtrace_flag; + int i, cpu = get_cpu(); + + if (test_and_set_bit(0, &backtrace_flag)) { + /* + * If there is already a trigger_all_cpu_backtrace() in progress + * (backtrace_flag == 1), don't output double cpu dump infos. + */ + put_cpu(); + return; + } + + cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask); + if (!include_self) + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + + if (!cpumask_empty(to_cpumask(backtrace_mask))) { + pr_info("Sending FIQ to %s CPUs:\n", + (include_self ? "all" : "other")); + smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE); + } + + /* Wait for up to 10 seconds for all CPUs to do the backtrace */ + for (i = 0; i < 10 * 1000; i++) { + if (cpumask_empty(to_cpumask(backtrace_mask))) + break; + + mdelay(1); + } + + clear_bit(0, &backtrace_flag); + smp_mb__after_atomic(); + put_cpu(); +}
Previous changes have introduced both a replacement default FIQ handler and an implementation of arch_trigger_all_cpu_backtrace for ARM but these are currently independent of each other.
This patch plumbs together these features making it possible, on platforms that support it, to trigger backtrace using FIQ.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- arch/arm/include/asm/smp.h | 3 +++ arch/arm/kernel/smp.c | 4 +++- arch/arm/kernel/traps.c | 3 +++ 3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h index 18f5a554134f..b076584ac0fa 100644 --- a/arch/arm/include/asm/smp.h +++ b/arch/arm/include/asm/smp.h @@ -18,6 +18,8 @@ # error "<asm/smp.h> included in non-SMP build" #endif
+#define SMP_IPI_FIQ_MASK 0x0100 + #define raw_smp_processor_id() (current_thread_info()->cpu)
struct seq_file; @@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu); extern void arch_send_call_function_ipi_mask(const struct cpumask *mask); extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
+extern void ipi_cpu_backtrace(struct pt_regs *regs); extern int register_ipi_completion(struct completion *completion, int cpu);
struct smp_operations { diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 14c594a12bef..e923843562d9 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -539,7 +539,7 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
-static void ipi_cpu_backtrace(struct pt_regs *regs) +void ipi_cpu_backtrace(struct pt_regs *regs) { int cpu = smp_processor_id();
@@ -580,6 +580,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs) unsigned int cpu = smp_processor_id(); struct pt_regs *old_regs = set_irq_regs(regs);
+ BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE)); + if ((unsigned)ipinr < NR_IPI) { trace_ipi_entry(ipi_types[ipinr]); __inc_irq_stat(cpu, ipi_irqs[ipinr]); diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 4dc45b38e56e..9eb05be9526e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) #ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif +#ifdef CONFIG_SMP + ipi_cpu_backtrace(regs); +#endif
nmi_exit();
Hi Thomas, Hi Jason: Patches 1 to 4 are for you.
This patchset modifies the GIC driver to allow it, on supported platforms, to route IPI interrupts to FIQ and uses this feature to implement arch_trigger_all_cpu_backtrace for arm.
On platforms not capable of supporting FIQ the signal to generate a backtrace we fall back to using IRQ for propagation instead (relying on a timeout to avoid wedging the CPU requesting the backtrace if other CPUs are not responsive).
It has been tested on two systems capable of supporting grouping (Freescale i.MX6 and STiH416) and two that do not (vexpress-a9 and Qualcomm Snapdragon 600).
v11:
* Optimized gic_raise_softirq() by replacing a register read with a memory read (Jason Cooper).
v10:
* Add a further patch to optimize away some of the locking on systems where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with exynos_defconfig (which is the only defconfig to set this option).
* Whitespace fixes in patch 4. That patch previously used spaces for alignment of new constants but the rest of the file used tabs.
v9:
* Improved documentation and structure of initial patch (now initial two patches) to make gic_raise_softirq() safe to call from FIQ (Thomas Gleixner).
* Avoid masking interrupts during gic_raise_softirq(). The use of the read lock makes this redundant (because we can safely re-enter the function).
v8:
* Fixed build on arm64 causes by a spurious include file in irq-gic.c.
v7-2 (accidentally released twice with same number):
* Fixed boot regression on vexpress-a9 (reported by Russell King).
* Rebased on v3.18-rc3; removed one patch from set that is already included in mainline.
* Dropped arm64/fiq.h patch from the set (still useful but not related to issuing backtraces).
v7:
* Re-arranged code within the patch series to fix a regression introduced midway through the series and corrected by a later patch (testing by Olof's autobuilder). Tested offending patch in isolation using defconfig identified by the autobuilder.
v6:
* Renamed svc_entry's call_trace argument to just trace (example code from Russell King).
* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell King).
* Modified usr_entry to optional avoid calling into the trace code and used this in FIQ entry from usr path. Modified corresponding exit code to avoid calling into trace code and the scheduler (example code from Russell King).
* Ensured the default FIQ register state is restored when the default FIQ handler is reinstalled (example code from Russell King).
* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting a default FIQ handler.
* Re-instated fiq_safe_migration_lock and associated logic in gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd() in the console unlock logic.
v5:
* Rebased on 3.17-rc4.
* Removed a spurious line from the final "glue it together" patch that broke the build.
v4:
* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas Pitre).
* Really fix bad pt_regs pointer generation in __fiq_abt.
* Remove fiq_safe_migration_lock and associated logic in gic_raise_softirq() (review of Russell King)
* Restructured to introduce the default FIQ handler first, before the new features (review of Russell King).
v3:
* Removed redundant header guards from arch/arm64/include/asm/fiq.h (review of Catalin Marinas).
* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas Pitre).
v2:
* Restructured to sit nicely on a similar FYI patchset from Russell King. It now effectively replaces the work in progress final patch with something much more complete.
* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq (review of Nicolas Pitre)
* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts being acknowledged by the IRQ handler does still exist but should be harmless because the IRQ handler will still wind up calling ipi_cpu_backtrace().
* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively becomes a platform feature (although the use of non-maskable interrupts to implement it is best effort rather than guaranteed).
* Better comments highlighting usage of RAZ/WI registers (and parts of registers) in the GIC code.
Changes *before* v1:
* This patchset is a hugely cut-down successor to "[PATCH v11 00/19] arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting the new structure. For historic details see: https://lkml.org/lkml/2014/9/2/227
* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value). In fixing this we also remove the useless indirection previously found in the fiq_handler macro.
* Make default fiq handler "always on" by migrating from fiq.c to traps.c and replace do_unexp_fiq with the new handler (review of Russell King).
* Add arm64 version of fiq.h (review of Russell King)
* Removed conditional branching and code from irq-gic.c, this is replaced by much simpler code that relies on the GIC specification's heavy use of read-as-zero/write-ignored (review of Russell King)
Daniel Thompson (6): irqchip: gic: Finer grain locking for gic_raise_softirq irqchip: gic: Optimize locking in gic_raise_softirq irqchip: gic: Make gic_raise_softirq FIQ-safe irqchip: gic: Introduce plumbing for IPI FIQ ARM: add basic support for on-demand backtrace of other CPUs arm: smp: Handle ipi_cpu_backtrace() using FIQ (if available)
arch/arm/include/asm/irq.h | 5 + arch/arm/include/asm/smp.h | 3 + arch/arm/kernel/smp.c | 64 +++++++++++++ arch/arm/kernel/traps.c | 8 +- drivers/irqchip/irq-gic.c | 203 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 ++ 6 files changed, 275 insertions(+), 16 deletions(-)
-- 1.9.3
irq_controller_lock is used for multiple purposes within the gic driver. Primarily it is used to make register read-modify-write sequences atomic. It is also used by gic_raise_softirq() in order that the big.LITTLE migration logic can figure out when it is safe to migrate interrupts between physical cores.
The second usage of irq_controller_lock is difficult to discern when reviewing the code because the migration itself takes place outside the lock.
This patch makes the second usage more explicit by splitting it out into a separate lock and providing better comments.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..94d77118efa8 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,12 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock);
/* + * This lock is used by the big.LITTLE migration code to ensure no IPIs + * can be pended on the old core after the map has been updated. + */ +static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock); + +/* * The GIC mapping of CPU interfaces does not necessarily match * the logical CPU numbering. Let's use a mapping as returned * by the GIC itself. @@ -624,7 +630,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags); + raw_spin_lock_irqsave(&cpu_map_migration_lock, flags);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -639,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&irq_controller_lock, flags); + raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); } #endif
@@ -710,8 +716,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
raw_spin_lock(&irq_controller_lock);
- /* Update the target interface for this logical CPU */ + /* + * Update the target interface for this logical CPU + * + * From the point we release the cpu_map_migration_lock any new + * SGIs will be pended on the new cpu which makes the set of SGIs + * pending on the old cpu static. That means we can defer the + * migration until after we have released the irq_controller_lock. + */ + raw_spin_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id; + raw_spin_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
Currently gic_raise_softirq() unconditionally takes and releases a lock whose only purpose is to synchronize with the b.L switcher.
Remove this lock if the b.L switcher is not compiled in.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 94d77118efa8..e875da93f24a 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -76,8 +76,23 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock); * This lock is used by the big.LITTLE migration code to ensure no IPIs * can be pended on the old core after the map has been updated. */ +#ifdef CONFIG_BL_SWITCHER static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock);
+static inline void bl_migration_lock(unsigned long *flags) +{ + raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags); +} + +static inline void bl_migration_unlock(unsigned long flags) +{ + raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); +} +#else +static inline void bl_migration_lock(unsigned long *flags) {} +static inline void bl_migration_unlock(unsigned long flags) {} +#endif + /* * The GIC mapping of CPU interfaces does not necessarily match * the logical CPU numbering. Let's use a mapping as returned @@ -630,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&cpu_map_migration_lock, flags); + bl_migration_lock(&flags);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -645,7 +660,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); + bl_migration_unlock(flags); } #endif
On Thu, 27 Nov 2014, Daniel Thompson wrote:
Currently gic_raise_softirq() unconditionally takes and releases a lock whose only purpose is to synchronize with the b.L switcher.
Remove this lock if the b.L switcher is not compiled in.
I think the patches are in the wrong order. We optimize for the sane use case first, i.e BL=n. So you want to make the locking of irq_controller_lock in gic_raise_softirq() conditional in the first place, which should have been done when this was introduced.
Once you have isolated that you can apply your split lock patch for the BL=y nonsense.
Adding more locks first and then optimizing them out does not make any sense.
Thanks,
tglx
On 27/11/14 21:37, Thomas Gleixner wrote:
On Thu, 27 Nov 2014, Daniel Thompson wrote:
Currently gic_raise_softirq() unconditionally takes and releases a lock whose only purpose is to synchronize with the b.L switcher.
Remove this lock if the b.L switcher is not compiled in.
I think the patches are in the wrong order. We optimize for the sane use case first, i.e BL=n. So you want to make the locking of irq_controller_lock in gic_raise_softirq() conditional in the first place, which should have been done when this was introduced.
Once you have isolated that you can apply your split lock patch for the BL=y nonsense.
Adding more locks first and then optimizing them out does not make any sense.
You original described the use of irq_controller_lock for its current dual purpose to be an abuse of the lock. Does it really make more sense to optimize before we correct the abuse?
How about just squashing them together? It reduces the combined diffstat by ~10%...
Daniel.
It is currently possible for FIQ handlers to re-enter gic_raise_softirq() and lock up.
gic_raise_softirq() lock(x); -~-> FIQ handle_fiq() gic_raise_softirq() lock(x); <-- Lockup
Calling printk() from a FIQ handler can trigger this problem because printk() raises an IPI when it needs to wake_up_klogd(). More generally, IPIs are the only means for FIQ handlers to safely defer work to less restrictive calling context so the function to raise them really needs to be FIQ-safe.
This patch fixes the problem by converting the cpu_map_migration_lock into a rwlock making it safe to re-enter the function.
Having made it safe to re-enter gic_raise_softirq() we no longer need to mask interrupts during gic_raise_softirq() because the b.L migration is always performed from task context.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++------------- 1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index e875da93f24a..5d72823bc5e9 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock); /* * This lock is used by the big.LITTLE migration code to ensure no IPIs * can be pended on the old core after the map has been updated. + * + * This lock may be locked for reading from both IRQ and FIQ handlers + * and therefore must not be locked for writing when these are enabled. */ #ifdef CONFIG_BL_SWITCHER -static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock); +static DEFINE_RWLOCK(cpu_map_migration_lock);
-static inline void bl_migration_lock(unsigned long *flags) +static inline void bl_migration_lock(void) { - raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags); + read_lock(&cpu_map_migration_lock); }
-static inline void bl_migration_unlock(unsigned long flags) +static inline void bl_migration_unlock(void) { - raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); + read_unlock(&cpu_map_migration_lock); } #else -static inline void bl_migration_lock(unsigned long *flags) {} -static inline void bl_migration_unlock(unsigned long flags) {} +static inline void bl_migration_lock(void) {} +static inline void bl_migration_unlock(void) {} #endif
/* @@ -640,12 +643,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic) #endif
#ifdef CONFIG_SMP +/* + * Raise the specified IPI on all cpus set in mask. + * + * This function is safe to call from all calling contexts, including + * FIQ handlers. It relies on bl_migration_lock() being multiply acquirable + * to avoid deadlocks when the function is re-entered at different + * exception levels. + */ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; - unsigned long flags, map = 0; + unsigned long map = 0;
- bl_migration_lock(&flags); + bl_migration_lock();
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -660,7 +671,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- bl_migration_unlock(flags); + bl_migration_unlock(); } #endif
@@ -708,7 +719,8 @@ int gic_get_cpu_id(unsigned int cpu) * Migrate all peripheral interrupts with a target matching the current CPU * to the interface corresponding to @new_cpu_id. The CPU interface mapping * is also updated. Targets to other CPU interfaces are unchanged. - * This must be called with IRQs locally disabled. + * This must be called from a task context and with IRQ and FIQ locally + * disabled. */ void gic_migrate_target(unsigned int new_cpu_id) { @@ -739,9 +751,9 @@ void gic_migrate_target(unsigned int new_cpu_id) * pending on the old cpu static. That means we can defer the * migration until after we have released the irq_controller_lock. */ - raw_spin_lock(&cpu_map_migration_lock); + write_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id; - raw_spin_unlock(&cpu_map_migration_lock); + write_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
On Thu, 27 Nov 2014, Daniel Thompson wrote:
It is currently possible for FIQ handlers to re-enter gic_raise_softirq() and lock up.
gic_raise_softirq() lock(x);
-~-> FIQ handle_fiq() gic_raise_softirq() lock(x); <-- Lockup
Calling printk() from a FIQ handler can trigger this problem because printk() raises an IPI when it needs to wake_up_klogd(). More generally, IPIs are the only means for FIQ handlers to safely defer work to less restrictive calling context so the function to raise them really needs to be FIQ-safe.
That's not really true. irq_work can be used from FIQ/NMI context and it was specifically designed for that purpose.
Now printk is a different issue, but there is work in progress to make printk safe from FIQ/NMI context as well. This is not an ARM specific issue. Any architecture which has NMI like facilities has the problem of doing printk from that context. Steven is working on a mitigation for that. https://lkml.org/lkml/2014/11/18/1146
Thanks,
tglx
On 27/11/14 21:45, Thomas Gleixner wrote:
On Thu, 27 Nov 2014, Daniel Thompson wrote:
It is currently possible for FIQ handlers to re-enter gic_raise_softirq() and lock up.
gic_raise_softirq() lock(x);
-~-> FIQ handle_fiq() gic_raise_softirq() lock(x); <-- Lockup
Calling printk() from a FIQ handler can trigger this problem because printk() raises an IPI when it needs to wake_up_klogd(). More generally, IPIs are the only means for FIQ handlers to safely defer work to less restrictive calling context so the function to raise them really needs to be FIQ-safe.
That's not really true. irq_work can be used from FIQ/NMI context and it was specifically designed for that purpose.
Actually we cannot currently issue irq_work used from FIQ context; that's exactly what this patch fixes and is why wake_up_klogd() currently locks up. ARM implements arch_irq_work_raise() using IPIs (at least on SMP).
I'll fix the wording to make this more explicit.
Now printk is a different issue, but there is work in progress to make printk safe from FIQ/NMI context as well. This is not an ARM specific issue. Any architecture which has NMI like facilities has the problem of doing printk from that context. Steven is working on a mitigation for that. https://lkml.org/lkml/2014/11/18/1146
Thanks. I'll watch that with interest.
In that case I'll drop the printk() rationale entirely. The rationale above shoudl be enough motivation because the only other thing likely to printk() from interrupt is the hard lockup detector and, because that uses perf events, it requires irq_work be be free from lockups way before it ever thinks about calling printk().
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org --- arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 151 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 153 insertions(+), 11 deletions(-)
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 0c8b10801d36..4dc45b38e56e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -26,6 +26,7 @@ #include <linux/init.h> #include <linux/sched.h> #include <linux/irq.h> +#include <linux/irqchip/arm-gic.h>
#include <linux/atomic.h> #include <asm/cacheflush.h> @@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
nmi_enter();
- /* nop. FIQ handlers for special arch/arm features can be added here. */ +#ifdef CONFIG_ARM_GIC + gic_handle_fiq_ipi(); +#endif
nmi_exit();
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 5d72823bc5e9..5c36aefa67ea 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -39,6 +39,7 @@ #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/ratelimit.h>
#include <asm/cputype.h> #include <asm/irq.h> @@ -48,6 +49,10 @@ #include "irq-gic-common.h" #include "irqchip.h"
+#ifndef SMP_IPI_FIQ_MASK +#define SMP_IPI_FIQ_MASK 0 +#endif + union gic_base { void __iomem *common_base; void __percpu * __iomem *percpu_base; @@ -65,6 +70,7 @@ struct gic_chip_data { #endif struct irq_domain *domain; unsigned int gic_irqs; + u32 igroup0_shadow; #ifdef CONFIG_GIC_NON_BANKED void __iomem *(*get_base)(union gic_base *); #endif @@ -348,6 +354,83 @@ static struct irq_chip gic_chip = { .irq_set_wake = gic_set_wake, };
+/* + * Shift an interrupt between Group 0 and Group 1. + * + * In addition to changing the group we also modify the priority to + * match what "ARM strongly recommends" for a system where no Group 1 + * interrupt must ever preempt a Group 0 interrupt. + * + * If is safe to call this function on systems which do not support + * grouping (it will have no effect). + */ +static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq, + int group) +{ + void __iomem *base = gic_data_dist_base(gic); + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_mask = BIT(hwirq % 32); + u32 grp_val; + + unsigned int pri_reg = (hwirq / 4) * 4; + u32 pri_mask = BIT(7 + ((hwirq % 4) * 8)); + u32 pri_val; + + /* + * Systems which do not support grouping will have not have + * the EnableGrp1 bit set. + */ + if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))) + return; + + raw_spin_lock(&irq_controller_lock); + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg); + + if (group) { + grp_val |= grp_mask; + pri_val |= pri_mask; + } else { + grp_val &= ~grp_mask; + pri_val &= ~pri_mask; + } + + writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg); + if (grp_reg == 0) + gic->igroup0_shadow = grp_val; + + writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg); + + raw_spin_unlock(&irq_controller_lock); +} + + +/* + * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI, + * otherwise do nothing. + */ +void gic_handle_fiq_ipi(void) +{ + struct gic_chip_data *gic = &gic_data[0]; + void __iomem *cpu_base = gic_data_cpu_base(gic); + unsigned long irqstat, irqnr; + + if (WARN_ON(!in_nmi())) + return; + + while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) & + SMP_IPI_FIQ_MASK) { + irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK); + writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); + + irqnr = irqstat & GICC_IAR_INT_ID_MASK; + WARN_RATELIMIT(irqnr > 16, + "Unexpected irqnr %lu (bad prioritization?)\n", + irqnr); + } +} + void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq) { if (gic_nr >= MAX_GIC_NR) @@ -379,15 +462,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic) static void gic_cpu_if_up(void) { void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]); - u32 bypass = 0; + void __iomem *dist_base = gic_data_dist_base(&gic_data[0]); + u32 ctrl = 0;
/* - * Preserve bypass disable bits to be written back later - */ - bypass = readl(cpu_base + GIC_CPU_CTRL); - bypass &= GICC_DIS_BYPASS_MASK; + * Preserve bypass disable bits to be written back later + */ + ctrl = readl(cpu_base + GIC_CPU_CTRL); + ctrl &= GICC_DIS_BYPASS_MASK;
- writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); + /* + * If EnableGrp1 is set in the distributor then enable group 1 + * support for this CPU (and route group 0 interrupts to FIQ). + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) + ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL | + GICC_ENABLE_GRP1; + + writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); }
@@ -411,7 +503,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
gic_dist_config(base, gic_irqs, NULL);
- writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL); + /* + * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only, + * bit 1 ignored) depending on current mode. + */ + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL); + + /* + * Set all global interrupts to be group 1 if (and only if) it + * is possible to enable group 1 interrupts. This register is RAZ/WI + * if not accessible or not implemented, however some GICv1 devices + * do not implement the EnableGrp1 bit making it unsafe to set + * this register unconditionally. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)) + for (i = 32; i < gic_irqs; i += 32) + writel_relaxed(0xffffffff, + base + GIC_DIST_IGROUP + i * 4 / 32); }
static void gic_cpu_init(struct gic_chip_data *gic) @@ -420,6 +528,7 @@ static void gic_cpu_init(struct gic_chip_data *gic) void __iomem *base = gic_data_cpu_base(gic); unsigned int cpu_mask, cpu = smp_processor_id(); int i; + unsigned long secure_irqs, secure_irq;
/* * Get what the GIC says our CPU mask is. @@ -438,6 +547,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
gic_cpu_config(dist_base, NULL);
+ /* + * If the distributor is configured to support interrupt grouping + * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK + * to be group1 and ensure any remaining group 0 interrupts have + * the right priority. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) { + secure_irqs = SMP_IPI_FIQ_MASK; + writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0); + gic->igroup0_shadow = ~secure_irqs; + for_each_set_bit(secure_irq, &secure_irqs, 16) + gic_set_group_irq(gic, secure_irq, 0); + } + writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK); gic_cpu_if_up(); } @@ -527,7 +650,8 @@ static void gic_dist_restore(unsigned int gic_nr) writel_relaxed(gic_data[gic_nr].saved_spi_enable[i], dist_base + GIC_DIST_ENABLE_SET + i * 4);
- writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL); + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, + dist_base + GIC_DIST_CTRL); }
static void gic_cpu_save(unsigned int gic_nr) @@ -655,6 +779,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; unsigned long map = 0; + unsigned long softint;
bl_migration_lock();
@@ -668,8 +793,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) */ dmb(ishst);
- /* this always happens on GIC0 */ - writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); + /* We avoid a readl here by using the shadow copy of IGROUP[0] */ + softint = map << 16 | irq; + if (gic_data[0].igroup0_shadow & BIT(irq)) + softint |= 0x8000; + + /* This always happens on GIC0 */ + writel_relaxed(softint, + gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
bl_migration_unlock(); } diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h index 13eed92c7d24..e83d292d4dbc 100644 --- a/include/linux/irqchip/arm-gic.h +++ b/include/linux/irqchip/arm-gic.h @@ -22,6 +22,10 @@ #define GIC_CPU_IDENT 0xfc
#define GICC_ENABLE 0x1 +#define GICC_ENABLE_GRP1 0x2 +#define GICC_ACK_CTL 0x4 +#define GICC_FIQ_EN 0x8 +#define GICC_COMMON_BPR 0x10 #define GICC_INT_PRI_THRESHOLD 0xf0 #define GICC_IAR_INT_ID_MASK 0x3ff #define GICC_INT_SPURIOUS 1023 @@ -44,6 +48,7 @@ #define GIC_DIST_SGI_PENDING_SET 0xf20
#define GICD_ENABLE 0x1 +#define GICD_ENABLE_GRP1 0x2 #define GICD_DISABLE 0x0 #define GICD_INT_ACTLOW_LVLTRIG 0x0 #define GICD_INT_EN_CLR_X32 0xffffffff @@ -117,5 +122,8 @@ static inline void __init register_routable_domain_ops { gic_routable_irq_domain_ops = ops; } + +void gic_handle_fiq_ipi(void); + #endif /* __ASSEMBLY */ #endif
Add basic infrastructure for triggering a backtrace of other CPUs via an IPI, preferably at FIQ level. It is intended that this shall be used for cases where we have detected that something has already failed in the kernel.
Signed-off-by: Russell King rmk+kernel@arm.linux.org.uk Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- arch/arm/include/asm/irq.h | 5 ++++ arch/arm/kernel/smp.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+)
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h index 53c15dec7af6..be1d07d59ee9 100644 --- a/arch/arm/include/asm/irq.h +++ b/arch/arm/include/asm/irq.h @@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *); extern void set_handle_irq(void (*handle_irq)(struct pt_regs *)); #endif
+#ifdef CONFIG_SMP +extern void arch_trigger_all_cpu_backtrace(bool); +#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x) +#endif + #endif
#endif diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 13396d3d600e..14c594a12bef 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -72,8 +72,12 @@ enum ipi_msg_type { IPI_CPU_STOP, IPI_IRQ_WORK, IPI_COMPLETION, + IPI_CPU_BACKTRACE, };
+/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; + static DECLARE_COMPLETION(cpu_running);
static struct smp_operations smp_ops; @@ -535,6 +539,21 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
+static void ipi_cpu_backtrace(struct pt_regs *regs) +{ + int cpu = smp_processor_id(); + + if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { + static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED; + + arch_spin_lock(&lock); + printk(KERN_WARNING "FIQ backtrace for cpu %d\n", cpu); + show_regs(regs); + arch_spin_unlock(&lock); + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + } +} + static DEFINE_PER_CPU(struct completion *, cpu_completion);
int register_ipi_completion(struct completion *completion, int cpu) @@ -614,6 +633,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs) irq_exit(); break;
+ case IPI_CPU_BACKTRACE: + irq_enter(); + ipi_cpu_backtrace(regs); + irq_exit(); + break; + default: printk(KERN_CRIT "CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); @@ -708,3 +733,40 @@ static int __init register_cpufreq_notifier(void) core_initcall(register_cpufreq_notifier);
#endif + +void arch_trigger_all_cpu_backtrace(bool include_self) +{ + static unsigned long backtrace_flag; + int i, cpu = get_cpu(); + + if (test_and_set_bit(0, &backtrace_flag)) { + /* + * If there is already a trigger_all_cpu_backtrace() in progress + * (backtrace_flag == 1), don't output double cpu dump infos. + */ + put_cpu(); + return; + } + + cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask); + if (!include_self) + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + + if (!cpumask_empty(to_cpumask(backtrace_mask))) { + pr_info("Sending FIQ to %s CPUs:\n", + (include_self ? "all" : "other")); + smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE); + } + + /* Wait for up to 10 seconds for all CPUs to do the backtrace */ + for (i = 0; i < 10 * 1000; i++) { + if (cpumask_empty(to_cpumask(backtrace_mask))) + break; + + mdelay(1); + } + + clear_bit(0, &backtrace_flag); + smp_mb__after_atomic(); + put_cpu(); +}
Previous changes have introduced both a replacement default FIQ handler and an implementation of arch_trigger_all_cpu_backtrace for ARM but these are currently independent of each other.
This patch plumbs together these features making it possible, on platforms that support it, to trigger backtrace using FIQ.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- arch/arm/include/asm/smp.h | 3 +++ arch/arm/kernel/smp.c | 4 +++- arch/arm/kernel/traps.c | 3 +++ 3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h index 18f5a554134f..b076584ac0fa 100644 --- a/arch/arm/include/asm/smp.h +++ b/arch/arm/include/asm/smp.h @@ -18,6 +18,8 @@ # error "<asm/smp.h> included in non-SMP build" #endif
+#define SMP_IPI_FIQ_MASK 0x0100 + #define raw_smp_processor_id() (current_thread_info()->cpu)
struct seq_file; @@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu); extern void arch_send_call_function_ipi_mask(const struct cpumask *mask); extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
+extern void ipi_cpu_backtrace(struct pt_regs *regs); extern int register_ipi_completion(struct completion *completion, int cpu);
struct smp_operations { diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 14c594a12bef..e923843562d9 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -539,7 +539,7 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
-static void ipi_cpu_backtrace(struct pt_regs *regs) +void ipi_cpu_backtrace(struct pt_regs *regs) { int cpu = smp_processor_id();
@@ -580,6 +580,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs) unsigned int cpu = smp_processor_id(); struct pt_regs *old_regs = set_irq_regs(regs);
+ BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE)); + if ((unsigned)ipinr < NR_IPI) { trace_ipi_entry(ipi_types[ipinr]); __inc_irq_stat(cpu, ipi_irqs[ipinr]); diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 4dc45b38e56e..9eb05be9526e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) #ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif +#ifdef CONFIG_SMP + ipi_cpu_backtrace(regs); +#endif
nmi_exit();
Hi Thomas, Hi Jason: Patches 1 to 3 are for you.
This patchset modifies the GIC driver to allow it, on supported platforms, to route IPI interrupts to FIQ and uses this feature to implement arch_trigger_all_cpu_backtrace for arm.
On platforms not capable of supporting FIQ the signal to generate a backtrace we fall back to using IRQ for propagation instead (relying on a timeout to avoid wedging the CPU requesting the backtrace if other CPUs are not responsive).
It has been tested on two systems capable of supporting grouping (Freescale i.MX6 and STiH416) and two that do not (vexpress-a9 and Qualcomm Snapdragon 600).
v12:
* Squash first two patches into a single one and re-describe (Thomas Gleixner).
* Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe" (Thomas Gleixner).
v11:
* Optimized gic_raise_softirq() by replacing a register read with a memory read (Jason Cooper).
v10:
* Add a further patch to optimize away some of the locking on systems where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with exynos_defconfig (which is the only defconfig to set this option).
* Whitespace fixes in patch 4. That patch previously used spaces for alignment of new constants but the rest of the file used tabs.
v9:
* Improved documentation and structure of initial patch (now initial two patches) to make gic_raise_softirq() safe to call from FIQ (Thomas Gleixner).
* Avoid masking interrupts during gic_raise_softirq(). The use of the read lock makes this redundant (because we can safely re-enter the function).
v8:
* Fixed build on arm64 causes by a spurious include file in irq-gic.c.
v7-2 (accidentally released twice with same number):
* Fixed boot regression on vexpress-a9 (reported by Russell King).
* Rebased on v3.18-rc3; removed one patch from set that is already included in mainline.
* Dropped arm64/fiq.h patch from the set (still useful but not related to issuing backtraces).
v7:
* Re-arranged code within the patch series to fix a regression introduced midway through the series and corrected by a later patch (testing by Olof's autobuilder). Tested offending patch in isolation using defconfig identified by the autobuilder.
v6:
* Renamed svc_entry's call_trace argument to just trace (example code from Russell King).
* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell King).
* Modified usr_entry to optional avoid calling into the trace code and used this in FIQ entry from usr path. Modified corresponding exit code to avoid calling into trace code and the scheduler (example code from Russell King).
* Ensured the default FIQ register state is restored when the default FIQ handler is reinstalled (example code from Russell King).
* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting a default FIQ handler.
* Re-instated fiq_safe_migration_lock and associated logic in gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd() in the console unlock logic.
v5:
* Rebased on 3.17-rc4.
* Removed a spurious line from the final "glue it together" patch that broke the build.
v4:
* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas Pitre).
* Really fix bad pt_regs pointer generation in __fiq_abt.
* Remove fiq_safe_migration_lock and associated logic in gic_raise_softirq() (review of Russell King)
* Restructured to introduce the default FIQ handler first, before the new features (review of Russell King).
v3:
* Removed redundant header guards from arch/arm64/include/asm/fiq.h (review of Catalin Marinas).
* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas Pitre).
v2:
* Restructured to sit nicely on a similar FYI patchset from Russell King. It now effectively replaces the work in progress final patch with something much more complete.
* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq (review of Nicolas Pitre)
* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts being acknowledged by the IRQ handler does still exist but should be harmless because the IRQ handler will still wind up calling ipi_cpu_backtrace().
* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively becomes a platform feature (although the use of non-maskable interrupts to implement it is best effort rather than guaranteed).
* Better comments highlighting usage of RAZ/WI registers (and parts of registers) in the GIC code.
Changes *before* v1:
* This patchset is a hugely cut-down successor to "[PATCH v11 00/19] arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting the new structure. For historic details see: https://lkml.org/lkml/2014/9/2/227
* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value). In fixing this we also remove the useless indirection previously found in the fiq_handler macro.
* Make default fiq handler "always on" by migrating from fiq.c to traps.c and replace do_unexp_fiq with the new handler (review of Russell King).
* Add arm64 version of fiq.h (review of Russell King)
* Removed conditional branching and code from irq-gic.c, this is replaced by much simpler code that relies on the GIC specification's heavy use of read-as-zero/write-ignored (review of Russell King)
Daniel Thompson (5): irqchip: gic: Optimize locking in gic_raise_softirq irqchip: gic: Make gic_raise_softirq FIQ-safe irqchip: gic: Introduce plumbing for IPI FIQ ARM: add basic support for on-demand backtrace of other CPUs arm: smp: Handle ipi_cpu_backtrace() using FIQ (if available)
arch/arm/include/asm/irq.h | 5 + arch/arm/include/asm/smp.h | 3 + arch/arm/kernel/smp.c | 64 +++++++++++++ arch/arm/kernel/traps.c | 8 +- drivers/irqchip/irq-gic.c | 203 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 ++ 6 files changed, 275 insertions(+), 16 deletions(-)
-- 1.9.3
Currently gic_raise_softirq() is locked using upon irq_controller_lock. This lock is primarily used to make register read-modify-write sequences atomic but gic_raise_softirq() uses it instead to ensure that the big.LITTLE migration logic can figure out when it is safe to migrate interrupts between physical cores.
This is sub-optimal in closely related ways:
1. No locking at all is required on systems where the b.L switcher is not configured.
2. Finer grain locking can be used on systems where the b.L switcher is present.
This patch resolves both of the above by introducing a separate finer grain lock and providing conditionally compiled inlines to lock/unlock it.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 36 +++++++++++++++++++++++++++++++++--- 1 file changed, 33 insertions(+), 3 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 38493ff28fa5..e875da93f24a 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,27 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock);
/* + * This lock is used by the big.LITTLE migration code to ensure no IPIs + * can be pended on the old core after the map has been updated. + */ +#ifdef CONFIG_BL_SWITCHER +static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock); + +static inline void bl_migration_lock(unsigned long *flags) +{ + raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags); +} + +static inline void bl_migration_unlock(unsigned long flags) +{ + raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); +} +#else +static inline void bl_migration_lock(unsigned long *flags) {} +static inline void bl_migration_unlock(unsigned long flags) {} +#endif + +/* * The GIC mapping of CPU interfaces does not necessarily match * the logical CPU numbering. Let's use a mapping as returned * by the GIC itself. @@ -624,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags); + bl_migration_lock(&flags);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -639,7 +660,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&irq_controller_lock, flags); + bl_migration_unlock(flags); } #endif
@@ -710,8 +731,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
raw_spin_lock(&irq_controller_lock);
- /* Update the target interface for this logical CPU */ + /* + * Update the target interface for this logical CPU + * + * From the point we release the cpu_map_migration_lock any new + * SGIs will be pended on the new cpu which makes the set of SGIs + * pending on the old cpu static. That means we can defer the + * migration until after we have released the irq_controller_lock. + */ + raw_spin_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id; + raw_spin_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
It is currently possible for FIQ handlers to re-enter gic_raise_softirq() and lock up.
gic_raise_softirq() lock(x); -~-> FIQ handle_fiq() gic_raise_softirq() lock(x); <-- Lockup
arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue renders it difficult for FIQ handlers to safely defer work to less restrictive calling contexts.
This patch fixes the problem by converting the cpu_map_migration_lock into a rwlock making it safe to re-enter the function.
Note that having made it safe to re-enter gic_raise_softirq() we no longer need to mask interrupts during gic_raise_softirq() because the b.L migration is always performed from task context.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++------------- 1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index e875da93f24a..5d72823bc5e9 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock); /* * This lock is used by the big.LITTLE migration code to ensure no IPIs * can be pended on the old core after the map has been updated. + * + * This lock may be locked for reading from both IRQ and FIQ handlers + * and therefore must not be locked for writing when these are enabled. */ #ifdef CONFIG_BL_SWITCHER -static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock); +static DEFINE_RWLOCK(cpu_map_migration_lock);
-static inline void bl_migration_lock(unsigned long *flags) +static inline void bl_migration_lock(void) { - raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags); + read_lock(&cpu_map_migration_lock); }
-static inline void bl_migration_unlock(unsigned long flags) +static inline void bl_migration_unlock(void) { - raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); + read_unlock(&cpu_map_migration_lock); } #else -static inline void bl_migration_lock(unsigned long *flags) {} -static inline void bl_migration_unlock(unsigned long flags) {} +static inline void bl_migration_lock(void) {} +static inline void bl_migration_unlock(void) {} #endif
/* @@ -640,12 +643,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic) #endif
#ifdef CONFIG_SMP +/* + * Raise the specified IPI on all cpus set in mask. + * + * This function is safe to call from all calling contexts, including + * FIQ handlers. It relies on bl_migration_lock() being multiply acquirable + * to avoid deadlocks when the function is re-entered at different + * exception levels. + */ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; - unsigned long flags, map = 0; + unsigned long map = 0;
- bl_migration_lock(&flags); + bl_migration_lock();
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -660,7 +671,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- bl_migration_unlock(flags); + bl_migration_unlock(); } #endif
@@ -708,7 +719,8 @@ int gic_get_cpu_id(unsigned int cpu) * Migrate all peripheral interrupts with a target matching the current CPU * to the interface corresponding to @new_cpu_id. The CPU interface mapping * is also updated. Targets to other CPU interfaces are unchanged. - * This must be called with IRQs locally disabled. + * This must be called from a task context and with IRQ and FIQ locally + * disabled. */ void gic_migrate_target(unsigned int new_cpu_id) { @@ -739,9 +751,9 @@ void gic_migrate_target(unsigned int new_cpu_id) * pending on the old cpu static. That means we can defer the * migration until after we have released the irq_controller_lock. */ - raw_spin_lock(&cpu_map_migration_lock); + write_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id; - raw_spin_unlock(&cpu_map_migration_lock); + write_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org --- arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 151 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 153 insertions(+), 11 deletions(-)
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 0c8b10801d36..4dc45b38e56e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -26,6 +26,7 @@ #include <linux/init.h> #include <linux/sched.h> #include <linux/irq.h> +#include <linux/irqchip/arm-gic.h>
#include <linux/atomic.h> #include <asm/cacheflush.h> @@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
nmi_enter();
- /* nop. FIQ handlers for special arch/arm features can be added here. */ +#ifdef CONFIG_ARM_GIC + gic_handle_fiq_ipi(); +#endif
nmi_exit();
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 5d72823bc5e9..5c36aefa67ea 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -39,6 +39,7 @@ #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/ratelimit.h>
#include <asm/cputype.h> #include <asm/irq.h> @@ -48,6 +49,10 @@ #include "irq-gic-common.h" #include "irqchip.h"
+#ifndef SMP_IPI_FIQ_MASK +#define SMP_IPI_FIQ_MASK 0 +#endif + union gic_base { void __iomem *common_base; void __percpu * __iomem *percpu_base; @@ -65,6 +70,7 @@ struct gic_chip_data { #endif struct irq_domain *domain; unsigned int gic_irqs; + u32 igroup0_shadow; #ifdef CONFIG_GIC_NON_BANKED void __iomem *(*get_base)(union gic_base *); #endif @@ -348,6 +354,83 @@ static struct irq_chip gic_chip = { .irq_set_wake = gic_set_wake, };
+/* + * Shift an interrupt between Group 0 and Group 1. + * + * In addition to changing the group we also modify the priority to + * match what "ARM strongly recommends" for a system where no Group 1 + * interrupt must ever preempt a Group 0 interrupt. + * + * If is safe to call this function on systems which do not support + * grouping (it will have no effect). + */ +static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq, + int group) +{ + void __iomem *base = gic_data_dist_base(gic); + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_mask = BIT(hwirq % 32); + u32 grp_val; + + unsigned int pri_reg = (hwirq / 4) * 4; + u32 pri_mask = BIT(7 + ((hwirq % 4) * 8)); + u32 pri_val; + + /* + * Systems which do not support grouping will have not have + * the EnableGrp1 bit set. + */ + if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))) + return; + + raw_spin_lock(&irq_controller_lock); + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg); + + if (group) { + grp_val |= grp_mask; + pri_val |= pri_mask; + } else { + grp_val &= ~grp_mask; + pri_val &= ~pri_mask; + } + + writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg); + if (grp_reg == 0) + gic->igroup0_shadow = grp_val; + + writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg); + + raw_spin_unlock(&irq_controller_lock); +} + + +/* + * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI, + * otherwise do nothing. + */ +void gic_handle_fiq_ipi(void) +{ + struct gic_chip_data *gic = &gic_data[0]; + void __iomem *cpu_base = gic_data_cpu_base(gic); + unsigned long irqstat, irqnr; + + if (WARN_ON(!in_nmi())) + return; + + while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) & + SMP_IPI_FIQ_MASK) { + irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK); + writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); + + irqnr = irqstat & GICC_IAR_INT_ID_MASK; + WARN_RATELIMIT(irqnr > 16, + "Unexpected irqnr %lu (bad prioritization?)\n", + irqnr); + } +} + void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq) { if (gic_nr >= MAX_GIC_NR) @@ -379,15 +462,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic) static void gic_cpu_if_up(void) { void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]); - u32 bypass = 0; + void __iomem *dist_base = gic_data_dist_base(&gic_data[0]); + u32 ctrl = 0;
/* - * Preserve bypass disable bits to be written back later - */ - bypass = readl(cpu_base + GIC_CPU_CTRL); - bypass &= GICC_DIS_BYPASS_MASK; + * Preserve bypass disable bits to be written back later + */ + ctrl = readl(cpu_base + GIC_CPU_CTRL); + ctrl &= GICC_DIS_BYPASS_MASK;
- writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); + /* + * If EnableGrp1 is set in the distributor then enable group 1 + * support for this CPU (and route group 0 interrupts to FIQ). + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) + ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL | + GICC_ENABLE_GRP1; + + writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); }
@@ -411,7 +503,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
gic_dist_config(base, gic_irqs, NULL);
- writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL); + /* + * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only, + * bit 1 ignored) depending on current mode. + */ + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL); + + /* + * Set all global interrupts to be group 1 if (and only if) it + * is possible to enable group 1 interrupts. This register is RAZ/WI + * if not accessible or not implemented, however some GICv1 devices + * do not implement the EnableGrp1 bit making it unsafe to set + * this register unconditionally. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)) + for (i = 32; i < gic_irqs; i += 32) + writel_relaxed(0xffffffff, + base + GIC_DIST_IGROUP + i * 4 / 32); }
static void gic_cpu_init(struct gic_chip_data *gic) @@ -420,6 +528,7 @@ static void gic_cpu_init(struct gic_chip_data *gic) void __iomem *base = gic_data_cpu_base(gic); unsigned int cpu_mask, cpu = smp_processor_id(); int i; + unsigned long secure_irqs, secure_irq;
/* * Get what the GIC says our CPU mask is. @@ -438,6 +547,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
gic_cpu_config(dist_base, NULL);
+ /* + * If the distributor is configured to support interrupt grouping + * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK + * to be group1 and ensure any remaining group 0 interrupts have + * the right priority. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) { + secure_irqs = SMP_IPI_FIQ_MASK; + writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0); + gic->igroup0_shadow = ~secure_irqs; + for_each_set_bit(secure_irq, &secure_irqs, 16) + gic_set_group_irq(gic, secure_irq, 0); + } + writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK); gic_cpu_if_up(); } @@ -527,7 +650,8 @@ static void gic_dist_restore(unsigned int gic_nr) writel_relaxed(gic_data[gic_nr].saved_spi_enable[i], dist_base + GIC_DIST_ENABLE_SET + i * 4);
- writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL); + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, + dist_base + GIC_DIST_CTRL); }
static void gic_cpu_save(unsigned int gic_nr) @@ -655,6 +779,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; unsigned long map = 0; + unsigned long softint;
bl_migration_lock();
@@ -668,8 +793,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) */ dmb(ishst);
- /* this always happens on GIC0 */ - writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); + /* We avoid a readl here by using the shadow copy of IGROUP[0] */ + softint = map << 16 | irq; + if (gic_data[0].igroup0_shadow & BIT(irq)) + softint |= 0x8000; + + /* This always happens on GIC0 */ + writel_relaxed(softint, + gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
bl_migration_unlock(); } diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h index 13eed92c7d24..e83d292d4dbc 100644 --- a/include/linux/irqchip/arm-gic.h +++ b/include/linux/irqchip/arm-gic.h @@ -22,6 +22,10 @@ #define GIC_CPU_IDENT 0xfc
#define GICC_ENABLE 0x1 +#define GICC_ENABLE_GRP1 0x2 +#define GICC_ACK_CTL 0x4 +#define GICC_FIQ_EN 0x8 +#define GICC_COMMON_BPR 0x10 #define GICC_INT_PRI_THRESHOLD 0xf0 #define GICC_IAR_INT_ID_MASK 0x3ff #define GICC_INT_SPURIOUS 1023 @@ -44,6 +48,7 @@ #define GIC_DIST_SGI_PENDING_SET 0xf20
#define GICD_ENABLE 0x1 +#define GICD_ENABLE_GRP1 0x2 #define GICD_DISABLE 0x0 #define GICD_INT_ACTLOW_LVLTRIG 0x0 #define GICD_INT_EN_CLR_X32 0xffffffff @@ -117,5 +122,8 @@ static inline void __init register_routable_domain_ops { gic_routable_irq_domain_ops = ops; } + +void gic_handle_fiq_ipi(void); + #endif /* __ASSEMBLY */ #endif
Add basic infrastructure for triggering a backtrace of other CPUs via an IPI, preferably at FIQ level. It is intended that this shall be used for cases where we have detected that something has already failed in the kernel.
Signed-off-by: Russell King rmk+kernel@arm.linux.org.uk Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- arch/arm/include/asm/irq.h | 5 ++++ arch/arm/kernel/smp.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+)
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h index 53c15dec7af6..be1d07d59ee9 100644 --- a/arch/arm/include/asm/irq.h +++ b/arch/arm/include/asm/irq.h @@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *); extern void set_handle_irq(void (*handle_irq)(struct pt_regs *)); #endif
+#ifdef CONFIG_SMP +extern void arch_trigger_all_cpu_backtrace(bool); +#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x) +#endif + #endif
#endif diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 13396d3d600e..14c594a12bef 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -72,8 +72,12 @@ enum ipi_msg_type { IPI_CPU_STOP, IPI_IRQ_WORK, IPI_COMPLETION, + IPI_CPU_BACKTRACE, };
+/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; + static DECLARE_COMPLETION(cpu_running);
static struct smp_operations smp_ops; @@ -535,6 +539,21 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
+static void ipi_cpu_backtrace(struct pt_regs *regs) +{ + int cpu = smp_processor_id(); + + if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { + static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED; + + arch_spin_lock(&lock); + printk(KERN_WARNING "FIQ backtrace for cpu %d\n", cpu); + show_regs(regs); + arch_spin_unlock(&lock); + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + } +} + static DEFINE_PER_CPU(struct completion *, cpu_completion);
int register_ipi_completion(struct completion *completion, int cpu) @@ -614,6 +633,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs) irq_exit(); break;
+ case IPI_CPU_BACKTRACE: + irq_enter(); + ipi_cpu_backtrace(regs); + irq_exit(); + break; + default: printk(KERN_CRIT "CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); @@ -708,3 +733,40 @@ static int __init register_cpufreq_notifier(void) core_initcall(register_cpufreq_notifier);
#endif + +void arch_trigger_all_cpu_backtrace(bool include_self) +{ + static unsigned long backtrace_flag; + int i, cpu = get_cpu(); + + if (test_and_set_bit(0, &backtrace_flag)) { + /* + * If there is already a trigger_all_cpu_backtrace() in progress + * (backtrace_flag == 1), don't output double cpu dump infos. + */ + put_cpu(); + return; + } + + cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask); + if (!include_self) + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + + if (!cpumask_empty(to_cpumask(backtrace_mask))) { + pr_info("Sending FIQ to %s CPUs:\n", + (include_self ? "all" : "other")); + smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE); + } + + /* Wait for up to 10 seconds for all CPUs to do the backtrace */ + for (i = 0; i < 10 * 1000; i++) { + if (cpumask_empty(to_cpumask(backtrace_mask))) + break; + + mdelay(1); + } + + clear_bit(0, &backtrace_flag); + smp_mb__after_atomic(); + put_cpu(); +}
Previous changes have introduced both a replacement default FIQ handler and an implementation of arch_trigger_all_cpu_backtrace for ARM but these are currently independent of each other.
This patch plumbs together these features making it possible, on platforms that support it, to trigger backtrace using FIQ.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- arch/arm/include/asm/smp.h | 3 +++ arch/arm/kernel/smp.c | 4 +++- arch/arm/kernel/traps.c | 3 +++ 3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h index 18f5a554134f..b076584ac0fa 100644 --- a/arch/arm/include/asm/smp.h +++ b/arch/arm/include/asm/smp.h @@ -18,6 +18,8 @@ # error "<asm/smp.h> included in non-SMP build" #endif
+#define SMP_IPI_FIQ_MASK 0x0100 + #define raw_smp_processor_id() (current_thread_info()->cpu)
struct seq_file; @@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu); extern void arch_send_call_function_ipi_mask(const struct cpumask *mask); extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
+extern void ipi_cpu_backtrace(struct pt_regs *regs); extern int register_ipi_completion(struct completion *completion, int cpu);
struct smp_operations { diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 14c594a12bef..e923843562d9 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -539,7 +539,7 @@ static void ipi_cpu_stop(unsigned int cpu) cpu_relax(); }
-static void ipi_cpu_backtrace(struct pt_regs *regs) +void ipi_cpu_backtrace(struct pt_regs *regs) { int cpu = smp_processor_id();
@@ -580,6 +580,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs) unsigned int cpu = smp_processor_id(); struct pt_regs *old_regs = set_irq_regs(regs);
+ BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE)); + if ((unsigned)ipinr < NR_IPI) { trace_ipi_entry(ipi_types[ipinr]); __inc_irq_stat(cpu, ipi_irqs[ipinr]); diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 4dc45b38e56e..9eb05be9526e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) #ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif +#ifdef CONFIG_SMP + ipi_cpu_backtrace(regs); +#endif
nmi_exit();
On 28/11/14 16:16, Daniel Thompson wrote:
Hi Thomas, Hi Jason: Patches 1 to 3 are for you.
This patchset modifies the GIC driver to allow it, on supported platforms, to route IPI interrupts to FIQ and uses this feature to implement arch_trigger_all_cpu_backtrace for arm.
Are the irq parts of this patchset looking OK now?
Daniel.
On platforms not capable of supporting FIQ the signal to generate a backtrace we fall back to using IRQ for propagation instead (relying on a timeout to avoid wedging the CPU requesting the backtrace if other CPUs are not responsive).
It has been tested on two systems capable of supporting grouping (Freescale i.MX6 and STiH416) and two that do not (vexpress-a9 and Qualcomm Snapdragon 600).
v12:
Squash first two patches into a single one and re-describe (Thomas Gleixner).
Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe" (Thomas Gleixner).
v11:
- Optimized gic_raise_softirq() by replacing a register read with a memory read (Jason Cooper).
v10:
Add a further patch to optimize away some of the locking on systems where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with exynos_defconfig (which is the only defconfig to set this option).
Whitespace fixes in patch 4. That patch previously used spaces for alignment of new constants but the rest of the file used tabs.
v9:
Improved documentation and structure of initial patch (now initial two patches) to make gic_raise_softirq() safe to call from FIQ (Thomas Gleixner).
Avoid masking interrupts during gic_raise_softirq(). The use of the read lock makes this redundant (because we can safely re-enter the function).
v8:
- Fixed build on arm64 causes by a spurious include file in irq-gic.c.
v7-2 (accidentally released twice with same number):
Fixed boot regression on vexpress-a9 (reported by Russell King).
Rebased on v3.18-rc3; removed one patch from set that is already included in mainline.
Dropped arm64/fiq.h patch from the set (still useful but not related to issuing backtraces).
v7:
- Re-arranged code within the patch series to fix a regression introduced midway through the series and corrected by a later patch (testing by Olof's autobuilder). Tested offending patch in isolation using defconfig identified by the autobuilder.
v6:
Renamed svc_entry's call_trace argument to just trace (example code from Russell King).
Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell King).
Modified usr_entry to optional avoid calling into the trace code and used this in FIQ entry from usr path. Modified corresponding exit code to avoid calling into trace code and the scheduler (example code from Russell King).
Ensured the default FIQ register state is restored when the default FIQ handler is reinstalled (example code from Russell King).
Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting a default FIQ handler.
Re-instated fiq_safe_migration_lock and associated logic in gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd() in the console unlock logic.
v5:
Rebased on 3.17-rc4.
Removed a spurious line from the final "glue it together" patch that broke the build.
v4:
Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas Pitre).
Really fix bad pt_regs pointer generation in __fiq_abt.
Remove fiq_safe_migration_lock and associated logic in gic_raise_softirq() (review of Russell King)
Restructured to introduce the default FIQ handler first, before the new features (review of Russell King).
v3:
Removed redundant header guards from arch/arm64/include/asm/fiq.h (review of Catalin Marinas).
Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas Pitre).
v2:
Restructured to sit nicely on a similar FYI patchset from Russell King. It now effectively replaces the work in progress final patch with something much more complete.
Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq (review of Nicolas Pitre)
Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts being acknowledged by the IRQ handler does still exist but should be harmless because the IRQ handler will still wind up calling ipi_cpu_backtrace().
Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively becomes a platform feature (although the use of non-maskable interrupts to implement it is best effort rather than guaranteed).
Better comments highlighting usage of RAZ/WI registers (and parts of registers) in the GIC code.
Changes *before* v1:
This patchset is a hugely cut-down successor to "[PATCH v11 00/19] arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting the new structure. For historic details see: https://lkml.org/lkml/2014/9/2/227
Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value). In fixing this we also remove the useless indirection previously found in the fiq_handler macro.
Make default fiq handler "always on" by migrating from fiq.c to traps.c and replace do_unexp_fiq with the new handler (review of Russell King).
Add arm64 version of fiq.h (review of Russell King)
Removed conditional branching and code from irq-gic.c, this is replaced by much simpler code that relies on the GIC specification's heavy use of read-as-zero/write-ignored (review of Russell King)
Daniel Thompson (5): irqchip: gic: Optimize locking in gic_raise_softirq irqchip: gic: Make gic_raise_softirq FIQ-safe irqchip: gic: Introduce plumbing for IPI FIQ ARM: add basic support for on-demand backtrace of other CPUs arm: smp: Handle ipi_cpu_backtrace() using FIQ (if available)
arch/arm/include/asm/irq.h | 5 + arch/arm/include/asm/smp.h | 3 + arch/arm/kernel/smp.c | 64 +++++++++++++ arch/arm/kernel/traps.c | 8 +- drivers/irqchip/irq-gic.c | 203 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 ++ 6 files changed, 275 insertions(+), 16 deletions(-)
-- 1.9.3
Hi Thomas, Hi Jason: Patches 1 to 3 are for you and, apart from the rebase, these three patches haven't been changed since the last time I posted them.
This patchset modifies the GIC driver to allow it, on supported platforms, to route IPI interrupts to FIQ and uses this feature to implement arch_trigger_all_cpu_backtrace for arm.
On platforms not capable of supporting FIQ the signal to generate a backtrace we fall back to using IRQ for propagation instead (relying on a timeout to avoid wedging the CPU requesting the backtrace if other CPUs are not responsive).
It has been tested on two systems capable of supporting grouping (Freescale i.MX6 and STiH416) and two that do not (vexpress-a9 and Qualcomm Snapdragon 600).
v13:
* Updated the code to print the backtrace to replicate Steven Rostedt's x86 work to make SysRq-l safe. This is pretty much a total rewrite of patches 4 and 5.
v12:
* Squash first two patches into a single one and re-describe (Thomas Gleixner).
* Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe" (Thomas Gleixner).
v11:
* Optimized gic_raise_softirq() by replacing a register read with a memory read (Jason Cooper).
v10:
* Add a further patch to optimize away some of the locking on systems where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with exynos_defconfig (which is the only defconfig to set this option).
* Whitespace fixes in patch 4. That patch previously used spaces for alignment of new constants but the rest of the file used tabs.
v9:
* Improved documentation and structure of initial patch (now initial two patches) to make gic_raise_softirq() safe to call from FIQ (Thomas Gleixner).
* Avoid masking interrupts during gic_raise_softirq(). The use of the read lock makes this redundant (because we can safely re-enter the function).
v8:
* Fixed build on arm64 causes by a spurious include file in irq-gic.c.
v7-2 (accidentally released twice with same number):
* Fixed boot regression on vexpress-a9 (reported by Russell King).
* Rebased on v3.18-rc3; removed one patch from set that is already included in mainline.
* Dropped arm64/fiq.h patch from the set (still useful but not related to issuing backtraces).
v7:
* Re-arranged code within the patch series to fix a regression introduced midway through the series and corrected by a later patch (testing by Olof's autobuilder). Tested offending patch in isolation using defconfig identified by the autobuilder.
v6:
* Renamed svc_entry's call_trace argument to just trace (example code from Russell King).
* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell King).
* Modified usr_entry to optional avoid calling into the trace code and used this in FIQ entry from usr path. Modified corresponding exit code to avoid calling into trace code and the scheduler (example code from Russell King).
* Ensured the default FIQ register state is restored when the default FIQ handler is reinstalled (example code from Russell King).
* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting a default FIQ handler.
* Re-instated fiq_safe_migration_lock and associated logic in gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd() in the console unlock logic.
v5:
* Rebased on 3.17-rc4.
* Removed a spurious line from the final "glue it together" patch that broke the build.
v4:
* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas Pitre).
* Really fix bad pt_regs pointer generation in __fiq_abt.
* Remove fiq_safe_migration_lock and associated logic in gic_raise_softirq() (review of Russell King)
* Restructured to introduce the default FIQ handler first, before the new features (review of Russell King).
v3:
* Removed redundant header guards from arch/arm64/include/asm/fiq.h (review of Catalin Marinas).
* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas Pitre).
v2:
* Restructured to sit nicely on a similar FYI patchset from Russell King. It now effectively replaces the work in progress final patch with something much more complete.
* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq (review of Nicolas Pitre)
* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts being acknowledged by the IRQ handler does still exist but should be harmless because the IRQ handler will still wind up calling ipi_cpu_backtrace().
* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively becomes a platform feature (although the use of non-maskable interrupts to implement it is best effort rather than guaranteed).
* Better comments highlighting usage of RAZ/WI registers (and parts of registers) in the GIC code.
Changes *before* v1:
* This patchset is a hugely cut-down successor to "[PATCH v11 00/19] arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting the new structure. For historic details see: https://lkml.org/lkml/2014/9/2/227
* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value). In fixing this we also remove the useless indirection previously found in the fiq_handler macro.
* Make default fiq handler "always on" by migrating from fiq.c to traps.c and replace do_unexp_fiq with the new handler (review of Russell King).
* Add arm64 version of fiq.h (review of Russell King)
* Removed conditional branching and code from irq-gic.c, this is replaced by much simpler code that relies on the GIC specification's heavy use of read-as-zero/write-ignored (review of Russell King)
Daniel Thompson (5): irqchip: gic: Optimize locking in gic_raise_softirq irqchip: gic: Make gic_raise_softirq FIQ-safe irqchip: gic: Introduce plumbing for IPI FIQ ARM: Add support for on-demand backtrace of other CPUs ARM: Fix on-demand backtrace triggered by IRQ
arch/arm/include/asm/hardirq.h | 2 +- arch/arm/include/asm/irq.h | 5 + arch/arm/include/asm/smp.h | 3 + arch/arm/kernel/smp.c | 164 ++++++++++++++++++++++++++++++++ arch/arm/kernel/traps.c | 8 +- drivers/irqchip/irq-gic.c | 203 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 ++ 7 files changed, 376 insertions(+), 17 deletions(-)
-- 1.9.3
Currently gic_raise_softirq() is locked using upon irq_controller_lock. This lock is primarily used to make register read-modify-write sequences atomic but gic_raise_softirq() uses it instead to ensure that the big.LITTLE migration logic can figure out when it is safe to migrate interrupts between physical cores.
This is sub-optimal in closely related ways:
1. No locking at all is required on systems where the b.L switcher is not configured.
2. Finer grain locking can be used on systems where the b.L switcher is present.
This patch resolves both of the above by introducing a separate finer grain lock and providing conditionally compiled inlines to lock/unlock it.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 36 +++++++++++++++++++++++++++++++++--- 1 file changed, 33 insertions(+), 3 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index d617ee5a3d8a..a9ed64dcc84b 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,27 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock);
/* + * This lock is used by the big.LITTLE migration code to ensure no IPIs + * can be pended on the old core after the map has been updated. + */ +#ifdef CONFIG_BL_SWITCHER +static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock); + +static inline void bl_migration_lock(unsigned long *flags) +{ + raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags); +} + +static inline void bl_migration_unlock(unsigned long flags) +{ + raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); +} +#else +static inline void bl_migration_lock(unsigned long *flags) {} +static inline void bl_migration_unlock(unsigned long flags) {} +#endif + +/* * The GIC mapping of CPU interfaces does not necessarily match * the logical CPU numbering. Let's use a mapping as returned * by the GIC itself. @@ -624,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags); + bl_migration_lock(&flags);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -639,7 +660,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&irq_controller_lock, flags); + bl_migration_unlock(flags); } #endif
@@ -710,8 +731,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
raw_spin_lock(&irq_controller_lock);
- /* Update the target interface for this logical CPU */ + /* + * Update the target interface for this logical CPU + * + * From the point we release the cpu_map_migration_lock any new + * SGIs will be pended on the new cpu which makes the set of SGIs + * pending on the old cpu static. That means we can defer the + * migration until after we have released the irq_controller_lock. + */ + raw_spin_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id; + raw_spin_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
It is currently possible for FIQ handlers to re-enter gic_raise_softirq() and lock up.
gic_raise_softirq() lock(x); -~-> FIQ handle_fiq() gic_raise_softirq() lock(x); <-- Lockup
arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue renders it difficult for FIQ handlers to safely defer work to less restrictive calling contexts.
This patch fixes the problem by converting the cpu_map_migration_lock into a rwlock making it safe to re-enter the function.
Note that having made it safe to re-enter gic_raise_softirq() we no longer need to mask interrupts during gic_raise_softirq() because the b.L migration is always performed from task context.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++------------- 1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index a9ed64dcc84b..c172176499f6 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock); /* * This lock is used by the big.LITTLE migration code to ensure no IPIs * can be pended on the old core after the map has been updated. + * + * This lock may be locked for reading from both IRQ and FIQ handlers + * and therefore must not be locked for writing when these are enabled. */ #ifdef CONFIG_BL_SWITCHER -static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock); +static DEFINE_RWLOCK(cpu_map_migration_lock);
-static inline void bl_migration_lock(unsigned long *flags) +static inline void bl_migration_lock(void) { - raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags); + read_lock(&cpu_map_migration_lock); }
-static inline void bl_migration_unlock(unsigned long flags) +static inline void bl_migration_unlock(void) { - raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); + read_unlock(&cpu_map_migration_lock); } #else -static inline void bl_migration_lock(unsigned long *flags) {} -static inline void bl_migration_unlock(unsigned long flags) {} +static inline void bl_migration_lock(void) {} +static inline void bl_migration_unlock(void) {} #endif
/* @@ -640,12 +643,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic) #endif
#ifdef CONFIG_SMP +/* + * Raise the specified IPI on all cpus set in mask. + * + * This function is safe to call from all calling contexts, including + * FIQ handlers. It relies on bl_migration_lock() being multiply acquirable + * to avoid deadlocks when the function is re-entered at different + * exception levels. + */ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; - unsigned long flags, map = 0; + unsigned long map = 0;
- bl_migration_lock(&flags); + bl_migration_lock();
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -660,7 +671,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- bl_migration_unlock(flags); + bl_migration_unlock(); } #endif
@@ -708,7 +719,8 @@ int gic_get_cpu_id(unsigned int cpu) * Migrate all peripheral interrupts with a target matching the current CPU * to the interface corresponding to @new_cpu_id. The CPU interface mapping * is also updated. Targets to other CPU interfaces are unchanged. - * This must be called with IRQs locally disabled. + * This must be called from a task context and with IRQ and FIQ locally + * disabled. */ void gic_migrate_target(unsigned int new_cpu_id) { @@ -739,9 +751,9 @@ void gic_migrate_target(unsigned int new_cpu_id) * pending on the old cpu static. That means we can defer the * migration until after we have released the irq_controller_lock. */ - raw_spin_lock(&cpu_map_migration_lock); + write_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id; - raw_spin_unlock(&cpu_map_migration_lock); + write_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org --- arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 151 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 153 insertions(+), 11 deletions(-)
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 788e23fe64d8..b35e220ae1b1 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -26,6 +26,7 @@ #include <linux/init.h> #include <linux/sched.h> #include <linux/irq.h> +#include <linux/irqchip/arm-gic.h>
#include <linux/atomic.h> #include <asm/cacheflush.h> @@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
nmi_enter();
- /* nop. FIQ handlers for special arch/arm features can be added here. */ +#ifdef CONFIG_ARM_GIC + gic_handle_fiq_ipi(); +#endif
nmi_exit();
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index c172176499f6..c4f4a8827ed8 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -39,6 +39,7 @@ #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/ratelimit.h>
#include <asm/cputype.h> #include <asm/irq.h> @@ -48,6 +49,10 @@ #include "irq-gic-common.h" #include "irqchip.h"
+#ifndef SMP_IPI_FIQ_MASK +#define SMP_IPI_FIQ_MASK 0 +#endif + union gic_base { void __iomem *common_base; void __percpu * __iomem *percpu_base; @@ -65,6 +70,7 @@ struct gic_chip_data { #endif struct irq_domain *domain; unsigned int gic_irqs; + u32 igroup0_shadow; #ifdef CONFIG_GIC_NON_BANKED void __iomem *(*get_base)(union gic_base *); #endif @@ -348,6 +354,83 @@ static struct irq_chip gic_chip = { .irq_set_wake = gic_set_wake, };
+/* + * Shift an interrupt between Group 0 and Group 1. + * + * In addition to changing the group we also modify the priority to + * match what "ARM strongly recommends" for a system where no Group 1 + * interrupt must ever preempt a Group 0 interrupt. + * + * If is safe to call this function on systems which do not support + * grouping (it will have no effect). + */ +static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq, + int group) +{ + void __iomem *base = gic_data_dist_base(gic); + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_mask = BIT(hwirq % 32); + u32 grp_val; + + unsigned int pri_reg = (hwirq / 4) * 4; + u32 pri_mask = BIT(7 + ((hwirq % 4) * 8)); + u32 pri_val; + + /* + * Systems which do not support grouping will have not have + * the EnableGrp1 bit set. + */ + if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))) + return; + + raw_spin_lock(&irq_controller_lock); + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg); + + if (group) { + grp_val |= grp_mask; + pri_val |= pri_mask; + } else { + grp_val &= ~grp_mask; + pri_val &= ~pri_mask; + } + + writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg); + if (grp_reg == 0) + gic->igroup0_shadow = grp_val; + + writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg); + + raw_spin_unlock(&irq_controller_lock); +} + + +/* + * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI, + * otherwise do nothing. + */ +void gic_handle_fiq_ipi(void) +{ + struct gic_chip_data *gic = &gic_data[0]; + void __iomem *cpu_base = gic_data_cpu_base(gic); + unsigned long irqstat, irqnr; + + if (WARN_ON(!in_nmi())) + return; + + while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) & + SMP_IPI_FIQ_MASK) { + irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK); + writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); + + irqnr = irqstat & GICC_IAR_INT_ID_MASK; + WARN_RATELIMIT(irqnr > 16, + "Unexpected irqnr %lu (bad prioritization?)\n", + irqnr); + } +} + void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq) { if (gic_nr >= MAX_GIC_NR) @@ -379,15 +462,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic) static void gic_cpu_if_up(void) { void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]); - u32 bypass = 0; + void __iomem *dist_base = gic_data_dist_base(&gic_data[0]); + u32 ctrl = 0;
/* - * Preserve bypass disable bits to be written back later - */ - bypass = readl(cpu_base + GIC_CPU_CTRL); - bypass &= GICC_DIS_BYPASS_MASK; + * Preserve bypass disable bits to be written back later + */ + ctrl = readl(cpu_base + GIC_CPU_CTRL); + ctrl &= GICC_DIS_BYPASS_MASK;
- writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); + /* + * If EnableGrp1 is set in the distributor then enable group 1 + * support for this CPU (and route group 0 interrupts to FIQ). + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) + ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL | + GICC_ENABLE_GRP1; + + writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); }
@@ -411,7 +503,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
gic_dist_config(base, gic_irqs, NULL);
- writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL); + /* + * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only, + * bit 1 ignored) depending on current mode. + */ + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL); + + /* + * Set all global interrupts to be group 1 if (and only if) it + * is possible to enable group 1 interrupts. This register is RAZ/WI + * if not accessible or not implemented, however some GICv1 devices + * do not implement the EnableGrp1 bit making it unsafe to set + * this register unconditionally. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)) + for (i = 32; i < gic_irqs; i += 32) + writel_relaxed(0xffffffff, + base + GIC_DIST_IGROUP + i * 4 / 32); }
static void gic_cpu_init(struct gic_chip_data *gic) @@ -420,6 +528,7 @@ static void gic_cpu_init(struct gic_chip_data *gic) void __iomem *base = gic_data_cpu_base(gic); unsigned int cpu_mask, cpu = smp_processor_id(); int i; + unsigned long secure_irqs, secure_irq;
/* * Get what the GIC says our CPU mask is. @@ -438,6 +547,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
gic_cpu_config(dist_base, NULL);
+ /* + * If the distributor is configured to support interrupt grouping + * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK + * to be group1 and ensure any remaining group 0 interrupts have + * the right priority. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) { + secure_irqs = SMP_IPI_FIQ_MASK; + writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0); + gic->igroup0_shadow = ~secure_irqs; + for_each_set_bit(secure_irq, &secure_irqs, 16) + gic_set_group_irq(gic, secure_irq, 0); + } + writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK); gic_cpu_if_up(); } @@ -527,7 +650,8 @@ static void gic_dist_restore(unsigned int gic_nr) writel_relaxed(gic_data[gic_nr].saved_spi_enable[i], dist_base + GIC_DIST_ENABLE_SET + i * 4);
- writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL); + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, + dist_base + GIC_DIST_CTRL); }
static void gic_cpu_save(unsigned int gic_nr) @@ -655,6 +779,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; unsigned long map = 0; + unsigned long softint;
bl_migration_lock();
@@ -668,8 +793,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) */ dmb(ishst);
- /* this always happens on GIC0 */ - writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); + /* We avoid a readl here by using the shadow copy of IGROUP[0] */ + softint = map << 16 | irq; + if (gic_data[0].igroup0_shadow & BIT(irq)) + softint |= 0x8000; + + /* This always happens on GIC0 */ + writel_relaxed(softint, + gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
bl_migration_unlock(); } diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h index 71d706d5f169..7690f70049a3 100644 --- a/include/linux/irqchip/arm-gic.h +++ b/include/linux/irqchip/arm-gic.h @@ -22,6 +22,10 @@ #define GIC_CPU_IDENT 0xfc
#define GICC_ENABLE 0x1 +#define GICC_ENABLE_GRP1 0x2 +#define GICC_ACK_CTL 0x4 +#define GICC_FIQ_EN 0x8 +#define GICC_COMMON_BPR 0x10 #define GICC_INT_PRI_THRESHOLD 0xf0 #define GICC_IAR_INT_ID_MASK 0x3ff #define GICC_INT_SPURIOUS 1023 @@ -44,6 +48,7 @@ #define GIC_DIST_SGI_PENDING_SET 0xf20
#define GICD_ENABLE 0x1 +#define GICD_ENABLE_GRP1 0x2 #define GICD_DISABLE 0x0 #define GICD_INT_ACTLOW_LVLTRIG 0x0 #define GICD_INT_EN_CLR_X32 0xffffffff @@ -121,5 +126,8 @@ static inline void __init register_routable_domain_ops { gic_routable_irq_domain_ops = ops; } + +void gic_handle_fiq_ipi(void); + #endif /* __ASSEMBLY */ #endif
Duplicate the x86 code to trigger a backtrace using an NMI and hook it up to IPI on ARM. Where it is possible for the hardware to do so the IPI will be delivered at FIQ level.
Also provide are a few small items of plumbing to hook up the new code.
Note that the code copied from x86 has been deliberately modified as little as possible (to make extracting out the common code easier in future).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Steven Rostedt rostedt@goodmis.org --- arch/arm/include/asm/hardirq.h | 2 +- arch/arm/include/asm/irq.h | 5 ++ arch/arm/include/asm/smp.h | 3 + arch/arm/kernel/smp.c | 151 +++++++++++++++++++++++++++++++++++++++++ arch/arm/kernel/traps.c | 3 + 5 files changed, 163 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/hardirq.h b/arch/arm/include/asm/hardirq.h index fe3ea776dc34..5df33e30ae1b 100644 --- a/arch/arm/include/asm/hardirq.h +++ b/arch/arm/include/asm/hardirq.h @@ -5,7 +5,7 @@ #include <linux/threads.h> #include <asm/irq.h>
-#define NR_IPI 8 +#define NR_IPI 9
typedef struct { unsigned int __softirq_pending; diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h index 53c15dec7af6..be1d07d59ee9 100644 --- a/arch/arm/include/asm/irq.h +++ b/arch/arm/include/asm/irq.h @@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *); extern void set_handle_irq(void (*handle_irq)(struct pt_regs *)); #endif
+#ifdef CONFIG_SMP +extern void arch_trigger_all_cpu_backtrace(bool); +#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x) +#endif + #endif
#endif diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h index 18f5a554134f..b076584ac0fa 100644 --- a/arch/arm/include/asm/smp.h +++ b/arch/arm/include/asm/smp.h @@ -18,6 +18,8 @@ # error "<asm/smp.h> included in non-SMP build" #endif
+#define SMP_IPI_FIQ_MASK 0x0100 + #define raw_smp_processor_id() (current_thread_info()->cpu)
struct seq_file; @@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu); extern void arch_send_call_function_ipi_mask(const struct cpumask *mask); extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
+extern void ipi_cpu_backtrace(struct pt_regs *regs); extern int register_ipi_completion(struct completion *completion, int cpu);
struct smp_operations { diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 5e6052e18850..12667eb68198 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -26,6 +26,7 @@ #include <linux/completion.h> #include <linux/cpufreq.h> #include <linux/irq_work.h> +#include <linux/seq_buf.h>
#include <linux/atomic.h> #include <asm/smp.h> @@ -72,6 +73,7 @@ enum ipi_msg_type { IPI_CPU_STOP, IPI_IRQ_WORK, IPI_COMPLETION, + IPI_CPU_BACKTRACE, };
static DECLARE_COMPLETION(cpu_running); @@ -444,6 +446,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = { S(IPI_CPU_STOP, "CPU stop interrupts"), S(IPI_IRQ_WORK, "IRQ work interrupts"), S(IPI_COMPLETION, "completion interrupts"), + S(IPI_CPU_BACKTRACE, "backtrace interrupts"), };
static void smp_cross_call(const struct cpumask *target, unsigned int ipinr) @@ -558,6 +561,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs) unsigned int cpu = smp_processor_id(); struct pt_regs *old_regs = set_irq_regs(regs);
+ BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE)); + if ((unsigned)ipinr < NR_IPI) { trace_ipi_entry(ipi_types[ipinr]); __inc_irq_stat(cpu, ipi_irqs[ipinr]); @@ -611,6 +616,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs) irq_exit(); break;
+ case IPI_CPU_BACKTRACE: + irq_enter(); + ipi_cpu_backtrace(regs); + irq_exit(); + break; + default: pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); @@ -705,3 +716,143 @@ static int __init register_cpufreq_notifier(void) core_initcall(register_cpufreq_notifier);
#endif + +/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; +static cpumask_t printtrace_mask; + +#define NMI_BUF_SIZE 4096 + +struct nmi_seq_buf { + unsigned char buffer[NMI_BUF_SIZE]; + struct seq_buf seq; +}; + +/* Safe printing in NMI context */ +static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq); + +/* "in progress" flag of arch_trigger_all_cpu_backtrace */ +static unsigned long backtrace_flag; + +/* + * It is not safe to call printk() directly from NMI handlers. + * It may be fine if the NMI detected a lock up and we have no choice + * but to do so, but doing a NMI on all other CPUs to get a back trace + * can be done with a sysrq-l. We don't want that to lock up, which + * can happen if the NMI interrupts a printk in progress. + * + * Instead, we redirect the vprintk() to this nmi_vprintk() that writes + * the content into a per cpu seq_buf buffer. Then when the NMIs are + * all done, we can safely dump the contents of the seq_buf to a printk() + * from a non NMI context. + */ +static int nmi_vprintk(const char *fmt, va_list args) +{ + struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq); + unsigned int len = seq_buf_used(&s->seq); + + seq_buf_vprintf(&s->seq, fmt, args); + return seq_buf_used(&s->seq) - len; +} + +void ipi_cpu_backtrace(struct pt_regs *regs) +{ + int cpu; + + cpu = smp_processor_id(); + + if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { + printk_func_t printk_func_save = this_cpu_read(printk_func); + + /* Replace printk to write into the NMI seq */ + this_cpu_write(printk_func, nmi_vprintk); + printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu); + show_regs(regs); + this_cpu_write(printk_func, printk_func_save); + + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + } +} + +static void print_seq_line(struct nmi_seq_buf *s, int start, int end) +{ + const char *buf = s->buffer + start; + + printk("%.*s", (end - start) + 1, buf); +} + +void arch_trigger_all_cpu_backtrace(bool include_self) +{ + struct nmi_seq_buf *s; + int len; + int cpu; + int i; + int this_cpu = get_cpu(); + + if (test_and_set_bit(0, &backtrace_flag)) { + /* + * If there is already a trigger_all_cpu_backtrace() in progress + * (backtrace_flag == 1), don't output double cpu dump infos. + */ + put_cpu(); + return; + } + + cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask); + if (!include_self) + cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask)); + + cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask)); + /* + * Set up per_cpu seq_buf buffers that the NMIs running on the other + * CPUs will write to. + */ + for_each_cpu(cpu, to_cpumask(backtrace_mask)) { + s = &per_cpu(nmi_print_seq, cpu); + seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE); + } + + if (!cpumask_empty(to_cpumask(backtrace_mask))) { + pr_info("Sending NMI to %s CPUs:\n", + (include_self ? "all" : "other")); + smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE); + } + + /* Wait for up to 10 seconds for all CPUs to do the backtrace */ + for (i = 0; i < 10 * 1000; i++) { + if (cpumask_empty(to_cpumask(backtrace_mask))) + break; + mdelay(1); + touch_softlockup_watchdog(); + } + + /* + * Now that all the NMIs have triggered, we can dump out their + * back traces safely to the console. + */ + for_each_cpu(cpu, &printtrace_mask) { + int last_i = 0; + + s = &per_cpu(nmi_print_seq, cpu); + len = seq_buf_used(&s->seq); + if (!len) + continue; + + /* Print line by line. */ + for (i = 0; i < len; i++) { + if (s->buffer[i] == '\n') { + print_seq_line(s, last_i, i); + last_i = i + 1; + } + } + /* Check if there was a partial line. */ + if (last_i < len) { + print_seq_line(s, last_i, len - 1); + pr_cont("\n"); + } + } + + clear_bit(0, &backtrace_flag); + smp_mb__after_atomic(); + put_cpu(); +} diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index b35e220ae1b1..1836415b8a5c 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) #ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif +#ifdef CONFIG_SMP + ipi_cpu_backtrace(regs); +#endif
nmi_exit();
On Mon, 5 Jan 2015 14:54:58 +0000 Daniel Thompson daniel.thompson@linaro.org wrote:
+/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; +static cpumask_t printtrace_mask;
+#define NMI_BUF_SIZE 4096
+struct nmi_seq_buf {
- unsigned char buffer[NMI_BUF_SIZE];
- struct seq_buf seq;
+};
+/* Safe printing in NMI context */ +static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+/* "in progress" flag of arch_trigger_all_cpu_backtrace */ +static unsigned long backtrace_flag;
+/*
- It is not safe to call printk() directly from NMI handlers.
- It may be fine if the NMI detected a lock up and we have no choice
- but to do so, but doing a NMI on all other CPUs to get a back trace
- can be done with a sysrq-l. We don't want that to lock up, which
- can happen if the NMI interrupts a printk in progress.
- Instead, we redirect the vprintk() to this nmi_vprintk() that writes
- the content into a per cpu seq_buf buffer. Then when the NMIs are
- all done, we can safely dump the contents of the seq_buf to a printk()
- from a non NMI context.
- */
+static int nmi_vprintk(const char *fmt, va_list args) +{
- struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
- unsigned int len = seq_buf_used(&s->seq);
- seq_buf_vprintf(&s->seq, fmt, args);
- return seq_buf_used(&s->seq) - len;
+}
This is the same code as in x86. I wonder if we should move the duplicate code into kernel/printk/ and have it compiled if the arch requests it (CONFIG_ARCH_WANT_NMI_PRINTK or something). That way we don't have 20 copies of the same nmi_vprintk() and later find that we need to change it, and have to change it in 20 different archs.
-- Steve
On 05/01/15 15:19, Steven Rostedt wrote:
On Mon, 5 Jan 2015 14:54:58 +0000 Daniel Thompson daniel.thompson@linaro.org wrote:
+/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; +static cpumask_t printtrace_mask;
+#define NMI_BUF_SIZE 4096
+struct nmi_seq_buf {
- unsigned char buffer[NMI_BUF_SIZE];
- struct seq_buf seq;
+};
+/* Safe printing in NMI context */ +static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+/* "in progress" flag of arch_trigger_all_cpu_backtrace */ +static unsigned long backtrace_flag;
+/*
- It is not safe to call printk() directly from NMI handlers.
- It may be fine if the NMI detected a lock up and we have no choice
- but to do so, but doing a NMI on all other CPUs to get a back trace
- can be done with a sysrq-l. We don't want that to lock up, which
- can happen if the NMI interrupts a printk in progress.
- Instead, we redirect the vprintk() to this nmi_vprintk() that writes
- the content into a per cpu seq_buf buffer. Then when the NMIs are
- all done, we can safely dump the contents of the seq_buf to a printk()
- from a non NMI context.
- */
+static int nmi_vprintk(const char *fmt, va_list args) +{
- struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq);
- unsigned int len = seq_buf_used(&s->seq);
- seq_buf_vprintf(&s->seq, fmt, args);
- return seq_buf_used(&s->seq) - len;
+}
This is the same code as in x86. I wonder if we should move the duplicate code into kernel/printk/ and have it compiled if the arch requests it (CONFIG_ARCH_WANT_NMI_PRINTK or something). That way we don't have 20 copies of the same nmi_vprintk() and later find that we need to change it, and have to change it in 20 different archs.
Sounds like a good idea. I'll take a look at this.
Daniel.
On Mon, Jan 05, 2015 at 10:19:25AM -0500, Steven Rostedt wrote:
On Mon, 5 Jan 2015 14:54:58 +0000 Daniel Thompson daniel.thompson@linaro.org wrote:
+/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; +static cpumask_t printtrace_mask;
+#define NMI_BUF_SIZE 4096
+struct nmi_seq_buf {
- unsigned char buffer[NMI_BUF_SIZE];
- struct seq_buf seq;
+};
Am I missing something or does this limit us to 4096 characters of backtrace output per CPU?
This is the same code as in x86. I wonder if we should move the duplicate code into kernel/printk/ and have it compiled if the arch requests it (CONFIG_ARCH_WANT_NMI_PRINTK or something). That way we don't have 20 copies of the same nmi_vprintk() and later find that we need to change it, and have to change it in 20 different archs.
Agreed, though I wonder about the buffer size.
On Fri, 9 Jan 2015 16:48:01 +0000 Russell King - ARM Linux linux@arm.linux.org.uk wrote:
On Mon, Jan 05, 2015 at 10:19:25AM -0500, Steven Rostedt wrote:
On Mon, 5 Jan 2015 14:54:58 +0000 Daniel Thompson daniel.thompson@linaro.org wrote:
+/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; +static cpumask_t printtrace_mask;
+#define NMI_BUF_SIZE 4096
+struct nmi_seq_buf {
- unsigned char buffer[NMI_BUF_SIZE];
- struct seq_buf seq;
+};
Am I missing something or does this limit us to 4096 characters of backtrace output per CPU?
This is the same code as in x86. I wonder if we should move the duplicate code into kernel/printk/ and have it compiled if the arch requests it (CONFIG_ARCH_WANT_NMI_PRINTK or something). That way we don't have 20 copies of the same nmi_vprintk() and later find that we need to change it, and have to change it in 20 different archs.
Agreed, though I wonder about the buffer size.
Have we had kernel back traces bigger than that? Since the stack size is limited to page size, it would seem dangerous if backtraces filled up a page size itself, as most function frames are bigger than the typical 60 bytes of data per line.
We could change that hard coded 4096 to PAGE_SIZE, for those archs with bigger pages.
Also, if the backtrace were to fill up that much. Most the pertinent data from a back trace is at the beginning of the trace. Seldom do we care about the top most callers (bottom of the output).
-- Steve
On 11/01/15 23:37, Steven Rostedt wrote:
On Fri, 9 Jan 2015 16:48:01 +0000 Russell King - ARM Linux linux@arm.linux.org.uk wrote:
On Mon, Jan 05, 2015 at 10:19:25AM -0500, Steven Rostedt wrote:
On Mon, 5 Jan 2015 14:54:58 +0000 Daniel Thompson daniel.thompson@linaro.org wrote:
+/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; +static cpumask_t printtrace_mask;
+#define NMI_BUF_SIZE 4096
+struct nmi_seq_buf {
- unsigned char buffer[NMI_BUF_SIZE];
- struct seq_buf seq;
+};
Am I missing something or does this limit us to 4096 characters of backtrace output per CPU?
This is the same code as in x86. I wonder if we should move the duplicate code into kernel/printk/ and have it compiled if the arch requests it (CONFIG_ARCH_WANT_NMI_PRINTK or something). That way we don't have 20 copies of the same nmi_vprintk() and later find that we need to change it, and have to change it in 20 different archs.
Agreed, though I wonder about the buffer size.
Have we had kernel back traces bigger than that? Since the stack size is limited to page size, it would seem dangerous if backtraces filled up a page size itself, as most function frames are bigger than the typical 60 bytes of data per line.
We could change that hard coded 4096 to PAGE_SIZE, for those archs with bigger pages.
I've just updated the patchset with a couple of patches to common up the printk code between arm and x86.
Just for the record I haven't changed the hard coded 4096 as part of this. I'd be quite happy to but I didn't want to introduce any "secret" changes to the code whilst the patch header claims I am just copying stuff.
Daniel.
Also, if the backtrace were to fill up that much. Most the pertinent data from a back trace is at the beginning of the trace. Seldom do we care about the top most callers (bottom of the output).
-- Steve
On Tue, 13 Jan 2015 10:36:29 +0000 Daniel Thompson daniel.thompson@linaro.org wrote:
We could change that hard coded 4096 to PAGE_SIZE, for those archs with bigger pages.
I've just updated the patchset with a couple of patches to common up the printk code between arm and x86.
Just for the record I haven't changed the hard coded 4096 as part of this. I'd be quite happy to but I didn't want to introduce any "secret" changes to the code whilst the patch header claims I am just copying stuff.
Adding a separate patch would be fine by me.
-- Steve
Currently if arch_trigger_all_cpu_backtrace() is called with interrupts disabled and on a platform the delivers IPI_CPU_BACKTRACE using regular IRQ requests the system will wedge for ten seconds waiting for the current CPU to react to a masked interrupt.
This patch resolves this issue by calling directly into the backtrace dump code instead of generating an IPI.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Steven Rostedt rostedt@goodmis.org --- arch/arm/kernel/smp.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 12667eb68198..644f654f7a7e 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -767,7 +767,10 @@ void ipi_cpu_backtrace(struct pt_regs *regs) /* Replace printk to write into the NMI seq */ this_cpu_write(printk_func, nmi_vprintk); printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu); - show_regs(regs); + if (regs != NULL) + show_regs(regs); + else + dump_stack(); this_cpu_write(printk_func, printk_func_save);
cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); @@ -812,6 +815,16 @@ void arch_trigger_all_cpu_backtrace(bool include_self) seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE); }
+ /* + * If irqs are disabled on the current processor then, if + * IPI_CPU_BACKTRACE is delivered using IRQ, we will won't be able to + * react to IPI_CPU_BACKTRACE until we leave this function. We avoid + * the potential timeout (not to mention the failure to print useful + * information) by calling the backtrace directly. + */ + if (include_self && irqs_disabled()) + ipi_cpu_backtrace(in_interrupt() ? get_irq_regs() : NULL); + if (!cpumask_empty(to_cpumask(backtrace_mask))) { pr_info("Sending NMI to %s CPUs:\n", (include_self ? "all" : "other"));
Hi Thomas, Hi Jason: Patches 1 to 3 are for you (and should be separable from the rest of the series). The patches haven't changes since the last time I posted them. The changes in v14 tidy up the later part of the patch set in order to share more code between x86 and arm.
This patchset modifies the GIC driver to allow it, on supported platforms, to route IPI interrupts to FIQ and uses this feature to implement arch_trigger_all_cpu_backtrace for arm.
On platforms not capable of supporting FIQ the signal to generate a backtrace we fall back to using IRQ for propagation instead (relying on a timeout to avoid wedging the CPU requesting the backtrace if other CPUs are not responsive).
It has been tested on two systems capable of supporting grouping (Freescale i.MX6 and STiH416) and two that do not (vexpress-a9 and Qualcomm Snapdragon 600).
v14:
* Moved a nmi_vprintk() and friends from arch/x86/kernel/apic/hw_nmi.c to printk.c (Steven Rostedt)
v13:
* Updated the code to print the backtrace to replicate Steven Rostedt's x86 work to make SysRq-l safe. This is pretty much a total rewrite of patches 4 and 5.
v12:
* Squash first two patches into a single one and re-describe (Thomas Gleixner).
* Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe" (Thomas Gleixner).
v11:
* Optimized gic_raise_softirq() by replacing a register read with a memory read (Jason Cooper).
v10:
* Add a further patch to optimize away some of the locking on systems where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with exynos_defconfig (which is the only defconfig to set this option).
* Whitespace fixes in patch 4. That patch previously used spaces for alignment of new constants but the rest of the file used tabs.
v9:
* Improved documentation and structure of initial patch (now initial two patches) to make gic_raise_softirq() safe to call from FIQ (Thomas Gleixner).
* Avoid masking interrupts during gic_raise_softirq(). The use of the read lock makes this redundant (because we can safely re-enter the function).
v8:
* Fixed build on arm64 causes by a spurious include file in irq-gic.c.
v7-2 (accidentally released twice with same number):
* Fixed boot regression on vexpress-a9 (reported by Russell King).
* Rebased on v3.18-rc3; removed one patch from set that is already included in mainline.
* Dropped arm64/fiq.h patch from the set (still useful but not related to issuing backtraces).
v7:
* Re-arranged code within the patch series to fix a regression introduced midway through the series and corrected by a later patch (testing by Olof's autobuilder). Tested offending patch in isolation using defconfig identified by the autobuilder.
v6:
* Renamed svc_entry's call_trace argument to just trace (example code from Russell King).
* Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell King).
* Modified usr_entry to optional avoid calling into the trace code and used this in FIQ entry from usr path. Modified corresponding exit code to avoid calling into trace code and the scheduler (example code from Russell King).
* Ensured the default FIQ register state is restored when the default FIQ handler is reinstalled (example code from Russell King).
* Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting a default FIQ handler.
* Re-instated fiq_safe_migration_lock and associated logic in gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd() in the console unlock logic.
v5:
* Rebased on 3.17-rc4.
* Removed a spurious line from the final "glue it together" patch that broke the build.
v4:
* Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas Pitre).
* Really fix bad pt_regs pointer generation in __fiq_abt.
* Remove fiq_safe_migration_lock and associated logic in gic_raise_softirq() (review of Russell King)
* Restructured to introduce the default FIQ handler first, before the new features (review of Russell King).
v3:
* Removed redundant header guards from arch/arm64/include/asm/fiq.h (review of Catalin Marinas).
* Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas Pitre).
v2:
* Restructured to sit nicely on a similar FYI patchset from Russell King. It now effectively replaces the work in progress final patch with something much more complete.
* Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq (review of Nicolas Pitre)
* Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts being acknowledged by the IRQ handler does still exist but should be harmless because the IRQ handler will still wind up calling ipi_cpu_backtrace().
* Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively becomes a platform feature (although the use of non-maskable interrupts to implement it is best effort rather than guaranteed).
* Better comments highlighting usage of RAZ/WI registers (and parts of registers) in the GIC code.
Changes *before* v1:
* This patchset is a hugely cut-down successor to "[PATCH v11 00/19] arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting the new structure. For historic details see: https://lkml.org/lkml/2014/9/2/227
* Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value). In fixing this we also remove the useless indirection previously found in the fiq_handler macro.
* Make default fiq handler "always on" by migrating from fiq.c to traps.c and replace do_unexp_fiq with the new handler (review of Russell King).
* Add arm64 version of fiq.h (review of Russell King)
* Removed conditional branching and code from irq-gic.c, this is replaced by much simpler code that relies on the GIC specification's heavy use of read-as-zero/write-ignored (review of Russell King)
Daniel Thompson (7): irqchip: gic: Optimize locking in gic_raise_softirq irqchip: gic: Make gic_raise_softirq FIQ-safe irqchip: gic: Introduce plumbing for IPI FIQ printk: Simple implementation for NMI backtracing x86/nmi: Use common printk functions ARM: Add support for on-demand backtrace of other CPUs ARM: Fix on-demand backtrace triggered by IRQ
arch/Kconfig | 3 + arch/arm/Kconfig | 1 + arch/arm/include/asm/hardirq.h | 2 +- arch/arm/include/asm/irq.h | 5 + arch/arm/include/asm/smp.h | 3 + arch/arm/kernel/smp.c | 84 +++++++++++++++++ arch/arm/kernel/traps.c | 8 +- arch/x86/Kconfig | 1 + arch/x86/kernel/apic/hw_nmi.c | 94 ++----------------- drivers/irqchip/irq-gic.c | 203 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 ++ include/linux/printk.h | 22 +++++ kernel/printk/printk.c | 122 ++++++++++++++++++++++++ 13 files changed, 452 insertions(+), 104 deletions(-)
-- 1.9.3
Currently gic_raise_softirq() is locked using upon irq_controller_lock. This lock is primarily used to make register read-modify-write sequences atomic but gic_raise_softirq() uses it instead to ensure that the big.LITTLE migration logic can figure out when it is safe to migrate interrupts between physical cores.
This is sub-optimal in closely related ways:
1. No locking at all is required on systems where the b.L switcher is not configured.
2. Finer grain locking can be used on systems where the b.L switcher is present.
This patch resolves both of the above by introducing a separate finer grain lock and providing conditionally compiled inlines to lock/unlock it.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 36 +++++++++++++++++++++++++++++++++--- 1 file changed, 33 insertions(+), 3 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index d617ee5a3d8a..a9ed64dcc84b 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -73,6 +73,27 @@ struct gic_chip_data { static DEFINE_RAW_SPINLOCK(irq_controller_lock);
/* + * This lock is used by the big.LITTLE migration code to ensure no IPIs + * can be pended on the old core after the map has been updated. + */ +#ifdef CONFIG_BL_SWITCHER +static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock); + +static inline void bl_migration_lock(unsigned long *flags) +{ + raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags); +} + +static inline void bl_migration_unlock(unsigned long flags) +{ + raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); +} +#else +static inline void bl_migration_lock(unsigned long *flags) {} +static inline void bl_migration_unlock(unsigned long flags) {} +#endif + +/* * The GIC mapping of CPU interfaces does not necessarily match * the logical CPU numbering. Let's use a mapping as returned * by the GIC itself. @@ -624,7 +645,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) int cpu; unsigned long flags, map = 0;
- raw_spin_lock_irqsave(&irq_controller_lock, flags); + bl_migration_lock(&flags);
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -639,7 +660,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- raw_spin_unlock_irqrestore(&irq_controller_lock, flags); + bl_migration_unlock(flags); } #endif
@@ -710,8 +731,17 @@ void gic_migrate_target(unsigned int new_cpu_id)
raw_spin_lock(&irq_controller_lock);
- /* Update the target interface for this logical CPU */ + /* + * Update the target interface for this logical CPU + * + * From the point we release the cpu_map_migration_lock any new + * SGIs will be pended on the new cpu which makes the set of SGIs + * pending on the old cpu static. That means we can defer the + * migration until after we have released the irq_controller_lock. + */ + raw_spin_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id; + raw_spin_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
It is currently possible for FIQ handlers to re-enter gic_raise_softirq() and lock up.
gic_raise_softirq() lock(x); -~-> FIQ handle_fiq() gic_raise_softirq() lock(x); <-- Lockup
arch/arm/ uses IPIs to implement arch_irq_work_raise(), thus this issue renders it difficult for FIQ handlers to safely defer work to less restrictive calling contexts.
This patch fixes the problem by converting the cpu_map_migration_lock into a rwlock making it safe to re-enter the function.
Note that having made it safe to re-enter gic_raise_softirq() we no longer need to mask interrupts during gic_raise_softirq() because the b.L migration is always performed from task context.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com --- drivers/irqchip/irq-gic.c | 38 +++++++++++++++++++++++++------------- 1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index a9ed64dcc84b..c172176499f6 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -75,22 +75,25 @@ static DEFINE_RAW_SPINLOCK(irq_controller_lock); /* * This lock is used by the big.LITTLE migration code to ensure no IPIs * can be pended on the old core after the map has been updated. + * + * This lock may be locked for reading from both IRQ and FIQ handlers + * and therefore must not be locked for writing when these are enabled. */ #ifdef CONFIG_BL_SWITCHER -static DEFINE_RAW_SPINLOCK(cpu_map_migration_lock); +static DEFINE_RWLOCK(cpu_map_migration_lock);
-static inline void bl_migration_lock(unsigned long *flags) +static inline void bl_migration_lock(void) { - raw_spin_lock_irqsave(&cpu_map_migration_lock, *flags); + read_lock(&cpu_map_migration_lock); }
-static inline void bl_migration_unlock(unsigned long flags) +static inline void bl_migration_unlock(void) { - raw_spin_unlock_irqrestore(&cpu_map_migration_lock, flags); + read_unlock(&cpu_map_migration_lock); } #else -static inline void bl_migration_lock(unsigned long *flags) {} -static inline void bl_migration_unlock(unsigned long flags) {} +static inline void bl_migration_lock(void) {} +static inline void bl_migration_unlock(void) {} #endif
/* @@ -640,12 +643,20 @@ static void __init gic_pm_init(struct gic_chip_data *gic) #endif
#ifdef CONFIG_SMP +/* + * Raise the specified IPI on all cpus set in mask. + * + * This function is safe to call from all calling contexts, including + * FIQ handlers. It relies on bl_migration_lock() being multiply acquirable + * to avoid deadlocks when the function is re-entered at different + * exception levels. + */ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; - unsigned long flags, map = 0; + unsigned long map = 0;
- bl_migration_lock(&flags); + bl_migration_lock();
/* Convert our logical CPU mask into a physical one. */ for_each_cpu(cpu, mask) @@ -660,7 +671,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) /* this always happens on GIC0 */ writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
- bl_migration_unlock(flags); + bl_migration_unlock(); } #endif
@@ -708,7 +719,8 @@ int gic_get_cpu_id(unsigned int cpu) * Migrate all peripheral interrupts with a target matching the current CPU * to the interface corresponding to @new_cpu_id. The CPU interface mapping * is also updated. Targets to other CPU interfaces are unchanged. - * This must be called with IRQs locally disabled. + * This must be called from a task context and with IRQ and FIQ locally + * disabled. */ void gic_migrate_target(unsigned int new_cpu_id) { @@ -739,9 +751,9 @@ void gic_migrate_target(unsigned int new_cpu_id) * pending on the old cpu static. That means we can defer the * migration until after we have released the irq_controller_lock. */ - raw_spin_lock(&cpu_map_migration_lock); + write_lock(&cpu_map_migration_lock); gic_cpu_map[cpu] = 1 << new_cpu_id; - raw_spin_unlock(&cpu_map_migration_lock); + write_unlock(&cpu_map_migration_lock);
/* * Find all the peripheral interrupts targetting the current
Currently it is not possible to exploit FIQ for systems with a GIC, even if the systems are otherwise capable of it. This patch makes it possible for IPIs to be delivered using FIQ.
To do so it modifies the register state so that normal interrupts are placed in group 1 and specific IPIs are placed into group 0. It also configures the controller to raise group 0 interrupts using the FIQ signal. It provides a means for architecture code to define which IPIs shall use FIQ and to acknowledge any IPIs that are raised.
All GIC hardware except GICv1-without-TrustZone support provides a means to group exceptions into group 0 and group 1 but the hardware functionality is unavailable to the kernel when a secure monitor is present because access to the grouping registers are prohibited outside "secure world". However when grouping is not available (or in the case of early GICv1 implementations is very hard to configure) the code to change groups does not deploy and all IPIs will be raised via IRQ.
It has been tested and shown working on two systems capable of supporting grouping (Freescale i.MX6 and STiH416). It has also been tested for boot regressions on two systems that do not support grouping (vexpress-a9 and Qualcomm Snapdragon 600).
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jason Cooper jason@lakedaemon.net Cc: Russell King linux@arm.linux.org.uk Cc: Marc Zyngier marc.zyngier@arm.com Tested-by: Jon Medhurst tixy@linaro.org --- arch/arm/kernel/traps.c | 5 +- drivers/irqchip/irq-gic.c | 151 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 +++ 3 files changed, 153 insertions(+), 11 deletions(-)
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 788e23fe64d8..b35e220ae1b1 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -26,6 +26,7 @@ #include <linux/init.h> #include <linux/sched.h> #include <linux/irq.h> +#include <linux/irqchip/arm-gic.h>
#include <linux/atomic.h> #include <asm/cacheflush.h> @@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs)
nmi_enter();
- /* nop. FIQ handlers for special arch/arm features can be added here. */ +#ifdef CONFIG_ARM_GIC + gic_handle_fiq_ipi(); +#endif
nmi_exit();
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index c172176499f6..c4f4a8827ed8 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -39,6 +39,7 @@ #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/ratelimit.h>
#include <asm/cputype.h> #include <asm/irq.h> @@ -48,6 +49,10 @@ #include "irq-gic-common.h" #include "irqchip.h"
+#ifndef SMP_IPI_FIQ_MASK +#define SMP_IPI_FIQ_MASK 0 +#endif + union gic_base { void __iomem *common_base; void __percpu * __iomem *percpu_base; @@ -65,6 +70,7 @@ struct gic_chip_data { #endif struct irq_domain *domain; unsigned int gic_irqs; + u32 igroup0_shadow; #ifdef CONFIG_GIC_NON_BANKED void __iomem *(*get_base)(union gic_base *); #endif @@ -348,6 +354,83 @@ static struct irq_chip gic_chip = { .irq_set_wake = gic_set_wake, };
+/* + * Shift an interrupt between Group 0 and Group 1. + * + * In addition to changing the group we also modify the priority to + * match what "ARM strongly recommends" for a system where no Group 1 + * interrupt must ever preempt a Group 0 interrupt. + * + * If is safe to call this function on systems which do not support + * grouping (it will have no effect). + */ +static void gic_set_group_irq(struct gic_chip_data *gic, unsigned int hwirq, + int group) +{ + void __iomem *base = gic_data_dist_base(gic); + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_mask = BIT(hwirq % 32); + u32 grp_val; + + unsigned int pri_reg = (hwirq / 4) * 4; + u32 pri_mask = BIT(7 + ((hwirq % 4) * 8)); + u32 pri_val; + + /* + * Systems which do not support grouping will have not have + * the EnableGrp1 bit set. + */ + if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))) + return; + + raw_spin_lock(&irq_controller_lock); + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg); + + if (group) { + grp_val |= grp_mask; + pri_val |= pri_mask; + } else { + grp_val &= ~grp_mask; + pri_val &= ~pri_mask; + } + + writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg); + if (grp_reg == 0) + gic->igroup0_shadow = grp_val; + + writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg); + + raw_spin_unlock(&irq_controller_lock); +} + + +/* + * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI, + * otherwise do nothing. + */ +void gic_handle_fiq_ipi(void) +{ + struct gic_chip_data *gic = &gic_data[0]; + void __iomem *cpu_base = gic_data_cpu_base(gic); + unsigned long irqstat, irqnr; + + if (WARN_ON(!in_nmi())) + return; + + while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) & + SMP_IPI_FIQ_MASK) { + irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK); + writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); + + irqnr = irqstat & GICC_IAR_INT_ID_MASK; + WARN_RATELIMIT(irqnr > 16, + "Unexpected irqnr %lu (bad prioritization?)\n", + irqnr); + } +} + void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq) { if (gic_nr >= MAX_GIC_NR) @@ -379,15 +462,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic) static void gic_cpu_if_up(void) { void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]); - u32 bypass = 0; + void __iomem *dist_base = gic_data_dist_base(&gic_data[0]); + u32 ctrl = 0;
/* - * Preserve bypass disable bits to be written back later - */ - bypass = readl(cpu_base + GIC_CPU_CTRL); - bypass &= GICC_DIS_BYPASS_MASK; + * Preserve bypass disable bits to be written back later + */ + ctrl = readl(cpu_base + GIC_CPU_CTRL); + ctrl &= GICC_DIS_BYPASS_MASK;
- writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); + /* + * If EnableGrp1 is set in the distributor then enable group 1 + * support for this CPU (and route group 0 interrupts to FIQ). + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) + ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL | + GICC_ENABLE_GRP1; + + writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); }
@@ -411,7 +503,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
gic_dist_config(base, gic_irqs, NULL);
- writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL); + /* + * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only, + * bit 1 ignored) depending on current mode. + */ + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL); + + /* + * Set all global interrupts to be group 1 if (and only if) it + * is possible to enable group 1 interrupts. This register is RAZ/WI + * if not accessible or not implemented, however some GICv1 devices + * do not implement the EnableGrp1 bit making it unsafe to set + * this register unconditionally. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)) + for (i = 32; i < gic_irqs; i += 32) + writel_relaxed(0xffffffff, + base + GIC_DIST_IGROUP + i * 4 / 32); }
static void gic_cpu_init(struct gic_chip_data *gic) @@ -420,6 +528,7 @@ static void gic_cpu_init(struct gic_chip_data *gic) void __iomem *base = gic_data_cpu_base(gic); unsigned int cpu_mask, cpu = smp_processor_id(); int i; + unsigned long secure_irqs, secure_irq;
/* * Get what the GIC says our CPU mask is. @@ -438,6 +547,20 @@ static void gic_cpu_init(struct gic_chip_data *gic)
gic_cpu_config(dist_base, NULL);
+ /* + * If the distributor is configured to support interrupt grouping + * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK + * to be group1 and ensure any remaining group 0 interrupts have + * the right priority. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) { + secure_irqs = SMP_IPI_FIQ_MASK; + writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0); + gic->igroup0_shadow = ~secure_irqs; + for_each_set_bit(secure_irq, &secure_irqs, 16) + gic_set_group_irq(gic, secure_irq, 0); + } + writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK); gic_cpu_if_up(); } @@ -527,7 +650,8 @@ static void gic_dist_restore(unsigned int gic_nr) writel_relaxed(gic_data[gic_nr].saved_spi_enable[i], dist_base + GIC_DIST_ENABLE_SET + i * 4);
- writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL); + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, + dist_base + GIC_DIST_CTRL); }
static void gic_cpu_save(unsigned int gic_nr) @@ -655,6 +779,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; unsigned long map = 0; + unsigned long softint;
bl_migration_lock();
@@ -668,8 +793,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) */ dmb(ishst);
- /* this always happens on GIC0 */ - writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); + /* We avoid a readl here by using the shadow copy of IGROUP[0] */ + softint = map << 16 | irq; + if (gic_data[0].igroup0_shadow & BIT(irq)) + softint |= 0x8000; + + /* This always happens on GIC0 */ + writel_relaxed(softint, + gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
bl_migration_unlock(); } diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h index 71d706d5f169..7690f70049a3 100644 --- a/include/linux/irqchip/arm-gic.h +++ b/include/linux/irqchip/arm-gic.h @@ -22,6 +22,10 @@ #define GIC_CPU_IDENT 0xfc
#define GICC_ENABLE 0x1 +#define GICC_ENABLE_GRP1 0x2 +#define GICC_ACK_CTL 0x4 +#define GICC_FIQ_EN 0x8 +#define GICC_COMMON_BPR 0x10 #define GICC_INT_PRI_THRESHOLD 0xf0 #define GICC_IAR_INT_ID_MASK 0x3ff #define GICC_INT_SPURIOUS 1023 @@ -44,6 +48,7 @@ #define GIC_DIST_SGI_PENDING_SET 0xf20
#define GICD_ENABLE 0x1 +#define GICD_ENABLE_GRP1 0x2 #define GICD_DISABLE 0x0 #define GICD_INT_ACTLOW_LVLTRIG 0x0 #define GICD_INT_EN_CLR_X32 0xffffffff @@ -121,5 +126,8 @@ static inline void __init register_routable_domain_ops { gic_routable_irq_domain_ops = ops; } + +void gic_handle_fiq_ipi(void); + #endif /* __ASSEMBLY */ #endif
Currently there is a quite a pile of code sitting in arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI. The code is inaccessible to backtrace implementations for other architectures, which is a shame because they would probably like to be safe too.
Copy this code into printk. We'll port the x86 NMI backtrace to it in a later patch.
Incidentally, technically I think it might be safe to call prepare_nmi_printk() from NMI, providing care were taken to honour the return code. complete_nmi_printk() cannot be called from NMI but could be scheduled using irq_work_queue(). However honouring the return code means sometimes it is impossible to get the message out so I'd say using this code in such a way should probably attract sympathy and/or derision rather than admiration.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Steven Rostedt rostedt@goodmis.org --- arch/Kconfig | 3 ++ include/linux/printk.h | 22 +++++++++ kernel/printk/printk.c | 122 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 147 insertions(+)
diff --git a/arch/Kconfig b/arch/Kconfig index 05d7a8a458d5..50c9412a77d0 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -309,6 +309,9 @@ config ARCH_WANT_OLD_COMPAT_IPC select ARCH_WANT_COMPAT_IPC_PARSE_VERSION bool
+config ARCH_WANT_NMI_PRINTK + bool + config HAVE_ARCH_SECCOMP_FILTER bool help diff --git a/include/linux/printk.h b/include/linux/printk.h index c8f170324e64..539ea5a8f219 100644 --- a/include/linux/printk.h +++ b/include/linux/printk.h @@ -219,6 +219,28 @@ static inline void show_regs_print_info(const char *log_lvl) } #endif
+#ifdef CONFIG_ARCH_WANT_NMI_PRINTK +extern __printf(1, 0) int nmi_vprintk(const char *fmt, va_list args); + +struct cpumask; +extern int prepare_nmi_printk(struct cpumask *cpus); +extern void complete_nmi_printk(struct cpumask *cpus); + +/* + * Replace printk to write into the NMI seq. + * + * To avoid include hell this is a macro rather than an inline function + * (printk_func is not declared in this header file). + */ +#define this_cpu_begin_nmi_printk() ({ \ + printk_func_t orig = this_cpu_read(printk_func); \ + this_cpu_write(printk_func, nmi_vprintk); \ + orig; \ +}) +#define this_cpu_end_nmi_printk(fn) this_cpu_write(printk_func, fn) + +#endif + extern asmlinkage void dump_stack(void) __cold;
#ifndef pr_fmt diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 02d6b6d28796..774119e27e0b 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1805,6 +1805,127 @@ asmlinkage int printk_emit(int facility, int level, } EXPORT_SYMBOL(printk_emit);
+#ifdef CONFIG_ARCH_WANT_NMI_PRINTK + +#define NMI_BUF_SIZE 4096 + +struct nmi_seq_buf { + unsigned char buffer[NMI_BUF_SIZE]; + struct seq_buf seq; +}; + +/* Safe printing in NMI context */ +static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq); + +/* "in progress" flag of NMI printing */ +static unsigned long nmi_print_flag; + +/* + * It is not safe to call printk() directly from NMI handlers. + * It may be fine if the NMI detected a lock up and we have no choice + * but to do so, but doing a NMI on all other CPUs to get a back trace + * can be done with a sysrq-l. We don't want that to lock up, which + * can happen if the NMI interrupts a printk in progress. + * + * Instead, we redirect the vprintk() to this nmi_vprintk() that writes + * the content into a per cpu seq_buf buffer. Then when the NMIs are + * all done, we can safely dump the contents of the seq_buf to a printk() + * from a non NMI context. + * + * This is not a generic printk() implementation and must be used with + * great care. In particular there is a static limit on the quantity of + * data that may be emitted during NMI, only one client can be active at + * one time (arbitrated by the return value of begin_nmi_printk() and + * it is required that something at task or interrupt context be scheduled + * to issue the output. + */ +int nmi_vprintk(const char *fmt, va_list args) +{ + struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq); + unsigned int len = seq_buf_used(&s->seq); + + seq_buf_vprintf(&s->seq, fmt, args); + return seq_buf_used(&s->seq) - len; +} +EXPORT_SYMBOL_GPL(nmi_vprintk); + +/* + * Check for concurrent usage and set up per_cpu seq_buf buffers that the NMIs + * running on the other CPUs will write to. Provides the mask of CPUs it is + * safe to write from (i.e. a copy of the online mask). + */ +int prepare_nmi_printk(struct cpumask *cpus) +{ + struct nmi_seq_buf *s; + int cpu; + + if (test_and_set_bit(0, &nmi_print_flag)) { + /* + * If something is already using the NMI print facility we + * can't allow a second one... + */ + return -EBUSY; + } + + cpumask_copy(cpus, cpu_online_mask); + + for_each_cpu(cpu, cpus) { + s = &per_cpu(nmi_print_seq, cpu); + seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE); + } + + return 0; +} +EXPORT_SYMBOL_GPL(prepare_nmi_printk); + +static void print_seq_line(struct nmi_seq_buf *s, int start, int end) +{ + const char *buf = s->buffer + start; + + printk("%.*s", (end - start) + 1, buf); +} + +void complete_nmi_printk(struct cpumask *cpus) +{ + struct nmi_seq_buf *s; + int len; + int cpu; + int i; + + /* + * Now that all the NMIs have triggered, we can dump out their + * back traces safely to the console. + */ + for_each_cpu(cpu, cpus) { + int last_i = 0; + + s = &per_cpu(nmi_print_seq, cpu); + + len = seq_buf_used(&s->seq); + if (!len) + continue; + + /* Print line by line. */ + for (i = 0; i < len; i++) { + if (s->buffer[i] == '\n') { + print_seq_line(s, last_i, i); + last_i = i + 1; + } + } + /* Check if there was a partial line. */ + if (last_i < len) { + print_seq_line(s, last_i, len - 1); + pr_cont("\n"); + } + } + + clear_bit(0, &nmi_print_flag); + smp_mb__after_atomic(); +} +EXPORT_SYMBOL_GPL(complete_nmi_printk); + +#endif /* CONFIG_ARCH_WANT_NMI_PRINTK */ + int vprintk_default(const char *fmt, va_list args) { int r; @@ -1829,6 +1950,7 @@ EXPORT_SYMBOL_GPL(vprintk_default); */ DEFINE_PER_CPU(printk_func_t, printk_func) = vprintk_default;
+ /** * printk - print a kernel message * @fmt: format string
Much of the code sitting in arch/x86/kernel/apic/hw_nmi.c to support safe all-cpu backtracing from NMI has been copied to printk.c to make it accessible to other architectures.
Port the x86 NMI backtrace to the generic code.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Steven Rostedt rostedt@goodmis.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: x86@kernel.org --- arch/x86/Kconfig | 1 + arch/x86/kernel/apic/hw_nmi.c | 94 ++++--------------------------------------- 2 files changed, 8 insertions(+), 87 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index ba397bde7948..f36d3058968e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -138,6 +138,7 @@ config X86 select HAVE_ACPI_APEI_NMI if ACPI select ACPI_LEGACY_TABLES_LOOKUP if ACPI select X86_FEATURE_NAMES if PROC_FS + select ARCH_WANT_NMI_PRINTK if X86_LOCAL_APIC
config INSTRUCTION_DECODER def_bool y diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c index 6873ab925d00..7d4a5c8ac510 100644 --- a/arch/x86/kernel/apic/hw_nmi.c +++ b/arch/x86/kernel/apic/hw_nmi.c @@ -32,26 +32,6 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh) static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; static cpumask_t printtrace_mask;
-#define NMI_BUF_SIZE 4096 - -struct nmi_seq_buf { - unsigned char buffer[NMI_BUF_SIZE]; - struct seq_buf seq; -}; - -/* Safe printing in NMI context */ -static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq); - -/* "in progress" flag of arch_trigger_all_cpu_backtrace */ -static unsigned long backtrace_flag; - -static void print_seq_line(struct nmi_seq_buf *s, int start, int end) -{ - const char *buf = s->buffer + start; - - printk("%.*s", (end - start) + 1, buf); -} - void arch_trigger_all_cpu_backtrace(bool include_self) { struct nmi_seq_buf *s; @@ -60,28 +40,18 @@ void arch_trigger_all_cpu_backtrace(bool include_self) int i; int this_cpu = get_cpu();
- if (test_and_set_bit(0, &backtrace_flag)) { + if (0 != prepare_nmi_printk(to_cpumask(backtrace_mask))) { /* - * If there is already a trigger_all_cpu_backtrace() in progress - * (backtrace_flag == 1), don't output double cpu dump infos. + * If there is already an nmi printk sequence in + * progress then just give up... */ put_cpu(); return; }
- cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask); if (!include_self) cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask)); - cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask)); - /* - * Set up per_cpu seq_buf buffers that the NMIs running on the other - * CPUs will write to. - */ - for_each_cpu(cpu, to_cpumask(backtrace_mask)) { - s = &per_cpu(nmi_print_seq, cpu); - seq_buf_init(&s->seq, s->buffer, NMI_BUF_SIZE); - }
if (!cpumask_empty(to_cpumask(backtrace_mask))) { pr_info("sending NMI to %s CPUs:\n", @@ -97,73 +67,23 @@ void arch_trigger_all_cpu_backtrace(bool include_self) touch_softlockup_watchdog(); }
- /* - * Now that all the NMIs have triggered, we can dump out their - * back traces safely to the console. - */ - for_each_cpu(cpu, &printtrace_mask) { - int last_i = 0; - - s = &per_cpu(nmi_print_seq, cpu); - len = seq_buf_used(&s->seq); - if (!len) - continue; - - /* Print line by line. */ - for (i = 0; i < len; i++) { - if (s->buffer[i] == '\n') { - print_seq_line(s, last_i, i); - last_i = i + 1; - } - } - /* Check if there was a partial line. */ - if (last_i < len) { - print_seq_line(s, last_i, len - 1); - pr_cont("\n"); - } - } - - clear_bit(0, &backtrace_flag); - smp_mb__after_atomic(); + complete_nmi_printk(&printtrace_mask); put_cpu(); }
-/* - * It is not safe to call printk() directly from NMI handlers. - * It may be fine if the NMI detected a lock up and we have no choice - * but to do so, but doing a NMI on all other CPUs to get a back trace - * can be done with a sysrq-l. We don't want that to lock up, which - * can happen if the NMI interrupts a printk in progress. - * - * Instead, we redirect the vprintk() to this nmi_vprintk() that writes - * the content into a per cpu seq_buf buffer. Then when the NMIs are - * all done, we can safely dump the contents of the seq_buf to a printk() - * from a non NMI context. - */ -static int nmi_vprintk(const char *fmt, va_list args) -{ - struct nmi_seq_buf *s = this_cpu_ptr(&nmi_print_seq); - unsigned int len = seq_buf_used(&s->seq); - - seq_buf_vprintf(&s->seq, fmt, args); - return seq_buf_used(&s->seq) - len; -} - static int arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs) { int cpu; + printk_func_t orig;
cpu = smp_processor_id();
if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { - printk_func_t printk_func_save = this_cpu_read(printk_func); - - /* Replace printk to write into the NMI seq */ - this_cpu_write(printk_func, nmi_vprintk); + orig = this_cpu_begin_nmi_printk(); printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu); show_regs(regs); - this_cpu_write(printk_func, printk_func_save); + this_cpu_end_nmi_printk(orig);
cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); return NMI_HANDLED;
Duplicate the x86 code to trigger a backtrace using an NMI and hook it up to IPI on ARM. Where it is possible for the hardware to do so the IPI will be delivered at FIQ level.
Also provide are a few small items of plumbing to hook up the new code.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Steven Rostedt rostedt@goodmis.org --- arch/arm/Kconfig | 1 + arch/arm/include/asm/hardirq.h | 2 +- arch/arm/include/asm/irq.h | 5 +++ arch/arm/include/asm/smp.h | 3 ++ arch/arm/kernel/smp.c | 71 ++++++++++++++++++++++++++++++++++++++++++ arch/arm/kernel/traps.c | 3 ++ 6 files changed, 84 insertions(+), 1 deletion(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 97d07ed60a0b..91d62731b52d 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -11,6 +11,7 @@ config ARM select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF select ARCH_WANT_IPC_PARSE_VERSION + select ARCH_WANT_NMI_PRINTK select BUILDTIME_EXTABLE_SORT if MMU select CLONE_BACKWARDS select CPU_PM if (SUSPEND || CPU_IDLE) diff --git a/arch/arm/include/asm/hardirq.h b/arch/arm/include/asm/hardirq.h index fe3ea776dc34..5df33e30ae1b 100644 --- a/arch/arm/include/asm/hardirq.h +++ b/arch/arm/include/asm/hardirq.h @@ -5,7 +5,7 @@ #include <linux/threads.h> #include <asm/irq.h>
-#define NR_IPI 8 +#define NR_IPI 9
typedef struct { unsigned int __softirq_pending; diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h index 53c15dec7af6..be1d07d59ee9 100644 --- a/arch/arm/include/asm/irq.h +++ b/arch/arm/include/asm/irq.h @@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *); extern void set_handle_irq(void (*handle_irq)(struct pt_regs *)); #endif
+#ifdef CONFIG_SMP +extern void arch_trigger_all_cpu_backtrace(bool); +#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x) +#endif + #endif
#endif diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h index 18f5a554134f..b076584ac0fa 100644 --- a/arch/arm/include/asm/smp.h +++ b/arch/arm/include/asm/smp.h @@ -18,6 +18,8 @@ # error "<asm/smp.h> included in non-SMP build" #endif
+#define SMP_IPI_FIQ_MASK 0x0100 + #define raw_smp_processor_id() (current_thread_info()->cpu)
struct seq_file; @@ -79,6 +81,7 @@ extern void arch_send_call_function_single_ipi(int cpu); extern void arch_send_call_function_ipi_mask(const struct cpumask *mask); extern void arch_send_wakeup_ipi_mask(const struct cpumask *mask);
+extern void ipi_cpu_backtrace(struct pt_regs *regs); extern int register_ipi_completion(struct completion *completion, int cpu);
struct smp_operations { diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 5e6052e18850..afb094a1e6d4 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -26,6 +26,7 @@ #include <linux/completion.h> #include <linux/cpufreq.h> #include <linux/irq_work.h> +#include <linux/seq_buf.h>
#include <linux/atomic.h> #include <asm/smp.h> @@ -72,6 +73,7 @@ enum ipi_msg_type { IPI_CPU_STOP, IPI_IRQ_WORK, IPI_COMPLETION, + IPI_CPU_BACKTRACE, };
static DECLARE_COMPLETION(cpu_running); @@ -444,6 +446,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = { S(IPI_CPU_STOP, "CPU stop interrupts"), S(IPI_IRQ_WORK, "IRQ work interrupts"), S(IPI_COMPLETION, "completion interrupts"), + S(IPI_CPU_BACKTRACE, "backtrace interrupts"), };
static void smp_cross_call(const struct cpumask *target, unsigned int ipinr) @@ -558,6 +561,8 @@ void handle_IPI(int ipinr, struct pt_regs *regs) unsigned int cpu = smp_processor_id(); struct pt_regs *old_regs = set_irq_regs(regs);
+ BUILD_BUG_ON(SMP_IPI_FIQ_MASK != BIT(IPI_CPU_BACKTRACE)); + if ((unsigned)ipinr < NR_IPI) { trace_ipi_entry(ipi_types[ipinr]); __inc_irq_stat(cpu, ipi_irqs[ipinr]); @@ -611,6 +616,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs) irq_exit(); break;
+ case IPI_CPU_BACKTRACE: + irq_enter(); + ipi_cpu_backtrace(regs); + irq_exit(); + break; + default: pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); @@ -705,3 +716,63 @@ static int __init register_cpufreq_notifier(void) core_initcall(register_cpufreq_notifier);
#endif + +/* For reliability, we're prepared to waste bits here. */ +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly; +static cpumask_t printtrace_mask; + +void arch_trigger_all_cpu_backtrace(bool include_self) +{ + struct nmi_seq_buf *s; + int len; + int cpu; + int i; + int this_cpu = get_cpu(); + + if (0 != prepare_nmi_printk(to_cpumask(backtrace_mask))) { + /* + * If there is already an nmi printk sequence in + * progress then just give up... + */ + put_cpu(); + return; + } + + if (!include_self) + cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask)); + cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask)); + + if (!cpumask_empty(to_cpumask(backtrace_mask))) { + pr_info("Sending FIQ to %s CPUs:\n", + (include_self ? "all" : "other")); + smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE); + } + + /* Wait for up to 10 seconds for all CPUs to do the backtrace */ + for (i = 0; i < 10 * 1000; i++) { + if (cpumask_empty(to_cpumask(backtrace_mask))) + break; + mdelay(1); + touch_softlockup_watchdog(); + } + + complete_nmi_printk(&printtrace_mask); + put_cpu(); +} + +void ipi_cpu_backtrace(struct pt_regs *regs) +{ + int cpu; + printk_func_t orig; + + cpu = smp_processor_id(); + + if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { + orig = this_cpu_begin_nmi_printk(); + pr_warn("FIQ backtrace for cpu %d\n", cpu); + show_regs(regs); + this_cpu_end_nmi_printk(orig); + + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + } +} diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index b35e220ae1b1..1836415b8a5c 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -483,6 +483,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) #ifdef CONFIG_ARM_GIC gic_handle_fiq_ipi(); #endif +#ifdef CONFIG_SMP + ipi_cpu_backtrace(regs); +#endif
nmi_exit();
Currently if arch_trigger_all_cpu_backtrace() is called with interrupts disabled and on a platform the delivers IPI_CPU_BACKTRACE using regular IRQ requests the system will wedge for ten seconds waiting for the current CPU to react to a masked interrupt.
This patch resolves this issue by calling directly into the backtrace dump code instead of generating an IPI.
Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Cc: Steven Rostedt rostedt@goodmis.org --- arch/arm/kernel/smp.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index afb094a1e6d4..cf3b738568b8 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -742,6 +742,16 @@ void arch_trigger_all_cpu_backtrace(bool include_self) cpumask_clear_cpu(this_cpu, to_cpumask(backtrace_mask)); cpumask_copy(&printtrace_mask, to_cpumask(backtrace_mask));
+ /* + * If irqs are disabled on the current processor then, if + * IPI_CPU_BACKTRACE is delivered using IRQ, we will won't be able to + * react to IPI_CPU_BACKTRACE until we leave this function. We avoid + * the potential timeout (not to mention the failure to print useful + * information) by calling the backtrace directly. + */ + if (include_self && irqs_disabled()) + ipi_cpu_backtrace(in_interrupt() ? get_irq_regs() : NULL); + if (!cpumask_empty(to_cpumask(backtrace_mask))) { pr_info("Sending FIQ to %s CPUs:\n", (include_self ? "all" : "other")); @@ -770,7 +780,10 @@ void ipi_cpu_backtrace(struct pt_regs *regs) if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { orig = this_cpu_begin_nmi_printk(); pr_warn("FIQ backtrace for cpu %d\n", cpu); - show_regs(regs); + if (regs != NULL) + show_regs(regs); + else + dump_stack(); this_cpu_end_nmi_printk(orig);
cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
On 13/01/15 10:26, Daniel Thompson wrote:
Hi Thomas, Hi Jason: Patches 1 to 3 are for you (and should be separable from the rest of the series). The patches haven't changes since the last time I posted them. The changes in v14 tidy up the later part of the patch set in order to share more code between x86 and arm.
No review comments! Have I finally got this right?
If so it possible and/or sensible to get patches 1-3 in a tree that feeds linux-next. I'd really like the gic changes to meet the various ARM build and boot bots.
Daniel.
This patchset modifies the GIC driver to allow it, on supported platforms, to route IPI interrupts to FIQ and uses this feature to implement arch_trigger_all_cpu_backtrace for arm.
On platforms not capable of supporting FIQ the signal to generate a backtrace we fall back to using IRQ for propagation instead (relying on a timeout to avoid wedging the CPU requesting the backtrace if other CPUs are not responsive).
It has been tested on two systems capable of supporting grouping (Freescale i.MX6 and STiH416) and two that do not (vexpress-a9 and Qualcomm Snapdragon 600).
v14:
- Moved a nmi_vprintk() and friends from arch/x86/kernel/apic/hw_nmi.c to printk.c (Steven Rostedt)
v13:
- Updated the code to print the backtrace to replicate Steven Rostedt's x86 work to make SysRq-l safe. This is pretty much a total rewrite of patches 4 and 5.
v12:
Squash first two patches into a single one and re-describe (Thomas Gleixner).
Improve description of "irqchip: gic: Make gic_raise_softirq FIQ-safe" (Thomas Gleixner).
v11:
- Optimized gic_raise_softirq() by replacing a register read with a memory read (Jason Cooper).
v10:
Add a further patch to optimize away some of the locking on systems where CONFIG_BL_SWITCHER is not set (Marc Zyngier). Compiles OK with exynos_defconfig (which is the only defconfig to set this option).
Whitespace fixes in patch 4. That patch previously used spaces for alignment of new constants but the rest of the file used tabs.
v9:
Improved documentation and structure of initial patch (now initial two patches) to make gic_raise_softirq() safe to call from FIQ (Thomas Gleixner).
Avoid masking interrupts during gic_raise_softirq(). The use of the read lock makes this redundant (because we can safely re-enter the function).
v8:
- Fixed build on arm64 causes by a spurious include file in irq-gic.c.
v7-2 (accidentally released twice with same number):
Fixed boot regression on vexpress-a9 (reported by Russell King).
Rebased on v3.18-rc3; removed one patch from set that is already included in mainline.
Dropped arm64/fiq.h patch from the set (still useful but not related to issuing backtraces).
v7:
- Re-arranged code within the patch series to fix a regression introduced midway through the series and corrected by a later patch (testing by Olof's autobuilder). Tested offending patch in isolation using defconfig identified by the autobuilder.
v6:
Renamed svc_entry's call_trace argument to just trace (example code from Russell King).
Fixed mismatched ENDPROC() in __fiq_abt (example code from Russell King).
Modified usr_entry to optional avoid calling into the trace code and used this in FIQ entry from usr path. Modified corresponding exit code to avoid calling into trace code and the scheduler (example code from Russell King).
Ensured the default FIQ register state is restored when the default FIQ handler is reinstalled (example code from Russell King).
Renamed no_fiq_insn to dfl_fiq_insn to reflect the effect of adopting a default FIQ handler.
Re-instated fiq_safe_migration_lock and associated logic in gic_raise_softirq(). gic_raise_softirq() is called by wake_up_klogd() in the console unlock logic.
v5:
Rebased on 3.17-rc4.
Removed a spurious line from the final "glue it together" patch that broke the build.
v4:
Replaced push/pop with stmfd/ldmfd respectively (review of Nicolas Pitre).
Really fix bad pt_regs pointer generation in __fiq_abt.
Remove fiq_safe_migration_lock and associated logic in gic_raise_softirq() (review of Russell King)
Restructured to introduce the default FIQ handler first, before the new features (review of Russell King).
v3:
Removed redundant header guards from arch/arm64/include/asm/fiq.h (review of Catalin Marinas).
Moved svc_exit_via_fiq macro to entry-header.S (review of Nicolas Pitre).
v2:
Restructured to sit nicely on a similar FYI patchset from Russell King. It now effectively replaces the work in progress final patch with something much more complete.
Implemented (and tested) a Thumb-2 implementation of svc_exit_via_fiq (review of Nicolas Pitre)
Dropped the GIC group 0 workaround patch. The issue of FIQ interrupts being acknowledged by the IRQ handler does still exist but should be harmless because the IRQ handler will still wind up calling ipi_cpu_backtrace().
Removed any dependency on CONFIG_FIQ; all cpu backtrace effectively becomes a platform feature (although the use of non-maskable interrupts to implement it is best effort rather than guaranteed).
Better comments highlighting usage of RAZ/WI registers (and parts of registers) in the GIC code.
Changes *before* v1:
This patchset is a hugely cut-down successor to "[PATCH v11 00/19] arm: KGDB NMI/FIQ support". Thanks to Thomas Gleixner for suggesting the new structure. For historic details see: https://lkml.org/lkml/2014/9/2/227
Fix bug in __fiq_abt (no longer passes a bad struct pt_regs value). In fixing this we also remove the useless indirection previously found in the fiq_handler macro.
Make default fiq handler "always on" by migrating from fiq.c to traps.c and replace do_unexp_fiq with the new handler (review of Russell King).
Add arm64 version of fiq.h (review of Russell King)
Removed conditional branching and code from irq-gic.c, this is replaced by much simpler code that relies on the GIC specification's heavy use of read-as-zero/write-ignored (review of Russell King)
Daniel Thompson (7): irqchip: gic: Optimize locking in gic_raise_softirq irqchip: gic: Make gic_raise_softirq FIQ-safe irqchip: gic: Introduce plumbing for IPI FIQ printk: Simple implementation for NMI backtracing x86/nmi: Use common printk functions ARM: Add support for on-demand backtrace of other CPUs ARM: Fix on-demand backtrace triggered by IRQ
arch/Kconfig | 3 + arch/arm/Kconfig | 1 + arch/arm/include/asm/hardirq.h | 2 +- arch/arm/include/asm/irq.h | 5 + arch/arm/include/asm/smp.h | 3 + arch/arm/kernel/smp.c | 84 +++++++++++++++++ arch/arm/kernel/traps.c | 8 +- arch/x86/Kconfig | 1 + arch/x86/kernel/apic/hw_nmi.c | 94 ++----------------- drivers/irqchip/irq-gic.c | 203 +++++++++++++++++++++++++++++++++++++--- include/linux/irqchip/arm-gic.h | 8 ++ include/linux/printk.h | 22 +++++ kernel/printk/printk.c | 122 ++++++++++++++++++++++++ 13 files changed, 452 insertions(+), 104 deletions(-)
-- 1.9.3
On 01/20/2015 02:25 AM, Daniel Thompson wrote:
On 13/01/15 10:26, Daniel Thompson wrote:
Hi Thomas, Hi Jason: Patches 1 to 3 are for you (and should be separable from the rest of the series). The patches haven't changes since the last time I posted them. The changes in v14 tidy up the later part of the patch set in order to share more code between x86 and arm.
No review comments! Have I finally got this right?
If so it possible and/or sensible to get patches 1-3 in a tree that feeds linux-next. I'd really like the gic changes to meet the various ARM build and boot bots.
With this patchset, is it possible to call sched_clock() from within NMI context? I ask because the generic sched_clock() code is not NMI safe today. We were planning on making it NMI safe by doing something similar to what was done for ktime_get_mono_fast_ns() but we haven't gotten around to it. Mostly because no architecture that uses generic sched_clock() has support for NMIs right now.
On 20/01/15 20:53, Stephen Boyd wrote:
On 01/20/2015 02:25 AM, Daniel Thompson wrote:
On 13/01/15 10:26, Daniel Thompson wrote:
Hi Thomas, Hi Jason: Patches 1 to 3 are for you (and should be separable from the rest of the series). The patches haven't changes since the last time I posted them. The changes in v14 tidy up the later part of the patch set in order to share more code between x86 and arm.
No review comments! Have I finally got this right?
If so it possible and/or sensible to get patches 1-3 in a tree that feeds linux-next. I'd really like the gic changes to meet the various ARM build and boot bots.
With this patchset, is it possible to call sched_clock() from within NMI context? I ask because the generic sched_clock() code is not NMI safe today. We were planning on making it NMI safe by doing something similar to what was done for ktime_get_mono_fast_ns() but we haven't gotten around to it. Mostly because no architecture that uses generic sched_clock() has support for NMIs right now.
I've not done any work to make sched_clock() safe to call from NMI. However since my patchset does not introduce any calls to sched_clock() from NMI I think this is OK!
I ported Steven Rostedt's work to make arch_trigger_all_cpu_backtrace() safe from NMI from x86 to ARM. One result of Steven's approach are that printk() timestamping is deferred until we return to normal context. Thus even with CONFIG_PRINTK_TIME we do not call local_clock() during NMI processing.
To confirm the above I have added the code below to my kernel and ran it with a fairly paranoid set of debugging options. The check does not fire.
Daniel.
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h index 630dd2372238..fea0deeb524b 100644 --- a/include/asm-generic/bug.h +++ b/include/asm-generic/bug.h @@ -111,8 +111,10 @@ extern void warn_slowpath_null(const char *file, const int line); int __ret_warn_once = !!(condition); \ \ if (unlikely(__ret_warn_once)) \ - if (WARN_ON(!__warned)) \ + if (unlikely(!__warned)) { \ __warned = true; \ + __WARN(); \ + } \ unlikely(__ret_warn_once); \ })
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c index 01d2d15aa662..81ea469b7e68 100644 --- a/kernel/time/sched_clock.c +++ b/kernel/time/sched_clock.c @@ -63,6 +63,8 @@ unsigned long long notrace sched_clock(void) u64 cyc; unsigned long seq;
+ WARN_ON_ONCE(in_nmi()); + if (cd.suspended) return cd.epoch_ns;
On Wed, 21 Jan 2015 10:47:37 +0000 Daniel Thompson daniel.thompson@linaro.org wrote:
With this patchset, is it possible to call sched_clock() from within NMI context? I ask because the generic sched_clock() code is not NMI safe
That's not good. Better not run function tracing, as that could trace functions in NMI context (I depend on that it does), and it uses sched_clock() as the default clock.
-- Steve
today. We were planning on making it NMI safe by doing something similar to what was done for ktime_get_mono_fast_ns() but we haven't gotten around to it. Mostly because no architecture that uses generic sched_clock() has support for NMIs right now.
On 21/01/15 13:06, Steven Rostedt wrote:
On Wed, 21 Jan 2015 10:47:37 +0000 Daniel Thompson daniel.thompson@linaro.org wrote:
With this patchset, is it possible to call sched_clock() from within NMI context? I ask because the generic sched_clock() code is not NMI safe
That's not good. Better not run function tracing, as that could trace functions in NMI context (I depend on that it does), and it uses sched_clock() as the default clock.
I think sched_clock is unsafe as in "may sometimes give the wrong value" rather than "can lock up arbitrarily". Thus the impact is unlikely to be harmful enough to want to avoid tracing altogether.
It would require special care be taken when interpreting the timestamps however. Also since update_sched_clock() is a notrace function its very hard to figure out when timestamps are at risk.
Anyhow, the fix doesn't seem that hard. I can take a look.
Daniel.
On 21/01/15 13:48, Daniel Thompson wrote:
On 21/01/15 13:06, Steven Rostedt wrote:
On Wed, 21 Jan 2015 10:47:37 +0000 Daniel Thompson daniel.thompson@linaro.org wrote:
With this patchset, is it possible to call sched_clock() from within NMI context? I ask because the generic sched_clock() code is not NMI safe
That's not good. Better not run function tracing, as that could trace functions in NMI context (I depend on that it does), and it uses sched_clock() as the default clock.
I think sched_clock is unsafe as in "may sometimes give the wrong value" rather than "can lock up arbitrarily". Thus the impact is unlikely to be harmful enough to want to avoid tracing altogether.
Just to update the record...
The above paragraph is wrong in every possible way. It is a livelock (and I'm working on it).
It would require special care be taken when interpreting the timestamps however. Also since update_sched_clock() is a notrace function its very hard to figure out when timestamps are at risk.
Anyhow, the fix doesn't seem that hard. I can take a look.
Daniel.
linaro-kernel@lists.linaro.org