ACPI spec is x86 oriented thus it use NMI for error reporting as low latency error signalling way. Fast response to error occurrence is what ARM trying to satisfy using interrupt prioritization since there is no NMI direct equivalent for ARM architecture.
Patch set are divided into three step: 1. Create generic code responsible for setting interrupt priority. 2. Use it in interrupt setup. 3. Example of interrupt controller priority mapping for GIC platform dependent code.
Patch set tries to meet requirements like: - not breaking existing code - easy to expand to new priority levels - easy to map generic priority levels to platform dependent priority values
See commit logs for more detailed explanation.
Some of the interrupt controller allow to prioritize interrupts. All of interrupts have equal priority so far. This commit add infrastructure to manipulate interrupt signaling order which means that some of them can take precedence over the others.
Initial implementation assume three priority level: HIGH - interrupt which need to be served ASAP DEFAULT - default interrupt priority LOW - interrupt which doesn't care about response latency
Such generic priority levels approach require mapping to the architecture specific value. It could be done using static allocated table like:
static unsigned int priority_map [SIZE] = { [IRQP_HIGH] = 0x00, [IRQP_DEFAULT] = 0xa0, [IRQP_LOW] = 0xe0, }; It allow us to be compatible in case of irqpriority_t (see include/linux/irqpriority.h) further modification.
Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org --- include/linux/irq.h | 3 +++ include/linux/irqpriority.h | 24 ++++++++++++++++++++++++ kernel/irq/chip.c | 9 +++++++++ kernel/irq/internals.h | 1 + 4 files changed, 37 insertions(+) create mode 100644 include/linux/irqpriority.h
diff --git a/include/linux/irq.h b/include/linux/irq.h index f04d3ba..21d2776 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -17,6 +17,7 @@ #include <linux/gfp.h> #include <linux/irqreturn.h> #include <linux/irqnr.h> +#include <linux/irqpriority.h> #include <linux/errno.h> #include <linux/topology.h> #include <linux/wait.h> @@ -289,6 +290,7 @@ static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d) * @irq_retrigger: resend an IRQ to the CPU * @irq_set_type: set the flow type (IRQ_TYPE_LEVEL/etc.) of an IRQ * @irq_set_wake: enable/disable power-management wake-on of an IRQ + * @irq_set_priority: set IRQ priority * @irq_bus_lock: function to lock access to slow bus (i2c) chips * @irq_bus_sync_unlock:function to sync and unlock slow bus (i2c) chips * @irq_cpu_online: configure an interrupt source for a secondary CPU @@ -317,6 +319,7 @@ struct irq_chip { int (*irq_retrigger)(struct irq_data *data); int (*irq_set_type)(struct irq_data *data, unsigned int flow_type); int (*irq_set_wake)(struct irq_data *data, unsigned int on); + int (*irq_set_priority)(struct irq_data *data, irqpriority_t priority);
void (*irq_bus_lock)(struct irq_data *data); void (*irq_bus_sync_unlock)(struct irq_data *data); diff --git a/include/linux/irqpriority.h b/include/linux/irqpriority.h new file mode 100644 index 0000000..cf6bf8d --- /dev/null +++ b/include/linux/irqpriority.h @@ -0,0 +1,24 @@ +#ifndef _LINUX_IRQPRIORITY_H +#define _LINUX_IRQPRIORITY_H + +/** + * enum irqpriority + * @IRQP_HIGH address to low response latency interrupt e.g. error + * signaling + * @IRQP_DEFAULT default priority and set for all interrupt sources + * during interrupt controller initialization + * @IRQP_LOW interrupt which doesn't really care about response + * latency + * @... place for priority extension + */ +enum irqpriority { + IRQP_HIGH = 0, + IRQP_DEFAULT, + IRQP_LOW, + + IRQP_LEVELS_NR +}; + +typedef enum irqpriority irqpriority_t; + +#endif /* _LINUX_IRQPRIORITY_H */ diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index a3bb14f..74a3af8 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -14,6 +14,7 @@ #include <linux/msi.h> #include <linux/module.h> #include <linux/interrupt.h> +#include <linux/irqpriority.h> #include <linux/kernel_stat.h>
#include <trace/events/irq.h> @@ -281,6 +282,14 @@ void unmask_irq(struct irq_desc *desc) } }
+int irq_set_priority(struct irq_desc *desc, irqpriority_t priority) +{ + if (!desc->irq_data.chip->irq_set_priority) + return -ENOSYS; + + return desc->irq_data.chip->irq_set_priority(&desc->irq_data, priority); +} + /* * handle_nested_irq - Handle a nested irq from a irq thread * @irq: the interrupt number diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h index 001fa5b..c264f5f 100644 --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -73,6 +73,7 @@ extern void irq_percpu_enable(struct irq_desc *desc, unsigned int cpu); extern void irq_percpu_disable(struct irq_desc *desc, unsigned int cpu); extern void mask_irq(struct irq_desc *desc); extern void unmask_irq(struct irq_desc *desc); +extern int irq_set_priority(struct irq_desc *desc, irqpriority_t priority);
extern void init_kstat_irqs(struct irq_desc *desc, int node, int nr);
Once interrupt is shared all actions co-related to given interrupt need to have IRQF_HIGH_PRIORITY flag which means that actions have the same priority.
Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org --- include/linux/interrupt.h | 1 + kernel/irq/manage.c | 16 ++++++++++++++-- 2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index 5fa5afe..a566e54 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -71,6 +71,7 @@ #define IRQF_FORCE_RESUME 0x00008000 #define IRQF_NO_THREAD 0x00010000 #define IRQF_EARLY_RESUME 0x00020000 +#define IRQF_HIGH_PRIORITY 0x00040000
#define IRQF_TIMER (__IRQF_TIMER | IRQF_NO_SUSPEND | IRQF_NO_THREAD)
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index 514bcfd..7c80295 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -14,6 +14,7 @@ #include <linux/module.h> #include <linux/random.h> #include <linux/interrupt.h> +#include <linux/irqpriority.h> #include <linux/slab.h> #include <linux/sched.h> #include <linux/sched/rt.h> @@ -1006,11 +1007,12 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) * the same type (level, edge, polarity). So both flag * fields must have IRQF_SHARED set and the bits which * set the trigger type must match. Also all must - * agree on ONESHOT. + * agree on ONESHOT and IRQF_HIGH_PRIORITY. */ if (!((old->flags & new->flags) & IRQF_SHARED) || ((old->flags ^ new->flags) & IRQF_TRIGGER_MASK) || - ((old->flags ^ new->flags) & IRQF_ONESHOT)) + ((old->flags ^ new->flags) & IRQF_ONESHOT) || + ((old->flags ^ new->flags) & IRQF_HIGH_PRIORITY)) goto mismatch;
/* All handlers must agree on per-cpuness */ @@ -1115,6 +1117,16 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) if (new->flags & IRQF_ONESHOT) desc->istate |= IRQS_ONESHOT;
+ if (new->flags & IRQF_HIGH_PRIORITY) { + ret = irq_set_priority(desc, IRQP_HIGH); + if (ret) + goto out_mask; + } else { + ret = irq_set_priority(desc, IRQP_DEFAULT); + if (ret && ret != ENOSYS) + goto out_mask; + } + if (irq_settings_can_autoenable(desc)) irq_startup(desc, true); else
Couple of notes: o new handler has been added irq_set_priority o this could be an example of how to map generic priority level to architecture priority value (see comments in code)
Signed-off-by: Tomasz Nowicki tomasz.nowicki@linaro.org --- drivers/irqchip/irq-gic.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 13b2849..97e919f 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -37,6 +37,7 @@ #include <linux/of_irq.h> #include <linux/irqdomain.h> #include <linux/interrupt.h> +#include <linux/irqpriority.h> #include <linux/percpu.h> #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> @@ -93,6 +94,19 @@ struct irq_chip gic_arch_extn = { .irq_set_wake = NULL, };
+/* + * Map generic interrupt priority levels (irqpriority_t) to GIC_DIST_PRI + * register value. Value should be mapped using table index assignment: + * [priority level] = <vale for GIC_DIST_PRI> which allow us to be compatible + * in case of irqpriority_t (see include/linux/irqpriority.h) further + * modification. + */ +static unsigned int priority_map [IRQP_LEVELS_NR] = { + [IRQP_HIGH] = 0x00, + [IRQP_DEFAULT] = 0xa0, + [IRQP_LOW] = 0xe0, +}; + #ifndef MAX_GIC_NR #define MAX_GIC_NR 1 #endif @@ -332,12 +346,30 @@ static void gic_handle_cascade_irq(unsigned int irq, struct irq_desc *desc) chained_irq_exit(chip, desc); }
+int gic_set_priority(struct irq_data *data, irqpriority_t priority) +{ + unsigned int hw_irq = gic_irq(data); + u32 cur_priority; + + if (hw_irq < 32) + return -EINVAL; + + raw_spin_lock(&irq_controller_lock); + cur_priority = readl_relaxed(gic_dist_base(data) + GIC_DIST_PRI + (hw_irq / 4) * 4); + cur_priority &= ~(0xff << (hw_irq % 4)); + cur_priority |= priority_map[priority] << (hw_irq % 4); + writel_relaxed(cur_priority, gic_dist_base(data) + GIC_DIST_PRI + (hw_irq / 4) * 4); + raw_spin_unlock(&irq_controller_lock); + return 0; +} + static struct irq_chip gic_chip = { .name = "GIC", .irq_mask = gic_mask_irq, .irq_unmask = gic_unmask_irq, .irq_eoi = gic_eoi_irq, .irq_set_type = gic_set_type, + .irq_set_priority = gic_set_priority, .irq_retrigger = gic_retrigger, #ifdef CONFIG_SMP .irq_set_affinity = gic_set_affinity,
Sorry I sent it twice, just added linux-arm-kernel@lists.infradead.org.
Tomasz
W dniu 26.08.2013 17:36, Tomasz Nowicki pisze:
ACPI spec is x86 oriented thus it use NMI for error reporting as low latency error signalling way. Fast response to error occurrence is what ARM trying to satisfy using interrupt prioritization since there is no NMI direct equivalent for ARM architecture.
Patch set are divided into three step:
- Create generic code responsible for setting interrupt priority.
- Use it in interrupt setup.
- Example of interrupt controller priority mapping for GIC platform dependent code.
Patch set tries to meet requirements like:
- not breaking existing code
- easy to expand to new priority levels
- easy to map generic priority levels to platform dependent priority values
See commit logs for more detailed explanation.
On Mon, Aug 26, 2013 at 05:36:22PM +0200, Tomasz Nowicki wrote:
ACPI spec is x86 oriented thus it use NMI for error reporting as low latency error signalling way. Fast response to error occurrence is what ARM trying to satisfy using interrupt prioritization since there is no NMI direct equivalent for ARM architecture.
Patch set are divided into three step:
- Create generic code responsible for setting interrupt priority.
- Use it in interrupt setup.
- Example of interrupt controller priority mapping for GIC platform dependent code.
Patch set tries to meet requirements like:
- not breaking existing code
- easy to expand to new priority levels
- easy to map generic priority levels to platform dependent priority values
Using the GIC interrupt priority mapping is rather sub-optimal for this - on the face of it, it sounds like a sensible solution, and would be with an OS which didn't insist on running all interrupt handlers with IRQs disabled.
Unfortunately, Linux decided that this would be the case for various reasons - not only the issue of kernel stack overflow, but also because of lockdep requirements. What this means is that the best you can do is to control which interrupt of many pending at the same time gets serviced first.
So... the general principle here is that IRQ priority levels are pretty useless in Linux.
In other words, the error reporting won't work while any interrupt handler is running irrespective of how you program the GIC or indeed any interrupt disabled region.
The closest we have to NMI is the FIQ, but having the FIQ call kernel C code and take locks is very problematic: you end up needing the locks and such like to disable FIQs in addition to IRQs, which reduces FIQs down to the same level as IRQs. The other stumbling block here is that the FIQ may only be available to the secure world, and not the non-secure world.
On Mon, Aug 26, 2013 at 06:15:01PM +0100, Russell King - ARM Linux wrote:
On Mon, Aug 26, 2013 at 05:36:22PM +0200, Tomasz Nowicki wrote:
ACPI spec is x86 oriented thus it use NMI for error reporting as low latency error signalling way. Fast response to error occurrence is what ARM trying to satisfy using interrupt prioritization since there is no NMI direct equivalent
On Mon, Aug 26, 2013 at 06:15:01PM +0100, Russell King - ARM Linux wrote:
On Mon, Aug 26, 2013 at 05:36:22PM +0200, Tomasz Nowicki wrote:
ACPI spec is x86 oriented thus it use NMI for error reporting as low latency error signalling way. Fast response to error occurrence is what ARM trying to satisfy using interrupt prioritization since there is no NMI direct equivalent
[ elided reasons why interrupt prioritization is a no-go for linux]
So... the general principle here is that IRQ priority levels are pretty useless in Linux.
In other words, the error reporting won't work while any interrupt handler is running irrespective of how you program the GIC or indeed any interrupt disabled region.
The closest we have to NMI is the FIQ, but having the FIQ call kernel C code and take locks is very problematic: you end up needing the
I think FIQ is a non-starter (at least on arm64 systems) as it will almost certainly be restricted to group0 interrupts used by the EL3 firmware on many implementations.
The System Error Int/Exception (known on armv7 as ABORT) is a bit closer to the semantics that NMI provides on x86 (with the exception that the SEI can be masked by the PSTATE.A flag). I don't believe that linux currently uses SEI's for any generic purpose.
Many of us have been requesting ARM to add a real non-maskable interrupt, but it is claimed that prioritization is sufficient for most needs (a claim with which I disagree).
scott
On Mon, Aug 26, 2013 at 05:29:10PM +0000, Lurndal, Scott wrote:
The System Error Int/Exception (known on armv7 as ABORT) is a bit closer to the semantics that NMI provides on x86 (with the exception that the SEI can be masked by the PSTATE.A flag). I don't believe that linux currently uses SEI's for any generic purpose.
I can only talk about Aarch32.
An abort (which, if you're referring to the PSR A flag is what you are referring to) is much better than an interrupt, because it can occur at any time from any context - except as you note when the A flag masks it. It gets completely out of the "irq priority" and "irq masked so can't receive it" problems.
Many of us have been requesting ARM to add a real non-maskable interrupt, but it is claimed that prioritization is sufficient for most needs (a claim with which I disagree).
So I disagree with wanting a "non-maskable interrupt" - what you want is for the system to raise an _exception_ when something bad happens. As I understand it, this is exactly what happens if you have systems with ECC or parity. There's error codes in the fault status register for prefetch (instruction) and data exceptions to indicate parity errors.
There's support in the architecture for external peripherals and memory subsystems to raise "external aborts" when things go wrong too, which can be precise (happen at the point when the instruction is executed) or imprecise (happen sometime later.)
So, I believe ARM have already done what's necessary: what is more hit and miss is whether any particular vendor implements any of this stuff, and that's not ARM's problem - that's the vendors problem.
OMAP does - OMAP raises aborts if you (for example) access a peripheral when all its clocks are turned off. On the other hand, Marvell Dove wedges the CPU solid when that happens.
Yes, it may not be called an "interrupt" and it may not be a pin on the interrupt controller, but just because x86 does that doesn't mean it's the best solution everywhere.
On Mon, Aug 26, 2013 at 05:29:10PM +0000, Lurndal, Scott wrote:
The System Error Int/Exception (known on armv7 as ABORT) is a bit closer to the semantics that NMI provides on x86 (with the exception that the SEI can be masked by the PSTATE.A flag). I don't believe that linux currently uses SEI's for any generic purpose.
I can only talk about Aarch32.
An abort (which, if you're referring to the PSR A flag is what you are referring to) is much better than an interrupt, because it can occur at any time from any context - except as you note when the A flag masks it. It gets completely out of the "irq priority" and "irq masked so can't receive it" problems.
Yes, the ARMv7 abort condition has been renamed to System Error in ARMv8.
Many of us have been requesting ARM to add a real non-maskable interrupt, but it is claimed that prioritization is sufficient for most needs (a claim with which I disagree).
So I disagree with wanting a "non-maskable interrupt" - what you want is for the system to raise an _exception_ when something bad happens. As I understand it, this is exactly what happens if you have systems with ECC or parity. There's error codes in the fault status register for prefetch (instruction) and data exceptions to indicate parity errors.
Certainly, a synchronous exception is required for things that can be detected as part of the instruction execution (e.g. attempting to read poisoned data from somewhere in the cache hierarchy on a LD instruction).
A bit like an x86 Machine Check Exception.
There's also need for corrected error notification, which is less time critical (more for logging and preventive maintenance) (e.g. the CMCI threshhold stuff in x86, or errors detected by cache or DRAM scrubbers). This should be asynchronous (imprecise).
There's support in the architecture for external peripherals and memory subsystems to raise "external aborts" when things go wrong too, which can be precise (happen at the point when the instruction is executed) or imprecise (happen sometime later.)
Yes, this is my understanding. The syndrome data is a bit loosely defined, in this case, however.
OMAP does - OMAP raises aborts if you (for example) access a peripheral when all its clocks are turned off. On the other hand, Marvell Dove wedges the CPU solid when that happens.
Fortunately (or hopefully), the ARMv8 base server specification has prohibited the latter behavior.
My primary concern about NMI vs. SEI is that SEI has the ability to mask the condition when using SEI to implement a kernel/hypervisor in-band debugging capability (i.e. the ability to interrupt & debug exception handlers, when the corresponding processor state bits mask SEI).
scott
On Mon, Aug 26, 2013 at 06:31:50PM +0000, Lurndal, Scott wrote:
My primary concern about NMI vs. SEI is that SEI has the ability to mask the condition when using SEI to implement a kernel/hypervisor in-band debugging capability (i.e. the ability to interrupt & debug exception handlers, when the corresponding processor state bits mask SEI).
The reason why there's the A bit is to prevent two exceptions in succession causing state to be irrecoverably lost.
Consider the case where you hit a page fault, which raises a data abort. The CPU has just switched to abort mode, and vectored to the data abort handler.
An imprecise abort has been raised by an external peripheral at this point due to a writeback.
At this point, the imprecise aborts are masked by the A bit. This prevents the abort from being raised while unsaved state which would be lost if this was to cause a re-entry into the data abort handler.
So, what happens is that the first data abort is allowed to save its state, and once the handler has saved that state, it can clear the A bit, allowing the imprecise abort to then be safely received.
The failure to have this method of masking means that state is silently and unknowingly corrupted; if you analyse what happens if the A bit didn't exist, when you return from handling the imprecise abort, you return to the beginning of the data abort handler. That much is fine, but the state you're about to save will be the same state as the imprecise abort, which will lead you restart the data abort handling when you finish that first data abort.
So, the A bit is very necessary.
Now, if you were to say that we weren't clearing the PSR A bit in Aarch32 after we'd saved the necessary state, I'd agree with you, and I'd wonder how that's been missed for soo long. :)