Hi Marc,
On Fri, Sep 10, 2021 at 12:23 PM Marc Zyngier maz@kernel.org wrote:
On Thu, 09 Sep 2021 16:22:01 +0100, Geert Uytterhoeven geert@linux-m68k.org wrote:
On Wed, Jun 24, 2020 at 9:59 PM Marc Zyngier maz@kernel.org wrote:
The GIC driver uses a RMW sequence to update the affinity, and relies on the gic_lock_irqsave/gic_unlock_irqrestore sequences to update it atomically.
But these sequences only expend into anything meaningful if the BL_SWITCHER option is selected, which almost never happens.
It also turns out that using a RMW and locks is just as silly, as the GIC distributor supports byte accesses for the GICD_TARGETRn registers, which when used make the update atomic by definition.
Drop the terminally broken code and replace it by a byte write.
Fixes: 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature") Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier maz@kernel.org
Thanks for your patch, which is now commit 005c34ae4b44f085 ("irqchip/gic: Atomically update affinity"), to which I bisected a hard lock-up during boot on the Renesas EMMA Mobile EV2-based KZM-A9-Dual board, which has a dual Cortex-A9 with PL390.
Despite the ARM Generic Interrupt Controller Architecture Specification (both version 1.0 and 2.0) stating that the Interrupt Processor Targets Registers are byte-accessible, the EMMA Mobile EV2 User's Manual states that the interrupt registers can be accessed via the APB bus, in 32-bit units. Using byte accesses locks up the system.
Urgh. That is definitely a pretty poor integration. How about the priority registers? I guess they suffer from the same issue...
Yes, they do.
Unfortunately I only have remote access to the board showing the issue. I did check that adding the writeb_relaxed() before the writel_relaxed() that was used before also causes a lock-up, so the issue is not an endian mismatch. Looking at the driver history, these registers have always been accessed using 32-bit accesses before. As byte accesses lead indeed to simpler code, I'm wondering if they had been tried before, and caused issues before?
Not that I know. A lock was probably fine on a two CPU system. Less so on a busy 8 CPU machine where interrupts are often migrated. The GIC architecture makes a point in not requiring locking for most of the registers that can be accessed concurrently.
Since you said the locking was bogus before, due to the reliance on the BL_SWITCHER option, I'm not suggesting a plain revert, but I'm wondering what kind of locking you suggest to use instead?
There isn't much we can do aside from reintroducing the RMW+spinlock approach, and for real this time. It would have to be handled as a quirk though, as I'm not keen on reintroducing this for all systems.
I wrote the patchlet below, which is totally untested. Please give it a go and let me know if it helps.
Thanks for your quick response! Your solution works, after making a few small modifications.
--- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -774,6 +776,25 @@ static int gic_pm_init(struct gic_chip_data *gic) #endif
#ifdef CONFIG_SMP +static void rmw_writeb(u8 bval, void __iomem *addr) +{
static DEFINE_RAW_SPINLOCK(rmw_lock);
unsigned long offset = (unsigned long)addr & ~3UL;
Please drop the tilde.
unsigned long shift = offset * 8;
"unsigned int" is sufficient for offset and size.
unsigned long flags;
u32 val;
raw_spin_lock_irqsave(&rmw_lock, flags);
addr -= offset;
val = readl_relaxed(addr);
val &= ~(0xffUL << shift);
No need for the UL suffix.
val |= (u32)bval << shift;
No need for the cast.
writel_relaxed(val, addr);
raw_spin_unlock_irqrestore(&rmw_lock, flags);
+}
static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val, bool force) {
@@ -1375,6 +1399,29 @@ static bool gic_check_eoimode(struct device_node *node, void __iomem **base) return true; }
+static bool gic_enable_rmw_access(void *data) +{
/*
* The EMEV2 class of machines has a broken interconnect, and
* locks up on accesses that are less than 32bit. So far, only
* the affinity setting requires it.
*/
if (of_machine_is_compatible("renesas,emev2")) {
static_branch_enable(&needs_rmw_access);
return true;
}
return false;
+}
+static const struct gic_quirk gic_quirks[] = {
{
.desc = "Implementation with broken byte access",
The output would look better without capitalizing the first word. I think you can drop the first two words, saving some space:
GIC: enabling workaround for broken byte access
.compatible = "arm,pl390",
.init = gic_enable_rmw_access,
},
Missing "{ /* sentinel */ }".
+};
static int gic_of_setup(struct gic_chip_data *gic, struct device_node *node) { if (!gic || !node)
Gr{oetje,eeting}s,
Geert