Cédric Le Goater clg@kaod.org writes:
The PCI INTx interrupts and other LSI interrupts are handled differently under a sPAPR platform. When the interrupt source characteristics are queried, the hypervisor returns an H_INT_ESB flag to inform the OS that it should be using the H_INT_ESB hcall for interrupt management and not loads and stores on the interrupt ESB pages.
A default -1 value is returned for the addresses of the ESB pages. The driver ignores this condition today and performs a bogus IO mapping. Recent changes and the DEBUG_VM configuration option make the bug visible with :
[ 0.015518] kernel BUG at arch/powerpc/include/asm/book3s/64/pgtable.h:612! [ 0.015578] Oops: Exception in kernel mode, sig: 5 [#1] [ 0.015627] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=1024 NUMA pSeries [ 0.015697] Modules linked in: [ 0.015739] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-0.rc6.git0.1.fc32.ppc64le #1 [ 0.015812] NIP: c000000000f63294 LR: c000000000f62e44 CTR: 0000000000000000 [ 0.015889] REGS: c0000000fa45f0d0 TRAP: 0700 Not tainted (5.4.0-0.rc6.git0.1.fc32.ppc64le) [ 0.015971] MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000424 XER: 00000000 [ 0.016050] CFAR: c000000000f63128 IRQMASK: 0 [ 0.016050] GPR00: c000000000f62e44 c0000000fa45f360 c000000001be5400 0000000000000000 [ 0.016050] GPR04: c0000000019c7d38 c0000000fa340030 00000000fa330009 c000000001c15e18 [ 0.016050] GPR08: 0000000000000040 ffe0000000000000 0000000000000000 8418dd352dbd190f [ 0.016050] GPR12: 0000000000000000 c000000001e00000 c00a000080060000 c00a000080060000 [ 0.016050] GPR16: 0000ffffffffffff 80000000000001ae c000000001c24d98 ffffffffffff0000 [ 0.016050] GPR20: c00a00008007ffff c000000001cafca0 c00a00008007ffff ffffffffffff0000 [ 0.016050] GPR24: c00a000080080000 c00a000080080000 c000000001cafca8 c00a000080080000 [ 0.016050] GPR28: c0000000fa32e010 c00a000080060000 ffffffffffff0000 c0000000fa330000 [ 0.016711] NIP [c000000000f63294] ioremap_page_range+0x4c4/0x6e0 [ 0.016778] LR [c000000000f62e44] ioremap_page_range+0x74/0x6e0 [ 0.016846] Call Trace: [ 0.016876] [c0000000fa45f360] [c000000000f62e44] ioremap_page_range+0x74/0x6e0 (unreliable) [ 0.016969] [c0000000fa45f460] [c0000000000934bc] do_ioremap+0x8c/0x120 [ 0.017037] [c0000000fa45f4b0] [c0000000000938e8] __ioremap_caller+0x128/0x140 [ 0.017116] [c0000000fa45f500] [c0000000000931a0] ioremap+0x30/0x50 [ 0.017184] [c0000000fa45f520] [c0000000000d1380] xive_spapr_populate_irq_data+0x170/0x260 [ 0.017263] [c0000000fa45f5c0] [c0000000000cc90c] xive_irq_domain_map+0x8c/0x170 [ 0.017344] [c0000000fa45f600] [c000000000219124] irq_domain_associate+0xb4/0x2d0 [ 0.017424] [c0000000fa45f690] [c000000000219fe0] irq_create_mapping+0x1e0/0x3b0 [ 0.017506] [c0000000fa45f730] [c00000000021ad6c] irq_create_fwspec_mapping+0x27c/0x3e0 [ 0.017586] [c0000000fa45f7c0] [c00000000021af68] irq_create_of_mapping+0x98/0xb0 [ 0.017666] [c0000000fa45f830] [c0000000008d4e48] of_irq_parse_and_map_pci+0x168/0x230 [ 0.017746] [c0000000fa45f910] [c000000000075428] pcibios_setup_device+0x88/0x250 [ 0.017826] [c0000000fa45f9a0] [c000000000077b84] pcibios_setup_bus_devices+0x54/0x100 [ 0.017906] [c0000000fa45fa10] [c0000000000793f0] __of_scan_bus+0x160/0x310 [ 0.017973] [c0000000fa45faf0] [c000000000075fc0] pcibios_scan_phb+0x330/0x390 [ 0.018054] [c0000000fa45fba0] [c00000000139217c] pcibios_init+0x8c/0x128 [ 0.018121] [c0000000fa45fc20] [c0000000000107b0] do_one_initcall+0x60/0x2c0 [ 0.018201] [c0000000fa45fcf0] [c000000001384624] kernel_init_freeable+0x290/0x378 [ 0.018280] [c0000000fa45fdb0] [c000000000010d24] kernel_init+0x2c/0x148 [ 0.018348] [c0000000fa45fe20] [c00000000000bdbc] ret_from_kernel_thread+0x5c/0x80 [ 0.018427] Instruction dump: [ 0.018468] 41820014 3920fe7f 7d494838 7d290074 7929d182 f8e10038 69290001 0b090000 [ 0.018552] 7a098420 0b090000 7bc95960 7929a802 <0b090000> 7fc68b78 e8610048 7dc47378
I hit this too, and your patch works for me. I can't claim to understand it, but I can verify it! :)
Tested-by: Daniel Axtens dja@axtens.net
Regards, Daniel
Cc: stable@vger.kernel.org # v4.14+ Fixes: bed81ee181dd ("powerpc/xive: introduce H_INT_ESB hcall") Signed-off-by: Cédric Le Goater clg@kaod.org
arch/powerpc/sysdev/xive/spapr.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c index 33c10749edec..55dc61cb4867 100644 --- a/arch/powerpc/sysdev/xive/spapr.c +++ b/arch/powerpc/sysdev/xive/spapr.c @@ -392,20 +392,28 @@ static int xive_spapr_populate_irq_data(u32 hw_irq, struct xive_irq_data *data) data->esb_shift = esb_shift; data->trig_page = trig_page;
- data->hw_irq = hw_irq;
- /*
*/ data->src_chip = XIVE_INVALID_CHIP_ID;
- No chip-id for the sPAPR backend. This has an impact how we
- pick a target. See xive_pick_irq_target().
- /*
* When the H_INT_ESB flag is set, the H_INT_ESB hcall should
* be used for interrupt management. Skip the remapping of the
* ESB pages which are not available.
*/
- if (data->flags & XIVE_IRQ_FLAG_H_INT_ESB)
return 0;
- data->eoi_mmio = ioremap(data->eoi_page, 1u << data->esb_shift); if (!data->eoi_mmio) { pr_err("Failed to map EOI page for irq 0x%x\n", hw_irq); return -ENOMEM; }
- data->hw_irq = hw_irq;
- /* Full function page supports trigger */ if (flags & XIVE_SRC_TRIGGER) { data->trig_mmio = data->eoi_mmio;
-- 2.21.0