Failed boards:
cubie2 sunxi_defconfig : FAILED 1:20.72 hummingboard multi_v7_defconfig : FAILED 1:04.66 panda multi_v7_defconfig : FAILED 1:36.13 snowball multi_v7_defconfig : FAILED 1:13.33 wandboard multi_v7_defconfig : FAILED 1:03.73
Successful boards:
apc vt8500_v6_v7_defconfig : passed 1:19.42 arndale exynos_defconfig : passed 1:11.87 bbb omap2plus_defconfig : passed 1:42.18 bbb multi_v7_defconfig : passed 1:19.92 beaver tegra_defconfig : passed 0:55.90 beaver multi_v7_defconfig : passed 0:47.47 capri bcm_defconfig : passed 0:46.07 capri multi_v7_defconfig : passed 0:44.10 cubie sunxi_defconfig : passed 0:52.92 cubie multi_v7_defconfig : passed 0:59.36 cubie2 multi_v7_defconfig : passed 1:02.06 cubie2 multi_lpae_defconfig : passed 0:51.29 dalmore tegra_defconfig : passed 1:19.83 dalmore multi_v7_defconfig : passed 1:15.50 dalmore multi_lpae_defconfig : passed 1:16.91 hummingboard imx_v6_v7_defconfig : passed 1:23.67 omap5uevm omap2plus_defconfig : warnings 1:03.83 omap5uevm multi_v7_defconfig : warnings 1:41.24 omap5uevm multi_lpae_defconfig : warnings 1:38.53 panda omap2plus_defconfig : warnings 1:14.19 sama5 sama5_defconfig : passed 1:59.50 seaboard tegra_defconfig : passed 1:04.55 seaboard multi_v7_defconfig : passed 1:07.14 snow exynos_defconfig : passed 1:32.08 snowball u8500_defconfig : passed 1:40.75 trimslice tegra_defconfig : passed 1:02.08 trimslice multi_v7_defconfig : passed 1:04.64 wandboard imx_v6_v7_defconfig : passed 0:59.17
Offline boards:
Board legend is available at http://arm-soc.lixom.net/boards.html
Last entries of failed logs below:
========================================================================
Board cubie2-sunxi_defconfig failure log: -------------------------------------------------
[ 0.017830] platform ahci-5v.2: Driver reg-fixed-voltage requests probe deferral [ 0.017853] reg-fixed-voltage usb1-vbus.3: could not find pctldev for node /soc@01c00000/pinctrl@01c20800/usb1_vbus_pin@0, deferring probe [ 0.017867] platform usb1-vbus.3: Driver reg-fixed-voltage requests probe deferral [ 0.017889] reg-fixed-voltage usb2-vbus.4: could not find pctldev for node /soc@01c00000/pinctrl@01c20800/usb2_vbus_pin@0, deferring probe [ 0.017902] platform usb2-vbus.4: Driver reg-fixed-voltage requests probe deferral [ 0.018844] Switched to clocksource arch_sys_counter [ 0.025950] NET: Registered protocol family 2 [ 0.026420] TCP established hash table entries: 8192 (order: 3, 32768 bytes) [ 0.026508] TCP bind hash table entries: 8192 (order: 4, 65536 bytes) [ 0.026629] TCP: Hash tables configured (established 8192 bind 8192) [ 0.026714] TCP: reno registered [ 0.026728] UDP hash table entries: 512 (order: 2, 16384 bytes) [ 0.026784] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes) [ 0.026986] NET: Registered protocol family 1 [ 0.027385] RPC: Registered named UNIX socket transport module. [ 0.027399] RPC: Registered udp transport module. [ 0.027405] RPC: Registered tcp transport module. [ 0.027411] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 0.028380] futex hash table entries: 512 (order: 3, 32768 bytes) [ 0.028946] bounce pool size: 64 pages [ 0.036731] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253) [ 0.036748] io scheduler noop registered [ 0.036755] io scheduler deadline registered [ 0.036940] io scheduler cfq registered (default) [ 0.038581] sunxi-pinctrl 1c20800.pinctrl: initialized sunXi PIO driver [ 0.080430] Serial: 8250/16550 driver, 8 ports, IRQ sharing disabled [ 0.082866] console [ttyS0] disabled [ 0.103071] 1c28000.serial: ttyS0 at MMIO 0x1c28000 (irq = 33, base_baud = 1500000) is a U6_16550A [ 0.618073] console [ttyS0] enabled [ 0.622789] mousedev: PS/2 mouse device common for all mice [ 0.628614] i2c /dev entries driver [ 0.633491] sunxi-wdt 1c20c90.watchdog: Watchdog enabled (timeout=16 sec, nowayout=0) [ 0.642490] TCP: cubic registered [ 0.645813] NET: Registered protocol family 17 [ 0.650374] Registering SWP/SWPB emulation handler [ 0.656082] ahci-5v: 5000 mV [ 0.659375] usb1-vbus: 5000 mV [ 0.662737] usb2-vbus: 5000 mV ~$off # PYBOOT: Exception: timeout
========================================================================
Board hummingboard-multi_v7_defconfig failure log: -------------------------------------------------
C1 U-Boot > tftp 0x10800000 next/next-20140409/multi_v7_defconfig/zImage tftp 0x10800000 next/next-20140409/multi_v7_defconfig/zImage Using FEC device TFTP from server 172.16.1.3; our IP address is 172.16.1.111 Filename 'next/next-20140409/multi_v7_defconfig/zImage'. Load address: 0x10800000 Loading: *################################################################# ################################################################# ################################################################# ################################################################# ######################################################## 2 MiB/s done Bytes transferred = 4631744 (46acc0 hex) C1 U-Boot >setenv serverip 172.16.1.3 setenv serverip 172.16.1.3 C1 U-Boot > tftp 0x11000000 next/next-20140409/multi_v7_defconfig/dtbs/imx6dl-hummingboard.dtb tftp 0x11000000 next/next-20140409/multi_v7_defconfig/dtbs/imx6dl-hummingboard.dtb Using FEC device TFTP from server 172.16.1.3; our IP address is 172.16.1.111 Filename 'next/next-20140409/multi_v7_defconfig/dtbs/imx6dl-hummingboard.dtb'. Load address: 0x11000000 Loading: *## 745.1 KiB/s done Bytes transferred = 25956 (6564 hex) C1 U-Boot > printenv bootargs printenv bootargs bootargs=console=ttymxc0,115200 root=/dev/mmcblk0p2 rootwait debug earlyprintk C1 U-Boot > bootz 0x10800000 - 0x11000000 bootz 0x10800000 - 0x11000000 Kernel image @ 0x10800000 [ 0x000000 - 0x46acc0 ] ## Flattened Device Tree blob at 11000000 Booting using the fdt blob at 0x11000000 Using Device Tree in place at 11000000, end 11009563
Starting kernel ...
~$off # PYBOOT: Exception: timeout
========================================================================
Board panda-multi_v7_defconfig failure log: -------------------------------------------------
[ 1.617340] omap_hsmmc 4809c000.mmc: unable to get vmmc regulator -517 [ 1.624237] platform 4809c000.mmc: Driver omap_hsmmc requests probe deferral [ 1.631927] omap_hsmmc 480d5000.mmc: unable to get vmmc regulator -517 [ 1.638824] platform 480d5000.mmc: Driver omap_hsmmc requests probe deferral [ 1.646728] sdhci-pltfm: SDHCI platform and OF driver helper [ 1.653594] usbcore: registered new interface driver usbhid [ 1.659454] usbhid: USB HID core driver [ 1.667358] TCP: cubic registered [ 1.670867] NET: Registered protocol family 17 [ 1.675720] Key type dns_resolver registered [ 1.680664] Power Management for TI OMAP4+ devices. [ 1.685821] Power Management for TI OMAP4. [ 1.690124] OMAP4 PM: u-boot >= v2012.07 is required for full PM support [ 1.697265] Registering SWP/SWPB emulation handler [ 1.703277] vwl1271: 1800 mV [ 1.707733] Skipping twl internal clock init and using bootloader value (unknown osc rate) [ 1.717651] twl 0-0048: PIH (irq 39) nested IRQs [ 1.723327] twl_rtc rtc.15: Power up reset detected. [ 1.729187] twl_rtc rtc.15: Enabling TWL-RTC [ 1.736145] twl_rtc rtc.15: rtc core: registered rtc.15 as rtc0 [ 1.743103] VAUX1_6030: 1000 <--> 3000 mV at 2800 mV [ 1.749176] VAUX2_6030: 1200 <--> 2800 mV at 1800 mV [ 1.755279] VAUX3_6030: 1000 <--> 3000 mV at 1200 mV [ 1.761444] VMMC: 1200 <--> 3000 mV at 3000 mV [ 1.766937] VPP: 1800 <--> 2500 mV at 1900 mV [ 1.772399] VUSIM: 1200 <--> 2900 mV at 1800 mV [ 1.777404] VDAC: 1800 mV [ 1.781158] VANA: 2100 mV [ 1.784698] VCXIO: 1800 mV [ 1.784698] VUSB: 3300 mV [ 1.791656] V1V8: 1800 mV [ 1.795196] V2V1: 2100 mV [ 1.879302] usb 1-1: new high-speed USB device number 2 using ehci-omap [ 2.049896] hub 1-1:1.0: USB hub found [ 2.053985] hub 1-1:1.0: 5 ports detected [ 2.356109] usb 1-1.1: new high-speed USB device number 3 using ehci-omap [ 2.671478] smsc95xx v1.0.4 [ 2.744995] smsc95xx 1-1.1:1.0 eth0: register 'smsc95xx' at usb-4a064c00.ehci-1.1, smsc95xx USB 2.0 Ethernet, 86:25:f9:08:f1:7b ~$off # PYBOOT: Exception: timeout
========================================================================
Board snowball-multi_v7_defconfig failure log: -------------------------------------------------
BOOTP broadcast 1 *** Unhandled DHCP Option in OFFER/ACK: 28 *** Unhandled DHCP Option in OFFER/ACK: 28 DHCP client bound to address 172.16.1.157 U8500 $ setenv serverip 172.16.1.3 setenv serverip 172.16.1.3 U8500 $ tftp 0x00100000 tmp/snowball-kbfD_h/tmp0N8RpD-uImage tftp 0x00100000 tmp/snowball-kbfD_h/tmp0N8RpD-uImage smc911x: detected LAN9221 controller smc911x: phy initialized smc911x: MAC 08:00:08:1e:0f:44 Using smc911x-0 device TFTP from server 172.16.1.3; our IP address is 172.16.1.157 Filename 'tmp/snowball-kbfD_h/tmp0N8RpD-uImage'. Load address: 0x100000 Loading: *################################################################# ################################################################# ################################################################# ################################################################# ########################################################### done Bytes transferred = 4673057 (474e21 hex) U8500 $ printenv bootargs printenv bootargs bootargs=console=ttyAMA2,115200 root=/dev/mmcblk0p4 rootwait rw debug earlyprintk U8500 $ bootm 0x00100000 bootm 0x00100000 ## Booting kernel from Legacy Image at 00100000 ... Image Name: Linux Image Type: ARM Linux Kernel Image (uncompressed) Data Size: 4672993 Bytes = 4.5 MB Load Address: 00008000 Entry Point: 00008000 Loading Kernel Image ... OK OK
Starting kernel ...
~$off # PYBOOT: Exception: timeout
========================================================================
Board wandboard-multi_v7_defconfig failure log: -------------------------------------------------
=> tftp 0x10800000 next/next-20140409/multi_v7_defconfig/zImage tftp 0x10800000 next/next-20140409/multi_v7_defconfig/zImage Using FEC device TFTP from server 172.16.1.3; our IP address is 172.16.1.110 Filename 'next/next-20140409/multi_v7_defconfig/zImage'. Load address: 0x10800000 Loading: *################################################################# ################################################################# ################################################################# ################################################################# ######################################################## 6.1 MiB/s done Bytes transferred = 4631744 (46acc0 hex) => setenv serverip 172.16.1.3 setenv serverip 172.16.1.3 => tftp 0x11000000 next/next-20140409/multi_v7_defconfig/dtbs/imx6q-wandboard.dtb tftp 0x11000000 next/next-20140409/multi_v7_defconfig/dtbs/imx6q-wandboard.dtb Using FEC device TFTP from server 172.16.1.3; our IP address is 172.16.1.110 Filename 'next/next-20140409/multi_v7_defconfig/dtbs/imx6q-wandboard.dtb'. Load address: 0x11000000 Loading: *### 1.1 MiB/s done Bytes transferred = 29394 (72d2 hex) => printenv bootargs printenv bootargs bootargs=console=ttymxc0,115200 root=/dev/mmcblk0p2 rootwait debug => bootz 0x10800000 - 0x11000000 bootz 0x10800000 - 0x11000000 Kernel image @ 0x10800000 [ 0x000000 - 0x46acc0 ] ## Flattened Device Tree blob at 11000000 Booting using the fdt blob at 0x11000000 Using Device Tree in place at 11000000, end 1100a2d1
Starting kernel ...
~$off # PYBOOT: Exception: timeout
On Wed, Apr 09, 2014 at 01:14:51AM -0700, Olof's autobooter wrote:
Failed boards:
cubie2 sunxi_defconfig : FAILED 1:20.72 hummingboard multi_v7_defconfig : FAILED 1:04.66 panda multi_v7_defconfig : FAILED 1:36.13 snowball multi_v7_defconfig : FAILED 1:13.33 wandboard multi_v7_defconfig : FAILED 1:03.73
Well, it looks like the Dove fix isn't the full story, and boards are still broken even with the PJ4 fix in place.
On Wed, Apr 9, 2014 at 5:31 AM, Russell King - ARM Linux linux@arm.linux.org.uk wrote:
On Wed, Apr 09, 2014 at 01:14:51AM -0700, Olof's autobooter wrote:
Failed boards:
cubie2 sunxi_defconfig : FAILED 1:20.72 hummingboard multi_v7_defconfig : FAILED 1:04.66 panda multi_v7_defconfig : FAILED 1:36.13 snowball multi_v7_defconfig : FAILED 1:13.33 wandboard multi_v7_defconfig : FAILED 1:03.73
Well, it looks like the Dove fix isn't the full story, and boards are still broken even with the PJ4 fix in place.
mx6 is booting fine with multi_v7_defconfig here (and also in Kevin's boot system).
As Kevin pointed out, the issue Olof is seeing is probably due to dtb and zImage overlap, so he needs to properly adjust the loadaddr/fdt_addr.
I am using: loadaddr=0x12000000 fdt_addr=0x18000000
On Wed, Apr 9, 2014 at 6:33 AM, Fabio Estevam festevam@gmail.com wrote:
On Wed, Apr 9, 2014 at 5:31 AM, Russell King - ARM Linux linux@arm.linux.org.uk wrote:
On Wed, Apr 09, 2014 at 01:14:51AM -0700, Olof's autobooter wrote:
Failed boards:
cubie2 sunxi_defconfig : FAILED 1:20.72 hummingboard multi_v7_defconfig : FAILED 1:04.66 panda multi_v7_defconfig : FAILED 1:36.13 snowball multi_v7_defconfig : FAILED 1:13.33 wandboard multi_v7_defconfig : FAILED 1:03.73
Well, it looks like the Dove fix isn't the full story, and boards are still broken even with the PJ4 fix in place.
mx6 is booting fine with multi_v7_defconfig here (and also in Kevin's boot system).
As Kevin pointed out, the issue Olof is seeing is probably due to dtb and zImage overlap, so he needs to properly adjust the loadaddr/fdt_addr.
Yes. I suspect it's load addr related since I ran into similar failures.
Also note that next/master doesn't have MACH_DOVE enabled (and thus doesn't have CONFIG_CPU_PJ4 enabled) because arm-soc/for-next still has a revert of the MACH_DOVE change while we were waiting for the PJ4 fixes to be merged. Now that you've applied them, we'll drop this revert from arm-soc/for-next.
Kevin
On Wed, Apr 9, 2014 at 8:33 AM, Kevin Hilman khilman@linaro.org wrote:
On Wed, Apr 9, 2014 at 6:33 AM, Fabio Estevam festevam@gmail.com wrote:
On Wed, Apr 9, 2014 at 5:31 AM, Russell King - ARM Linux linux@arm.linux.org.uk wrote:
On Wed, Apr 09, 2014 at 01:14:51AM -0700, Olof's autobooter wrote:
Failed boards:
cubie2 sunxi_defconfig : FAILED 1:20.72 hummingboard multi_v7_defconfig : FAILED 1:04.66 panda multi_v7_defconfig : FAILED 1:36.13 snowball multi_v7_defconfig : FAILED 1:13.33 wandboard multi_v7_defconfig : FAILED 1:03.73
Well, it looks like the Dove fix isn't the full story, and boards are still broken even with the PJ4 fix in place.
mx6 is booting fine with multi_v7_defconfig here (and also in Kevin's boot system).
As Kevin pointed out, the issue Olof is seeing is probably due to dtb and zImage overlap, so he needs to properly adjust the loadaddr/fdt_addr.
Yes. I suspect it's load addr related since I ran into similar failures.
I've moved loadaddrs around, and the i.MX platforms and panda all boot now (1 out of 1 try).
Snowball still has an issue with multi_v7 and cubie2 with sunxi_defconfig due to out-of-date u-boot (don't get me started on that....). I'll bisect snowball shortly.
-Olof
On Wed, Apr 9, 2014 at 4:37 PM, Olof Johansson olof@lixom.net wrote:
On Wed, Apr 9, 2014 at 8:33 AM, Kevin Hilman khilman@linaro.org wrote:
On Wed, Apr 9, 2014 at 6:33 AM, Fabio Estevam festevam@gmail.com wrote:
On Wed, Apr 9, 2014 at 5:31 AM, Russell King - ARM Linux linux@arm.linux.org.uk wrote:
On Wed, Apr 09, 2014 at 01:14:51AM -0700, Olof's autobooter wrote:
Failed boards:
cubie2 sunxi_defconfig : FAILED 1:20.72 hummingboard multi_v7_defconfig : FAILED 1:04.66 panda multi_v7_defconfig : FAILED 1:36.13 snowball multi_v7_defconfig : FAILED 1:13.33 wandboard multi_v7_defconfig : FAILED 1:03.73
Well, it looks like the Dove fix isn't the full story, and boards are still broken even with the PJ4 fix in place.
mx6 is booting fine with multi_v7_defconfig here (and also in Kevin's boot system).
As Kevin pointed out, the issue Olof is seeing is probably due to dtb and zImage overlap, so he needs to properly adjust the loadaddr/fdt_addr.
Yes. I suspect it's load addr related since I ran into similar failures.
I've moved loadaddrs around, and the i.MX platforms and panda all boot now (1 out of 1 try).
Snowball still has an issue with multi_v7 and cubie2 with sunxi_defconfig due to out-of-date u-boot (don't get me started on that....). I'll bisect snowball shortly.
snowball with multi_v7_defconfig is probably the uImage load addr and entry point.
Recently, _text has a 2M offset since adding qcom platforms to multi_v7, so the snowball uImage has to be created accordingly.
I added a feature to pyboot to get the uImage loadaddr/entrypoint from _text in System.map.
Kevin
On Wed, Apr 09, 2014 at 04:37:43PM -0700, Olof Johansson wrote:
Snowball still has an issue with multi_v7 and cubie2 with sunxi_defconfig due to out-of-date u-boot (don't get me started on that....). I'll bisect snowball shortly.
I can start you on that if you want.
Seriously, it seems like we have only *wrong* options here.
- Either we do it the old fashion way, with the smp_ops. But according to Xen/KVM people, it's a no-go, because of HYP being non-secure, you have to implement PSCI, and set up the proper arch timer frequency in the bootloader.
- Or, we do push everything in the bootloader, but then, *you* rant because you have to update the bootloader. - Or, we do nothing, but then, everyone is displeased because there's no mainline support / <vendor> does his kernel completely out of tree.
Please advise on the best course of action.
I'm sorry, but the result of this is what you gain when you want to push more logic in the bootloader. It doesn't matter wether it's a good or a bad reason, but it's a fact that you become dependant of the bootloader (and it's the whole point).
On Wed, Apr 09, 2014 at 08:33:56AM -0700, Kevin Hilman wrote:
On Wed, Apr 9, 2014 at 6:33 AM, Fabio Estevam festevam@gmail.com wrote:
On Wed, Apr 9, 2014 at 5:31 AM, Russell King - ARM Linux linux@arm.linux.org.uk wrote:
On Wed, Apr 09, 2014 at 01:14:51AM -0700, Olof's autobooter wrote:
Failed boards:
cubie2 sunxi_defconfig : FAILED 1:20.72 hummingboard multi_v7_defconfig : FAILED 1:04.66 panda multi_v7_defconfig : FAILED 1:36.13 snowball multi_v7_defconfig : FAILED 1:13.33 wandboard multi_v7_defconfig : FAILED 1:03.73
Well, it looks like the Dove fix isn't the full story, and boards are still broken even with the PJ4 fix in place.
mx6 is booting fine with multi_v7_defconfig here (and also in Kevin's boot system).
As Kevin pointed out, the issue Olof is seeing is probably due to dtb and zImage overlap, so he needs to properly adjust the loadaddr/fdt_addr.
Yes. I suspect it's load addr related since I ran into similar failures.
Also note that next/master doesn't have MACH_DOVE enabled (and thus doesn't have CONFIG_CPU_PJ4 enabled) because arm-soc/for-next still has a revert of the MACH_DOVE change while we were waiting for the PJ4 fixes to be merged. Now that you've applied them, we'll drop this revert from arm-soc/for-next.
So now that the remainder of arm-soc has merged into mainline, I'm seeing a combined Versatile Express + OMAP4 kernel failing to boot on OMAP4, while it boots fine on Versatile Express.
So no, this has nothing to do with PJ4 (neither iwmmxt nor pj4-cp0 appear in the build log), and it has nothing to do with load addresses. It looks like some investigation is required and that there is some breakage with what was remaining in arm-soc after all.
On Thu, Apr 10, 2014 at 10:10:32AM +0100, Russell King - ARM Linux wrote:
On Wed, Apr 09, 2014 at 08:33:56AM -0700, Kevin Hilman wrote:
On Wed, Apr 9, 2014 at 6:33 AM, Fabio Estevam festevam@gmail.com wrote:
On Wed, Apr 9, 2014 at 5:31 AM, Russell King - ARM Linux linux@arm.linux.org.uk wrote:
On Wed, Apr 09, 2014 at 01:14:51AM -0700, Olof's autobooter wrote:
Failed boards:
cubie2 sunxi_defconfig : FAILED 1:20.72 hummingboard multi_v7_defconfig : FAILED 1:04.66 panda multi_v7_defconfig : FAILED 1:36.13 snowball multi_v7_defconfig : FAILED 1:13.33 wandboard multi_v7_defconfig : FAILED 1:03.73
Well, it looks like the Dove fix isn't the full story, and boards are still broken even with the PJ4 fix in place.
mx6 is booting fine with multi_v7_defconfig here (and also in Kevin's boot system).
As Kevin pointed out, the issue Olof is seeing is probably due to dtb and zImage overlap, so he needs to properly adjust the loadaddr/fdt_addr.
Yes. I suspect it's load addr related since I ran into similar failures.
Also note that next/master doesn't have MACH_DOVE enabled (and thus doesn't have CONFIG_CPU_PJ4 enabled) because arm-soc/for-next still has a revert of the MACH_DOVE change while we were waiting for the PJ4 fixes to be merged. Now that you've applied them, we'll drop this revert from arm-soc/for-next.
So now that the remainder of arm-soc has merged into mainline, I'm seeing a combined Versatile Express + OMAP4 kernel failing to boot on OMAP4, while it boots fine on Versatile Express.
So no, this has nothing to do with PJ4 (neither iwmmxt nor pj4-cp0 appear in the build log), and it has nothing to do with load addresses. It looks like some investigation is required and that there is some breakage with what was remaining in arm-soc after all.
It looks like this is caused by something in the depths of OMAPDSS. I see a failure to allocate an order-9 page (2MB), which then provokes the OMAP DSS cleanup paths.
We get to dss_dispc_uninitialize_irq(), which calls devm_free_irq(). Because CONFIG_DEBUG_SHIRQ is enabled, __free_irq ends up calling the handler, and we hang somewhere in the handler.
One for Tomi I think.
My previous boots have succeeded inspite of the allocation failure, so something has changed between 18a1a7a1d862 and a7963eb7f4c4 (Linus' commits) which has caused this to hang.
The last few messages from my debugging are:
omapfb omapfb: failed to allocate framebuffer omapfb omapfb: failed to allocate fbmem omapdss_compat_uninit() APPLY: dss_dispc_uninitialize_irq() omapdss_dispc 58001000.dispc: find_dr: dr=c1363dc0 devm_irq_match: irq 57,57 data c0c0a350,c0c0a350 devm_free_irq: 57 c0c0a350 genirq: __free_irq: calling omap_dispc_irq_handler+0x0/0x11c
Hi,
On 10/04/14 14:33, Russell King - ARM Linux wrote:
It looks like this is caused by something in the depths of OMAPDSS. I see a failure to allocate an order-9 page (2MB), which then provokes the OMAP DSS cleanup paths.
Is CONFIG_CMA disabled? I don't think we're able to allocate the framebuffer without CMA, except for very very small framebuffers.
We get to dss_dispc_uninitialize_irq(), which calls devm_free_irq(). Because CONFIG_DEBUG_SHIRQ is enabled, __free_irq ends up calling the handler, and we hang somewhere in the handler.
One for Tomi I think.
My previous boots have succeeded inspite of the allocation failure, so something has changed between 18a1a7a1d862 and a7963eb7f4c4 (Linus' commits) which has caused this to hang.
The last few messages from my debugging are:
omapfb omapfb: failed to allocate framebuffer omapfb omapfb: failed to allocate fbmem omapdss_compat_uninit() APPLY: dss_dispc_uninitialize_irq() omapdss_dispc 58001000.dispc: find_dr: dr=c1363dc0 devm_irq_match: irq 57,57 data c0c0a350,c0c0a350 devm_free_irq: 57 c0c0a350 genirq: __free_irq: calling omap_dispc_irq_handler+0x0/0x11c
The dss irq handler presumes that the DSS hardware is enabled, which is not the case at uninitialize time. So it crashes when the handler tries to access DISPC registers.
This doesn't happen normally, as the DSS IRQ is only shared between two DSS submodules, DISPC and DSI, and when one of them is enabled, effectively both are enabled. But even so, the code is not correct.
That said, I don't understand why it breaks now but not earlier, nothing has changed around that. Hmm, except now we use proper DT bindings, so the IRQ comes from DT. But I don't see why that would affect this.
I wonder what's the correct way to handle shared interrupts... Should I always keep the HW enabled when an irq handler is registered, or should I check whether the HW is enabled or not in the irq handler?
The latter sounds like an easy source for race issues, so I guess the former is better. It means doing request/free_irq at runtime instead or probe time, based on whether a display is enabled or not, though.
Tomi
On Thu, Apr 10, 2014 at 03:36:56PM +0300, Tomi Valkeinen wrote:
Hi,
On 10/04/14 14:33, Russell King - ARM Linux wrote:
It looks like this is caused by something in the depths of OMAPDSS. I see a failure to allocate an order-9 page (2MB), which then provokes the OMAP DSS cleanup paths.
Is CONFIG_CMA disabled? I don't think we're able to allocate the framebuffer without CMA, except for very very small framebuffers.
We get to dss_dispc_uninitialize_irq(), which calls devm_free_irq(). Because CONFIG_DEBUG_SHIRQ is enabled, __free_irq ends up calling the handler, and we hang somewhere in the handler.
One for Tomi I think.
My previous boots have succeeded inspite of the allocation failure, so something has changed between 18a1a7a1d862 and a7963eb7f4c4 (Linus' commits) which has caused this to hang.
The last few messages from my debugging are:
omapfb omapfb: failed to allocate framebuffer omapfb omapfb: failed to allocate fbmem omapdss_compat_uninit() APPLY: dss_dispc_uninitialize_irq() omapdss_dispc 58001000.dispc: find_dr: dr=c1363dc0 devm_irq_match: irq 57,57 data c0c0a350,c0c0a350 devm_free_irq: 57 c0c0a350 genirq: __free_irq: calling omap_dispc_irq_handler+0x0/0x11c
The dss irq handler presumes that the DSS hardware is enabled, which is not the case at uninitialize time. So it crashes when the handler tries to access DISPC registers.
This doesn't happen normally, as the DSS IRQ is only shared between two DSS submodules, DISPC and DSI, and when one of them is enabled, effectively both are enabled. But even so, the code is not correct.
That said, I don't understand why it breaks now but not earlier, nothing has changed around that. Hmm, except now we use proper DT bindings, so the IRQ comes from DT. But I don't see why that would affect this.
It looks like the updates stopped DSI from initialising.
Before:
backlight.9 supply power not found, using dummy regulator pwm-backlight backlight.9: unable to request PWM, trying legacy API pwm-backlight backlight.9: unable to request legacy PWM platform backlight.9: Driver pwm-backlight requests probe deferral ------------[ cut here ]------------ WARNING: CPU: 1 PID: 1 at drivers/video/omap2/dss/dss.c:481 dss_set_fck_rate+0x88/0xb4() clk rate mismatch: 153600000 != 170666666 Modules linked in: CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W 3.14.0+ #1 ... [<c02221b4>] (dss_set_fck_rate) from [<c0606bbc>] (omap_dsshw_probe+0x1ec/0x304) r6:c16afc10 r5:5b8d8000 r4:00000001 [<c06069d0>] (omap_dsshw_probe) from [<c028b7c8>] (platform_drv_probe+0x24/0x54) r7:00000000 r6:00000000 r5:c0654e24 r4:c16afc10 [<c028b7a4>] (platform_drv_probe) from [<c0289e68>] (really_probe+0xf8/0x2e8) ... ---[ end trace 3406ff24bd97382f ]--- OMAP DSS rev 4.0 omapdss_dsi.0 supply vdds_dsi not found, using dummy regulator omapdss_hdmi supply vdda_hdmi_dac not found, using dummy regulator omapdss_dsi.1 supply vdds_dsi not found, using dummy regulator swapper/0: page allocation failure: order:9, mode:0xd0 CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W 3.14.0+ #1
After:
platform backlight.9: Driver pwm-backlight requests probe deferral ------------[ cut here ]------------ WARNING: CPU: 1 PID: 1 at drivers/video/omap2/dss/dss.c:483 dss_set_fck_rate+0x88/0xb4() clk rate mismatch: 153600000 != 170666666 ... ---[ end trace 3406ff24bd97382f ]--- OMAP DSS rev 4.0 panel-dsi-cm display.14: Failed to connect to video source omapfb omapfb: failed to connect default display omapfb omapfb: failed to init overlay connections omapfb omapfb: failed to setup omapfb platform omapfb: Driver omapfb requests probe deferral ... backlight.9 supply power not found, using dummy regulator kworker/u4:0: page allocation failure: order:9, mode:0xd0 CPU: 1 PID: 6 Comm: kworker/u4:0 Tainted: G W 3.14.0+ #1 Workqueue: deferwq deferred_probe_work_func ... omapfb omapfb: failed to allocate framebuffer omapfb omapfb: failed to allocate fbmem
I wonder what's the correct way to handle shared interrupts... Should I always keep the HW enabled when an irq handler is registered, or should I check whether the HW is enabled or not in the irq handler?
If the interrupt is registered as a shared interrupt, it means that the interrupt may be shared with another device, which may trigger that interrupt at any moment, whether or not your hardware is accessible.
That means you must: - have some way to make it accessible in the interrupt handler - have some way to know that the interrupt could never have come from the hardware (and this return IRQ_NONE) - not register it as a shared interrupt, thus ensuring that no other device driver could register and raise an interrupt when you're not ready to deal with it
PS, as you can see from the above, there's another bug in OMAPDSS with dss_set_fck_rate().
On 10/04/14 15:49, Russell King - ARM Linux wrote:
That said, I don't understand why it breaks now but not earlier, nothing has changed around that. Hmm, except now we use proper DT bindings, so the IRQ comes from DT. But I don't see why that would affect this.
It looks like the updates stopped DSI from initialising.
Hm, so are we talking about Linus' master, or linux-next? I'm testing on Linus' tree for today, with multi_v7_defconfig + DSS enabled, the display on 4430 SDP comes up fine for me (but something causes the boot to halt right after mounting my rootfs, but that's probably different thing). And the display seems to work with CMA and without CMA.
Can you share the .config?
If the interrupt is registered as a shared interrupt, it means that the interrupt may be shared with another device, which may trigger that interrupt at any moment, whether or not your hardware is accessible.
That means you must:
- have some way to make it accessible in the interrupt handler
- have some way to know that the interrupt could never have come from the hardware (and this return IRQ_NONE)
- not register it as a shared interrupt, thus ensuring that no other device driver could register and raise an interrupt when you're not ready to deal with it
I'll try to come up with a proper fix. For the time being, the attached hack patch makes the problem go away with CONFIG_DEBUG_SHIRQ, which should allow booting even if the omapfb fails to start.
PS, as you can see from the above, there's another bug in OMAPDSS with dss_set_fck_rate().
I have a fix for that, but it depends for drivers/clk changes so I could not send it with the main pull request. The warning should be relatively harmless. I've attached the patch in any case.
Tomi
On Thu, Apr 10, 2014 at 04:31:40PM +0300, Tomi Valkeinen wrote:
On 10/04/14 15:49, Russell King - ARM Linux wrote:
That said, I don't understand why it breaks now but not earlier, nothing has changed around that. Hmm, except now we use proper DT bindings, so the IRQ comes from DT. But I don't see why that would affect this.
It looks like the updates stopped DSI from initialising.
Hm, so are we talking about Linus' master, or linux-next? I'm testing on Linus' tree for today, with multi_v7_defconfig + DSS enabled, the display on 4430 SDP comes up fine for me (but something causes the boot to halt right after mounting my rootfs, but that's probably different thing).
As always, it's from my autobuilder, which is Linus' master plus almost everything in my tree, plus maybe arm-soc if the conflicts aren't too great. Neither of the last two builds had arm-soc's for-next merged in though - the previous working one because there were too many (non-DT) conflicts, and non-working build because arm-soc's for-next is now merged into Linus' tree.
I mentioned the two commits in Linus' tree: which were the base between the working tree (a7963eb7f4c4) and the non-working tree (a7963eb7f4c4).
While there could be something in my tree which could have affected it, I doubt that - the delta between the working and non-working for my tree is:
.../DocBook/media/v4l/pixfmt-packed-rgb.xml | 39 +++++++++ .../bindings/staging/imx-drm/fsl-imx-drm.txt | 3 +- arch/arm/include/asm/assembler.h | 42 ++++++++++ arch/arm/include/asm/cputype.h | 19 +++++ arch/arm/kernel/entry-armv.S | 11 ++- arch/arm/kernel/entry-header.S | 11 --- arch/arm/kernel/iwmmxt.S | 15 +++- arch/arm/kernel/pj4-cp0.c | 4 + arch/arm/kernel/traps.c | 1 + arch/arm/mach-ep93xx/crunch-bits.S | 13 ++- arch/arm/vfp/entry.S | 28 ++----- arch/arm/vfp/vfphw.S | 19 ++--- drivers/staging/imx-drm/dw-hdmi-audio.c | 94 ++++++++++++++-------- drivers/staging/imx-drm/dw-hdmi-audio.h | 3 + drivers/staging/imx-drm/imx-hdmi.c | 12 +-- drivers/staging/imx-drm/imx-hdmi.h | 5 -- drivers/staging/imx-drm/ipu-v3/ipu-dc.c | 9 +++ drivers/staging/imx-drm/ipu-v3/ipu-di.c | 2 +- drivers/staging/imx-drm/ipuv3-crtc.c | 2 +- drivers/staging/imx-drm/parallel-display.c | 2 + include/uapi/linux/videodev2.h | 1 + 21 files changed, 230 insertions(+), 105 deletions(-)
which is basically just imx-drm changes, and the undefined instruction entry changes. Nothing there which would affect OMAPDSS.
And the display seems to work with CMA and without CMA.
Can you share the .config?
What's probably more relevant (which probably is a contributory factor to the allocation failure) is the command line arguments:
mem=512M vmalloc=1G
which I always supply to OMAP kernels - this forces almost all of the memory into highmem.
The config file nevertheless is:
http://www.arm.linux.org.uk/developer/build/file.php?lid=5490
On 10/04/14 17:26, Russell King - ARM Linux wrote:
What's probably more relevant (which probably is a contributory factor to the allocation failure) is the command line arguments:
mem=512M vmalloc=1G
which I always supply to OMAP kernels - this forces almost all of the memory into highmem.
Is everything (well, at least display) supposed to work with that? If I try vmalloc=1G (on my config), I can't even boot with nfsroot on panda, as the USB seems to fail to initialize. And I see "cma: CMA: failed to reserve 48 MiB" at the very start of the kernel log.
Ah, I see. If I disable CMA from the kernel config, I am able to boot, although omapfb still fails to allocate the fb. However, I see a spam from oom-killer when the rootfs starts to boot.
What's the DMA-able memory when CMA is off? Is it the low mem? With vmalloc=1G, lowmem seems to be 32MB. I guess that's big enough in theory to be able to allocate a framebuffer, presuming it's not fragmented, and there's not much there, and, of course, depending on the resolution.
Tomi
On Fri, Apr 11, 2014 at 03:20:35PM +0300, Tomi Valkeinen wrote:
On 10/04/14 17:26, Russell King - ARM Linux wrote:
What's probably more relevant (which probably is a contributory factor to the allocation failure) is the command line arguments:
mem=512M vmalloc=1G
which I always supply to OMAP kernels - this forces almost all of the memory into highmem.
Is everything (well, at least display) supposed to work with that? If I try vmalloc=1G (on my config), I can't even boot with nfsroot on panda, as the USB seems to fail to initialize. And I see "cma: CMA: failed to reserve 48 MiB" at the very start of the kernel log.
Ah, I see. If I disable CMA from the kernel config, I am able to boot, although omapfb still fails to allocate the fb. However, I see a spam from oom-killer when the rootfs starts to boot.
What's the DMA-able memory when CMA is off? Is it the low mem? With vmalloc=1G, lowmem seems to be 32MB. I guess that's big enough in theory to be able to allocate a framebuffer, presuming it's not fragmented, and there's not much there, and, of course, depending on the resolution.
Yes it is supposed to boot, and does boot. Here's the results - last nights with the runtime get/put added:
http://www.arm.linux.org.uk/developer/build/result.php?type=boot&idx=173...
All the previous boot instances are here:
http://www.arm.linux.org.uk/developer/build/index.php?id=2009
and for those which just cover OMAP4:
http://www.arm.linux.org.uk/developer/build/index.php?id=2001
As ever, full details are there - configuration files, full build logs and full boot logs.
On 11/04/14 15:37, Russell King - ARM Linux wrote:
On Fri, Apr 11, 2014 at 03:20:35PM +0300, Tomi Valkeinen wrote:
On 10/04/14 17:26, Russell King - ARM Linux wrote:
What's probably more relevant (which probably is a contributory factor to the allocation failure) is the command line arguments:
mem=512M vmalloc=1G
which I always supply to OMAP kernels - this forces almost all of the memory into highmem.
Is everything (well, at least display) supposed to work with that? If I try vmalloc=1G (on my config), I can't even boot with nfsroot on panda, as the USB seems to fail to initialize. And I see "cma: CMA: failed to reserve 48 MiB" at the very start of the kernel log.
Ah, I see. If I disable CMA from the kernel config, I am able to boot, although omapfb still fails to allocate the fb. However, I see a spam from oom-killer when the rootfs starts to boot.
What's the DMA-able memory when CMA is off? Is it the low mem? With vmalloc=1G, lowmem seems to be 32MB. I guess that's big enough in theory to be able to allocate a framebuffer, presuming it's not fragmented, and there's not much there, and, of course, depending on the resolution.
Yes it is supposed to boot, and does boot. Here's the results - last nights with the runtime get/put added:
Well, I meant generally, not just for your config. At least with CMA enabled, the kernel doesn't seem to work. I see this in the log with booting with CMA, which probably affects drivers:
DMA: failed to allocate 256 KiB pool for atomic coherent allocation
And, as I mentioned, even without CMA, OOM killer is triggered for me almost instantly after rootfs is mounter (seems to __alloc_pages_nodemask, first call from alloc_inode()).
But I see enabling HIGHPTE helps, for CMA case also. I guess lowmem is just so tight with vmalloc=1G that it's easy to get problems.
As ever, full details are there - configuration files, full build logs and full boot logs.
Not sure why, but the config links return a blank page for me. For example:
http://www.arm.linux.org.uk/developer/build/file.php?lid=5514
Tomi
On Fri, Apr 11, 2014 at 04:59:54PM +0300, Tomi Valkeinen wrote:
Not sure why, but the config links return a blank page for me. For example:
http://www.arm.linux.org.uk/developer/build/file.php?lid=5514
Should be fixed now.
kernel-build-reports@lists.linaro.org