Tony Lindgren tony@atomide.com writes:
Hi,
- Kevin Hilman khilman@baylibre.com [180817 15:47]:
Hmm, I think I spoke to soon, and now I don't think it's the MMC card.
I'm still seeing periodic failures on this board soon after the MMC init, but only in mainline and next: https://kernelci.org/boot/omap3-beagle-xm
Also, looking at that URL, you'll see that the failures are only for multi_v7 but not omap2plus_defconfig.
I was finally able to reproduce this here this morning with v4.18 after about 20 boot attempts. Looks like the system boots up, it just has a long pause. See the timestamps below where there is about 185 second pause:
[ 2.307800] mmc0: host does not support reading read-only switch, assuming write-enable [ 2.318237] mmc0: new high speed SDHC card at address 59b4 [ 2.325592] mmcblk0: mmc0:59b4 SD 14.7 GiB [ 2.333221] mmcblk0: p1 p2 [ 2.384490] ehci-omap 48064800.ehci: EHCI Host Controller [ 2.390045] ehci-omap 48064800.ehci: new USB bus registered, assigned bus number 2 [ 2.399261] ehci-omap 48064800.ehci: irq 93, io mem 0x48064800 [ 2.434387] ehci-omap 48064800.ehci: USB 2.0 started, EHCI 1.00 [ 2.441436] hub 2-0:1.0: USB hub found [ 2.445404] hub 2-0:1.0: 3 ports detected [ 2.451751] input: gpio_keys as /devices/platform/gpio_keys/input/input0 [ 2.460144] twl_rtc 48070000.i2c:twl@48:rtc: setting system clock to 2000-01-01 00:05:24 UTC (946685124) [ 2.814422] usb 2-2: new high-speed USB device number 2 using ehci-omap [ 3.016632] hub 2-2:1.0: USB hub found [ 3.020751] hub 2-2:1.0: 5 ports detected [ 3.344421] usb 2-2.1: new high-speed USB device number 3 using ehci-omap [ 3.498474] smsc95xx v1.0.6 [ 3.595184] smsc95xx 2-2.1:1.0 eth0: register 'smsc95xx' at usb-48064800.ehci-2.1, smsc95 xx USB 2.0 Ethernet, 11:22:33:44:55:66 [ 188.843536] random: crng init done [ 188.951354] smsc95xx 2-2.1:1.0 eth0: hardware isn't capable of remote wakeup [ 188.959014] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 190.469421] smsc95xx 2-2.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0xC1E1 [ 190.494506] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 190.524627] Sending DHCP requests .., OK [ 192.948425] IP-Config: Got DHCP answer from 192.168.111.1, my address is 192.168.111.32 [ 192.956604] IP-Config: Complete: [ 192.959869] device=eth0, hwaddr=02:02:00:a0:1f:e0, ipaddr=192.168.111.32, mask=255.2 55.255.0, gw=192.168.111.1 [ 192.970428] host=beagleboard-xm, domain=muru.com, nis-domain=(none) [ 192.977233] bootserver=192.168.111.64, rootserver=192.168.111.64, rootpath=/srv/nfs3 /alpine-armhf,v3,rsize=32768,wsize=32768 [ 192.977264] nameserver0=208.69.40.3, nameserver1=208.69.40.4 [ 192.997680] VAUX3: disabling [ 193.002410] VDAC: disabling [ 193.007171] VUSB3V1: disabling [ 193.010711] VPLL2: disabling [ 193.030761] VFS: Mounted root (nfs filesystem) readonly on device 0:14. [ 193.038635] devtmpfs: mounted [ 193.046600] Freeing unused kernel memory: 2048K
The long pause happens already before disabling unused regulators. So it seems more like some regression with timers or interrupts.
Ah, that would explain the fails in kernelCI. I think we have a default wait of 200 sec for a full kernel boot.
The step that seems to be happening right after MMC init is unused regulators being disabled. Is it possible that multi_v7 is missing some regulator setup?
Also, the last line in the failure case:
leds_pwm pwmleds: unable to request PWM for beagleboard::pmu_stat: -517
doesn't happen on the successful omap2plus_defconfig boots either.
Looks like we're missing these in multi_v7_defconfig:
CONFIG_PWM_TWL=y CONFIG_PWM_TWL_LED=y
Does the absence of these explain the super-long boot delay? Based on your boot, it doesn't seem like it.
Any other ideas for where the delay is coming from?
Kevin