On 10/25/23 10:05, Geert Uytterhoeven wrote:
On Wed, Oct 25, 2023 at 2:35 PM Geert Uytterhoeven geert@linux-m68k.org wrote:
On Wed, Oct 25, 2023 at 12:53 PM Geert Uytterhoeven geert@linux-m68k.org wrote:
On Wed, Oct 25, 2023 at 12:47 PM Geert Uytterhoeven geert@linux-m68k.org wrote:
On Tue, Oct 24, 2023 at 9:22 PM Pavel Machek pavel@denx.de wrote:
But we still have failures on Renesas with 5.10.199-rc2:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/10...
And they still happed during MMC init:
2.638013] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
[ 2.638846] INFO: trying to register non-static key. [ 2.644192] ledtrig-cpu: registered to indicate activity on CPUs [ 2.649066] The code is fine but needs lockdep annotation, or maybe [ 2.649069] you didn't initialize this object before use? [ 2.649071] turning off the locking correctness validator. [ 2.649080] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.199-rc2-arm64-renesas-ge31b6513c43d #1 [ 2.649082] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT) [ 2.649086] Call trace: [ 2.655106] SMCCC: SOC_ID: ARCH_SOC_ID not implemented, skipping .... [ 2.661354] dump_backtrace+0x0/0x194 [ 2.661361] show_stack+0x14/0x20 [ 2.667430] usbcore: registered new interface driver usbhid [ 2.672230] dump_stack+0xe8/0x130 [ 2.672238] register_lock_class+0x480/0x514 [ 2.672244] __lock_acquire+0x74/0x20ec [ 2.681113] usbhid: USB HID core driver [ 2.687450] lock_acquire+0x218/0x350 [ 2.687456] _raw_spin_lock+0x58/0x80 [ 2.687464] tmio_mmc_irq+0x410/0x9ac [ 2.688556] renesas_sdhi_internal_dmac ee160000.mmc: mmc0 base at 0x00000000ee160000, max clock rate 200 MHz [ 2.744936] __handle_irq_event_percpu+0xbc/0x340 [ 2.749635] handle_irq_event+0x60/0x100 [ 2.753553] handle_fasteoi_irq+0xa0/0x1ec [ 2.757644] __handle_domain_irq+0x7c/0xdc [ 2.761736] efi_header_end+0x4c/0xd0 [ 2.765393] el1_irq+0xcc/0x180 [ 2.768530] arch_cpu_idle+0x14/0x2c [ 2.772100] default_idle_call+0x58/0xe4 [ 2.776019] do_idle+0x244/0x2c0 [ 2.779242] cpu_startup_entry+0x20/0x6c [ 2.783160] rest_init+0x164/0x28c [ 2.786561] arch_call_rest_init+0xc/0x14 [ 2.790565] start_kernel+0x4c4/0x4f8 [ 2.794233] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014 [ 2.803011] Mem abort info:
from https://lava.ciplatform.org/scheduler/job/1025535 from https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5360973... .
Is there something else missing?
I don't have a HopeRun HiHope RZ/G2M, but both v5.10.198 and v5.10.199 seem to work fine on Salvator-XS with R-Car H3 ES2.0 and Salvator-X with R-Car M3-W ES1.0, using a config based on latest renesas_defconfig.
Sorry, I looked at the wrong log on R-Car M3-W. I do see the issue with v5.10.198, but not with v5.10.199.
It seems to be an intermittent issue. Investigating...
After spending too much time on bisecting, the bad guy turns out to be commit 6d3745bbc3341d3b ("mmc: renesas_sdhi: register irqs before registering controller") in v5.10.198.
Adding debug information shows the lock is mmc_host.lock.
It is definitely initialized:
renesas_sdhi_probe() { ... tmio_mmc_host_alloc() mmc_alloc_host spin_lock_init(&host->lock); ... devm_request_irq() -> tmio_mmc_irq tmio_mmc_cmd_irq() spin_lock(&host->lock); ... }
That leaves us with a missing lockdep annotation?
Is it possible that the lock initialization is overwritten ? I seem to recall a recent case where this happens.
Also, there is spin_lock_init(&_host->lock); in tmio_mmc_host_probe(), and tmio_mmc_host_probe() is called after devm_request_irq().
Also, how would lockdep annotation help with "Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014" in the log above ?
Guenter