Re: [qemu] boot failed: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000

6 Jul 2020

On 7/6/2020 5:53 AM, Arnd Bergmann wrote:
...
On Mon, Jul 6, 2020 at 1:03 PM Naresh Kamboju naresh.kamboju@linaro.org wrote:
...
While booting qemu_arm64 and qemu_arm with Linux version 5.8.0-rc3-next-20200706
the kernel panic noticed due to kernel NULL pointer dereference.
metadata:
   git branch: master
   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
   git commit: 5680d14d59bddc8bcbc5badf00dbbd4374858497
   git describe: next-20200706
   make_kernelversion: 5.8.0-rc3
   kernel-config:
https://builds.tuxbuild.com/Glr-Ql1wbp3qN3cnHogyNA/kernel.config
qemu arm64 boot crash log,
[    0.972053] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000000
[    0.975301] Mem abort info:
[    0.976316]   ESR = 0x96000004
[    0.977378]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.979363]   SET = 0, FnV = 0
[    0.980458]   EA = 0, S1PTW = 0
[    0.981583] Data abort info:
[    0.982634]   ISV = 0, ISS = 0x00000004
[    0.984213]   CM = 0, WnR = 0
[    0.985260] [0000000000000000] user address but active_mm is swapper
[    0.987600] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[    0.989557] Modules linked in:
[    0.990671] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
5.8.0-rc3-next-20200706 #1
[    0.993711] Hardware name: linux,dummy-virt (DT)
[    0.995708] pstate: 00000005 (nzcv daif -PAN -UAO BTYPE=--)
[    0.998168] pc : pl011_dma_probe+0x90/0x360
This is the code from you vmlinux file:
ffff8000107233e4:       b90087e2        str     w2, [sp, #132]
ffff8000107233e8:       97fcf14c        bl      ffff80001065f918
<dma_request_chan>
ffff8000107233ec:       aa0003f4        mov     x20, x0
ffff8000107233f0:       b140041f        cmn     x0, #0x1, lsl #12
ffff8000107233f4:       54000488        b.hi    ffff800010723484
<pl011_dma_probe+0x11c>  // b.pmore
ffff8000107233f8:       f9400280        ldr     x0, [x20]
ffff8000107233fc:       f9409c02        ldr     x2, [x0, #312]
ffff800010723400:       b4000082        cbz     x2, ffff800010723410
<pl011_dma_probe+0xa8>
It's the "ldr     x0, [x20]" dereferencing 'chan' in pl011_dma_probe() after
checking it for an error value. However it's a NULL pointer, not an
error pointer, indicating that there is a bug in the dmaengine driver
that you use here, or in the dmaengine core code.
Arnd,
I'm looking at the pl001_dma_probe(), I think we could make it more robust if it 
uses IS_ERR_OR_NULL(chan) instead of IS_ERR(). Should I send a patch for it? I 
suppose looking at the comment header for dma_request_chan() it does say return 
chan ptr or error ptr. Sorry I missed that.
Vinod,
It looks like the only fix for dmaengine for the patch is where Arnd pointed out 
as far as I can tell after auditing it. Let me know how you want to handle this. 
Thanks!

diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 0d6529eff66f..48e159e83cf5 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -852,7 +852,7 @@ struct dma_chan *dma_request_chan(struct device *dev, const 
char *name)
         mutex_lock(&dma_list_mutex);
         if (list_empty(&dma_device_list)) {
                 mutex_unlock(&dma_list_mutex);
-               return NULL;
+               return ERR_PTR(-ENODEV);
         }
list_for_each_entry_safe(d, _d, &dma_device_list, global_node) {
...
I don't see anything suspicious in dmaengine drivers, but there is a
recent series
from Dave Jiang that might explain it. Could you try reverting  commit
deb9541f5052 ("dmaengine: check device and channel list for empty")?
I think the broken change is this one:
@@ -819,6 +850,11 @@ struct dma_chan *dma_request_chan(struct device
*dev, const char *name)
     /* Try to find the channel via the DMA filter map(s) */
     mutex_lock(&dma_list_mutex);


  if (list_empty(&dma_device_list)) {


          mutex_unlock(&dma_list_mutex);


          return NULL;


  }


   list_for_each_entry_safe(d, _d, &dma_device_list, global_node) {
           dma_cap_mask_t mask;
           const struct dma_slave_map *map = dma_filter_match(d,



name, dev);
which needs to return an error code like -ENODEV instead of NULL. There
may be other changes in the same patch that introduce the same bug
elsewhere.
  Arnd



    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [qemu] boot failed: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000