On Tue, Apr 20, 2021 at 10:12 AM Alexandre TORGUE alexandre.torgue@foss.st.com wrote:
On 4/20/21 4:45 PM, Rob Herring wrote:
On Tue, Apr 20, 2021 at 9:03 AM Alexandre TORGUE alexandre.torgue@foss.st.com wrote:
Hi,
Greg or Sasha won't know what to do with this. Not sure who follows the stable list either. Quentin sent the patch, but is not the author. Given the patch in question is about consistency between EFI memory map boot and DT memory map boot, copying EFI knowledgeable folks would help (Ard B for starters).
Ok thanks for the tips. I add Ard in the loop.
Sigh. If it was only Ard I was suggesting I would have done that myself. Now everyone on the patch in question and relevant lists are Cc'ed.
Ard, let me know if other people have to be directly added or if I have to resend to another mailing list.
thanks alex
Since v5.4.102 I observe a regression on stm32mp1 platform: "no-map" reserved-memory regions are no more "reserved" and make part of the kernel System RAM. This causes allocation failure for devices which try to take a reserved-memory region.
It has been introduced by the following path:
"fdt: Properly handle "no-map" field in the memory region [ Upstream commit 86588296acbfb1591e92ba60221e95677ecadb43 ]" which replace memblock_remove by memblock_mark_nomap in no-map case.
Reverting this patch it's fine.
I add part of my DT (something is maybe wrong inside):
memory@c0000000 { reg = <0xc0000000 0x20000000>; };
reserved-memory { #address-cells = <1>; #size-cells = <1>; ranges;
gpu_reserved: gpu@d4000000 { reg = <0xd4000000 0x4000000>; no-map; };
};
Sorry if this issue has already been raised and discussed.
Thanks alex
On Tue, 20 Apr 2021 at 17:54, Rob Herring robh+dt@kernel.org wrote:
On Tue, Apr 20, 2021 at 10:12 AM Alexandre TORGUE alexandre.torgue@foss.st.com wrote:
On 4/20/21 4:45 PM, Rob Herring wrote:
On Tue, Apr 20, 2021 at 9:03 AM Alexandre TORGUE alexandre.torgue@foss.st.com wrote:
Hi,
Greg or Sasha won't know what to do with this. Not sure who follows the stable list either. Quentin sent the patch, but is not the author. Given the patch in question is about consistency between EFI memory map boot and DT memory map boot, copying EFI knowledgeable folks would help (Ard B for starters).
Ok thanks for the tips. I add Ard in the loop.
Sigh. If it was only Ard I was suggesting I would have done that myself. Now everyone on the patch in question and relevant lists are Cc'ed.
Thanks for the cc.
Ard, let me know if other people have to be directly added or if I have to resend to another mailing list.
thanks alex
Since v5.4.102 I observe a regression on stm32mp1 platform: "no-map" reserved-memory regions are no more "reserved" and make part of the kernel System RAM. This causes allocation failure for devices which try to take a reserved-memory region.
It has been introduced by the following path:
"fdt: Properly handle "no-map" field in the memory region [ Upstream commit 86588296acbfb1591e92ba60221e95677ecadb43 ]" which replace memblock_remove by memblock_mark_nomap in no-map case.
Why was this backported? It doesn't look like a bugfix to me.
Reverting this patch it's fine.
I add part of my DT (something is maybe wrong inside):
memory@c0000000 { reg = <0xc0000000 0x20000000>; };
reserved-memory { #address-cells = <1>; #size-cells = <1>; ranges;
gpu_reserved: gpu@d4000000 { reg = <0xd4000000 0x4000000>; no-map; };
};
Sorry if this issue has already been raised and discussed.
Could you explain why it fails? The region is clearly part of system memory, and tagged as no-map, so the patch in itself is not unreasonable. However, we obviously have code that relies on how the region is represented in /proc/iomem, so it would be helpful to get some insight into why this is the case.
In any case, the mere fact that this causes a regression should be sufficient justification to revert/withdraw it from v5.4, as I don't see a reason why it was merged there in the first place. (It has no fixes tag or cc:stable)
On 4/20/2021 9:10 AM, Ard Biesheuvel wrote:
On Tue, 20 Apr 2021 at 17:54, Rob Herring robh+dt@kernel.org wrote:
On Tue, Apr 20, 2021 at 10:12 AM Alexandre TORGUE alexandre.torgue@foss.st.com wrote:
On 4/20/21 4:45 PM, Rob Herring wrote:
On Tue, Apr 20, 2021 at 9:03 AM Alexandre TORGUE alexandre.torgue@foss.st.com wrote:
Hi,
Greg or Sasha won't know what to do with this. Not sure who follows the stable list either. Quentin sent the patch, but is not the author. Given the patch in question is about consistency between EFI memory map boot and DT memory map boot, copying EFI knowledgeable folks would help (Ard B for starters).
Ok thanks for the tips. I add Ard in the loop.
Sigh. If it was only Ard I was suggesting I would have done that myself. Now everyone on the patch in question and relevant lists are Cc'ed.
Thanks for the cc.
Ard, let me know if other people have to be directly added or if I have to resend to another mailing list.
thanks alex
Since v5.4.102 I observe a regression on stm32mp1 platform: "no-map" reserved-memory regions are no more "reserved" and make part of the kernel System RAM. This causes allocation failure for devices which try to take a reserved-memory region.
It has been introduced by the following path:
"fdt: Properly handle "no-map" field in the memory region [ Upstream commit 86588296acbfb1591e92ba60221e95677ecadb43 ]" which replace memblock_remove by memblock_mark_nomap in no-map case.
Why was this backported? It doesn't look like a bugfix to me.
Reverting this patch it's fine.
I add part of my DT (something is maybe wrong inside):
memory@c0000000 { reg = <0xc0000000 0x20000000>; };
reserved-memory { #address-cells = <1>; #size-cells = <1>; ranges;
gpu_reserved: gpu@d4000000 { reg = <0xd4000000 0x4000000>; no-map; };
};
Sorry if this issue has already been raised and discussed.
Could you explain why it fails? The region is clearly part of system memory, and tagged as no-map, so the patch in itself is not unreasonable. However, we obviously have code that relies on how the region is represented in /proc/iomem, so it would be helpful to get some insight into why this is the case.
I do wonder as well, we have a 32MB "no-map" reserved memory region on our platforms located at 0xfe000000. Without the offending commit, /proc/iomem looks like this:
40000000-fdffefff : System RAM 40008000-40ffffff : Kernel code 41e00000-41ef1d77 : Kernel data 100000000-13fffffff : System RAM
and with the patch applied, we have this:
40000000-fdffefff : System RAM 40008000-40ffffff : Kernel code 41e00000-41ef3db7 : Kernel data fdfff000-ffffffff : System RAM 100000000-13fffffff : System RAM
so we can now see that the region 0xfe000000 - 0xfffffff is also cobbled up with the preceding region which is a mailbox between Linux and the secure monitor at 0xfdfff000 and of size 4KB. It seems like there is
The memblock=debug outputs is also different:
[ 0.000000] MEMBLOCK configuration: [ 0.000000] memory size = 0xfdfff000 reserved size = 0x7ce4d20d [ 0.000000] memory.cnt = 0x2 [ 0.000000] memory[0x0] [0x00000040000000-0x000000fdffefff], 0xbdfff000 bytes flags: 0x0 [ 0.000000] memory[0x1] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0 [ 0.000000] reserved.cnt = 0x6 [ 0.000000] reserved[0x0] [0x00000040003000-0x0000004000e494], 0xb495 bytes flags: 0x0 [ 0.000000] reserved[0x1] [0x00000040200000-0x00000041ef1d77], 0x1cf1d78 bytes flags: 0x0 [ 0.000000] reserved[0x2] [0x00000045000000-0x000000450fffff], 0x100000 bytes flags: 0x0 [ 0.000000] reserved[0x3] [0x00000047000000-0x0000004704ffff], 0x50000 bytes flags: 0x0 [ 0.000000] reserved[0x4] [0x000000c2c00000-0x000000fdbfffff], 0x3b000000 bytes flags: 0x0 [ 0.000000] reserved[0x5] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0
[ 0.000000] MEMBLOCK configuration: [ 0.000000] memory size = 0x100000000 reserved size = 0x7ca4f24d [ 0.000000] memory.cnt = 0x3 [ 0.000000] memory[0x0] [0x00000040000000-0x000000fdffefff], 0xbdfff000 bytes flags: 0x0 [ 0.000000] memory[0x1] [0x000000fdfff000-0x000000ffffffff], 0x2001000 bytes flags: 0x4 [ 0.000000] memory[0x2] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0 [ 0.000000] reserved.cnt = 0x6 [ 0.000000] reserved[0x0] [0x00000040003000-0x0000004000e494], 0xb495 bytes flags: 0x0 [ 0.000000] reserved[0x1] [0x00000040200000-0x00000041ef3db7], 0x1cf3db8 bytes flags: 0x0 [ 0.000000] reserved[0x2] [0x00000045000000-0x000000450fffff], 0x100000 bytes flags: 0x0 [ 0.000000] reserved[0x3] [0x00000047000000-0x0000004704ffff], 0x50000 bytes flags: 0x0 [ 0.000000] reserved[0x4] [0x000000c3000000-0x000000fdbfffff], 0x3ac00000 bytes flags: 0x0 [ 0.000000] reserved[0x5] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0
in the second case we can clearly see that the 32MB no-map region is now considered as usable RAM.
Hope this helps.
In any case, the mere fact that this causes a regression should be sufficient justification to revert/withdraw it from v5.4, as I don't see a reason why it was merged there in the first place. (It has no fixes tag or cc:stable)
Agreed, however that means we still need to find out whether a more recent kernel is also broken, I should be able to tell you that a little later.
On Tuesday 20 Apr 2021 at 09:33:56 (-0700), Florian Fainelli wrote:
I do wonder as well, we have a 32MB "no-map" reserved memory region on our platforms located at 0xfe000000. Without the offending commit, /proc/iomem looks like this:
40000000-fdffefff : System RAM 40008000-40ffffff : Kernel code 41e00000-41ef1d77 : Kernel data 100000000-13fffffff : System RAM
and with the patch applied, we have this:
40000000-fdffefff : System RAM 40008000-40ffffff : Kernel code 41e00000-41ef3db7 : Kernel data fdfff000-ffffffff : System RAM 100000000-13fffffff : System RAM
so we can now see that the region 0xfe000000 - 0xfffffff is also cobbled up with the preceding region which is a mailbox between Linux and the secure monitor at 0xfdfff000 and of size 4KB. It seems like there is
The memblock=debug outputs is also different:
[ 0.000000] MEMBLOCK configuration: [ 0.000000] memory size = 0xfdfff000 reserved size = 0x7ce4d20d [ 0.000000] memory.cnt = 0x2 [ 0.000000] memory[0x0] [0x00000040000000-0x000000fdffefff], 0xbdfff000 bytes flags: 0x0 [ 0.000000] memory[0x1] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0 [ 0.000000] reserved.cnt = 0x6 [ 0.000000] reserved[0x0] [0x00000040003000-0x0000004000e494], 0xb495 bytes flags: 0x0 [ 0.000000] reserved[0x1] [0x00000040200000-0x00000041ef1d77], 0x1cf1d78 bytes flags: 0x0 [ 0.000000] reserved[0x2] [0x00000045000000-0x000000450fffff], 0x100000 bytes flags: 0x0 [ 0.000000] reserved[0x3] [0x00000047000000-0x0000004704ffff], 0x50000 bytes flags: 0x0 [ 0.000000] reserved[0x4] [0x000000c2c00000-0x000000fdbfffff], 0x3b000000 bytes flags: 0x0 [ 0.000000] reserved[0x5] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0
[ 0.000000] MEMBLOCK configuration: [ 0.000000] memory size = 0x100000000 reserved size = 0x7ca4f24d [ 0.000000] memory.cnt = 0x3 [ 0.000000] memory[0x0] [0x00000040000000-0x000000fdffefff], 0xbdfff000 bytes flags: 0x0 [ 0.000000] memory[0x1] [0x000000fdfff000-0x000000ffffffff], 0x2001000 bytes flags: 0x4 [ 0.000000] memory[0x2] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0 [ 0.000000] reserved.cnt = 0x6 [ 0.000000] reserved[0x0] [0x00000040003000-0x0000004000e494], 0xb495 bytes flags: 0x0 [ 0.000000] reserved[0x1] [0x00000040200000-0x00000041ef3db7], 0x1cf3db8 bytes flags: 0x0 [ 0.000000] reserved[0x2] [0x00000045000000-0x000000450fffff], 0x100000 bytes flags: 0x0 [ 0.000000] reserved[0x3] [0x00000047000000-0x0000004704ffff], 0x50000 bytes flags: 0x0 [ 0.000000] reserved[0x4] [0x000000c3000000-0x000000fdbfffff], 0x3ac00000 bytes flags: 0x0 [ 0.000000] reserved[0x5] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0
in the second case we can clearly see that the 32MB no-map region is now considered as usable RAM.
Hope this helps.
In any case, the mere fact that this causes a regression should be sufficient justification to revert/withdraw it from v5.4, as I don't see a reason why it was merged there in the first place. (It has no fixes tag or cc:stable)
Agreed, however that means we still need to find out whether a more recent kernel is also broken, I should be able to tell you that a little later.
FWIW I did test this on Qemu before posting. With 5.12-rc8 and a 1MiB no-map region at 0x80000000, I have the following:
40000000-7fffffff : System RAM 40210000-417fffff : Kernel code 41800000-41daffff : reserved 41db0000-4210ffff : Kernel data 48000000-48008fff : reserved 80000000-800fffff : reserved 80100000-13fffffff : System RAM fa000000-ffffffff : reserved 13b000000-13f5fffff : reserved 13f6de000-13f77dfff : reserved 13f77e000-13f77efff : reserved 13f77f000-13f7dafff : reserved 13f7dd000-13f7defff : reserved 13f7df000-13f7dffff : reserved 13f7e0000-13f7f3fff : reserved 13f7f4000-13f7fdfff : reserved 13f7fe000-13fffffff : reserved
If I remove the 'no-map' qualifier from DT, I get this:
40000000-13fffffff : System RAM 40210000-417fffff : Kernel code 41800000-41daffff : reserved 41db0000-4210ffff : Kernel data 48000000-48008fff : reserved 80000000-800fffff : reserved fa000000-ffffffff : reserved 13b000000-13f5fffff : reserved 13f6de000-13f77dfff : reserved 13f77e000-13f77efff : reserved 13f77f000-13f7dafff : reserved 13f7dd000-13f7defff : reserved 13f7df000-13f7dffff : reserved 13f7e0000-13f7f3fff : reserved 13f7f4000-13f7fdfff : reserved 13f7fe000-13fffffff : reserved
So this does seem to be working fine on my setup. I'll try again with 5.4 to see if I can repro.
Also, 8a5a75e5e9e5 ("of/fdt: Make sure no-map does not remove already reserved regions") looks more likely to cause the issue observed here, but that shouldn't be silent. I get the following error message in dmesg if I if place the no-map region on top of the kernel image:
OF: fdt: Reserved memory: failed to reserve memory for node 'foobar@40210000': base 0x0000000040210000, size 1 MiB
Is that triggering on your end?
Thanks, Quentin
On Wednesday 21 Apr 2021 at 08:31:00 (+0000), Quentin Perret wrote:
FWIW I did test this on Qemu before posting. With 5.12-rc8 and a 1MiB no-map region at 0x80000000, I have the following:
40000000-7fffffff : System RAM 40210000-417fffff : Kernel code 41800000-41daffff : reserved 41db0000-4210ffff : Kernel data 48000000-48008fff : reserved 80000000-800fffff : reserved 80100000-13fffffff : System RAM fa000000-ffffffff : reserved 13b000000-13f5fffff : reserved 13f6de000-13f77dfff : reserved 13f77e000-13f77efff : reserved 13f77f000-13f7dafff : reserved 13f7dd000-13f7defff : reserved 13f7df000-13f7dffff : reserved 13f7e0000-13f7f3fff : reserved 13f7f4000-13f7fdfff : reserved 13f7fe000-13fffffff : reserved
If I remove the 'no-map' qualifier from DT, I get this:
40000000-13fffffff : System RAM 40210000-417fffff : Kernel code 41800000-41daffff : reserved 41db0000-4210ffff : Kernel data 48000000-48008fff : reserved 80000000-800fffff : reserved fa000000-ffffffff : reserved 13b000000-13f5fffff : reserved 13f6de000-13f77dfff : reserved 13f77e000-13f77efff : reserved 13f77f000-13f7dafff : reserved 13f7dd000-13f7defff : reserved 13f7df000-13f7dffff : reserved 13f7e0000-13f7f3fff : reserved 13f7f4000-13f7fdfff : reserved 13f7fe000-13fffffff : reserved
So this does seem to be working fine on my setup. I'll try again with 5.4 to see if I can repro.
I just ran the same experiment on v5.4.102 which is where the regression was reported, and I'm seeing the same correct result...
Also, 8a5a75e5e9e5 ("of/fdt: Make sure no-map does not remove already reserved regions") looks more likely to cause the issue observed here, but that shouldn't be silent. I get the following error message in dmesg if I if place the no-map region on top of the kernel image:
OF: fdt: Reserved memory: failed to reserve memory for node 'foobar@40210000': base 0x0000000040210000, size 1 MiB
Is that triggering on your end?
So that really sounds like the cause of the issue here, though arguably this should be indicative a something funny in the DT.
Thanks, Quentin
On 4/21/2021 1:31 AM, Quentin Perret wrote:
On Tuesday 20 Apr 2021 at 09:33:56 (-0700), Florian Fainelli wrote:
I do wonder as well, we have a 32MB "no-map" reserved memory region on our platforms located at 0xfe000000. Without the offending commit, /proc/iomem looks like this:
40000000-fdffefff : System RAM 40008000-40ffffff : Kernel code 41e00000-41ef1d77 : Kernel data 100000000-13fffffff : System RAM
and with the patch applied, we have this:
40000000-fdffefff : System RAM 40008000-40ffffff : Kernel code 41e00000-41ef3db7 : Kernel data fdfff000-ffffffff : System RAM 100000000-13fffffff : System RAM
so we can now see that the region 0xfe000000 - 0xfffffff is also cobbled up with the preceding region which is a mailbox between Linux and the secure monitor at 0xfdfff000 and of size 4KB. It seems like there is
The memblock=debug outputs is also different:
[ 0.000000] MEMBLOCK configuration: [ 0.000000] memory size = 0xfdfff000 reserved size = 0x7ce4d20d [ 0.000000] memory.cnt = 0x2 [ 0.000000] memory[0x0] [0x00000040000000-0x000000fdffefff], 0xbdfff000 bytes flags: 0x0 [ 0.000000] memory[0x1] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0 [ 0.000000] reserved.cnt = 0x6 [ 0.000000] reserved[0x0] [0x00000040003000-0x0000004000e494], 0xb495 bytes flags: 0x0 [ 0.000000] reserved[0x1] [0x00000040200000-0x00000041ef1d77], 0x1cf1d78 bytes flags: 0x0 [ 0.000000] reserved[0x2] [0x00000045000000-0x000000450fffff], 0x100000 bytes flags: 0x0 [ 0.000000] reserved[0x3] [0x00000047000000-0x0000004704ffff], 0x50000 bytes flags: 0x0 [ 0.000000] reserved[0x4] [0x000000c2c00000-0x000000fdbfffff], 0x3b000000 bytes flags: 0x0 [ 0.000000] reserved[0x5] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0
[ 0.000000] MEMBLOCK configuration: [ 0.000000] memory size = 0x100000000 reserved size = 0x7ca4f24d [ 0.000000] memory.cnt = 0x3 [ 0.000000] memory[0x0] [0x00000040000000-0x000000fdffefff], 0xbdfff000 bytes flags: 0x0 [ 0.000000] memory[0x1] [0x000000fdfff000-0x000000ffffffff], 0x2001000 bytes flags: 0x4 [ 0.000000] memory[0x2] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0 [ 0.000000] reserved.cnt = 0x6 [ 0.000000] reserved[0x0] [0x00000040003000-0x0000004000e494], 0xb495 bytes flags: 0x0 [ 0.000000] reserved[0x1] [0x00000040200000-0x00000041ef3db7], 0x1cf3db8 bytes flags: 0x0 [ 0.000000] reserved[0x2] [0x00000045000000-0x000000450fffff], 0x100000 bytes flags: 0x0 [ 0.000000] reserved[0x3] [0x00000047000000-0x0000004704ffff], 0x50000 bytes flags: 0x0 [ 0.000000] reserved[0x4] [0x000000c3000000-0x000000fdbfffff], 0x3ac00000 bytes flags: 0x0 [ 0.000000] reserved[0x5] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0
in the second case we can clearly see that the 32MB no-map region is now considered as usable RAM.
Hope this helps.
In any case, the mere fact that this causes a regression should be sufficient justification to revert/withdraw it from v5.4, as I don't see a reason why it was merged there in the first place. (It has no fixes tag or cc:stable)
Agreed, however that means we still need to find out whether a more recent kernel is also broken, I should be able to tell you that a little later.
FWIW I did test this on Qemu before posting. With 5.12-rc8 and a 1MiB no-map region at 0x80000000, I have the following:
40000000-7fffffff : System RAM 40210000-417fffff : Kernel code 41800000-41daffff : reserved 41db0000-4210ffff : Kernel data 48000000-48008fff : reserved 80000000-800fffff : reserved 80100000-13fffffff : System RAM fa000000-ffffffff : reserved 13b000000-13f5fffff : reserved 13f6de000-13f77dfff : reserved 13f77e000-13f77efff : reserved 13f77f000-13f7dafff : reserved 13f7dd000-13f7defff : reserved 13f7df000-13f7dffff : reserved 13f7e0000-13f7f3fff : reserved 13f7f4000-13f7fdfff : reserved 13f7fe000-13fffffff : reserved
If I remove the 'no-map' qualifier from DT, I get this:
40000000-13fffffff : System RAM 40210000-417fffff : Kernel code 41800000-41daffff : reserved 41db0000-4210ffff : Kernel data 48000000-48008fff : reserved 80000000-800fffff : reserved fa000000-ffffffff : reserved 13b000000-13f5fffff : reserved 13f6de000-13f77dfff : reserved 13f77e000-13f77efff : reserved 13f77f000-13f7dafff : reserved 13f7dd000-13f7defff : reserved 13f7df000-13f7dffff : reserved 13f7e0000-13f7f3fff : reserved 13f7f4000-13f7fdfff : reserved 13f7fe000-13fffffff : reserved
So this does seem to be working fine on my setup. I'll try again with 5.4 to see if I can repro.
Also, 8a5a75e5e9e5 ("of/fdt: Make sure no-map does not remove already reserved regions") looks more likely to cause the issue observed here, but that shouldn't be silent. I get the following error message in dmesg if I if place the no-map region on top of the kernel image:
OF: fdt: Reserved memory: failed to reserve memory for node 'foobar@40210000': base 0x0000000040210000, size 1 MiB
Is that triggering on your end?
It is not, otherwise I would have noticed earlier, can you try the same thing that happens on my platform with a reserved region (without no-map) adjacent to a reserved region with 'no-map'? I will test different and newer kernels than 5.4 today to find out if this is still a problem with upstream. I could confirm that v4.9.259 also have this problem now.
On 4/21/2021 7:33 AM, Florian Fainelli wrote:
On 4/21/2021 1:31 AM, Quentin Perret wrote:
On Tuesday 20 Apr 2021 at 09:33:56 (-0700), Florian Fainelli wrote:
I do wonder as well, we have a 32MB "no-map" reserved memory region on our platforms located at 0xfe000000. Without the offending commit, /proc/iomem looks like this:
40000000-fdffefff : System RAM 40008000-40ffffff : Kernel code 41e00000-41ef1d77 : Kernel data 100000000-13fffffff : System RAM
and with the patch applied, we have this:
40000000-fdffefff : System RAM 40008000-40ffffff : Kernel code 41e00000-41ef3db7 : Kernel data fdfff000-ffffffff : System RAM 100000000-13fffffff : System RAM
so we can now see that the region 0xfe000000 - 0xfffffff is also cobbled up with the preceding region which is a mailbox between Linux and the secure monitor at 0xfdfff000 and of size 4KB. It seems like there is
The memblock=debug outputs is also different:
[ 0.000000] MEMBLOCK configuration: [ 0.000000] memory size = 0xfdfff000 reserved size = 0x7ce4d20d [ 0.000000] memory.cnt = 0x2 [ 0.000000] memory[0x0] [0x00000040000000-0x000000fdffefff], 0xbdfff000 bytes flags: 0x0 [ 0.000000] memory[0x1] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0 [ 0.000000] reserved.cnt = 0x6 [ 0.000000] reserved[0x0] [0x00000040003000-0x0000004000e494], 0xb495 bytes flags: 0x0 [ 0.000000] reserved[0x1] [0x00000040200000-0x00000041ef1d77], 0x1cf1d78 bytes flags: 0x0 [ 0.000000] reserved[0x2] [0x00000045000000-0x000000450fffff], 0x100000 bytes flags: 0x0 [ 0.000000] reserved[0x3] [0x00000047000000-0x0000004704ffff], 0x50000 bytes flags: 0x0 [ 0.000000] reserved[0x4] [0x000000c2c00000-0x000000fdbfffff], 0x3b000000 bytes flags: 0x0 [ 0.000000] reserved[0x5] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0
[ 0.000000] MEMBLOCK configuration: [ 0.000000] memory size = 0x100000000 reserved size = 0x7ca4f24d [ 0.000000] memory.cnt = 0x3 [ 0.000000] memory[0x0] [0x00000040000000-0x000000fdffefff], 0xbdfff000 bytes flags: 0x0 [ 0.000000] memory[0x1] [0x000000fdfff000-0x000000ffffffff], 0x2001000 bytes flags: 0x4 [ 0.000000] memory[0x2] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0 [ 0.000000] reserved.cnt = 0x6 [ 0.000000] reserved[0x0] [0x00000040003000-0x0000004000e494], 0xb495 bytes flags: 0x0 [ 0.000000] reserved[0x1] [0x00000040200000-0x00000041ef3db7], 0x1cf3db8 bytes flags: 0x0 [ 0.000000] reserved[0x2] [0x00000045000000-0x000000450fffff], 0x100000 bytes flags: 0x0 [ 0.000000] reserved[0x3] [0x00000047000000-0x0000004704ffff], 0x50000 bytes flags: 0x0 [ 0.000000] reserved[0x4] [0x000000c3000000-0x000000fdbfffff], 0x3ac00000 bytes flags: 0x0 [ 0.000000] reserved[0x5] [0x00000100000000-0x0000013fffffff], 0x40000000 bytes flags: 0x0
in the second case we can clearly see that the 32MB no-map region is now considered as usable RAM.
Hope this helps.
In any case, the mere fact that this causes a regression should be sufficient justification to revert/withdraw it from v5.4, as I don't see a reason why it was merged there in the first place. (It has no fixes tag or cc:stable)
Agreed, however that means we still need to find out whether a more recent kernel is also broken, I should be able to tell you that a little later.
FWIW I did test this on Qemu before posting. With 5.12-rc8 and a 1MiB no-map region at 0x80000000, I have the following:
40000000-7fffffff : System RAM 40210000-417fffff : Kernel code 41800000-41daffff : reserved 41db0000-4210ffff : Kernel data 48000000-48008fff : reserved 80000000-800fffff : reserved 80100000-13fffffff : System RAM fa000000-ffffffff : reserved 13b000000-13f5fffff : reserved 13f6de000-13f77dfff : reserved 13f77e000-13f77efff : reserved 13f77f000-13f7dafff : reserved 13f7dd000-13f7defff : reserved 13f7df000-13f7dffff : reserved 13f7e0000-13f7f3fff : reserved 13f7f4000-13f7fdfff : reserved 13f7fe000-13fffffff : reserved
If I remove the 'no-map' qualifier from DT, I get this:
40000000-13fffffff : System RAM 40210000-417fffff : Kernel code 41800000-41daffff : reserved 41db0000-4210ffff : Kernel data 48000000-48008fff : reserved 80000000-800fffff : reserved fa000000-ffffffff : reserved 13b000000-13f5fffff : reserved 13f6de000-13f77dfff : reserved 13f77e000-13f77efff : reserved 13f77f000-13f7dafff : reserved 13f7dd000-13f7defff : reserved 13f7df000-13f7dffff : reserved 13f7e0000-13f7f3fff : reserved 13f7f4000-13f7fdfff : reserved 13f7fe000-13fffffff : reserved
So this does seem to be working fine on my setup. I'll try again with 5.4 to see if I can repro.
Also, 8a5a75e5e9e5 ("of/fdt: Make sure no-map does not remove already reserved regions") looks more likely to cause the issue observed here, but that shouldn't be silent. I get the following error message in dmesg if I if place the no-map region on top of the kernel image:
OF: fdt: Reserved memory: failed to reserve memory for node 'foobar@40210000': base 0x0000000040210000, size 1 MiB
Is that triggering on your end?
It is not, otherwise I would have noticed earlier, can you try the same thing that happens on my platform with a reserved region (without no-map) adjacent to a reserved region with 'no-map'? I will test different and newer kernels than 5.4 today to find out if this is still a problem with upstream. I could confirm that v4.9.259 also have this problem now.
5.10.31 works correctly and shows the following for my platform:
40000000-fdffefff : System RAM 40200000-40eaffff : Kernel code 40eb0000-4237ffff : reserved 42380000-425affff : Kernel data 45000000-450fffff : reserved 47000000-4704ffff : reserved 4761e000-47624fff : reserved f8c00000-fdbfffff : reserved fdfff000-ffffffff : reserved 100000000-13fffffff : System RAM 13b000000-13effffff : reserved 13f114000-13f173fff : reserved 13f174000-13f774fff : reserved 13f775000-13f7e8fff : reserved 13f7eb000-13f7ecfff : reserved 13f7ed000-13f7effff : reserved 13f7f0000-13fffffff : reserved
On Wednesday 21 Apr 2021 at 08:17:28 (-0700), Florian Fainelli wrote:
5.10.31 works correctly and shows the following for my platform:
40000000-fdffefff : System RAM 40200000-40eaffff : Kernel code 40eb0000-4237ffff : reserved 42380000-425affff : Kernel data 45000000-450fffff : reserved 47000000-4704ffff : reserved 4761e000-47624fff : reserved f8c00000-fdbfffff : reserved fdfff000-ffffffff : reserved 100000000-13fffffff : System RAM 13b000000-13effffff : reserved 13f114000-13f173fff : reserved 13f174000-13f774fff : reserved 13f775000-13f7e8fff : reserved 13f7eb000-13f7ecfff : reserved 13f7ed000-13f7effff : reserved 13f7f0000-13fffffff : reserved
OK, so if we're confident this works from 5.10 onwards, I would suggest to follow Ard's original suggestion to revert this patch from older LTSes as we're clearly missing something there.
Thanks, Quentin
On Wednesday 21 Apr 2021 at 07:33:52 (-0700), Florian Fainelli wrote:
It is not, otherwise I would have noticed earlier, can you try the same thing that happens on my platform with a reserved region (without no-map) adjacent to a reserved region with 'no-map'?
I just tried, but still no luck. FTR, I tried to reproduce your setup with the following DT:
memory@40000000 { reg = <0x00 0x40000000 0x01 0x00>; device_type = "memory"; };
reserved-memory { #address-cells = <2>; #size-cells = <2>; ranges;
foo@fdfff000{ reg = <0x00 0xfdfff000 0x0 0x1000>; }; bar@fe000000{ reg = <0x00 0xfe000000 0x0 0x2000000>; no-map; }; };
And with 5.4.102 and 5.10.31 I get the following in /proc/iomem
<...> 40000000-fdffffff : System RAM 40080000-412cffff : Kernel code 412d0000-417affff : reserved 417b0000-419f8fff : Kernel data 48000000-48008fff : reserved f7c00000-fdbfffff : reserved fdfff000-fdffffff : reserved fe000000-ffffffff : reserved 100000000-13fffffff : System RAM <...>
which looks about right. I'll keep trying a few other things.
Thanks, Quentin
Hi Quentin
On 4/22/21 2:59 PM, Quentin Perret wrote:
On Wednesday 21 Apr 2021 at 07:33:52 (-0700), Florian Fainelli wrote:
It is not, otherwise I would have noticed earlier, can you try the same thing that happens on my platform with a reserved region (without no-map) adjacent to a reserved region with 'no-map'?
I just tried, but still no luck. FTR, I tried to reproduce your setup with the following DT:
memory@40000000 { reg = <0x00 0x40000000 0x01 0x00>; device_type = "memory"; }; reserved-memory { #address-cells = <2>; #size-cells = <2>; ranges; foo@fdfff000{ reg = <0x00 0xfdfff000 0x0 0x1000>; }; bar@fe000000{ reg = <0x00 0xfe000000 0x0 0x2000000>; no-map; }; };
And with 5.4.102 and 5.10.31 I get the following in /proc/iomem
<...> 40000000-fdffffff : System RAM 40080000-412cffff : Kernel code 412d0000-417affff : reserved 417b0000-419f8fff : Kernel data 48000000-48008fff : reserved f7c00000-fdbfffff : reserved fdfff000-fdffffff : reserved fe000000-ffffffff : reserved 100000000-13fffffff : System RAM <...>
which looks about right. I'll keep trying a few other things.
Did you get time to continue some tests on this issue ?
On my side this DT is not working:
memory@c0000000 { reg = <0xc0000000 0x20000000>; };
reserved-memory { #address-cells = <1>; #size-cells = <1>; ranges;
gpu_reserved: gpu@d4000000 { reg = <0xd4000000 0x4000000>; no-map; }; };
Let me know if I can help.
regards Alex
Thanks, Quentin
Hi Alexandre,
On Friday 07 May 2021 at 17:15:20 (+0200), Alexandre TORGUE wrote:
Did you get time to continue some tests on this issue ?
I did try a few things, but still fail to reproduced :/
On my side this DT is not working:
memory@c0000000 { reg = <0xc0000000 0x20000000>; };
reserved-memory { #address-cells = <1>; #size-cells = <1>; ranges;
gpu_reserved: gpu@d4000000 { reg = <0xd4000000 0x4000000>; no-map; };
};
So this does change how memory appears in /proc/iomem for me switching from 5.4.101 to v5.4.102 -- for the former d4000000-d7ffffff doesn't appear at all, and for the latter it appears as 'reserved'.
But still, it never gets accounted as System RAM for me ...
Let me know if I can help.
Could you please confirm you get a correct behaviour with 5.10.31 like Florian? If so, then bisecting to figure out what we're missing in older LTSes would help, but again it feels like we should just revert -- this wasn't really a fix in the first place.
Thanks, Quentin
Hi Quentin,
On 5/10/21 12:09 PM, Quentin Perret wrote:
Hi Alexandre,
On Friday 07 May 2021 at 17:15:20 (+0200), Alexandre TORGUE wrote:
Did you get time to continue some tests on this issue ?
I did try a few things, but still fail to reproduced :/
On my side this DT is not working:
memory@c0000000 { reg = <0xc0000000 0x20000000>; };
reserved-memory { #address-cells = <1>; #size-cells = <1>; ranges;
gpu_reserved: gpu@d4000000 { reg = <0xd4000000 0x4000000>; no-map; };
};
So this does change how memory appears in /proc/iomem for me switching from 5.4.101 to v5.4.102 -- for the former d4000000-d7ffffff doesn't appear at all, and for the latter it appears as 'reserved'.
But still, it never gets accounted as System RAM for me ...
Let me know if I can help.
Could you please confirm you get a correct behaviour with 5.10.31 like Florian? If so, then bisecting to figure out what we're missing in older LTSes would help, but again it feels like we should just revert -- this wasn't really a fix in the first place.
We saw that patches [1] and [2] cause issue on stable version (at least for 5.4). As you said issue can be seen with above device tree and check in /proc/iomem than gpu_reserved region is taken by the kernel as "System RAM".
On v5.10 stream there are no issues seen taking patches [1]&[2] and the reason is linked to patches [3]&[4] which have been introduced in v5.10.0. Reverting them give me the same behavior than on stable version.
[1] of/fdt: Make sure no-map does not remove already reserved regions [2] fdt: Properly handle "no-map" field in the memory region [3] arch, drivers: replace for_each_membock() with for_each_mem_range() [4] memblock: use separate iterators for memory and reserved regions
regards Alex
Thanks, Quentin
On Wednesday 12 May 2021 at 12:55:53 (+0200), Alexandre TORGUE wrote:
We saw that patches [1] and [2] cause issue on stable version (at least for 5.4). As you said issue can be seen with above device tree and check in /proc/iomem than gpu_reserved region is taken by the kernel as "System RAM".
On v5.10 stream there are no issues seen taking patches [1]&[2] and the reason is linked to patches [3]&[4] which have been introduced in v5.10.0. Reverting them give me the same behavior than on stable version.
Thanks for confirming. Given that the patches were not really fixes, I think reverting is still the best option. I've sent reverts to -stable for 5.4 and prior:
https://lore.kernel.org/stable/20210512122853.3243417-1-qperret@google.com/
Cheers, Quentin
On 5/12/21 2:34 PM, Quentin Perret wrote:
On Wednesday 12 May 2021 at 12:55:53 (+0200), Alexandre TORGUE wrote:
We saw that patches [1] and [2] cause issue on stable version (at least for 5.4). As you said issue can be seen with above device tree and check in /proc/iomem than gpu_reserved region is taken by the kernel as "System RAM".
On v5.10 stream there are no issues seen taking patches [1]&[2] and the reason is linked to patches [3]&[4] which have been introduced in v5.10.0. Reverting them give me the same behavior than on stable version.
Thanks for confirming. Given that the patches were not really fixes, I think reverting is still the best option. I've sent reverts to -stable for 5.4 and prior:
https://lore.kernel.org/stable/20210512122853.3243417-1-qperret@google.com/
Thanks Quentin.
alex
Cheers, Quentin
On Tue, Apr 20, 2021 at 11:10 AM Ard Biesheuvel ardb@kernel.org wrote:
On Tue, 20 Apr 2021 at 17:54, Rob Herring robh+dt@kernel.org wrote:
On Tue, Apr 20, 2021 at 10:12 AM Alexandre TORGUE alexandre.torgue@foss.st.com wrote:
On 4/20/21 4:45 PM, Rob Herring wrote:
On Tue, Apr 20, 2021 at 9:03 AM Alexandre TORGUE alexandre.torgue@foss.st.com wrote:
Hi,
Greg or Sasha won't know what to do with this. Not sure who follows the stable list either. Quentin sent the patch, but is not the author. Given the patch in question is about consistency between EFI memory map boot and DT memory map boot, copying EFI knowledgeable folks would help (Ard B for starters).
Ok thanks for the tips. I add Ard in the loop.
Sigh. If it was only Ard I was suggesting I would have done that myself. Now everyone on the patch in question and relevant lists are Cc'ed.
Thanks for the cc.
Ard, let me know if other people have to be directly added or if I have to resend to another mailing list.
thanks alex
Since v5.4.102 I observe a regression on stm32mp1 platform: "no-map" reserved-memory regions are no more "reserved" and make part of the kernel System RAM. This causes allocation failure for devices which try to take a reserved-memory region.
It has been introduced by the following path:
"fdt: Properly handle "no-map" field in the memory region [ Upstream commit 86588296acbfb1591e92ba60221e95677ecadb43 ]" which replace memblock_remove by memblock_mark_nomap in no-map case.
Why was this backported? It doesn't look like a bugfix to me.
Probably because of commit 8a5a75e5e9e5 ("of/fdt: Make sure no-map does not remove already reserved regions") which was in the same series.
'Properly handle' implies before it was 'improperly handled', so sounds like a fix.
Rob
boot-architecture@lists.linaro.org