On Sun, Jan 31, 2021 at 01:49:27PM -0800, Linus Torvalds wrote:
On Sun, Jan 31, 2021 at 12:04 AM Mike Rapoport rppt@kernel.org wrote:
That's *particularly* true when the very line above it did a "memblock_reserve()" of the exact same range that the memblock_add() "adds".
The most correct thing to do would have been to
memblock_add(0, end_of_first_memory_bank);
Somewhere at e820__memblock_setup().
You miss my complaint.
Why does the memblock code care about this magical "memblock_add()", when we just told it that the SAME REGION is reserved by doing a "memblock_reserve()"?
IOW, I'm not interested in "the correct thing to do would have been [another memblock_add()]". I'm saying that the memblock code itself is being confused, and no additional thing should have been required at all, because we already *did* that memblock_reserve().
See?
There is nothing magical about memblock_add().
Memblock presumes that arch code uses memblock_add() to register populated physical memory ranges and memblock_reserve() to protect memory ranges that should not be touched. These ranges do not necessarily overlap, so there maybe reserved ranges that do not have the corresponding registered memory.
This lets architectures to say "here are the memory banks I have" and "this memory is in use" (or even "this memory _might_ be in use" ) independently of each other.
The downside is that if there is a reserved range there is no way to tell whether it is backed by populated memory.
We could change this semantics and enforce the overlap, e.g. by implicitly adding all the reserved ranges to the registered memory. I've already tried that and I've found out that there are systems that rely on memblock's ability to track reserved and available ranges independently. For example, arm systems I've mentioned in the previous mail always have a reserved chunk at 0xfe000000 in their DTS, but they may have only 2G of memory actually populated.
Now, on x86 there is a gap between e820 and memblock since 2.6 times. As of now, only E820_TYPE_RAM is added to memblock as memory, some of the E820_*_RESERVED are reserved and on top there are reservations of the memory that's known to be used by BIOS or kernel.
I'm trying to close this gap with small steps and with changes that I believe will not break too many things at once so it'll become unmanageable.
Honestly, I'm not seeing it being a good thing to move further towards memblock code as the primary model for memory initialization, when the memblock code is so confused.
I'm not sure I follow you here. If I'm not mistaken, memblock is used as the primary model for memmap and page allocator initialization for almost a decade now...
Linus