On Tue 07-12-21 12:08:57, David Hildenbrand wrote:
On 07.12.21 11:54, Michal Hocko wrote:
Hi, I didn't have much time to dive into this deeper and I have hit some problems handling this in an arch specific code so I have tried to play with this instead:
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c5952749ad40..4d71759d0d9b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8032,8 +8032,16 @@ void __init free_area_init(unsigned long *max_zone_pfn) /* Initialise every node */ mminit_verify_pageflags_layout(); setup_nr_node_ids();
- for_each_online_node(nid) {
- for_each_node(nid) { pg_data_t *pgdat = NODE_DATA(nid);
if (!node_online(nid)) {
pr_warn("Node %d uninitialized by the platform. Please report with memory map.\n");
alloc_node_data(nid);
That's x86 specific and not exposed to generic code -- at least in my code base. I think we'd want an arch_alloc_nodedata() variant that allocates via memblock -- and initializes all fields to 0. So essentially a generic alloc_node_data().
you are right
free_area_init_memoryless_node(nid);
That's really just free_area_init_node() below, I do wonder what value free_area_init_memoryless_node() has as of today.
I am not sure there is any real value in having this special name for this but I have kept is sync with what x86 does currently. If we want to remove the wrapper then just do it everywhere. I can do that on top.
continue;
}
- free_area_init_node(nid);
/* Any memory on that node */
Could you give it a try? I do not have any machine which would exhibit the problem so I cannot really test this out. I hope build_zone_info will not choke on this. I assume the node distance table is uninitialized for these nodes and IIUC this should lead to an assumption that all other nodes are close. But who knows that can blow up there.
Btw. does this make any sense at all to others?
__build_all_zonelists() has to update the zonelists of all nodes I think.
I am not sure what you mean. This should be achieved by this patch because the boot time build_all_zonelists will go over all online nodes (i.e. with pgdat). free_area_init happens before that. I am just worried that the arch specific node_distance() will generate a complete garbage or blow up for some reason.