On 9/7/2021 4:49 PM, Linus Torvalds wrote:
On Tue, Sep 7, 2021 at 4:35 PM Nathan Chancellor nathan@kernel.org wrote:
Won't your example only fix the issue with CONFIG_CPUMASK_OFFSTACK=y
Yes, but..
or am I misreading the gigantic comment in include/linux/cpumask.h?
you're not misreading the comment, but you are missing this important fact:
config NR_CPUS_RANGE_END int depends on X86_64 default 8192 if SMP && CPUMASK_OFFSTACK default 512 if SMP && !CPUMASK_OFFSTACK default 1 if !SMP
so basically you can't choose more than 512 CPU's unless CPUMASK_OFFSTACK is set.
Of course, we may have some bug in the Kconfig elsewhere, and I didn't check other architectures. So maybe there's some way to work around it.
Ah, okay, that is an x86-only limitation so I missed it. I do not think there is any bug with that Kconfig logic but it is only used on x86.
But basically the rule is that CPUMASK_OFFSTACK and NR_CPUS are linked.
That linkage is admittedly a bit hidden and much too subtle. I think the only real reason why it's done that way is because people wanted to do test builds with CPUMASK_OFFSTACK even without having to have some ludicrous number of NR_CPUS.
You'll notice that the question "CPUMASK_OFFSTACK" is only enabled if DEBUG_PER_CPU_MAPS is true.
That whole "for debugging" reason made more sense a decade ago when this was all new and fancy.
It might make more sense to do that very explicitly, and make CPUMASK_OFFSTACK be just something like
config NR_CPUS_RANGE_END def_bool NR_CPUS <= 512
and get rid of the subtlety and choice in the matter.
Indeed. Grepping around the tree, I see that arc, arm64, ia64, powerpc, and sparc64 all support NR_CPUS up to 4096 (8192 for PPC) but none of them select CPUMASK_OFFSTACK so it seems like they should test support for CPUMASK_OFFSTACK and adopt similar logic to x86 to limit how much stack space cpumask variables can use. Like you mentioned, it probably has not come up before because most of those are 64-bit platforms that have a higher default FRAME_WARN value (and the default NR_CPUS values on all of them is small). I only noticed because Fedora sets NR_CPUS to 4096 for arm64 and has a FRAME_WARN value of 1024, meaning two cpumask variables in the same frame puts that frame right at the 1024 limit.
Cheers, Nathan