On Fri, Sep 06, 2024 at 07:17:44AM GMT, Arnd Bergmann wrote:
On Thu, Sep 5, 2024, at 21:15, Charlie Jenkins wrote:
Create a personality flag ADDR_LIMIT_47BIT to support applications that wish to transition from running in environments that support at most 47-bit VAs to environments that support larger VAs. This personality can be set to cause all allocations to be below the 47-bit boundary. Using MAP_FIXED with mmap() will bypass this restriction.
Signed-off-by: Charlie Jenkins charlie@rivosinc.com
I think having an architecture-independent mechanism to limit the size of the 64-bit address space is useful in general, and we've discussed the same thing for arm64 in the past, though we have not actually reached an agreement on the ABI previously.
The thread on the original proposals attests to this being rather a fraught topic, and I think the weight of opinion was more so in favour of opt-in rather than opt-out.
@@ -22,6 +22,7 @@ enum { WHOLE_SECONDS = 0x2000000, STICKY_TIMEOUTS = 0x4000000, ADDR_LIMIT_3GB = 0x8000000,
- ADDR_LIMIT_47BIT = 0x10000000,
};
I'm a bit worried about having this done specifically in the personality flag bits, as they are rather limited. We obviously don't want to add many more such flags when there could be a way to just set the default limit.
Since I'm the one who suggested it, I feel I should offer some kind of vague defence here :)
We shouldn't let perfect be the enemy of the good. This is a relatively straightforward means of achieving the aim (assuming your concern about arch_get_mmap_end() below isn't a blocker) which has the least impact on existing code.
Of course we can end up in absurdities where we start doing ADDR_LIMIT_xxBIT... but again - it's simple, shouldn't represent an egregious maintenance burden and is entirely opt-in so has things going for it.
It's also unclear to me how we want this flag to interact with the existing logic in arch_get_mmap_end(), which attempts to limit the default mapping to a 47-bit address space already.
How does ADDR_LIMIT_3GB presently interact with that?
For some reason, it appears that the arch_get_mmap_end() logic on RISC-V defaults to the maximum address space for the 'addr==0' case which is inconsistentn with the other architectures, so we should probably fix that part first, possibly moving more of that logic into a shared implementation.
Arnd