On Sun, Jan 14, 2024 at 01:26:57AM +0800, Yangyu Chen wrote:
Hi, Charlie
Although this patchset has been merged I still have some questions about this patchset. Because it breaks regular mmap if address >= 38 bits on sv48 / sv57 capable systems like qemu. For example, If a userspace program wants to mmap an anonymous page to addr=(1<<45) on an sv48 capable system, it will fail and kernel will mmaped to another sv39 address since it does
Thank you for raising this concern. To make sure I am understanding correctly, you are passing a hint address of (1<<45) and expecting mmap to return 1<<45 and if it returns a different address you are describing mmap as failing? If you want an address that is in the sv48 space you can pass in an address that is greater than 1<<47.
not meet the requirement to use sv48 as you wrote:
else if ((((_addr) >= VA_USER_SV48)) && (VA_BITS >= VA_BITS_SV48)) \ mmap_end = VA_USER_SV48; \ else \ mmap_end = VA_USER_SV39; \
Then, How can a userspace program create a mmap with a hint if the address
= (1<<38) after your patch without MAP_FIXED? The only way to do this is
to pass a hint >= (1<<47) on mmap syscall then kernel will return a random address in sv48 address space but the hint address gets lost. I think this
In order to force mmap to return the address provided you must use MAP_FIXED. Otherwise, the address is a "hint" and has no guarantees. The hint address on riscv is used to mean "don't give me an address that uses more bits than this". This behavior is not unique to riscv, arm64 and powerpc use a similar scheme. In arch/arm64/include/asm/processor.h there is the following code:
#define arch_get_mmap_base(addr, base) ((addr > DEFAULT_MAP_WINDOW) ? \ base + TASK_SIZE - DEFAULT_MAP_WINDOW :\ base)
arm64/powerpc are only concerned with a single boundary so the code is simpler.
violate the principle of mmap syscall as kernel should take the hint and attempt to create the mapping there.
Although the man page for mmap does say "on Linux, the kernel will pick a nearby page boundary" it is still a hint address so there is no strict requirement (and the precedent has already been set by arm64/powerpc).
I don't think patching in this way is right. However, if we only revert this patch, some programs relying on mmap to return address with effective bits <= 48 will still be an issue and it might expand to other ISAs if they implement larger virtual address space like RISC-V sv57. A better way to solve this might be adding a MAP_48BIT flag to mmap like MAP_32BIT has been introduced for decades.
Thanks, Yangyu Chen
- Charlie