On Wed, Aug 28, 2024 at 02:04:29AM +0800, Yangyu Chen wrote:
On Aug 28, 2024, at 00:40, Charlie Jenkins charlie@rivosinc.com wrote:
On Tue, Aug 27, 2024 at 09:33:11AM -0700, Palmer Dabbelt wrote:
On Tue, 27 Aug 2024 01:05:15 PDT (-0700), cyy@cyyself.name wrote:
Previous patch series[1][2] changes a mmap behavior that treats the hint address as the upper bound of the mmap address range. The motivation of the previous patch series is that some user space software may assume 48-bit address space and use higher bits to encode some information, which may collide with large virtual address space mmap may return. However, to make sv48 by default, we don't need to change the meaning of the hint address on mmap as the upper bound of the mmap address range. This behavior breaks some user space software like Chromium that gets ENOMEM error when the hint address + size is not big enough, as specified in [3].
Other ISAs with larger than 48-bit virtual address space like x86, arm64, and powerpc do not have this special mmap behavior on hint address. They all just make 48-bit / 47-bit virtual address space by default, and if a user space software wants to large virtual address space, it only need to specify a hint address larger than 48-bit / 47-bit.
Thus, this patch series change mmap to use sv48 by default but does not treat the hint address as the upper bound of the mmap address range. After this patch, the behavior of mmap will align with existing behavior on other ISAs with larger than 48-bit virtual address space like x86, arm64, and powerpc. The user space software will no longer need to rewrite their code to fit with this special mmap behavior only on RISC-V.
So it actually looks like we just screwed up the original version of this: the reason we went with the more complicated address splits were than we actually started with a defacto 39-bit page table uABI (ie 38-bit user VAs), and moving to even 48-bit page tables (ie, 47-bit user VAs) broke users (here's an ASAN bug, for example: https://github.com/google/android-riscv64/issues/64).
Unless I'm missing something, though, the code doesn't actually do that. I remember having that discussion at some point, but I must have forgotten to make sure it worked. As far as I can tell we've just moved to the 48-bit VAs by default, which breaks the whole point of doing the compatibilty stuff. Probably a good sign I need to pay more attention to this stuff.
So I'm not really sure what to do here: we can just copy the arm64 behavior at tell the other users that's just how things work, but then we're just pushing around breakages. At a certain point all we can really do with this hint stuff is push around problems, though, and at least if we copy arm64 then most of those problems get reported as bugs for us.
Relying on the hint address in any capacity will push around breakages is my perspective as well. I messed this up from the start. I believe the only way to have consistent behavior is to mark mmap relying on the hint address as a bug, and only rely on the hint address if a flag defines the behavior.
I agree with this. However, since we already have this behavior on x86 and aarch64 for quite a long time, to prevent breaking userspace, I think we can use this patch and then add a flag like MAP_VA_FULL to enable full va address in the future.
Since riscv is not x86 or aarch64, we should be able to make decisions that are best for riscv regardless of if it is identical to how it is implemented in x86 or aarch64.
- Charlie
Thanks, Yangyu Chen
There is an awkward window of releases that will have this "buggy" behavior. However, since the mmap changes introduced a variety of userspace bugs it seems acceptable to revert to the previous behavior and to create a consistent path forward.
- Charlie
Note: Charlie also created another series [4] to completely remove the arch_get_mmap_end and arch_get_mmap_base behavior based on the hint address and size. However, this will cause programs like Go and Java, which need to store information in the higher bits of the pointer, to fail on Sv57 machines.
Changes in v3:
- Rebase to newest master
- Changes some information in cover letter after patchset [2]
- Use patch [5] to patch selftests
- Link to v2: https://lore.kernel.org/linux-riscv/tencent_B2D0435BC011135736262764B511994F...
Changes in v2:
- correct arch_get_mmap_end and arch_get_mmap_base
- Add description in documentation about mmap behavior on kernel v6.6-6.7.
- Improve commit message and cover letter
- Rebase to newest riscv/for-next branch
- Link to v1: https://lore.kernel.org/linux-riscv/tencent_F3B3B5AB1C9D704763CA423E1A41F8BE...
[1] https://lore.kernel.org/linux-riscv/20230809232218.849726-1-charlie@rivosinc... [2] https://lore.kernel.org/linux-riscv/20240130-use_mmap_hint_address-v3-0-8a65... [3] https://lore.kernel.org/linux-riscv/MEYP282MB2312A08FF95D44014AB78411C68D2@M... [4] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-0-cd8962afe47f@ri... [5] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-2-cd8962afe47f@ri...
Charlie Jenkins (1): riscv: selftests: Remove mmap hint address checks
Yangyu Chen (2): RISC-V: mm: not use hint addr as upper bound Documentation: riscv: correct sv57 kernel behavior
Documentation/arch/riscv/vm-layout.rst | 43 ++++++++---- arch/riscv/include/asm/processor.h | 20 ++---- .../selftests/riscv/mm/mmap_bottomup.c | 2 - .../testing/selftests/riscv/mm/mmap_default.c | 2 - tools/testing/selftests/riscv/mm/mmap_test.h | 67 ------------------- 5 files changed, 36 insertions(+), 98 deletions(-)