Previous patch series[1] changes a mmap behavior that treats the hint
address as the upper bound of the mmap address range. The motivation of the
previous patch series is that some user space software may assume 48-bit
address space and use higher bits to encode some information, which may
collide with large virtual address space mmap may return. However, to make
sv48 by default, we don't need to change the meaning of the hint address on
mmap as the upper bound of the mmap address range, especially when this
behavior only shows up on the RISC-V. This behavior also breaks some user
space software which assumes mmap should try to create mapping on the hint
address if possible. As the mmap manpage said:
> If addr is not NULL, then the kernel takes it as a hint about where to
> place the mapping; on Linux, the kernel will pick a nearby page boundary
> (but always above or equal to the value specified by
> /proc/sys/vm/mmap_min_addr) and attempt to create the mapping there.
Unfortunately, what mmap said is not true on RISC-V since kernel v6.6.
Other ISAs with larger than 48-bit virtual address space like x86, arm64,
and powerpc do not have this special mmap behavior on hint address. They
all just make 48-bit / 47-bit virtual address space by default, and if a
user space software wants to large virtual address space, it only need to
specify a hint address larger than 48-bit / 47-bit.
Thus, this patch series keeps the change of mmap to use sv48 by default but
does not treat the hint address as the upper bound of the mmap address
range. After this patch, the behavior of mmap will align with existing
behavior on other ISAs with larger than 48-bit virtual address space like
x86, arm64, and powerpc. The user space software will no longer need to
rewrite their code to fit with this special mmap behavior only on RISC-V.
My concern is that the change of mmap behavior on the hint address is
already in the upstream kernel since v6.6, and it might be hard to revert
it although it already brings some regression on some user space software.
And it will be harder than adding it since v6.6 because mmap not creating
mapping on the hint address is very common, especially when running on a
machine without sv57 / sv48. However, if some user space software already
adopted this special mmap behavior on RISC-V, we should not return a mmap
address larger than the hint if the address is larger than BIT(38). My
opinion is that revert this change on the next kernel release might be a
good choice as only a few of hardware support sv57 / sv48 now, these
changes will have no impact on sv39 systems.
Moreover, previous patch series said it make sv48 by default, which is
in the cover letter, kernel documentation and MMAP_VA_BITS defination.
However, the code on arch_get_mmap_end and arch_get_mmap_base marco still
use sv39 by default, which makes me confused, and I still use sv48 by
default in this patch series including arch_get_mmap_end and
arch_get_mmap_base.
Changes in v2:
- correct arch_get_mmap_end and arch_get_mmap_base
- Add description in documentation about mmap behavior on kernel v6.6-6.7.
- Improve commit message and cover letter
- Rebase to newest riscv/for-next branch
- Link to v1: https://lore.kernel.org/linux-riscv/tencent_F3B3B5AB1C9D704763CA423E1A41F8B…
[1]. https://lore.kernel.org/linux-riscv/20230809232218.849726-1-charlie@rivosin…
Yangyu Chen (3):
RISC-V: mm: do not treat hint addr on mmap as the upper bound to
search
RISC-V: mm: only test mmap without hint
Documentation: riscv: correct sv57 kernel behavior
Documentation/arch/riscv/vm-layout.rst | 54 ++++++++++++-------
arch/riscv/include/asm/processor.h | 38 +++----------
.../selftests/riscv/mm/mmap_bottomup.c | 12 -----
.../testing/selftests/riscv/mm/mmap_default.c | 12 -----
tools/testing/selftests/riscv/mm/mmap_test.h | 30 -----------
5 files changed, 41 insertions(+), 105 deletions(-)
--
2.43.0
Powerpc specific selftests (specifically powerpc/primitives) included in linux-next
tree fails to build with following error
gcc -std=gnu99 -O2 -Wall -Werror -DGIT_VERSION='"next-20240229-0-gf303a3e2bcfb-dirty"' -I/home/sachin/linux-next/tools/testing/selftests/powerpc/include -I/home/sachin/linux-next/tools/testing/selftests/powerpc/primitives load_unaligned_zeropad.c ../harness.c -o /home/sachin/linux-next/tools/testing/selftests/powerpc/primitives/load_unaligned_zeropad
In file included from load_unaligned_zeropad.c:26:
word-at-a-time.h:7:10: fatal error: linux/bitops.h: No such file or directory
7 | #include <linux/bitops.h>
| ^~~~~~~~~~~~~~~~
compilation terminated.
The header file in question was last changed by following commit
commit 66a5c40f60f5d88ad8d47ba6a4ba05892853fa1f
kernel.h: removed REPEAT_BYTE from kernel.h
Thanks
— Sachin
Hi,
when running the dev_addr_lists unit test with lock debugging enabled,
I always get the following lockdep warning.
[ 7.031327] ====================================
[ 7.031393] WARNING: kunit_try_catch/1886 still has locks held!
[ 7.031478] 6.8.0-rc6-00053-g0fec7343edb5-dirty #1 Tainted: G W N
[ 7.031728] ------------------------------------
[ 7.031816] 1 lock held by kunit_try_catch/1886:
[ 7.031896] #0: ffffffff8ed35008 (rtnl_mutex){+.+.}-{3:3}, at: dev_addr_test_init+0x6a/0x100
Instrumentation shows that dev_addr_test_exit() is called, but only
after the warning fires.
Is this a problem with kunit tests or a problem with this specific test ?
Thanks,
Guenter
CC testing
On Wed, Feb 28, 2024 at 8:59 AM Guenter Roeck <linux(a)roeck-us.net> wrote:
> On 2/27/24 23:25, Christophe Leroy wrote:
> [ ... ]
> >>
> >> This test case is supposed to be as true to the "general case" as
> >> possible, so I have aligned the data along 14 + NET_IP_ALIGN. On ARM
> >> this will be a 16-byte boundary since NET_IP_ALIGN is 2. A driver that
> >> does not follow this may not be appropriately tested by this test case,
> >> but anyone is welcome to submit additional test cases that address this
> >> additional alignment concern.
> >
> > But then this test case is becoming less and less true to the "general
> > case" with this patch, whereas your initial implementation was almost
> > perfect as it was covering most cases, a lot more than what we get with
> > that patch applied.
> >
> NP with me if that is where people want to go. I'll simply disable checksum
> tests on all architectures which don't support unaligned accesses (so far
> it looks like that is only arm with thumb instructions, and possibly nios2).
> I personally find that less desirable and would have preferred a second
> configurable set of tests for unaligned accesses, but I have no problem
> with it.
IMHO the tests should validate the expected functionality. If a test
fails, either functionality is missing or behaves wrong, or the test
is wrong.
What is the point of writing tests for a core functionality like network
checksumming that do not match the expected functionality?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert(a)linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds