These are very much up for discussion, as it's a pretty big new user interface and it's quite a bit different from how we've historically done things: this isn't just providing an ISA string to userspace, this has its own format for providing information to userspace.
There's been a bunch of off-list discussions about this, including at Plumbers. The original plan was to do something involving providing an ISA string to userspace, but ISA strings just aren't sufficient for a stable ABI any more: in order to parse an ISA string users need the version of the specifications that the string is written to, the version of each extension (sometimes at a finer granularity than the RISC-V releases/versions encode), and the expected use case for the ISA string (ie, is it a U-mode or M-mode string). That's a lot of complexity to try and keep ABI compatible and it's probably going to continue to grow, as even if there's no more complexity in the specifications we'll have to deal with the various ISA string parsing oddities that end up all over userspace.
Instead this patch set takes a very different approach and provides a set of key/value pairs that encode various bits about the system. The big advantage here is that we can clearly define what these mean so we can ensure ABI stability, but it also allows us to encode information that's unlikely to ever appear in an ISA string (see the misaligned access performance, for example). The resulting interface looks a lot like what arm64 and x86 do, and will hopefully fit well into something like ACPI in the future.
The actual user interface is a syscall. I'm not really sure that's the right way to go about this, but it makes for flexible prototying. Various other approaches have been talked about like making HWCAP2 a pointer, having a VDSO routine, or exposing this via sysfs. Those seem like generally reasonable approaches, but I've yet to figure out a way to get the general case working without a syscall as that's the only way I've come up with to deal with the heterogenous CPU case. Happy to hear if someone has a better idea, though, as I don't really want to add a syscall if we can avoid it.
An example series in glibc exposing this syscall and using it in an ifunc selector for memcpy can be found at [1].
[1] https://public-inbox.org/libc-alpha/20230206194819.1679472-1-evan@rivosinc.c...
Changes in v2: - Changed the interface to look more like poll(). Rather than supplying key_offset and getting back an array of values with numerically contiguous keys, have the user pre-fill the key members of the array, and the kernel will fill in the corresponding values. For any key it doesn't recognize, it will set the key of that element to -1. This allows usermode to quickly ask for exactly the elements it cares about, and not get bogged down in a back and forth about newer keys that older kernels might not recognize. In other words, the kernel can communicate that it doesn't recognize some of the keys while still providing the data for the keys it does know. - Added a shortcut to the cpuset parameters that if a size of 0 and NULL is provided for the CPU set, the kernel will use a cpu mask of all online CPUs. This is convenient because I suspect most callers will only want to act on a feature if it's supported on all CPUs, and it's a headache to dynamically allocate an array of all 1s, not to mention a waste to have the kernel loop over all of the offline bits. - Fixed logic error in if(of_property_read_string...) that caused crash - Include cpufeature.h in cpufeature.h to avoid undeclared variable warning. - Added a _MASK define - Fix random checkpatch complaints - Updated the selftests to the new API and added some more. - Fixed indentation, comments in .S, and general checkpatch complaints.
Evan Green (4): RISC-V: Move struct riscv_cpuinfo to new header RISC-V: Add a syscall for HW probing RISC-V: hwprobe: Support probing of misaligned access performance selftests: Test the new RISC-V hwprobe interface
Palmer Dabbelt (2): RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA dt-bindings: Add RISC-V misaligned access performance
.../devicetree/bindings/riscv/cpus.yaml | 15 ++ Documentation/riscv/hwprobe.rst | 66 ++++++ Documentation/riscv/index.rst | 1 + arch/riscv/include/asm/cpufeature.h | 23 +++ arch/riscv/include/asm/hwprobe.h | 13 ++ arch/riscv/include/asm/smp.h | 9 + arch/riscv/include/asm/syscall.h | 3 + arch/riscv/include/uapi/asm/hwprobe.h | 35 ++++ arch/riscv/include/uapi/asm/unistd.h | 8 + arch/riscv/kernel/cpu.c | 11 +- arch/riscv/kernel/cpufeature.c | 31 ++- arch/riscv/kernel/sys_riscv.c | 192 +++++++++++++++++- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/riscv/Makefile | 58 ++++++ .../testing/selftests/riscv/hwprobe/Makefile | 10 + .../testing/selftests/riscv/hwprobe/hwprobe.c | 89 ++++++++ .../selftests/riscv/hwprobe/sys_hwprobe.S | 12 ++ tools/testing/selftests/riscv/libc.S | 46 +++++ 18 files changed, 613 insertions(+), 10 deletions(-) create mode 100644 Documentation/riscv/hwprobe.rst create mode 100644 arch/riscv/include/asm/cpufeature.h create mode 100644 arch/riscv/include/asm/hwprobe.h create mode 100644 arch/riscv/include/uapi/asm/hwprobe.h create mode 100644 tools/testing/selftests/riscv/Makefile create mode 100644 tools/testing/selftests/riscv/hwprobe/Makefile create mode 100644 tools/testing/selftests/riscv/hwprobe/hwprobe.c create mode 100644 tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S create mode 100644 tools/testing/selftests/riscv/libc.S
This adds a test for the recently added RISC-V interface for probing hardware capabilities. It happens to be the first selftest we have for RISC-V, so I've added some infrastructure for those as well. The build stuff looks pretty straight-forward, but there's also a tiny C library to avoid coupling this to any userspace implementation.
Co-developed-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Evan Green evan@rivosinc.com
---
Changes in v2: - Updated the selftests to the new API and added some more. - Fixed indentation, comments in .S, and general checkpatch complaints.
--- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/riscv/Makefile | 58 ++++++++++++ .../testing/selftests/riscv/hwprobe/Makefile | 10 +++ .../testing/selftests/riscv/hwprobe/hwprobe.c | 89 +++++++++++++++++++ .../selftests/riscv/hwprobe/sys_hwprobe.S | 12 +++ tools/testing/selftests/riscv/libc.S | 46 ++++++++++ 6 files changed, 216 insertions(+) create mode 100644 tools/testing/selftests/riscv/Makefile create mode 100644 tools/testing/selftests/riscv/hwprobe/Makefile create mode 100644 tools/testing/selftests/riscv/hwprobe/hwprobe.c create mode 100644 tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S create mode 100644 tools/testing/selftests/riscv/libc.S
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 41b649452560..a599ef726310 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -62,6 +62,7 @@ TARGETS += pstore TARGETS += ptrace TARGETS += openat2 TARGETS += resctrl +TARGETS += riscv TARGETS += rlimits TARGETS += rseq TARGETS += rtc diff --git a/tools/testing/selftests/riscv/Makefile b/tools/testing/selftests/riscv/Makefile new file mode 100644 index 000000000000..32a72902d045 --- /dev/null +++ b/tools/testing/selftests/riscv/Makefile @@ -0,0 +1,58 @@ +# SPDX-License-Identifier: GPL-2.0 +# Originally tools/testing/arm64/Makefile + +# When ARCH not overridden for crosscompiling, lookup machine +ARCH ?= $(shell uname -m 2>/dev/null || echo not) + +ifneq (,$(filter $(ARCH),riscv)) +RISCV_SUBTARGETS ?= hwprobe +else +RISCV_SUBTARGETS := +endif + +CFLAGS := -Wall -O2 -g + +# A proper top_srcdir is needed by KSFT(lib.mk) +top_srcdir = $(realpath ../../../../) + +# Additional include paths needed by kselftest.h and local headers +CFLAGS += -I$(top_srcdir)/tools/testing/selftests/ + +CFLAGS += $(KHDR_INCLUDES) + +export CFLAGS +export top_srcdir + +all: + @for DIR in $(RISCV_SUBTARGETS); do \ + BUILD_TARGET=$(OUTPUT)/$$DIR; \ + mkdir -p $$BUILD_TARGET; \ + $(MAKE) OUTPUT=$$BUILD_TARGET -C $$DIR $@; \ + done + +install: all + @for DIR in $(RISCV_SUBTARGETS); do \ + BUILD_TARGET=$(OUTPUT)/$$DIR; \ + $(MAKE) OUTPUT=$$BUILD_TARGET -C $$DIR $@; \ + done + +run_tests: all + @for DIR in $(RISCV_SUBTARGETS); do \ + BUILD_TARGET=$(OUTPUT)/$$DIR; \ + $(MAKE) OUTPUT=$$BUILD_TARGET -C $$DIR $@; \ + done + +# Avoid any output on non riscv on emit_tests +emit_tests: all + @for DIR in $(RISCV_SUBTARGETS); do \ + BUILD_TARGET=$(OUTPUT)/$$DIR; \ + $(MAKE) OUTPUT=$$BUILD_TARGET -C $$DIR $@; \ + done + +clean: + @for DIR in $(RISCV_SUBTARGETS); do \ + BUILD_TARGET=$(OUTPUT)/$$DIR; \ + $(MAKE) OUTPUT=$$BUILD_TARGET -C $$DIR $@; \ + done + +.PHONY: all clean install run_tests emit_tests diff --git a/tools/testing/selftests/riscv/hwprobe/Makefile b/tools/testing/selftests/riscv/hwprobe/Makefile new file mode 100644 index 000000000000..614501584803 --- /dev/null +++ b/tools/testing/selftests/riscv/hwprobe/Makefile @@ -0,0 +1,10 @@ +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2021 ARM Limited +# Originally tools/testing/arm64/abi/Makefile + +TEST_GEN_PROGS := hwprobe + +include ../../lib.mk + +$(OUTPUT)/hwprobe: hwprobe.c ../libc.S sys_hwprobe.S + $(CC) -o$@ $(CFLAGS) $(LDFLAGS) -nostdlib $^ diff --git a/tools/testing/selftests/riscv/hwprobe/hwprobe.c b/tools/testing/selftests/riscv/hwprobe/hwprobe.c new file mode 100644 index 000000000000..ddfb61de2938 --- /dev/null +++ b/tools/testing/selftests/riscv/hwprobe/hwprobe.c @@ -0,0 +1,89 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <asm/hwprobe.h> + +/* + * Rather than relying on having a new enough libc to define this, just do it + * ourselves. This way we don't need to be coupled to a new-enough libc to + * contain the call. + */ +long riscv_hwprobe(struct riscv_hwprobe *pairs, long pair_count, + long cpu_count, unsigned long *cpus, unsigned long flags); + +int main(int argc, char **argv) +{ + struct riscv_hwprobe pairs[8]; + unsigned long cpus; + long out; + + /* Fake the CPU_SET ops. */ + cpus = -1; + + /* + * Just run a basic test: pass enough pairs to get up to the base + * behavior, and then check to make sure it's sane. + */ + for (long i = 0; i < 8; i++) + pairs[i].key = i; + out = riscv_hwprobe(pairs, 8, 1, &cpus, 0); + if (out != 0) + return -1; + for (long i = 0; i < 4; ++i) { + /* Fail if the kernel claims not to recognize a base key. */ + if ((i < 4) && (pairs[i].key != i)) + return -2; + + if (pairs[i].key != RISCV_HWPROBE_KEY_BASE_BEHAVIOR) + continue; + + if (pairs[i].value & RISCV_HWPROBE_BASE_BEHAVIOR_IMA) + continue; + + return -3; + } + + /* + * This should also work with a NULL CPU set, but should not work + * with an improperly supplied CPU set. + */ + out = riscv_hwprobe(pairs, 8, 0, 0, 0); + if (out != 0) + return -4; + + out = riscv_hwprobe(pairs, 8, 0, &cpus, 0); + if (out == 0) + return -5; + + out = riscv_hwprobe(pairs, 8, 1, 0, 0); + if (out == 0) + return -6; + + /* + * Check that keys work by providing one that we know exists, and + * checking to make sure the resultig pair is what we asked for. + */ + pairs[0].key = RISCV_HWPROBE_KEY_BASE_BEHAVIOR; + out = riscv_hwprobe(pairs, 1, 1, &cpus, 0); + if (out != 0) + return -7; + if (pairs[0].key != RISCV_HWPROBE_KEY_BASE_BEHAVIOR) + return -8; + + /* + * Check that an unknown key gets overwritten with -1, + * but doesn't block elements after it. + */ + pairs[0].key = 0x5555; + pairs[1].key = 1; + pairs[1].value = 0xAAAA; + out = riscv_hwprobe(pairs, 2, 0, 0, 0); + if (out != 0) + return -9; + + if (pairs[0].key != -1) + return -10; + + if ((pairs[1].key != 1) || (pairs[1].value == 0xAAAA)) + return -11; + + return 0; +} diff --git a/tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S b/tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S new file mode 100644 index 000000000000..ed8d28863b27 --- /dev/null +++ b/tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (C) 2022 Rivos, Inc */ + +.text +.global riscv_hwprobe +riscv_hwprobe: + # Put __NR_riscv_hwprobe in the syscall number register, then just shim + # back the kernel's return. This doesn't do any sort of errno + # handling, the caller can deal with it. + li a7, 258 + ecall + ret diff --git a/tools/testing/selftests/riscv/libc.S b/tools/testing/selftests/riscv/libc.S new file mode 100644 index 000000000000..1041bbea9b6b --- /dev/null +++ b/tools/testing/selftests/riscv/libc.S @@ -0,0 +1,46 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (C) 2022 Rivos, Inc */ +/* A C library */ + +#if __riscv_xlen == 64 +# define REG_S sd +#else +# define REG_S sw +#endif + +.text +.global _start +_start: +.option push +.option norelax + la gp, __global_pointer$ +.option pop + + la sp, stack + + la t0, heap + la t1, brk + REG_S t0, 0(t1) + + li a0, 0 + li a1, 0 + + call main + + li a7, 93 + ecall + +1: + j 1b + +.data +brk: + .long 0 + +.global heap +heap: +.rep 65536 +.byte 0 +.endr +.global stack +stack:
On Mon, Feb 06, 2023 at 12:14:55PM -0800, Evan Green wrote:
+int main(int argc, char **argv) +{
--- /dev/null +++ b/tools/testing/selftests/riscv/libc.S
+.global _start +_start: +.option push +.option norelax
- la gp, __global_pointer$
+.option pop
- la sp, stack
- la t0, heap
- la t1, brk
- REG_S t0, 0(t1)
- li a0, 0
- li a1, 0
- call main
This looks like it's just a standard program entry but I don't speak RISC-V asm so I might be missing something. If that's the case might it make sense to use nolibc here?
On Mon, Feb 6, 2023 at 1:27 PM Mark Brown broonie@kernel.org wrote:
On Mon, Feb 06, 2023 at 12:14:55PM -0800, Evan Green wrote:
+int main(int argc, char **argv) +{
--- /dev/null +++ b/tools/testing/selftests/riscv/libc.S
+.global _start +_start: +.option push +.option norelax
la gp, __global_pointer$
+.option pop
la sp, stack
la t0, heap
la t1, brk
REG_S t0, 0(t1)
li a0, 0
li a1, 0
call main
This looks like it's just a standard program entry but I don't speak RISC-V asm so I might be missing something. If that's the case might it make sense to use nolibc here?
I think I can just remove this file entirely along with -nostdlib, and just let the compiler add in this glue. -Evan
On 6 Feb 2023, at 20:14, Evan Green evan@rivosinc.com wrote:
These are very much up for discussion, as it's a pretty big new user interface and it's quite a bit different from how we've historically done things: this isn't just providing an ISA string to userspace, this has its own format for providing information to userspace.
There's been a bunch of off-list discussions about this, including at Plumbers. The original plan was to do something involving providing an ISA string to userspace, but ISA strings just aren't sufficient for a stable ABI any more: in order to parse an ISA string users need the version of the specifications that the string is written to, the version of each extension (sometimes at a finer granularity than the RISC-V releases/versions encode), and the expected use case for the ISA string (ie, is it a U-mode or M-mode string). That's a lot of complexity to try and keep ABI compatible and it's probably going to continue to grow, as even if there's no more complexity in the specifications we'll have to deal with the various ISA string parsing oddities that end up all over userspace.
Instead this patch set takes a very different approach and provides a set of key/value pairs that encode various bits about the system. The big advantage here is that we can clearly define what these mean so we can ensure ABI stability, but it also allows us to encode information that's unlikely to ever appear in an ISA string (see the misaligned access performance, for example). The resulting interface looks a lot like what arm64 and x86 do, and will hopefully fit well into something like ACPI in the future.
The actual user interface is a syscall. I'm not really sure that's the right way to go about this, but it makes for flexible prototying. Various other approaches have been talked about like making HWCAP2 a pointer, having a VDSO routine, or exposing this via sysfs. Those seem like generally reasonable approaches, but I've yet to figure out a way to get the general case working without a syscall as that's the only way I've come up with to deal with the heterogenous CPU case. Happy to hear if someone has a better idea, though, as I don't really want to add a syscall if we can avoid it.
Please work with https://github.com/riscv-non-isa/riscv-c-api-doc as it’s crucial we have a portable standard interface for applications to query this information that works on OSes other than Linux. This can be backed by whatever you want, whether a syscall, magic VDSO thing, sysfs, etc, but it’s key that the exposed interface outside of libc is not Linux-specific otherwise we’re going to get fragmentation in this space.
I would encourage figuring out the right shape for the exposed interface first before continuing to refine details of how that information gets communicated between the kernel and libc.
Jess
An example series in glibc exposing this syscall and using it in an ifunc selector for memcpy can be found at [1].
[1] https://public-inbox.org/libc-alpha/20230206194819.1679472-1-evan@rivosinc.c...
Changes in v2:
- Changed the interface to look more like poll(). Rather than supplying key_offset and getting back an array of values with numerically contiguous keys, have the user pre-fill the key members of the array, and the kernel will fill in the corresponding values. For any key it doesn't recognize, it will set the key of that element to -1. This allows usermode to quickly ask for exactly the elements it cares about, and not get bogged down in a back and forth about newer keys that older kernels might not recognize. In other words, the kernel can communicate that it doesn't recognize some of the keys while still providing the data for the keys it does know.
- Added a shortcut to the cpuset parameters that if a size of 0 and NULL is provided for the CPU set, the kernel will use a cpu mask of all online CPUs. This is convenient because I suspect most callers will only want to act on a feature if it's supported on all CPUs, and it's a headache to dynamically allocate an array of all 1s, not to mention a waste to have the kernel loop over all of the offline bits.
- Fixed logic error in if(of_property_read_string...) that caused crash
- Include cpufeature.h in cpufeature.h to avoid undeclared variable warning.
- Added a _MASK define
- Fix random checkpatch complaints
- Updated the selftests to the new API and added some more.
- Fixed indentation, comments in .S, and general checkpatch complaints.
Evan Green (4): RISC-V: Move struct riscv_cpuinfo to new header RISC-V: Add a syscall for HW probing RISC-V: hwprobe: Support probing of misaligned access performance selftests: Test the new RISC-V hwprobe interface
Palmer Dabbelt (2): RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA dt-bindings: Add RISC-V misaligned access performance
.../devicetree/bindings/riscv/cpus.yaml | 15 ++ Documentation/riscv/hwprobe.rst | 66 ++++++ Documentation/riscv/index.rst | 1 + arch/riscv/include/asm/cpufeature.h | 23 +++ arch/riscv/include/asm/hwprobe.h | 13 ++ arch/riscv/include/asm/smp.h | 9 + arch/riscv/include/asm/syscall.h | 3 + arch/riscv/include/uapi/asm/hwprobe.h | 35 ++++ arch/riscv/include/uapi/asm/unistd.h | 8 + arch/riscv/kernel/cpu.c | 11 +- arch/riscv/kernel/cpufeature.c | 31 ++- arch/riscv/kernel/sys_riscv.c | 192 +++++++++++++++++- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/riscv/Makefile | 58 ++++++ .../testing/selftests/riscv/hwprobe/Makefile | 10 + .../testing/selftests/riscv/hwprobe/hwprobe.c | 89 ++++++++ .../selftests/riscv/hwprobe/sys_hwprobe.S | 12 ++ tools/testing/selftests/riscv/libc.S | 46 +++++ 18 files changed, 613 insertions(+), 10 deletions(-) create mode 100644 Documentation/riscv/hwprobe.rst create mode 100644 arch/riscv/include/asm/cpufeature.h create mode 100644 arch/riscv/include/asm/hwprobe.h create mode 100644 arch/riscv/include/uapi/asm/hwprobe.h create mode 100644 tools/testing/selftests/riscv/Makefile create mode 100644 tools/testing/selftests/riscv/hwprobe/Makefile create mode 100644 tools/testing/selftests/riscv/hwprobe/hwprobe.c create mode 100644 tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S create mode 100644 tools/testing/selftests/riscv/libc.S
-- 2.25.1
linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
On 2/6/23 22:11, Jessica Clarke wrote:
On 6 Feb 2023, at 20:14, Evan Green evan@rivosinc.com wrote:
These are very much up for discussion, as it's a pretty big new user interface and it's quite a bit different from how we've historically done things: this isn't just providing an ISA string to userspace, this has its own format for providing information to userspace.
There's been a bunch of off-list discussions about this, including at Plumbers. The original plan was to do something involving providing an ISA string to userspace, but ISA strings just aren't sufficient for a stable ABI any more: in order to parse an ISA string users need the version of the specifications that the string is written to, the version of each extension (sometimes at a finer granularity than the RISC-V releases/versions encode), and the expected use case for the ISA string (ie, is it a U-mode or M-mode string). That's a lot of complexity to try and keep ABI compatible and it's probably going to continue to grow, as even if there's no more complexity in the specifications we'll have to deal with the various ISA string parsing oddities that end up all over userspace.
Instead this patch set takes a very different approach and provides a set of key/value pairs that encode various bits about the system. The big advantage here is that we can clearly define what these mean so we can ensure ABI stability, but it also allows us to encode information that's unlikely to ever appear in an ISA string (see the misaligned access performance, for example). The resulting interface looks a lot like what arm64 and x86 do, and will hopefully fit well into something like ACPI in the future.
The actual user interface is a syscall. I'm not really sure that's the right way to go about this, but it makes for flexible prototying. Various other approaches have been talked about like making HWCAP2 a pointer, having a VDSO routine, or exposing this via sysfs. Those seem like generally reasonable approaches, but I've yet to figure out a way to get the general case working without a syscall as that's the only way I've come up with to deal with the heterogenous CPU case. Happy to hear if someone has a better idea, though, as I don't really want to add a syscall if we can avoid it.
Operating systems tend to reschedule threads moving them between harts. New threads may be created by processes at any time.
It is not clear to me what information the syscall shall convey in the heterogeneous case. I see the following alternatives:
* The syscall describes the current hart. * The syscall provides individual properties of all harts. * The syscall provides a set of properties that is valid for any hart on which the thread might be scheduled. * The syscall provides a set of properties that is valid for any hart that any thread of the current process might be scheduled to.
Describing only the current hart would not be helpful as the thread might be rescheduled to a hart with a smaller set of available extensions.
Describing the properties of all harts would not be helpful if the thread has no control to which hart it is scheduled.
Processes that don't control scheduling would most benefit from a guaranteed set of properties valid for all threads of the process.
Processes that take control of scheduling would probably want information about all harts.
Best regards
Heinrich
Please work with https://github.com/riscv-non-isa/riscv-c-api-doc as it’s crucial we have a portable standard interface for applications to query this information that works on OSes other than Linux. This can be backed by whatever you want, whether a syscall, magic VDSO thing, sysfs, etc, but it’s key that the exposed interface outside of libc is not Linux-specific otherwise we’re going to get fragmentation in this space.
I would encourage figuring out the right shape for the exposed interface first before continuing to refine details of how that information gets communicated between the kernel and libc.
Jess
An example series in glibc exposing this syscall and using it in an ifunc selector for memcpy can be found at [1].
[1] https://public-inbox.org/libc-alpha/20230206194819.1679472-1-evan@rivosinc.c...
Changes in v2:
- Changed the interface to look more like poll(). Rather than supplying key_offset and getting back an array of values with numerically contiguous keys, have the user pre-fill the key members of the array, and the kernel will fill in the corresponding values. For any key it doesn't recognize, it will set the key of that element to -1. This allows usermode to quickly ask for exactly the elements it cares about, and not get bogged down in a back and forth about newer keys that older kernels might not recognize. In other words, the kernel can communicate that it doesn't recognize some of the keys while still providing the data for the keys it does know.
- Added a shortcut to the cpuset parameters that if a size of 0 and NULL is provided for the CPU set, the kernel will use a cpu mask of all online CPUs. This is convenient because I suspect most callers will only want to act on a feature if it's supported on all CPUs, and it's a headache to dynamically allocate an array of all 1s, not to mention a waste to have the kernel loop over all of the offline bits.
- Fixed logic error in if(of_property_read_string...) that caused crash
- Include cpufeature.h in cpufeature.h to avoid undeclared variable warning.
- Added a _MASK define
- Fix random checkpatch complaints
- Updated the selftests to the new API and added some more.
- Fixed indentation, comments in .S, and general checkpatch complaints.
Evan Green (4): RISC-V: Move struct riscv_cpuinfo to new header RISC-V: Add a syscall for HW probing RISC-V: hwprobe: Support probing of misaligned access performance selftests: Test the new RISC-V hwprobe interface
Palmer Dabbelt (2): RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA dt-bindings: Add RISC-V misaligned access performance
.../devicetree/bindings/riscv/cpus.yaml | 15 ++ Documentation/riscv/hwprobe.rst | 66 ++++++ Documentation/riscv/index.rst | 1 + arch/riscv/include/asm/cpufeature.h | 23 +++ arch/riscv/include/asm/hwprobe.h | 13 ++ arch/riscv/include/asm/smp.h | 9 + arch/riscv/include/asm/syscall.h | 3 + arch/riscv/include/uapi/asm/hwprobe.h | 35 ++++ arch/riscv/include/uapi/asm/unistd.h | 8 + arch/riscv/kernel/cpu.c | 11 +- arch/riscv/kernel/cpufeature.c | 31 ++- arch/riscv/kernel/sys_riscv.c | 192 +++++++++++++++++- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/riscv/Makefile | 58 ++++++ .../testing/selftests/riscv/hwprobe/Makefile | 10 + .../testing/selftests/riscv/hwprobe/hwprobe.c | 89 ++++++++ .../selftests/riscv/hwprobe/sys_hwprobe.S | 12 ++ tools/testing/selftests/riscv/libc.S | 46 +++++ 18 files changed, 613 insertions(+), 10 deletions(-) create mode 100644 Documentation/riscv/hwprobe.rst create mode 100644 arch/riscv/include/asm/cpufeature.h create mode 100644 arch/riscv/include/asm/hwprobe.h create mode 100644 arch/riscv/include/uapi/asm/hwprobe.h create mode 100644 tools/testing/selftests/riscv/Makefile create mode 100644 tools/testing/selftests/riscv/hwprobe/Makefile create mode 100644 tools/testing/selftests/riscv/hwprobe/hwprobe.c create mode 100644 tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S create mode 100644 tools/testing/selftests/riscv/libc.S
-- 2.25.1
linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
On Mon, 06 Feb 2023 14:47:35 PST (-0800), heinrich.schuchardt@canonical.com wrote:
On 2/6/23 22:11, Jessica Clarke wrote:
On 6 Feb 2023, at 20:14, Evan Green evan@rivosinc.com wrote:
These are very much up for discussion, as it's a pretty big new user interface and it's quite a bit different from how we've historically done things: this isn't just providing an ISA string to userspace, this has its own format for providing information to userspace.
There's been a bunch of off-list discussions about this, including at Plumbers. The original plan was to do something involving providing an ISA string to userspace, but ISA strings just aren't sufficient for a stable ABI any more: in order to parse an ISA string users need the version of the specifications that the string is written to, the version of each extension (sometimes at a finer granularity than the RISC-V releases/versions encode), and the expected use case for the ISA string (ie, is it a U-mode or M-mode string). That's a lot of complexity to try and keep ABI compatible and it's probably going to continue to grow, as even if there's no more complexity in the specifications we'll have to deal with the various ISA string parsing oddities that end up all over userspace.
Instead this patch set takes a very different approach and provides a set of key/value pairs that encode various bits about the system. The big advantage here is that we can clearly define what these mean so we can ensure ABI stability, but it also allows us to encode information that's unlikely to ever appear in an ISA string (see the misaligned access performance, for example). The resulting interface looks a lot like what arm64 and x86 do, and will hopefully fit well into something like ACPI in the future.
The actual user interface is a syscall. I'm not really sure that's the right way to go about this, but it makes for flexible prototying. Various other approaches have been talked about like making HWCAP2 a pointer, having a VDSO routine, or exposing this via sysfs. Those seem like generally reasonable approaches, but I've yet to figure out a way to get the general case working without a syscall as that's the only way I've come up with to deal with the heterogenous CPU case. Happy to hear if someone has a better idea, though, as I don't really want to add a syscall if we can avoid it.
Operating systems tend to reschedule threads moving them between harts. New threads may be created by processes at any time.
It is not clear to me what information the syscall shall convey in the heterogeneous case. I see the following alternatives:
- The syscall describes the current hart.
- The syscall provides individual properties of all harts.
- The syscall provides a set of properties that is valid for any hart on
which the thread might be scheduled.
- The syscall provides a set of properties that is valid for any hart
that any thread of the current process might be scheduled to.
Describing only the current hart would not be helpful as the thread might be rescheduled to a hart with a smaller set of available extensions.
Describing the properties of all harts would not be helpful if the thread has no control to which hart it is scheduled.
Processes that don't control scheduling would most benefit from a guaranteed set of properties valid for all threads of the process.
Processes that take control of scheduling would probably want information about all harts.
There's a cpu_set_t argument. We tried to answer this via the Documentation patch. It's just the single sentence
The CPU set is defined by CPU_SET(3), the indicated features will be supported on all CPUs in the set.
so maybe it needs beefing up... Do you mind commenting on the doc diff, if you've got any ideas as to how to word it better? That way anyone else reviewing the docs will see too.
Best regards
Heinrich
Please work with https://github.com/riscv-non-isa/riscv-c-api-doc as it’s crucial we have a portable standard interface for applications to query this information that works on OSes other than Linux. This can be backed by whatever you want, whether a syscall, magic VDSO thing, sysfs, etc, but it’s key that the exposed interface outside of libc is not Linux-specific otherwise we’re going to get fragmentation in this space.
I would encourage figuring out the right shape for the exposed interface first before continuing to refine details of how that information gets communicated between the kernel and libc.
Jess
An example series in glibc exposing this syscall and using it in an ifunc selector for memcpy can be found at [1].
[1] https://public-inbox.org/libc-alpha/20230206194819.1679472-1-evan@rivosinc.c...
Changes in v2:
- Changed the interface to look more like poll(). Rather than supplying key_offset and getting back an array of values with numerically contiguous keys, have the user pre-fill the key members of the array, and the kernel will fill in the corresponding values. For any key it doesn't recognize, it will set the key of that element to -1. This allows usermode to quickly ask for exactly the elements it cares about, and not get bogged down in a back and forth about newer keys that older kernels might not recognize. In other words, the kernel can communicate that it doesn't recognize some of the keys while still providing the data for the keys it does know.
- Added a shortcut to the cpuset parameters that if a size of 0 and NULL is provided for the CPU set, the kernel will use a cpu mask of all online CPUs. This is convenient because I suspect most callers will only want to act on a feature if it's supported on all CPUs, and it's a headache to dynamically allocate an array of all 1s, not to mention a waste to have the kernel loop over all of the offline bits.
- Fixed logic error in if(of_property_read_string...) that caused crash
- Include cpufeature.h in cpufeature.h to avoid undeclared variable warning.
- Added a _MASK define
- Fix random checkpatch complaints
- Updated the selftests to the new API and added some more.
- Fixed indentation, comments in .S, and general checkpatch complaints.
Evan Green (4): RISC-V: Move struct riscv_cpuinfo to new header RISC-V: Add a syscall for HW probing RISC-V: hwprobe: Support probing of misaligned access performance selftests: Test the new RISC-V hwprobe interface
Palmer Dabbelt (2): RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA dt-bindings: Add RISC-V misaligned access performance
.../devicetree/bindings/riscv/cpus.yaml | 15 ++ Documentation/riscv/hwprobe.rst | 66 ++++++ Documentation/riscv/index.rst | 1 + arch/riscv/include/asm/cpufeature.h | 23 +++ arch/riscv/include/asm/hwprobe.h | 13 ++ arch/riscv/include/asm/smp.h | 9 + arch/riscv/include/asm/syscall.h | 3 + arch/riscv/include/uapi/asm/hwprobe.h | 35 ++++ arch/riscv/include/uapi/asm/unistd.h | 8 + arch/riscv/kernel/cpu.c | 11 +- arch/riscv/kernel/cpufeature.c | 31 ++- arch/riscv/kernel/sys_riscv.c | 192 +++++++++++++++++- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/riscv/Makefile | 58 ++++++ .../testing/selftests/riscv/hwprobe/Makefile | 10 + .../testing/selftests/riscv/hwprobe/hwprobe.c | 89 ++++++++ .../selftests/riscv/hwprobe/sys_hwprobe.S | 12 ++ tools/testing/selftests/riscv/libc.S | 46 +++++ 18 files changed, 613 insertions(+), 10 deletions(-) create mode 100644 Documentation/riscv/hwprobe.rst create mode 100644 arch/riscv/include/asm/cpufeature.h create mode 100644 arch/riscv/include/asm/hwprobe.h create mode 100644 arch/riscv/include/uapi/asm/hwprobe.h create mode 100644 tools/testing/selftests/riscv/Makefile create mode 100644 tools/testing/selftests/riscv/hwprobe/Makefile create mode 100644 tools/testing/selftests/riscv/hwprobe/hwprobe.c create mode 100644 tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S create mode 100644 tools/testing/selftests/riscv/libc.S
-- 2.25.1
linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
Hey Evan, Having been talking to Palmer about this series at FOSDEM, it was a very pleasant surprise to see this when I saw this in my inbox when I landed back home. I do very much intend reviewing this, but...
On 6 February 2023 20:14:49 GMT, Evan Green evan@rivosinc.com wrote:
These are very much up for discussion, as it's a pretty big new user interface and it's quite a bit different from how we've historically done things: this isn't just providing an ISA string to userspace, this has its own format for providing information to userspace.
There's been a bunch of off-list discussions about this, including at Plumbers. The original plan was to do something involving providing an ISA string to userspace, but ISA strings just aren't sufficient for a stable ABI any more: in order to parse an ISA string users need the version of the specifications that the string is written to, the version of each extension (sometimes at a finer granularity than the RISC-V releases/versions encode), and the expected use case for the ISA string (ie, is it a U-mode or M-mode string). That's a lot of complexity to try and keep ABI compatible and it's probably going to continue to grow, as even if there's no more complexity in the specifications we'll have to deal with the various ISA string parsing oddities that end up all over userspace.
Instead this patch set takes a very different approach and provides a set of key/value pairs that encode various bits about the system. The big advantage here is that we can clearly define what these mean so we can ensure ABI stability, but it also allows us to encode information that's unlikely to ever appear in an ISA string (see the misaligned access performance, for example). The resulting interface looks a lot like what arm64 and x86 do, and will hopefully fit well into something like ACPI in the future.
The actual user interface is a syscall. I'm not really sure that's the right way to go about this, but it makes for flexible prototying. Various other approaches have been talked about like making HWCAP2 a pointer, having a VDSO routine, or exposing this via sysfs. Those seem like generally reasonable approaches, but I've yet to figure out a way
This all looks to be the same cover message as Palmer submitted with the v1, so, as I'd mentioned to him the other day, I'd like to do a bit of an investigation into the sysfs approach drew suggested on the v1. So, if it's a little bit before you hear - I've certainly not forgotten about the series!
Thanks, Conor.
to get the general case working without a syscall as that's the only way I've come up with to deal with the heterogenous CPU case. Happy to hear if someone has a better idea, though, as I don't really want to add a syscall if we can avoid it.
An example series in glibc exposing this syscall and using it in an ifunc selector for memcpy can be found at [1].
[1] https://public-inbox.org/libc-alpha/20230206194819.1679472-1-evan@rivosinc.c...
Changes in v2:
- Changed the interface to look more like poll(). Rather than supplying key_offset and getting back an array of values with numerically contiguous keys, have the user pre-fill the key members of the array, and the kernel will fill in the corresponding values. For any key it doesn't recognize, it will set the key of that element to -1. This allows usermode to quickly ask for exactly the elements it cares about, and not get bogged down in a back and forth about newer keys that older kernels might not recognize. In other words, the kernel can communicate that it doesn't recognize some of the keys while still providing the data for the keys it does know.
- Added a shortcut to the cpuset parameters that if a size of 0 and NULL is provided for the CPU set, the kernel will use a cpu mask of all online CPUs. This is convenient because I suspect most callers will only want to act on a feature if it's supported on all CPUs, and it's a headache to dynamically allocate an array of all 1s, not to mention a waste to have the kernel loop over all of the offline bits.
- Fixed logic error in if(of_property_read_string...) that caused crash
- Include cpufeature.h in cpufeature.h to avoid undeclared variable warning.
- Added a _MASK define
- Fix random checkpatch complaints
- Updated the selftests to the new API and added some more.
- Fixed indentation, comments in .S, and general checkpatch complaints.
Evan Green (4): RISC-V: Move struct riscv_cpuinfo to new header RISC-V: Add a syscall for HW probing RISC-V: hwprobe: Support probing of misaligned access performance selftests: Test the new RISC-V hwprobe interface
Palmer Dabbelt (2): RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA dt-bindings: Add RISC-V misaligned access performance
.../devicetree/bindings/riscv/cpus.yaml | 15 ++ Documentation/riscv/hwprobe.rst | 66 ++++++ Documentation/riscv/index.rst | 1 + arch/riscv/include/asm/cpufeature.h | 23 +++ arch/riscv/include/asm/hwprobe.h | 13 ++ arch/riscv/include/asm/smp.h | 9 + arch/riscv/include/asm/syscall.h | 3 + arch/riscv/include/uapi/asm/hwprobe.h | 35 ++++ arch/riscv/include/uapi/asm/unistd.h | 8 + arch/riscv/kernel/cpu.c | 11 +- arch/riscv/kernel/cpufeature.c | 31 ++- arch/riscv/kernel/sys_riscv.c | 192 +++++++++++++++++- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/riscv/Makefile | 58 ++++++ .../testing/selftests/riscv/hwprobe/Makefile | 10 + .../testing/selftests/riscv/hwprobe/hwprobe.c | 89 ++++++++ .../selftests/riscv/hwprobe/sys_hwprobe.S | 12 ++ tools/testing/selftests/riscv/libc.S | 46 +++++ 18 files changed, 613 insertions(+), 10 deletions(-) create mode 100644 Documentation/riscv/hwprobe.rst create mode 100644 arch/riscv/include/asm/cpufeature.h create mode 100644 arch/riscv/include/asm/hwprobe.h create mode 100644 arch/riscv/include/uapi/asm/hwprobe.h create mode 100644 tools/testing/selftests/riscv/Makefile create mode 100644 tools/testing/selftests/riscv/hwprobe/Makefile create mode 100644 tools/testing/selftests/riscv/hwprobe/hwprobe.c create mode 100644 tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S create mode 100644 tools/testing/selftests/riscv/libc.S
linux-kselftest-mirror@lists.linaro.org