This series wires up getrandom() vDSO implementation on powerpc.
Tested on PPC32.
Performance on powerpc 885 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 2500000 times in 7.897495392 seconds libc: 2500000 times in 56.091632232 seconds syscall: 2500000 times in 55.704851989 seconds
Performance on powerpc 8321 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 2500000 times in 2.017183250 seconds libc: 2500000 times in 13.088533630 seconds syscall: 2500000 times in 12.952458068 seconds
Only build tested on PPC64. There is a problem with vdso_test_getrandom selftest, it doesn't find vDSO symbol __kernel_getrandom. There is the same problem with vdso_test_gettimeofday so it is not related to getrandom.
On strange things to be clarified, there is the format of the key passed to __arch_chacha20_blocks_nostack(). In struct vgetrandom_state it is declared as a table of u32, but in reality it seems it is a flat storage that needs to be loaded in reversed byte order, so it should either be defined as a table of bytes, or as a table of __le32 but not a table of u32. But this has no impact and can be clarified later and fixed in a follow-up patch.
Changes in v2: - Define VM_DROPPABLE for powerpc/32 - Fixes generic vDSO getrandom headers to enable CONFIG_COMPAT build. - Fixed size of generation counter - Fixed selftests to work on non x86 architectures
Christophe Leroy (17): asm-generic/unaligned.h: Extract common header for vDSO vdso: Clean header inclusion in getrandom vdso: Add __arch_get_k_vdso_rng_data() vdso: Add missing c-getrandom-y in Makefile vdso: Avoid call to memset() by getrandom vdso: Change getrandom's generation to unsigned long mm: Define VM_DROPPABLE for powerpc/32 powerpc: Add little endian variants of LWZX_BE and STWX_BE powerpc/vdso32: Add crtsavres powerpc/vdso: Refactor CFLAGS for CVDSO build powerpc/vdso: Wire up getrandom() vDSO implementation selftests: vdso: Fix powerpc64 vdso_config selftests: vdso: Don't hard-code location of vDSO sources selftests: vdso: Make test_vdso_getrandom look for the right vDSO function selftests: vdso: Fix build of test_vdso_chacha selftests: vdso: Make VDSO function call more generic selftests: vdso: Add support for vdso_test_random for powerpc
arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/asm-compat.h | 8 + arch/powerpc/include/asm/mman.h | 2 +- arch/powerpc/include/asm/vdso/getrandom.h | 67 ++++ arch/powerpc/include/asm/vdso/vsyscall.h | 6 + arch/powerpc/include/asm/vdso_datapage.h | 2 + arch/powerpc/kernel/asm-offsets.c | 1 + arch/powerpc/kernel/vdso/Makefile | 45 ++- arch/powerpc/kernel/vdso/getrandom.S | 58 ++++ arch/powerpc/kernel/vdso/gettimeofday.S | 13 - arch/powerpc/kernel/vdso/vdso32.lds.S | 1 + arch/powerpc/kernel/vdso/vdso64.lds.S | 1 + arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 297 ++++++++++++++++++ arch/powerpc/kernel/vdso/vgetrandom.c | 14 + arch/x86/entry/vdso/vma.c | 3 + arch/x86/include/asm/pvclock.h | 1 + arch/x86/include/asm/vdso/vsyscall.h | 10 +- drivers/char/random.c | 5 +- fs/proc/task_mmu.c | 4 +- include/asm-generic/unaligned.h | 11 +- include/linux/mm.h | 4 +- include/trace/events/mmflags.h | 4 +- include/vdso/datapage.h | 2 +- include/vdso/getrandom.h | 2 +- include/vdso/helpers.h | 1 + include/vdso/unaligned.h | 15 + lib/vdso/Makefile | 1 + lib/vdso/getrandom.c | 30 +- tools/arch/powerpc/vdso | 1 + tools/arch/x86/vdso | 1 + tools/include/linux/linkage.h | 4 + tools/testing/selftests/vDSO/Makefile | 12 +- tools/testing/selftests/vDSO/vdso_call.h | 52 +++ tools/testing/selftests/vDSO/vdso_config.h | 14 +- .../selftests/vDSO/vdso_test_getrandom.c | 11 +- 35 files changed, 628 insertions(+), 76 deletions(-) create mode 100644 arch/powerpc/include/asm/vdso/getrandom.h create mode 100644 arch/powerpc/kernel/vdso/getrandom.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom-chacha.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom.c create mode 100644 include/vdso/unaligned.h create mode 120000 tools/arch/powerpc/vdso create mode 120000 tools/arch/x86/vdso create mode 100644 tools/testing/selftests/vDSO/vdso_call.h
getrandom vDSO implementation requires __put_unaligned_t() and __put_unaligned_t() but including asm-generic/unaligned.h pulls too many other headers.
Follow the same approach as for most things in include/vdso/, see for instance commit 8165b57bca21 ("linux/const.h: Extract common header for vDSO"): Move __get_unaligned_t and __put_unaligned_t into a new unaligned.h living in the vdso/ include directory.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- include/asm-generic/unaligned.h | 11 +---------- include/vdso/unaligned.h | 15 +++++++++++++++ 2 files changed, 16 insertions(+), 10 deletions(-) create mode 100644 include/vdso/unaligned.h
diff --git a/include/asm-generic/unaligned.h b/include/asm-generic/unaligned.h index a84c64e5f11e..95acdd70b3b2 100644 --- a/include/asm-generic/unaligned.h +++ b/include/asm-generic/unaligned.h @@ -8,16 +8,7 @@ */ #include <linux/unaligned/packed_struct.h> #include <asm/byteorder.h> - -#define __get_unaligned_t(type, ptr) ({ \ - const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ - __pptr->x; \ -}) - -#define __put_unaligned_t(type, val, ptr) do { \ - struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ - __pptr->x = (val); \ -} while (0) +#include <vdso/unaligned.h>
#define get_unaligned(ptr) __get_unaligned_t(typeof(*(ptr)), (ptr)) #define put_unaligned(val, ptr) __put_unaligned_t(typeof(*(ptr)), (val), (ptr)) diff --git a/include/vdso/unaligned.h b/include/vdso/unaligned.h new file mode 100644 index 000000000000..eee3d2a4dbe4 --- /dev/null +++ b/include/vdso/unaligned.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __VDSO_UNALIGNED_H +#define __VDSO_UNALIGNED_H + +#define __get_unaligned_t(type, ptr) ({ \ + const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ + __pptr->x; \ +}) + +#define __put_unaligned_t(type, val, ptr) do { \ + struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ + __pptr->x = (val); \ +} while (0) + +#endif /* __VDSO_UNALIGNED_H */
On Thu, Aug 22, 2024 at 09:13:09AM +0200, Christophe Leroy wrote:
include/asm-generic/unaligned.h | 11 +---------- include/vdso/unaligned.h | 15 +++++++++++++++ 2 files changed, 16 insertions(+), 10 deletions(-) create mode 100644 include/vdso/unaligned.h
Do you need to also adjust the `#include <asm/unaligned.h>` inside of lib/vdso/getrandom.c to instead read `#include <vdso/unaligned.h>`?
Jason
Le 26/08/2024 à 09:20, Jason A. Donenfeld a écrit :
On Thu, Aug 22, 2024 at 09:13:09AM +0200, Christophe Leroy wrote:
include/asm-generic/unaligned.h | 11 +---------- include/vdso/unaligned.h | 15 +++++++++++++++ 2 files changed, 16 insertions(+), 10 deletions(-) create mode 100644 include/vdso/unaligned.h
Do you need to also adjust the `#include <asm/unaligned.h>` inside of lib/vdso/getrandom.c to instead read `#include <vdso/unaligned.h>`?
Yes, all adjustments to lib/vdso/getrandom.c are in patch 2.
Christophe
Building a VDSO32 on a 64 bits kernel is problematic when some system headers are included. See commit 8c59ab839f52 ("lib/vdso: Enable common headers") for more details.
Minimise the amount of headers by moving needed items into dedicated common headers.
For PAGE_SIZE and PAGE_MASK, redefine them from CONFIG_PAGE_SHIFT in the same way as commit cffaefd15a8f ("vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h")
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- arch/x86/include/asm/pvclock.h | 1 + include/vdso/helpers.h | 1 + lib/vdso/getrandom.c | 15 ++++++++------- 3 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h index 0c92db84469d..6e4f8fae3ce9 100644 --- a/arch/x86/include/asm/pvclock.h +++ b/arch/x86/include/asm/pvclock.h @@ -5,6 +5,7 @@ #include <asm/clocksource.h> #include <asm/pvclock-abi.h>
+struct timespec64; /* some helper functions for xen and kvm pv clock sources */ u64 pvclock_clocksource_read(struct pvclock_vcpu_time_info *src); u64 pvclock_clocksource_read_nowd(struct pvclock_vcpu_time_info *src); diff --git a/include/vdso/helpers.h b/include/vdso/helpers.h index 73501149439d..3ddb03bb05cb 100644 --- a/include/vdso/helpers.h +++ b/include/vdso/helpers.h @@ -4,6 +4,7 @@
#ifndef __ASSEMBLY__
+#include <asm/barrier.h> #include <vdso/datapage.h>
static __always_inline u32 vdso_read_begin(const struct vdso_data *vd) diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c index b230f0b10832..cab153c5f9be 100644 --- a/lib/vdso/getrandom.c +++ b/lib/vdso/getrandom.c @@ -3,15 +3,13 @@ * Copyright (C) 2022-2024 Jason A. Donenfeld Jason@zx2c4.com. All Rights Reserved. */
-#include <linux/cache.h> -#include <linux/kernel.h> -#include <linux/time64.h> +#include <linux/minmax.h> #include <vdso/datapage.h> #include <vdso/getrandom.h> +#include <vdso/unaligned.h> #include <asm/vdso/getrandom.h> -#include <asm/vdso/vsyscall.h> -#include <asm/unaligned.h> #include <uapi/linux/mman.h> +#include <uapi/linux/random.h>
#define MEMCPY_AND_ZERO_SRC(type, dst, src, len) do { \ while (len >= sizeof(type)) { \ @@ -23,6 +21,9 @@ } \ } while (0)
+#define _PAGE_SIZE (1UL << CONFIG_PAGE_SHIFT) +#define _PAGE_MASK (~(_PAGE_SIZE - 1)) + static void memcpy_and_zero_src(void *dst, void *src, size_t len) { if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) { @@ -64,7 +65,7 @@ static __always_inline ssize_t __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_t len, unsigned int flags, void *opaque_state, size_t opaque_len) { - ssize_t ret = min_t(size_t, INT_MAX & PAGE_MASK /* = MAX_RW_COUNT */, len); + ssize_t ret = min_t(size_t, INT_MAX & _PAGE_MASK /* = MAX_RW_COUNT */, len); struct vgetrandom_state *state = opaque_state; size_t batch_len, nblocks, orig_len = len; bool in_use, have_retried = false; @@ -82,7 +83,7 @@ __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_ }
/* The state must not straddle a page, since pages can be zeroed at any time. */ - if (unlikely(((unsigned long)opaque_state & ~PAGE_MASK) + sizeof(*state) > PAGE_SIZE)) + if (unlikely(((unsigned long)opaque_state & ~_PAGE_MASK) + sizeof(*state) > _PAGE_SIZE)) return -EFAULT;
/* If the caller passes the wrong size, which might happen due to CRIU, fallback. */
On Thu, Aug 22, 2024 at 09:13:10AM +0200, Christophe Leroy wrote:
diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h index 0c92db84469d..6e4f8fae3ce9 100644 --- a/arch/x86/include/asm/pvclock.h +++ b/arch/x86/include/asm/pvclock.h @@ -5,6 +5,7 @@ #include <asm/clocksource.h> #include <asm/pvclock-abi.h> +struct timespec64; /* some helper functions for xen and kvm pv clock sources */ u64 pvclock_clocksource_read(struct pvclock_vcpu_time_info *src); u64 pvclock_clocksource_read_nowd(struct pvclock_vcpu_time_info *src);
This change isn't mentioned in the commit message and could probably benefit from doing so.
diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c index b230f0b10832..cab153c5f9be 100644 --- a/lib/vdso/getrandom.c +++ b/lib/vdso/getrandom.c @@ -3,15 +3,13 @@
- Copyright (C) 2022-2024 Jason A. Donenfeld Jason@zx2c4.com. All Rights Reserved.
*/ -#include <linux/cache.h> -#include <linux/kernel.h> -#include <linux/time64.h> +#include <linux/minmax.h> #include <vdso/datapage.h> #include <vdso/getrandom.h> +#include <vdso/unaligned.h>
Ah, that's where you do it. Ignore my comment on the previous commit, then.
#include <asm/vdso/getrandom.h> -#include <asm/vdso/vsyscall.h> -#include <asm/unaligned.h> #include <uapi/linux/mman.h> +#include <uapi/linux/random.h> #define MEMCPY_AND_ZERO_SRC(type, dst, src, len) do { \ while (len >= sizeof(type)) { \ @@ -23,6 +21,9 @@ } \ } while (0) +#define _PAGE_SIZE (1UL << CONFIG_PAGE_SHIFT) +#define _PAGE_MASK (~(_PAGE_SIZE - 1))
If PAGE_SIZE isn't defined at this point, why not just call it PAGE_SIZE instead of _PAGE_SIZE? But if that's the case, why not put the vdso definition of PAGE_SIZE into some vdso header included by this file?
Le 26/08/2024 à 10:07, Jason A. Donenfeld a écrit :
On Thu, Aug 22, 2024 at 09:13:10AM +0200, Christophe Leroy wrote:
+#define _PAGE_SIZE (1UL << CONFIG_PAGE_SHIFT) +#define _PAGE_MASK (~(_PAGE_SIZE - 1))
If PAGE_SIZE isn't defined at this point, why not just call it PAGE_SIZE instead of _PAGE_SIZE? But if that's the case, why not put the vdso definition of PAGE_SIZE into some vdso header included by this file?
It was working ok on powerpc but on x86 I got:
CC arch/x86/entry/vdso/vgetrandom.o In file included from arch/x86/entry/vdso/vgetrandom.c:7: arch/x86/entry/vdso/../../../../lib/vdso/getrandom.c:24: error: "PAGE_SIZE" redefined [-Werror] 24 | #define PAGE_SIZE (1UL << CONFIG_PAGE_SHIFT) | In file included from ./arch/x86/include/asm/page.h:9, from ./arch/x86/include/asm/thread_info.h:12, from ./include/linux/thread_info.h:60, from ./include/linux/smp.h:118, from ./include/linux/alloc_tag.h:14, from ./include/linux/percpu.h:5, from ./arch/x86/include/asm/msr.h:15, from ./arch/x86/include/asm/vdso/gettimeofday.h:19, from ./include/vdso/datapage.h:164, from arch/x86/entry/vdso/../../../../lib/vdso/getrandom.c:7, from arch/x86/entry/vdso/vgetrandom.c:7: ./arch/x86/include/asm/page_types.h:11: note: this is the location of the previous definition 11 | #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) | In file included from arch/x86/entry/vdso/vgetrandom.c:7: arch/x86/entry/vdso/../../../../lib/vdso/getrandom.c:25: error: "PAGE_MASK" redefined [-Werror] 25 | #define PAGE_MASK (~(PAGE_SIZE - 1)) | In file included from ./arch/x86/include/asm/page.h:9, from ./arch/x86/include/asm/thread_info.h:12, from ./include/linux/thread_info.h:60, from ./include/linux/smp.h:118, from ./include/linux/alloc_tag.h:14, from ./include/linux/percpu.h:5, from ./arch/x86/include/asm/msr.h:15, from ./arch/x86/include/asm/vdso/gettimeofday.h:19, from ./include/vdso/datapage.h:164, from arch/x86/entry/vdso/../../../../lib/vdso/getrandom.c:7, from arch/x86/entry/vdso/vgetrandom.c:7: ./arch/x86/include/asm/page_types.h:12: note: this is the location of the previous definition 12 | #define PAGE_MASK (~(PAGE_SIZE-1)) | cc1: all warnings being treated as errors
Christophe
On Mon, Aug 26, 2024 at 10:37:49AM +0200, Christophe Leroy wrote:
Le 26/08/2024 à 10:07, Jason A. Donenfeld a écrit :
On Thu, Aug 22, 2024 at 09:13:10AM +0200, Christophe Leroy wrote:
+#define _PAGE_SIZE (1UL << CONFIG_PAGE_SHIFT) +#define _PAGE_MASK (~(_PAGE_SIZE - 1))
If PAGE_SIZE isn't defined at this point, why not just call it PAGE_SIZE instead of _PAGE_SIZE? But if that's the case, why not put the vdso definition of PAGE_SIZE into some vdso header included by this file?
It was working ok on powerpc but on x86 I got:
Seems like there might be some more fiddling to do, then? Or did you conclude it's impossible?
Le 26/08/2024 à 10:58, Jason A. Donenfeld a écrit :
On Mon, Aug 26, 2024 at 10:37:49AM +0200, Christophe Leroy wrote:
Le 26/08/2024 à 10:07, Jason A. Donenfeld a écrit :
On Thu, Aug 22, 2024 at 09:13:10AM +0200, Christophe Leroy wrote:
+#define _PAGE_SIZE (1UL << CONFIG_PAGE_SHIFT) +#define _PAGE_MASK (~(_PAGE_SIZE - 1))
If PAGE_SIZE isn't defined at this point, why not just call it PAGE_SIZE instead of _PAGE_SIZE? But if that's the case, why not put the vdso definition of PAGE_SIZE into some vdso header included by this file?
It was working ok on powerpc but on x86 I got:
Seems like there might be some more fiddling to do, then? Or did you conclude it's impossible?
Maybe someone who knows x86 in details could helps but after a first look I gave up because it looks very x86 specific, indeed that's x86/asm/vdso/gettimeofday.h that pulls several x86/asm/ headers , and the same type of issue might arise for any new architecture coming in.
For me it looked cleaner to just do as commit cffaefd15a8f ("vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h") and not use PAGE_SIZE at all. But I didn't want to directly use (1UL << CONFIG_PAGE_SHIFT) and (~(1UL << (CONFIG_PAGE_SHIFT - 1))) in the code directly hence the new macros with a leading underscore to avoid any conflict with existing macros.
Christophe
On Mon, Aug 26, 2024 at 12:45:40PM +0200, Christophe Leroy wrote:
Le 26/08/2024 à 10:58, Jason A. Donenfeld a écrit :
On Mon, Aug 26, 2024 at 10:37:49AM +0200, Christophe Leroy wrote:
Le 26/08/2024 à 10:07, Jason A. Donenfeld a écrit :
On Thu, Aug 22, 2024 at 09:13:10AM +0200, Christophe Leroy wrote:
+#define _PAGE_SIZE (1UL << CONFIG_PAGE_SHIFT) +#define _PAGE_MASK (~(_PAGE_SIZE - 1))
If PAGE_SIZE isn't defined at this point, why not just call it PAGE_SIZE instead of _PAGE_SIZE? But if that's the case, why not put the vdso definition of PAGE_SIZE into some vdso header included by this file?
It was working ok on powerpc but on x86 I got:
Seems like there might be some more fiddling to do, then? Or did you conclude it's impossible?
Maybe someone who knows x86 in details could helps but after a first look I gave up because it looks very x86 specific, indeed that's x86/asm/vdso/gettimeofday.h that pulls several x86/asm/ headers , and the same type of issue might arise for any new architecture coming in.
For me it looked cleaner to just do as commit cffaefd15a8f ("vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h") and not use PAGE_SIZE at all. But I didn't want to directly use (1UL << CONFIG_PAGE_SHIFT) and (~(1UL << (CONFIG_PAGE_SHIFT - 1))) in the code directly hence the new macros with a leading underscore to avoid any conflict with existing macros.
If that's the case and you can't sort it out or it's impossible, then it seems like adding #ifndef PAGE_SIZE, #define... inside of some vdso header (not inside getrandom.c) would make sense.
Jason
On Mon, Aug 26 2024 at 12:45, Christophe Leroy wrote:
Le 26/08/2024 à 10:58, Jason A. Donenfeld a écrit :
On Mon, Aug 26, 2024 at 10:37:49AM +0200, Christophe Leroy wrote:
Le 26/08/2024 à 10:07, Jason A. Donenfeld a écrit :
On Thu, Aug 22, 2024 at 09:13:10AM +0200, Christophe Leroy wrote:
+#define _PAGE_SIZE (1UL << CONFIG_PAGE_SHIFT) +#define _PAGE_MASK (~(_PAGE_SIZE - 1))
If PAGE_SIZE isn't defined at this point, why not just call it PAGE_SIZE instead of _PAGE_SIZE? But if that's the case, why not put the vdso definition of PAGE_SIZE into some vdso header included by this file?
It was working ok on powerpc but on x86 I got:
Seems like there might be some more fiddling to do, then? Or did you conclude it's impossible?
Maybe someone who knows x86 in details could helps but after a first look I gave up because it looks very x86 specific, indeed that's x86/asm/vdso/gettimeofday.h that pulls several x86/asm/ headers , and the same type of issue might arise for any new architecture coming in.
Of course :)
For me it looked cleaner to just do as commit cffaefd15a8f ("vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h") and not use PAGE_SIZE at all. But I didn't want to directly use (1UL << CONFIG_PAGE_SHIFT) and (~(1UL << (CONFIG_PAGE_SHIFT - 1))) in the code directly hence the new macros with a leading underscore to avoid any conflict with existing macros.
#ifndef PAGE_SIZE # define #endif
Perhaps?
Thanks,
tglx
_vdso_data is specific to x86 and __arch_get_k_vdso_data() is provided so that all architectures can provide the requested pointer.
Do the same with _vdso_rng_data, provide __arch_get_k_vdso_rng_data() and don't use x86 _vdso_rng_data directly.
Until now vdso/vsyscall.h was only included by time/vsyscall.c but now it will also be included in char/random.c, leading to a duplicate declaration of _vdso_data and _vdso_rng_data.
To fix this issue, move declaration in a C file. vma.c looks like the most appropriate candidate. Don't need to replace the definitions in vsyscall.h by declarations as declarations are already in asm/vvar.h
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- v2: Move x86 DEFINE_VVAR(_vdso_data) and DEFINE_VVAR(_vdso_rng_data) in vma.c --- arch/x86/entry/vdso/vma.c | 3 +++ arch/x86/include/asm/vdso/vsyscall.h | 10 +++++++--- drivers/char/random.c | 5 +++-- 3 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c index 6d83ceb7f1ba..b8fed8b8b9cc 100644 --- a/arch/x86/entry/vdso/vma.c +++ b/arch/x86/entry/vdso/vma.c @@ -38,6 +38,9 @@ struct vdso_data *arch_get_vdso_data(void *vvar_page) } #undef EMIT_VVAR
+DEFINE_VVAR(struct vdso_data, _vdso_data); +DEFINE_VVAR_SINGLE(struct vdso_rng_data, _vdso_rng_data); + unsigned int vclocks_used __read_mostly;
#if defined(CONFIG_X86_64) diff --git a/arch/x86/include/asm/vdso/vsyscall.h b/arch/x86/include/asm/vdso/vsyscall.h index 972415a8be31..67fedf1698b5 100644 --- a/arch/x86/include/asm/vdso/vsyscall.h +++ b/arch/x86/include/asm/vdso/vsyscall.h @@ -9,9 +9,6 @@ #include <asm/vgtod.h> #include <asm/vvar.h>
-DEFINE_VVAR(struct vdso_data, _vdso_data); -DEFINE_VVAR_SINGLE(struct vdso_rng_data, _vdso_rng_data); - /* * Update the vDSO data page to keep in sync with kernel timekeeping. */ @@ -22,6 +19,13 @@ struct vdso_data *__x86_get_k_vdso_data(void) } #define __arch_get_k_vdso_data __x86_get_k_vdso_data
+static __always_inline +struct vdso_rng_data *__x86_get_k_vdso_rng_data(void) +{ + return &_vdso_rng_data; +} +#define __arch_get_k_vdso_rng_data __x86_get_k_vdso_rng_data + /* The asm-generic header needs to be included after the definitions above */ #include <asm-generic/vdso/vsyscall.h>
diff --git a/drivers/char/random.c b/drivers/char/random.c index 87fe61295ea1..77968309e2c2 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -59,6 +59,7 @@ #ifdef CONFIG_VDSO_GETRANDOM #include <vdso/getrandom.h> #include <vdso/datapage.h> +#include <vdso/vsyscall.h> #endif #include <asm/archrandom.h> #include <asm/processor.h> @@ -282,7 +283,7 @@ static void crng_reseed(struct work_struct *work) * is ordered with the write above to base_crng.generation. Pairs with * the smp_rmb() before the syscall in the vDSO code. */ - smp_store_release(&_vdso_rng_data.generation, next_gen + 1); + smp_store_release(&__arch_get_k_vdso_rng_data()->generation, next_gen + 1); #endif if (!static_branch_likely(&crng_is_ready)) crng_init = CRNG_READY; @@ -735,7 +736,7 @@ static void __cold _credit_init_bits(size_t bits) queue_work(system_unbound_wq, &set_ready); atomic_notifier_call_chain(&random_ready_notifier, 0, NULL); #ifdef CONFIG_VDSO_GETRANDOM - WRITE_ONCE(_vdso_rng_data.is_ready, true); + WRITE_ONCE(__arch_get_k_vdso_rng_data()->is_ready, true); #endif wake_up_interruptible(&crng_init_wait); kill_fasync(&fasync, SIGIO, POLL_IN);
On Thu, Aug 22, 2024 at 09:13:11AM +0200, Christophe Leroy wrote:
_vdso_data is specific to x86 and __arch_get_k_vdso_data() is provided so that all architectures can provide the requested pointer.
Do the same with _vdso_rng_data, provide __arch_get_k_vdso_rng_data() and don't use x86 _vdso_rng_data directly.
Until now vdso/vsyscall.h was only included by time/vsyscall.c but now it will also be included in char/random.c, leading to a duplicate declaration of _vdso_data and _vdso_rng_data.
To fix this issue, move declaration in a C file. vma.c looks like the most appropriate candidate. Don't need to replace the definitions in vsyscall.h by declarations as declarations are already in asm/vvar.h
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu
v2: Move x86 DEFINE_VVAR(_vdso_data) and DEFINE_VVAR(_vdso_rng_data) in vma.c
Thanks, this seems sane to me. I'll apply this now to random.git as a fix; it should have been done this way before and is going to be a headache to coordinate later.
Jason
Same as for gettimeofday CVDSO implementation, add c-getrandom-y to ease the inclusion of lib/vdso/getrandom.c in architectures VDSO builds.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- lib/vdso/Makefile | 1 + 1 file changed, 1 insertion(+)
diff --git a/lib/vdso/Makefile b/lib/vdso/Makefile index 9f031eafc465..cedbf15f8087 100644 --- a/lib/vdso/Makefile +++ b/lib/vdso/Makefile @@ -4,6 +4,7 @@ GENERIC_VDSO_MK_PATH := $(abspath $(lastword $(MAKEFILE_LIST))) GENERIC_VDSO_DIR := $(dir $(GENERIC_VDSO_MK_PATH))
c-gettimeofday-$(CONFIG_GENERIC_GETTIMEOFDAY) := $(addprefix $(GENERIC_VDSO_DIR), gettimeofday.c) +c-getrandom-$(CONFIG_VDSO_GETRANDOM) := $(addprefix $(GENERIC_VDSO_DIR), getrandom.c)
# This cmd checks that the vdso library does not contain dynamic relocations. # It has to be called after the linking of the vdso library and requires it
On Thu, Aug 22, 2024 at 09:13:12AM +0200, Christophe Leroy wrote:
Same as for gettimeofday CVDSO implementation, add c-getrandom-y to ease the inclusion of lib/vdso/getrandom.c in architectures VDSO builds.
Thanks, seems straight forward. I'll apply this now as a fix.
With the current implementation, __cvdso_getrandom_data() calls memset(), which is unexpected in the VDSO.
Rewrite opaque data initialisation to avoid memset().
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- lib/vdso/getrandom.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c index cab153c5f9be..4a56f45141b4 100644 --- a/lib/vdso/getrandom.c +++ b/lib/vdso/getrandom.c @@ -4,6 +4,7 @@ */
#include <linux/minmax.h> +#include <linux/array_size.h> #include <vdso/datapage.h> #include <vdso/getrandom.h> #include <vdso/unaligned.h> @@ -74,11 +75,15 @@ __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_ u32 counter[2] = { 0 };
if (unlikely(opaque_len == ~0UL && !buffer && !len && !flags)) { - *(struct vgetrandom_opaque_params *)opaque_state = (struct vgetrandom_opaque_params) { - .size_of_opaque_state = sizeof(*state), - .mmap_prot = PROT_READ | PROT_WRITE, - .mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS - }; + struct vgetrandom_opaque_params *params = opaque_state; + int i; + + params->size_of_opaque_state = sizeof(*state); + params->mmap_prot = PROT_READ | PROT_WRITE; + params->mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS; + for (i = 0; i < ARRAY_SIZE(params->reserved); i++) + params->reserved[i] = 0; + return 0; }
On Thu, Aug 22, 2024 at 09:13:13AM +0200, Christophe Leroy wrote:
With the current implementation, __cvdso_getrandom_data() calls memset(), which is unexpected in the VDSO.
Rewrite opaque data initialisation to avoid memset().
I think of the various ways I've seen of fixing this -- e.g. adding a memset() arch-by-arch -- this is the cleanest and simplest. I'll add this as a fix to random.git now.
Jason
On Thu, Aug 22, 2024 at 09:13:13AM +0200, Christophe Leroy wrote:
With the current implementation, __cvdso_getrandom_data() calls memset(), which is unexpected in the VDSO.
Rewrite opaque data initialisation to avoid memset().
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu
lib/vdso/getrandom.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c index cab153c5f9be..4a56f45141b4 100644 --- a/lib/vdso/getrandom.c +++ b/lib/vdso/getrandom.c @@ -4,6 +4,7 @@ */ #include <linux/minmax.h> +#include <linux/array_size.h> #include <vdso/datapage.h> #include <vdso/getrandom.h> #include <vdso/unaligned.h> @@ -74,11 +75,15 @@ __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_ u32 counter[2] = { 0 }; if (unlikely(opaque_len == ~0UL && !buffer && !len && !flags)) {
*(struct vgetrandom_opaque_params *)opaque_state = (struct vgetrandom_opaque_params) {
.size_of_opaque_state = sizeof(*state),
.mmap_prot = PROT_READ | PROT_WRITE,
.mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS
};
struct vgetrandom_opaque_params *params = opaque_state;
int i;
params->size_of_opaque_state = sizeof(*state);
params->mmap_prot = PROT_READ | PROT_WRITE;
params->mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS;
for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
params->reserved[i] = 0;
- return 0; }
Is there a compiler flag that could be used to disable the generation of calls to memset?
- Eric
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
On Thu, Aug 22, 2024 at 09:13:13AM +0200, Christophe Leroy wrote:
With the current implementation, __cvdso_getrandom_data() calls memset(), which is unexpected in the VDSO.
Rewrite opaque data initialisation to avoid memset().
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu
lib/vdso/getrandom.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c index cab153c5f9be..4a56f45141b4 100644 --- a/lib/vdso/getrandom.c +++ b/lib/vdso/getrandom.c @@ -4,6 +4,7 @@ */ #include <linux/minmax.h> +#include <linux/array_size.h> #include <vdso/datapage.h> #include <vdso/getrandom.h> #include <vdso/unaligned.h> @@ -74,11 +75,15 @@ __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_ u32 counter[2] = { 0 }; if (unlikely(opaque_len == ~0UL && !buffer && !len && !flags)) {
*(struct vgetrandom_opaque_params *)opaque_state = (struct vgetrandom_opaque_params) {
.size_of_opaque_state = sizeof(*state),
.mmap_prot = PROT_READ | PROT_WRITE,
.mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS
};
struct vgetrandom_opaque_params *params = opaque_state;
int i;
params->size_of_opaque_state = sizeof(*state);
params->mmap_prot = PROT_READ | PROT_WRITE;
params->mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS;
for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
params->reserved[i] = 0;
- return 0; }
Is there a compiler flag that could be used to disable the generation of calls to memset?
Apparently not: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90701
-ffreestanding disables the most but still can generate calls to memcpy and memset, and the bug was closed as "RESOLVED INVALID". Surprising, but maybe it's one of those "functions are part of language" things.
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
On Thu, Aug 22, 2024 at 09:13:13AM +0200, Christophe Leroy wrote:
With the current implementation, __cvdso_getrandom_data() calls memset(), which is unexpected in the VDSO.
Rewrite opaque data initialisation to avoid memset().
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu
lib/vdso/getrandom.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c index cab153c5f9be..4a56f45141b4 100644 --- a/lib/vdso/getrandom.c +++ b/lib/vdso/getrandom.c @@ -4,6 +4,7 @@ */ #include <linux/minmax.h> +#include <linux/array_size.h> #include <vdso/datapage.h> #include <vdso/getrandom.h> #include <vdso/unaligned.h> @@ -74,11 +75,15 @@ __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_ u32 counter[2] = { 0 }; if (unlikely(opaque_len == ~0UL && !buffer && !len && !flags)) {
*(struct vgetrandom_opaque_params *)opaque_state = (struct vgetrandom_opaque_params) {
.size_of_opaque_state = sizeof(*state),
.mmap_prot = PROT_READ | PROT_WRITE,
.mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS
};
struct vgetrandom_opaque_params *params = opaque_state;
int i;
params->size_of_opaque_state = sizeof(*state);
params->mmap_prot = PROT_READ | PROT_WRITE;
params->mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS;
for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
params->reserved[i] = 0;
- return 0; }
Is there a compiler flag that could be used to disable the generation of calls to memset?
-fno-tree-loop-distribute-patterns . But, as always, read up on it, see what it actually does (and how it avoids your problem, and mostly: learn what the actual problem *was*!)
Have fun,
Segher
On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
On Thu, Aug 22, 2024 at 09:13:13AM +0200, Christophe Leroy wrote:
With the current implementation, __cvdso_getrandom_data() calls memset(), which is unexpected in the VDSO.
Rewrite opaque data initialisation to avoid memset().
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu
lib/vdso/getrandom.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c index cab153c5f9be..4a56f45141b4 100644 --- a/lib/vdso/getrandom.c +++ b/lib/vdso/getrandom.c @@ -4,6 +4,7 @@ */ #include <linux/minmax.h> +#include <linux/array_size.h> #include <vdso/datapage.h> #include <vdso/getrandom.h> #include <vdso/unaligned.h> @@ -74,11 +75,15 @@ __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_ u32 counter[2] = { 0 }; if (unlikely(opaque_len == ~0UL && !buffer && !len && !flags)) {
*(struct vgetrandom_opaque_params *)opaque_state = (struct vgetrandom_opaque_params) {
.size_of_opaque_state = sizeof(*state),
.mmap_prot = PROT_READ | PROT_WRITE,
.mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS
};
struct vgetrandom_opaque_params *params = opaque_state;
int i;
params->size_of_opaque_state = sizeof(*state);
params->mmap_prot = PROT_READ | PROT_WRITE;
params->mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS;
for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
params->reserved[i] = 0;
- return 0; }
Is there a compiler flag that could be used to disable the generation of calls to memset?
-fno-tree-loop-distribute-patterns . But, as always, read up on it, see what it actually does (and how it avoids your problem, and mostly: learn what the actual problem *was*!)
This might help with various loops, but it doesn't help with the matter that this patch fixes, which is struct initialization. I just tried it with the arm64 patch to no avail.
On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
Is there a compiler flag that could be used to disable the generation of calls to memset?
-fno-tree-loop-distribute-patterns . But, as always, read up on it, see what it actually does (and how it avoids your problem, and mostly: learn what the actual problem *was*!)
This might help with various loops, but it doesn't help with the matter that this patch fixes, which is struct initialization. I just tried it with the arm64 patch to no avail.
Maybe -ffreestanding can help here? That should cause the vdso to be built with the assumption that there is no libc, so it would neither add nor remove standard library calls. Not sure if that causes other problems, e.g. if the calling conventions are different.
Arnd
On Wed, Aug 28, 2024 at 2:24 PM Arnd Bergmann arnd@arndb.de wrote:
On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
Is there a compiler flag that could be used to disable the generation of calls to memset?
-fno-tree-loop-distribute-patterns . But, as always, read up on it, see what it actually does (and how it avoids your problem, and mostly: learn what the actual problem *was*!)
This might help with various loops, but it doesn't help with the matter that this patch fixes, which is struct initialization. I just tried it with the arm64 patch to no avail.
Maybe -ffreestanding can help here? That should cause the vdso to be built with the assumption that there is no libc, so it would neither add nor remove standard library calls. Not sure if that causes other problems, e.g. if the calling conventions are different.
From https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90701:
| You need -ffreestanding but that is documented to emit memset and memcpy still.
On Wed, Aug 28, 2024 at 02:26:08PM +0200, Jason A. Donenfeld wrote:
On Wed, Aug 28, 2024 at 2:24 PM Arnd Bergmann arnd@arndb.de wrote:
On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
Is there a compiler flag that could be used to disable the generation of calls to memset?
-fno-tree-loop-distribute-patterns . But, as always, read up on it, see what it actually does (and how it avoids your problem, and mostly: learn what the actual problem *was*!)
This might help with various loops, but it doesn't help with the matter that this patch fixes, which is struct initialization. I just tried it with the arm64 patch to no avail.
Maybe -ffreestanding can help here? That should cause the vdso to be built with the assumption that there is no libc, so it would neither add nor remove standard library calls. Not sure if that causes other problems, e.g. if the calling conventions are different.
| You need -ffreestanding but that is documented to emit memset and memcpy still.
Yeah.
'-nostdlib' Do not use the standard system startup files or libraries when linking.
This won't help a bit, the compiler will still optimise your initialisation loop to a memset() call, it just won't link to libgcc.a and crt*.o and its ilk (which is not where mem* are implemented in the first place!)
Segher
On Wed, Aug 28, 2024 at 12:24:12PM +0000, Arnd Bergmann wrote:
On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
Is there a compiler flag that could be used to disable the generation of calls to memset?
-fno-tree-loop-distribute-patterns . But, as always, read up on it, see what it actually does (and how it avoids your problem, and mostly: learn what the actual problem *was*!)
This might help with various loops, but it doesn't help with the matter that this patch fixes, which is struct initialization. I just tried it with the arm64 patch to no avail.
Maybe -ffreestanding can help here? That should cause the vdso to be built with the assumption that there is no libc, so it would neither add nor remove standard library calls. Not sure if that causes other problems, e.g. if the calling conventions are different.
"GCC requires the freestanding environment provide 'memcpy', 'memmove', 'memset' and 'memcmp'."
This is precisely to implement things like struct initialisation. Maybe we should have a "-ffreeerstanding" or "-ffreefloating" or think of something funnier still environment as well, this problem has been there since the -ffreestanding flag has existed, but the problem is as old as the night.
-fno-builtin might help a bit more, but just attack the problem at its root, like I suggested?
(This isn't a new problem, originally it showed up as "GCC replaces (part of) my memcpy() implementation by a (recursive) call to memcpy()" and, well, that doesn't quite work!)
Segher
On Wed, 28 Aug 2024 at 14:57, Segher Boessenkool segher@kernel.crashing.org wrote:
On Wed, Aug 28, 2024 at 12:24:12PM +0000, Arnd Bergmann wrote:
On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
Is there a compiler flag that could be used to disable the generation of calls to memset?
-fno-tree-loop-distribute-patterns . But, as always, read up on it, see what it actually does (and how it avoids your problem, and mostly: learn what the actual problem *was*!)
This might help with various loops, but it doesn't help with the matter that this patch fixes, which is struct initialization. I just tried it with the arm64 patch to no avail.
Maybe -ffreestanding can help here? That should cause the vdso to be built with the assumption that there is no libc, so it would neither add nor remove standard library calls. Not sure if that causes other problems, e.g. if the calling conventions are different.
"GCC requires the freestanding environment provide 'memcpy', 'memmove', 'memset' and 'memcmp'."
This is precisely to implement things like struct initialisation. Maybe we should have a "-ffreeerstanding" or "-ffreefloating" or think of something funnier still environment as well, this problem has been there since the -ffreestanding flag has existed, but the problem is as old as the night.
-fno-builtin might help a bit more, but just attack the problem at its root, like I suggested?
In my experience, this is likely to do the opposite: it causes the compiler to 'forget' the semantics of memcpy() and memset(), so that explicit trivial calls will no longer be elided and replaced with plain loads and stores (as it can no longer guarantee the equivalence)
(This isn't a new problem, originally it showed up as "GCC replaces (part of) my memcpy() implementation by a (recursive) call to memcpy()" and, well, that doesn't quite work!)
This needs to be fixed for Clang as well, so throwing GCC specific flags at it will at best be a partial solution.
Omitting the struct assignment is a reasonable way to reduce the likelihood that a memset() will be emitted, so for this patch
Acked-by: Ard Biesheuvel ardb@kernel.org
It is not a complete solution, unfortunately, and I guess there may be other situations (compiler/arch combinations) where this might pop up again.
Hi!
On Wed, Aug 28, 2024 at 05:40:23PM +0200, Ard Biesheuvel wrote:
On Wed, 28 Aug 2024 at 14:57, Segher Boessenkool segher@kernel.crashing.org wrote:
On Wed, Aug 28, 2024 at 12:24:12PM +0000, Arnd Bergmann wrote:
On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
Is there a compiler flag that could be used to disable the generation of calls to memset?
-fno-tree-loop-distribute-patterns . But, as always, read up on it, see what it actually does (and how it avoids your problem, and mostly: learn what the actual problem *was*!)
This might help with various loops, but it doesn't help with the matter that this patch fixes, which is struct initialization. I just tried it with the arm64 patch to no avail.
Maybe -ffreestanding can help here? That should cause the vdso to be built with the assumption that there is no libc, so it would neither add nor remove standard library calls. Not sure if that causes other problems, e.g. if the calling conventions are different.
"GCC requires the freestanding environment provide 'memcpy', 'memmove', 'memset' and 'memcmp'."
This is precisely to implement things like struct initialisation. Maybe we should have a "-ffreeerstanding" or "-ffreefloating" or think of something funnier still environment as well, this problem has been there since the -ffreestanding flag has existed, but the problem is as old as the night.
-fno-builtin might help a bit more, but just attack the problem at its root, like I suggested?
In my experience, this is likely to do the opposite: it causes the compiler to 'forget' the semantics of memcpy() and memset(), so that explicit trivial calls will no longer be elided and replaced with plain loads and stores (as it can no longer guarantee the equivalence)
No, the compiler will never forget those semantics. But if you tell it your function named memset() is not the actual standard memset -- via -fno-builtin-memset for example -- the compiler won't optimise things involving it quite as much. You told it so eh?
You can also tell it not to have a __builtin_memset function, but in this particular case that won;t quite work, since the compiler does need to have that builtin available to do struct and array initialisations and the like.
(This isn't a new problem, originally it showed up as "GCC replaces (part of) my memcpy() implementation by a (recursive) call to memcpy()" and, well, that doesn't quite work!)
This needs to be fixed for Clang as well, so throwing GCC specific flags at it will at best be a partial solution.
clang says it is a 100% plug-in replacement for GCC, so they will have to accept all GCC flags. And in many cases they do. Cases where they don't are bugs.
It is not a complete solution, unfortunately, and I guess there may be other situations (compiler/arch combinations) where this might pop up again.
Why do mem* not work in VDSOs? Fix that, and all these problems disappear, and you do not need workrarounds :-)
Segher
On Wed, 28 Aug 2024 at 18:24, Segher Boessenkool segher@kernel.crashing.org wrote:
Hi!
On Wed, Aug 28, 2024 at 05:40:23PM +0200, Ard Biesheuvel wrote:
On Wed, 28 Aug 2024 at 14:57, Segher Boessenkool segher@kernel.crashing.org wrote:
On Wed, Aug 28, 2024 at 12:24:12PM +0000, Arnd Bergmann wrote:
On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote: > > Is there a compiler flag that could be used to disable the generation of calls > to memset?
-fno-tree-loop-distribute-patterns . But, as always, read up on it, see what it actually does (and how it avoids your problem, and mostly: learn what the actual problem *was*!)
This might help with various loops, but it doesn't help with the matter that this patch fixes, which is struct initialization. I just tried it with the arm64 patch to no avail.
Maybe -ffreestanding can help here? That should cause the vdso to be built with the assumption that there is no libc, so it would neither add nor remove standard library calls. Not sure if that causes other problems, e.g. if the calling conventions are different.
"GCC requires the freestanding environment provide 'memcpy', 'memmove', 'memset' and 'memcmp'."
This is precisely to implement things like struct initialisation. Maybe we should have a "-ffreeerstanding" or "-ffreefloating" or think of something funnier still environment as well, this problem has been there since the -ffreestanding flag has existed, but the problem is as old as the night.
-fno-builtin might help a bit more, but just attack the problem at its root, like I suggested?
In my experience, this is likely to do the opposite: it causes the compiler to 'forget' the semantics of memcpy() and memset(), so that explicit trivial calls will no longer be elided and replaced with plain loads and stores (as it can no longer guarantee the equivalence)
No, the compiler will never forget those semantics. But if you tell it your function named memset() is not the actual standard memset -- via -fno-builtin-memset for example -- the compiler won't optimise things involving it quite as much. You told it so eh?
That is exactly the point I am making. So how would this help in this case?
You can also tell it not to have a __builtin_memset function, but in this particular case that won;t quite work, since the compiler does need to have that builtin available to do struct and array initialisations and the like.
(This isn't a new problem, originally it showed up as "GCC replaces (part of) my memcpy() implementation by a (recursive) call to memcpy()" and, well, that doesn't quite work!)
This needs to be fixed for Clang as well, so throwing GCC specific flags at it will at best be a partial solution.
clang says it is a 100% plug-in replacement for GCC, so they will have to accept all GCC flags. And in many cases they do. Cases where they don't are bugs.
Even if this were true, we support Clang versions today that do not support -fno-tree-loop-distribute-patterns so my point stands.
It is not a complete solution, unfortunately, and I guess there may be other situations (compiler/arch combinations) where this might pop up again.
Why do mem* not work in VDSOs? Fix that, and all these problems disappear, and you do not need workrarounds :-)
Good point. We should just import memcpy and memset in the VDSO ELF metadata.
Not sure about static binaries, though: do those even use the VDSO?
On Wed, Aug 28, 2024 at 07:12:55PM +0200, Ard Biesheuvel wrote:
On Wed, 28 Aug 2024 at 18:24, Segher Boessenkool segher@kernel.crashing.org wrote:
In my experience, this is likely to do the opposite: it causes the compiler to 'forget' the semantics of memcpy() and memset(), so that explicit trivial calls will no longer be elided and replaced with plain loads and stores (as it can no longer guarantee the equivalence)
No, the compiler will never forget those semantics. But if you tell it your function named memset() is not the actual standard memset -- via -fno-builtin-memset for example -- the compiler won't optimise things involving it quite as much. You told it so eh?
That is exactly the point I am making. So how would this help in this case?
I think we agree? :-)
This needs to be fixed for Clang as well, so throwing GCC specific flags at it will at best be a partial solution.
clang says it is a 100% plug-in replacement for GCC, so they will have to accept all GCC flags. And in many cases they do. Cases where they don't are bugs.
Even if this were true, we support Clang versions today that do not support -fno-tree-loop-distribute-patterns so my point stands.
It is true. Yes, this cause problems for their users.
It is not a complete solution, unfortunately, and I guess there may be other situations (compiler/arch combinations) where this might pop up again.
Why do mem* not work in VDSOs? Fix that, and all these problems disappear, and you do not need workrarounds :-)
Good point. We should just import memcpy and memset in the VDSO ELF metadata.
Yeah. In many cases GCC will replace such calls by (faster and/or smaller) inline code anyway, but when it does leave a call, there needs to be an external function implementing it :-)
Not sure about static binaries, though: do those even use the VDSO?
With "static binary" people usually mean "a binary not using any DSOs", I think the VDSO is a DSO, also in this respect? As always, -static builds are *way* less problematic (and faster and smaller :-) )
Segher
Le 28/08/2024 à 19:25, Segher Boessenkool a écrit :
Not sure about static binaries, though: do those even use the VDSO?
With "static binary" people usually mean "a binary not using any DSOs", I think the VDSO is a DSO, also in this respect? As always, -static builds are *way* less problematic (and faster and smaller :-) )
AFAIK on powerpc even static binaries use the vDSO, otherwise signals don't work.
Christophe
On Thu, Aug 29, 2024 at 07:36:38PM +0200, Christophe Leroy wrote:
Le 28/08/2024 à 19:25, Segher Boessenkool a écrit :
Not sure about static binaries, though: do those even use the VDSO?
With "static binary" people usually mean "a binary not using any DSOs", I think the VDSO is a DSO, also in this respect? As always, -static builds are *way* less problematic (and faster and smaller :-) )
AFAIK on powerpc even static binaries use the vDSO, otherwise signals don't work.
How can that work? Non-dynamic binaries do not use ld.so (that is the definition of a dynamic binary, even). So they cannot link (at runtime) to any DSO (unless that is done manually?!)
Maybe there is something at a fixed offset in the vDSO, or something like that? Is this documented somewhere?
Segher
Le 29/08/2024 à 20:02, Segher Boessenkool a écrit :
On Thu, Aug 29, 2024 at 07:36:38PM +0200, Christophe Leroy wrote:
Le 28/08/2024 à 19:25, Segher Boessenkool a écrit :
Not sure about static binaries, though: do those even use the VDSO?
With "static binary" people usually mean "a binary not using any DSOs", I think the VDSO is a DSO, also in this respect? As always, -static builds are *way* less problematic (and faster and smaller :-) )
AFAIK on powerpc even static binaries use the vDSO, otherwise signals don't work.
How can that work? Non-dynamic binaries do not use ld.so (that is the definition of a dynamic binary, even). So they cannot link (at runtime) to any DSO (unless that is done manually?!)
Maybe there is something at a fixed offset in the vDSO, or something like that? Is this documented somewhere?
You've got some explanation here : https://github.com/torvalds/linux/blob/master/Documentation/ABI/stable/vdso
Segher Boessenkool segher@kernel.crashing.org writes:
On Thu, Aug 29, 2024 at 07:36:38PM +0200, Christophe Leroy wrote:
Le 28/08/2024 à 19:25, Segher Boessenkool a écrit :
Not sure about static binaries, though: do those even use the VDSO?
With "static binary" people usually mean "a binary not using any DSOs", I think the VDSO is a DSO, also in this respect? As always, -static builds are *way* less problematic (and faster and smaller :-) )
AFAIK on powerpc even static binaries use the vDSO, otherwise signals don't work.
How can that work? Non-dynamic binaries do not use ld.so (that is the definition of a dynamic binary, even). So they cannot link (at runtime) to any DSO (unless that is done manually?!)
At least for signals I don't think the application needs to know anything about the VDSO. The kernel sets up the return to the signal trampoline (in the VDSO), and as long as userspace returns from its signal handler with blr it will land back on the trampoline.
cheers
On Wed, Aug 28, 2024 at 01:18:34PM +0200, Jason A. Donenfeld wrote:
On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
params->reserved[i] = 0;
This is a loop. With -ftree-loop-distribute-patterns, the default at -O2, this is optimised to
memset(params->reserved, 0, ...);
(which is perfectly fine, since memset is required to be there even for freestanding environments, this is documented!)
-fno-tree-loop-distribute-patterns . But, as always, read up on it, see what it actually does (and how it avoids your problem, and mostly: learn what the actual problem *was*!)
This might help with various loops, but it doesn't help with the matter that this patch fixes, which is struct initialization. I just tried it with the arm64 patch to no avail.
It very much *does* help. Try harder? Maybe you typoed?
Segher
On Wed, Aug 28, 2024 at 07:33:13AM -0500, Segher Boessenkool wrote:
On Wed, Aug 28, 2024 at 01:18:34PM +0200, Jason A. Donenfeld wrote:
On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
params->reserved[i] = 0;
This is a loop. With -ftree-loop-distribute-patterns, the default at -O2, this is optimised to
memset(params->reserved, 0, ...);
(which is perfectly fine, since memset is required to be there even for freestanding environments, this is documented!)
-fno-tree-loop-distribute-patterns . But, as always, read up on it, see what it actually does (and how it avoids your problem, and mostly: learn what the actual problem *was*!)
This might help with various loops, but it doesn't help with the matter that this patch fixes, which is struct initialization. I just tried it with the arm64 patch to no avail.
It very much *does* help. Try harder? Maybe you typoed?
Scroll up and reread the original patch. You are confused. The loop is what fixes the matter. It's struct initialization that generates the memset.
Performing SMP atomic operations on u64 fails on powerpc32.
Random driver generation is handled as unsigned long not u64, see for instance base_cnrg or struct crng.
Use the same type for vDSO's getrandom as it gets copied from the above. This is also in line with the local current_generation which is already an unsigned long.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- include/vdso/datapage.h | 2 +- include/vdso/getrandom.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/vdso/datapage.h b/include/vdso/datapage.h index b85f24cac3f5..00b66a9b5778 100644 --- a/include/vdso/datapage.h +++ b/include/vdso/datapage.h @@ -123,7 +123,7 @@ struct vdso_data { * @is_ready: boolean signaling whether the RNG is initialized */ struct vdso_rng_data { - u64 generation; + unsigned long generation; u8 is_ready; };
diff --git a/include/vdso/getrandom.h b/include/vdso/getrandom.h index a8b7c14b0ae0..36fd166bf330 100644 --- a/include/vdso/getrandom.h +++ b/include/vdso/getrandom.h @@ -38,7 +38,7 @@ struct vgetrandom_state { }; u8 batch_key[CHACHA_BLOCK_SIZE * 2]; }; - u64 generation; + unsigned long generation; u8 pos; bool in_use; };
On Thu, Aug 22, 2024 at 09:13:14AM +0200, Christophe Leroy wrote:
Performing SMP atomic operations on u64 fails on powerpc32.
Random driver generation is handled as unsigned long not u64, see for instance base_cnrg or struct crng.
Use the same type for vDSO's getrandom as it gets copied from the above. This is also in line with the local current_generation which is already an unsigned long.
This isn't going to work when 32-bit userspace tries to access a 64-bit kernel.
I had "fixed" this with a vdso_kernel_ulong type way back in an earlier version: https://lore.kernel.org/lkml/20240528122352.2485958-5-Jason@zx2c4.com/#Z31in...
But tglx pointed out in that thread that this actually isn't necessary:
| All of this is pointless because if a 32-bit application runs on a | 64-bit kernel it has to use the 64-bit 'generation'. So why on earth do | we need magic here for a 32-bit kernel? | | Just use u64 for both and spare all this voodoo. We're seriously not | "optimizing" for 32-bit kernels. | | All what happens on a 32-bit kernel is that the RNG will store the | unsigned long (32bit) generation into a 64bit variable: | | smp_store_release(&_vdso_rng_data.generation, next_gen + 1); | | As the upper 32bit are always zero, there is no issue vs. load store | tearing at all. So there is zero benefit for this aside of slightly | "better" user space code when running on a 32-bit kernel. Who cares?
So I just got rid of it and used a u64 as he suggested.
However, there's also an additional reason why it's not worth churning further over this - because VM_DROPPABLE is 64-bit only (due to flags in vma bits), likely so is vDSO getrandom() for the time being. So I think it makes more sense to retool this series to be ppc64, and then if you really really want 32-bit and can convince folks it matters, then all of these parts (for example, here, the fact that the smp helper doesn't want to tear) can be fixed up in a separate series.
Jason
Le 26/08/2024 à 09:50, Jason A. Donenfeld a écrit :
On Thu, Aug 22, 2024 at 09:13:14AM +0200, Christophe Leroy wrote:
Performing SMP atomic operations on u64 fails on powerpc32.
Random driver generation is handled as unsigned long not u64, see for instance base_cnrg or struct crng.
Use the same type for vDSO's getrandom as it gets copied from the above. This is also in line with the local current_generation which is already an unsigned long.
This isn't going to work when 32-bit userspace tries to access a 64-bit kernel.
I had "fixed" this with a vdso_kernel_ulong type way back in an earlier version: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kerne...
But tglx pointed out in that thread that this actually isn't necessary:
| All of this is pointless because if a 32-bit application runs on a | 64-bit kernel it has to use the 64-bit 'generation'. So why on earth do | we need magic here for a 32-bit kernel? | | Just use u64 for both and spare all this voodoo. We're seriously not | "optimizing" for 32-bit kernels. | | All what happens on a 32-bit kernel is that the RNG will store the | unsigned long (32bit) generation into a 64bit variable: | | smp_store_release(&_vdso_rng_data.generation, next_gen + 1); | | As the upper 32bit are always zero, there is no issue vs. load store | tearing at all. So there is zero benefit for this aside of slightly | "better" user space code when running on a 32-bit kernel. Who cares?
So I just got rid of it and used a u64 as he suggested.
However, there's also an additional reason why it's not worth churning further over this - because VM_DROPPABLE is 64-bit only (due to flags in vma bits), likely so is vDSO getrandom() for the time being. So I think it makes more sense to retool this series to be ppc64, and then if you really really want 32-bit and can convince folks it matters, then all of these parts (for example, here, the fact that the smp helper doesn't want to tear) can be fixed up in a separate series.
So yes I really really want it on ppc32 because this is the only type of boards I have and this is really were we need getrandom() to be optimised, indeed ppc64 was sherry-on-the-cake in my series, I just added it because it was easy to do after doing ppc32.
Christophe
On Mon, Aug 26, 2024 at 10:01:17AM +0200, Christophe Leroy wrote:
Le 26/08/2024 à 09:50, Jason A. Donenfeld a écrit :
On Thu, Aug 22, 2024 at 09:13:14AM +0200, Christophe Leroy wrote:
Performing SMP atomic operations on u64 fails on powerpc32.
Random driver generation is handled as unsigned long not u64, see for instance base_cnrg or struct crng.
Use the same type for vDSO's getrandom as it gets copied from the above. This is also in line with the local current_generation which is already an unsigned long.
This isn't going to work when 32-bit userspace tries to access a 64-bit kernel.
I had "fixed" this with a vdso_kernel_ulong type way back in an earlier version: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kerne...
But tglx pointed out in that thread that this actually isn't necessary:
| All of this is pointless because if a 32-bit application runs on a | 64-bit kernel it has to use the 64-bit 'generation'. So why on earth do | we need magic here for a 32-bit kernel? | | Just use u64 for both and spare all this voodoo. We're seriously not | "optimizing" for 32-bit kernels. | | All what happens on a 32-bit kernel is that the RNG will store the | unsigned long (32bit) generation into a 64bit variable: | | smp_store_release(&_vdso_rng_data.generation, next_gen + 1); | | As the upper 32bit are always zero, there is no issue vs. load store | tearing at all. So there is zero benefit for this aside of slightly | "better" user space code when running on a 32-bit kernel. Who cares?
So I just got rid of it and used a u64 as he suggested.
However, there's also an additional reason why it's not worth churning further over this - because VM_DROPPABLE is 64-bit only (due to flags in vma bits), likely so is vDSO getrandom() for the time being. So I think it makes more sense to retool this series to be ppc64, and then if you really really want 32-bit and can convince folks it matters, then all of these parts (for example, here, the fact that the smp helper doesn't want to tear) can be fixed up in a separate series.
So yes I really really want it on ppc32 because this is the only type of boards I have and this is really were we need getrandom() to be optimised, indeed ppc64 was sherry-on-the-cake in my series, I just added it because it was easy to do after doing ppc32.
I saw that you did in fact find a bit on ppc32 for VM_DROPPABLE. So it looks at least possible. Because of this generation counter business, I still think it might make sense to do in two steps, though, first doing 64-bit, and then doing 32-bit after.
As for the generation counter error you're seeing, I guess what we want is smp_store_release memory ordering semantics, but letting tearing happen (since the upper 32-bits will be zero anyway). I'm not sure the best way to do this, whether it's a new helper, or doing a WRITE_ONCE together with an smp barrier, or what. But I imagine it's something like that.
Jason
On Mon, Aug 26 2024 at 10:01, Christophe Leroy wrote:
Le 26/08/2024 à 09:50, Jason A. Donenfeld a écrit :
But tglx pointed out in that thread that this actually isn't necessary:
| All of this is pointless because if a 32-bit application runs on a | 64-bit kernel it has to use the 64-bit 'generation'. So why on earth do | we need magic here for a 32-bit kernel? | | Just use u64 for both and spare all this voodoo. We're seriously not | "optimizing" for 32-bit kernels. | | All what happens on a 32-bit kernel is that the RNG will store the | unsigned long (32bit) generation into a 64bit variable: | | smp_store_release(&_vdso_rng_data.generation, next_gen + 1); | | As the upper 32bit are always zero, there is no issue vs. load store | tearing at all. So there is zero benefit for this aside of slightly | "better" user space code when running on a 32-bit kernel. Who cares?
So I just got rid of it and used a u64 as he suggested.
However, there's also an additional reason why it's not worth churning further over this - because VM_DROPPABLE is 64-bit only (due to flags in vma bits), likely so is vDSO getrandom() for the time being. So I think it makes more sense to retool this series to be ppc64, and then if you really really want 32-bit and can convince folks it matters, then all of these parts (for example, here, the fact that the smp helper doesn't want to tear) can be fixed up in a separate series.
So yes I really really want it on ppc32 because this is the only type of boards I have and this is really were we need getrandom() to be optimised,
For nostalgic reasons?
indeed ppc64 was sherry-on-the-cake in my series, I just added it because it was easy to do after doing ppc32.
The rng problem for ppc32 seems to be:
smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
right?
Your proposed type change creates inconsistency for 32-bit userspace running on 64-bit kernels because the data struct layout changes.
As explained before, there is no problem with store or load tearing on 32bit systems because the generation counter is only 32bit wide. So the obvious solution is to only update 32 bits on a 32bit kernel:
--- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -282,7 +282,7 @@ static void crng_reseed(struct work_stru * is ordered with the write above to base_crng.generation. Pairs with * the smp_rmb() before the syscall in the vDSO code. */ - smp_store_release(&_vdso_rng_data.generation, next_gen + 1); + smp_store_release((unsigned long *)&_vdso_rng_data.generation, next_gen + 1); #endif if (!static_branch_likely(&crng_is_ready)) crng_init = CRNG_READY;
Which is perfectly fine on 32-bit independent of endianess because the user space side does READ_ONCE(data->generation) and the read value is solely used for comparison so it does not matter at all whether the generation information is in the upper or the lower 32bit of the u64.
No?
But that's a trivial fix compared to making VM_DROPPABLE work on 32-bit correclty. :)
Thanks,
tglx
On Mon, Aug 26, 2024 at 11:43:39AM +0200, Thomas Gleixner wrote:
As explained before, there is no problem with store or load tearing on 32bit systems because the generation counter is only 32bit wide. So the obvious solution is to only update 32 bits on a 32bit kernel:
--- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -282,7 +282,7 @@ static void crng_reseed(struct work_stru * is ordered with the write above to base_crng.generation. Pairs with * the smp_rmb() before the syscall in the vDSO code. */
- smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
- smp_store_release((unsigned long *)&_vdso_rng_data.generation, next_gen + 1);
#endif if (!static_branch_likely(&crng_is_ready)) crng_init = CRNG_READY;
That seems like a pretty clean fix.
But that's a trivial fix compared to making VM_DROPPABLE work on 32-bit correclty. :)
My initial response too, and then I noticed he posted this: https://lore.kernel.org/all/315e3a268b165b6edad7dcb723b0d8a506a56c4e.1724309... If that's correct, maybe it's not so bad, at least here. I haven't yet looked into the details of it.
Jason
Le 26/08/2024 à 11:43, Thomas Gleixner a écrit :
On Mon, Aug 26 2024 at 10:01, Christophe Leroy wrote:
Le 26/08/2024 à 09:50, Jason A. Donenfeld a écrit :
But tglx pointed out in that thread that this actually isn't necessary:
| All of this is pointless because if a 32-bit application runs on a | 64-bit kernel it has to use the 64-bit 'generation'. So why on earth do | we need magic here for a 32-bit kernel? | | Just use u64 for both and spare all this voodoo. We're seriously not | "optimizing" for 32-bit kernels. | | All what happens on a 32-bit kernel is that the RNG will store the | unsigned long (32bit) generation into a 64bit variable: | | smp_store_release(&_vdso_rng_data.generation, next_gen + 1); | | As the upper 32bit are always zero, there is no issue vs. load store | tearing at all. So there is zero benefit for this aside of slightly | "better" user space code when running on a 32-bit kernel. Who cares?
So I just got rid of it and used a u64 as he suggested.
However, there's also an additional reason why it's not worth churning further over this - because VM_DROPPABLE is 64-bit only (due to flags in vma bits), likely so is vDSO getrandom() for the time being. So I think it makes more sense to retool this series to be ppc64, and then if you really really want 32-bit and can convince folks it matters, then all of these parts (for example, here, the fact that the smp helper doesn't want to tear) can be fixed up in a separate series.
So yes I really really want it on ppc32 because this is the only type of boards I have and this is really were we need getrandom() to be optimised,
For nostalgic reasons?
No nostalgy here. We deliver voice communication systems for air trafic control that we are commited to maintain for at least the next 15 years.
The MPC 885 was manufactured by NXP until recently and we have several hundreds in stock in case one of our customer needs to extend an existing system.
The MPC 8321 is still manufactured by NXP and the boards on new systems we deliver use that CPU.
Both are 32 bits powerpc.
indeed ppc64 was sherry-on-the-cake in my series, I just added it because it was easy to do after doing ppc32.
The rng problem for ppc32 seems to be:
smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
right?
right.
Your proposed type change creates inconsistency for 32-bit userspace running on 64-bit kernels because the data struct layout changes.
As explained before, there is no problem with store or load tearing on 32bit systems because the generation counter is only 32bit wide. So the obvious solution is to only update 32 bits on a 32bit kernel:
--- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -282,7 +282,7 @@ static void crng_reseed(struct work_stru * is ordered with the write above to base_crng.generation. Pairs with * the smp_rmb() before the syscall in the vDSO code. */
- smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
- smp_store_release((unsigned long *)&_vdso_rng_data.generation, next_gen + 1); #endif if (!static_branch_likely(&crng_is_ready)) crng_init = CRNG_READY;
Which is perfectly fine on 32-bit independent of endianess because the user space side does READ_ONCE(data->generation) and the read value is solely used for comparison so it does not matter at all whether the generation information is in the upper or the lower 32bit of the u64. > No?
Yes that looks ok, it will only require a big fat comment to explain that. I'll give it a try.
But that's a trivial fix compared to making VM_DROPPABLE work on 32-bit correclty. :)
Euh ... Ok. What's the problem here ? I have used VM_ARCH_1 which is available. Is there more to do ?
Thanks Christophe
Commit 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") only adds VM_DROPPABLE for 64 bits architectures.
In order to also use the getrandom vDSO implementation on powerpc/32, use VM_ARCH_1 for VM_DROPPABLE on powerpc/32. This is possible because VM_ARCH_1 is used for VM_SAO on powerpc and VM_SAO is only for powerpc/64.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- fs/proc/task_mmu.c | 4 +++- include/linux/mm.h | 4 +++- include/trace/events/mmflags.h | 4 ++-- 3 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 5f171ad7b436..4a3fe961cbf6 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -987,8 +987,10 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) #ifdef CONFIG_X86_USER_SHADOW_STACK [ilog2(VM_SHADOW_STACK)] = "ss", #endif -#ifdef CONFIG_64BIT +#if VM_DROPPABLE != VM_NONE [ilog2(VM_DROPPABLE)] = "dp", +#endif +#ifdef CONFIG_64BIT [ilog2(VM_SEALED)] = "sl", #endif }; diff --git a/include/linux/mm.h b/include/linux/mm.h index c4b238a20b76..865d3e21ee5e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -359,7 +359,7 @@ extern unsigned int kobjsize(const void *objp);
#if defined(CONFIG_X86) # define VM_PAT VM_ARCH_1 /* PAT reserves whole VMA at once (x86) */ -#elif defined(CONFIG_PPC) +#elif defined(CONFIG_PPC64) # define VM_SAO VM_ARCH_1 /* Strong Access Ordering (powerpc) */ #elif defined(CONFIG_PARISC) # define VM_GROWSUP VM_ARCH_1 @@ -409,6 +409,8 @@ extern unsigned int kobjsize(const void *objp); #ifdef CONFIG_64BIT #define VM_DROPPABLE_BIT 40 #define VM_DROPPABLE BIT(VM_DROPPABLE_BIT) +#elif defined(CONFIG_PPC32) +#define VM_DROPPABLE VM_ARCH_1 #else #define VM_DROPPABLE VM_NONE #endif diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index b63d211bd141..9820cbfbcb14 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -143,7 +143,7 @@ IF_HAVE_PG_ARCH_X(arch_3)
#if defined(CONFIG_X86) #define __VM_ARCH_SPECIFIC_1 {VM_PAT, "pat" } -#elif defined(CONFIG_PPC) +#elif defined(CONFIG_PPC64) #define __VM_ARCH_SPECIFIC_1 {VM_SAO, "sao" } #elif defined(CONFIG_PARISC) #define __VM_ARCH_SPECIFIC_1 {VM_GROWSUP, "growsup" } @@ -165,7 +165,7 @@ IF_HAVE_PG_ARCH_X(arch_3) # define IF_HAVE_UFFD_MINOR(flag, name) #endif
-#ifdef CONFIG_64BIT +#if VM_DROPPABLE != VM_NONE # define IF_HAVE_VM_DROPPABLE(flag, name) {flag, name}, #else # define IF_HAVE_VM_DROPPABLE(flag, name)
Hi Christophe,
kernel test robot noticed the following build warnings:
[auto build test WARNING on powerpc/next] [also build test WARNING on powerpc/fixes shuah-kselftest/next shuah-kselftest/fixes crng-random/master linus/master v6.11-rc5 next-20240823] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Christophe-Leroy/asm-generic-... base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next patch link: https://lore.kernel.org/r/315e3a268b165b6edad7dcb723b0d8a506a56c4e.172430919... patch subject: [PATCH v2 07/17] mm: Define VM_DROPPABLE for powerpc/32 config: x86_64-buildonly-randconfig-002-20240826 (https://download.01.org/0day-ci/archive/20240826/202408261757.D4gOewE9-lkp@i...) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240826/202408261757.D4gOewE9-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202408261757.D4gOewE9-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from include/trace/events/page_ref.h:11, from mm/debug_page_ref.c:6:
include/trace/events/mmflags.h:168:5: warning: "VM_DROPPABLE" is not defined, evaluates to 0 [-Wundef]
168 | #if VM_DROPPABLE != VM_NONE | ^~~~~~~~~~~~
include/trace/events/mmflags.h:168:21: warning: "VM_NONE" is not defined, evaluates to 0 [-Wundef]
168 | #if VM_DROPPABLE != VM_NONE | ^~~~~~~ In file included from include/trace/events/page_ref.h:11, from include/trace/define_trace.h:95, from include/trace/events/page_ref.h:135:
include/trace/events/mmflags.h:168:5: warning: "VM_DROPPABLE" is not defined, evaluates to 0 [-Wundef]
168 | #if VM_DROPPABLE != VM_NONE | ^~~~~~~~~~~~
include/trace/events/mmflags.h:168:21: warning: "VM_NONE" is not defined, evaluates to 0 [-Wundef]
168 | #if VM_DROPPABLE != VM_NONE | ^~~~~~~ In file included from include/trace/events/page_ref.h:11, from include/trace/trace_events.h:94, from include/trace/define_trace.h:102:
include/trace/events/mmflags.h:169: warning: "IF_HAVE_VM_DROPPABLE" redefined
169 | # define IF_HAVE_VM_DROPPABLE(flag, name) {flag, name}, | include/trace/events/mmflags.h:171: note: this is the location of the previous definition 171 | # define IF_HAVE_VM_DROPPABLE(flag, name) |
vim +/VM_DROPPABLE +168 include/trace/events/mmflags.h
167
168 #if VM_DROPPABLE != VM_NONE 169 # define IF_HAVE_VM_DROPPABLE(flag, name) {flag, name},
170 #else 171 # define IF_HAVE_VM_DROPPABLE(flag, name) 172 #endif 173
Hi Christophe,
kernel test robot noticed the following build warnings:
[auto build test WARNING on powerpc/next] [also build test WARNING on powerpc/fixes shuah-kselftest/next shuah-kselftest/fixes linus/master v6.11-rc5 next-20240823] [cannot apply to crng-random/master] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Christophe-Leroy/asm-generic-... base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next patch link: https://lore.kernel.org/r/315e3a268b165b6edad7dcb723b0d8a506a56c4e.172430919... patch subject: [PATCH v2 07/17] mm: Define VM_DROPPABLE for powerpc/32 config: x86_64-allyesconfig (https://download.01.org/0day-ci/archive/20240826/202408261734.UAvnH7Mv-lkp@i...) compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 617a15a9eac96088ae5e9134248d8236e34b91b1) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240826/202408261734.UAvnH7Mv-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202408261734.UAvnH7Mv-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from mm/debug_page_ref.c:6: In file included from include/trace/events/page_ref.h:11: include/trace/events/mmflags.h:168:5: warning: 'VM_DROPPABLE' is not defined, evaluates to 0 [-Wundef] 168 | #if VM_DROPPABLE != VM_NONE | ^ include/trace/events/mmflags.h:168:21: warning: 'VM_NONE' is not defined, evaluates to 0 [-Wundef] 168 | #if VM_DROPPABLE != VM_NONE | ^ In file included from mm/debug_page_ref.c:6: In file included from include/trace/events/page_ref.h:135: In file included from include/trace/define_trace.h:95: In file included from include/trace/events/page_ref.h:11: include/trace/events/mmflags.h:168:5: warning: 'VM_DROPPABLE' is not defined, evaluates to 0 [-Wundef] 168 | #if VM_DROPPABLE != VM_NONE | ^ include/trace/events/mmflags.h:168:21: warning: 'VM_NONE' is not defined, evaluates to 0 [-Wundef] 168 | #if VM_DROPPABLE != VM_NONE | ^ In file included from mm/debug_page_ref.c:6: In file included from include/trace/events/page_ref.h:135: In file included from include/trace/define_trace.h:102: In file included from include/trace/trace_events.h:94: In file included from include/trace/events/page_ref.h:11:
include/trace/events/mmflags.h:169:10: warning: 'IF_HAVE_VM_DROPPABLE' macro redefined [-Wmacro-redefined]
169 | # define IF_HAVE_VM_DROPPABLE(flag, name) {flag, name}, | ^ include/trace/events/mmflags.h:171:10: note: previous definition is here 171 | # define IF_HAVE_VM_DROPPABLE(flag, name) | ^ 5 warnings generated.
vim +/IF_HAVE_VM_DROPPABLE +169 include/trace/events/mmflags.h
7677f7fd8be766 Axel Rasmussen 2021-05-04 167 41e2c674b334ed Christophe Leroy 2024-08-22 @168 #if VM_DROPPABLE != VM_NONE 9651fcedf7b92d Jason A. Donenfeld 2022-12-08 @169 # define IF_HAVE_VM_DROPPABLE(flag, name) {flag, name}, 9651fcedf7b92d Jason A. Donenfeld 2022-12-08 170 #else 9651fcedf7b92d Jason A. Donenfeld 2022-12-08 171 # define IF_HAVE_VM_DROPPABLE(flag, name) 9651fcedf7b92d Jason A. Donenfeld 2022-12-08 172 #endif 9651fcedf7b92d Jason A. Donenfeld 2022-12-08 173
Hi Christophe,
kernel test robot noticed the following build warnings:
[auto build test WARNING on powerpc/next] [also build test WARNING on powerpc/fixes shuah-kselftest/next shuah-kselftest/fixes linus/master v6.11-rc5 next-20240826] [cannot apply to crng-random/master] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Christophe-Leroy/asm-generic-... base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next patch link: https://lore.kernel.org/r/315e3a268b165b6edad7dcb723b0d8a506a56c4e.172430919... patch subject: [PATCH v2 07/17] mm: Define VM_DROPPABLE for powerpc/32 config: um-randconfig-r122-20240826 (https://download.01.org/0day-ci/archive/20240827/202408270553.2S5d14Ar-lkp@i...) compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240827/202408270553.2S5d14Ar-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202408270553.2S5d14Ar-lkp@intel.com/
sparse warnings: (new ones prefixed by >>) mm/debug_page_ref.c: note: in included file (through include/trace/events/page_ref.h):
include/trace/events/mmflags.h:168:5: sparse: sparse: undefined preprocessor identifier 'VM_DROPPABLE' include/trace/events/mmflags.h:168:21: sparse: sparse: undefined preprocessor identifier 'VM_NONE'
mm/debug_page_ref.c: note: in included file (through include/trace/events/page_ref.h, include/trace/define_trace.h, include/trace/events/page_ref.h):
include/trace/events/mmflags.h:168:5: sparse: sparse: undefined preprocessor identifier 'VM_DROPPABLE' include/trace/events/mmflags.h:168:21: sparse: sparse: undefined preprocessor identifier 'VM_NONE'
mm/debug_page_ref.c: note: in included file (through include/trace/events/page_ref.h, include/trace/trace_events.h, include/trace/define_trace.h, ...): include/trace/events/mmflags.h:169:10: sparse: sparse: preprocessor token IF_HAVE_VM_DROPPABLE redefined mm/debug_page_ref.c: note: in included file (through include/trace/events/page_ref.h): include/trace/events/mmflags.h:171:10: sparse: this was the original definition
vim +/VM_DROPPABLE +168 include/trace/events/mmflags.h
167
168 #if VM_DROPPABLE != VM_NONE
169 # define IF_HAVE_VM_DROPPABLE(flag, name) {flag, name}, 170 #else 171 # define IF_HAVE_VM_DROPPABLE(flag, name) 172 #endif 173
To support getrandom in VDSO which is based on little endian storage, add macros equivalent to LWZX_BE and STWX_BE for little endian accesses.
Put them outside of __powerpc64__ #ifdef so that they can also be used for PPC32.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- arch/powerpc/include/asm/asm-compat.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/powerpc/include/asm/asm-compat.h b/arch/powerpc/include/asm/asm-compat.h index 2bc53c646ccd..6e53a80917ef 100644 --- a/arch/powerpc/include/asm/asm-compat.h +++ b/arch/powerpc/include/asm/asm-compat.h @@ -61,4 +61,12 @@
#endif
+#ifdef __BIG_ENDIAN__ +#define LWZX_LE stringify_in_c(lwbrx) +#define STWX_LE stringify_in_c(stwbrx) +#else +#define LWZX_LE stringify_in_c(lwzx) +#define STWX_LE stringify_in_c(stwx) +#endif + #endif /* _ASM_POWERPC_ASM_COMPAT_H */
Commit 08c18b63d965 ("powerpc/vdso32: Add missing _restgpr_31_x to fix build failure") added _restgpr_31_x to the vdso for gettimeofday, but the work on getrandom shows that we will need more of those functions.
Remove _restgpr_31_x and link in crtsavres.o so that we get all save/restore functions when optimising the kernel for size.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- arch/powerpc/kernel/vdso/Makefile | 5 ++++- arch/powerpc/kernel/vdso/gettimeofday.S | 13 ------------- 2 files changed, 4 insertions(+), 14 deletions(-)
diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index 1425b6edc66b..c07a425b8f78 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -43,6 +43,7 @@ else endif
targets := $(obj-vdso32) vdso32.so.dbg vgettimeofday-32.o +targets += crtsavres-32.o obj-vdso32 := $(addprefix $(obj)/, $(obj-vdso32)) targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64)) @@ -68,7 +69,7 @@ targets += vdso64.lds CPPFLAGS_vdso64.lds += -P -C
# link rule for the .so file, .lds has to be first -$(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o FORCE +$(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o $(obj)/crtsavres-32.o FORCE $(call if_changed,vdso32ld_and_check) $(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o FORCE $(call if_changed,vdso64ld_and_check) @@ -76,6 +77,8 @@ $(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o # assembly rules for the .S files $(obj-vdso32): %-32.o: %.S FORCE $(call if_changed_dep,vdso32as) +$(obj)/crtsavres-32.o: %-32.o: $(srctree)/arch/powerpc/lib/crtsavres.S FORCE + $(call if_changed_dep,vdso32as) $(obj)/vgettimeofday-32.o: %-32.o: %.c FORCE $(call if_changed_dep,vdso32cc) $(obj-vdso64): %-64.o: %.S FORCE diff --git a/arch/powerpc/kernel/vdso/gettimeofday.S b/arch/powerpc/kernel/vdso/gettimeofday.S index 48fc6658053a..67254ac9c8bb 100644 --- a/arch/powerpc/kernel/vdso/gettimeofday.S +++ b/arch/powerpc/kernel/vdso/gettimeofday.S @@ -118,16 +118,3 @@ V_FUNCTION_END(__kernel_clock_getres) V_FUNCTION_BEGIN(__kernel_time) cvdso_call __c_kernel_time call_time=1 V_FUNCTION_END(__kernel_time) - -/* Routines for restoring integer registers, called by the compiler. */ -/* Called with r11 pointing to the stack header word of the caller of the */ -/* function, just beyond the end of the integer restore area. */ -#ifndef __powerpc64__ -_GLOBAL(_restgpr_31_x) -_GLOBAL(_rest32gpr_31_x) - lwz r0,4(r11) - lwz r31,-4(r11) - mtlr r0 - mr r1,r11 - blr -#endif
In order to avoid duplication when we add new VDSO functionnalities in C like getrandom, refactor common CFLAGS.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- arch/powerpc/kernel/vdso/Makefile | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index c07a425b8f78..cd4ed09864f0 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -10,11 +10,6 @@ obj-vdso64 = sigtramp64-64.o gettimeofday-64.o datapage-64.o cacheflush-64.o not
ifneq ($(c-gettimeofday-y),) CFLAGS_vgettimeofday-32.o += -include $(c-gettimeofday-y) - CFLAGS_vgettimeofday-32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) - CFLAGS_vgettimeofday-32.o += $(call cc-option, -fno-stack-protector) - CFLAGS_vgettimeofday-32.o += -DDISABLE_BRANCH_PROFILING - CFLAGS_vgettimeofday-32.o += -ffreestanding -fasynchronous-unwind-tables - CFLAGS_REMOVE_vgettimeofday-32.o = $(CC_FLAGS_FTRACE) CFLAGS_REMOVE_vgettimeofday-32.o += -mcmodel=medium -mabi=elfv1 -mabi=elfv2 -mcall-aixdesc # This flag is supported by clang for 64-bit but not 32-bit so it will cause # an unused command line flag warning for this file. @@ -22,11 +17,6 @@ ifneq ($(c-gettimeofday-y),) CFLAGS_REMOVE_vgettimeofday-32.o += -fno-stack-clash-protection endif CFLAGS_vgettimeofday-64.o += -include $(c-gettimeofday-y) - CFLAGS_vgettimeofday-64.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) - CFLAGS_vgettimeofday-64.o += $(call cc-option, -fno-stack-protector) - CFLAGS_vgettimeofday-64.o += -DDISABLE_BRANCH_PROFILING - CFLAGS_vgettimeofday-64.o += -ffreestanding -fasynchronous-unwind-tables - CFLAGS_REMOVE_vgettimeofday-64.o = $(CC_FLAGS_FTRACE) # Go prior to 1.16.x assumes r30 is not clobbered by any VDSO code. That used to be true # by accident when the VDSO was hand-written asm code, but may not be now that the VDSO is # compiler generated. To avoid breaking Go tell GCC not to use r30. Impact on code @@ -49,7 +39,12 @@ targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64))
ccflags-y := -fno-common -fno-builtin +ccflags-y += $(DISABLE_LATENT_ENTROPY_PLUGIN) +ccflags-y += $(call cc-option, -fno-stack-protector) +ccflags-y += -DDISABLE_BRANCH_PROFILING +ccflags-y += -ffreestanding -fasynchronous-unwind-tables ldflags-y := -Wl,--hash-style=both -nostdlib -shared -z noexecstack $(CLANG_FLAGS) +ccflags-remove-y := $(CC_FLAGS_FTRACE) ldflags-$(CONFIG_LD_IS_LLD) += $(call cc-option,--ld-path=$(LD),-fuse-ld=lld) ldflags-$(CONFIG_LD_ORPHAN_WARN) += -Wl,--orphan-handling=$(CONFIG_LD_ORPHAN_WARN_LEVEL)
To be consistent with other VDSO functions, the function is called __kernel_getrandom()
__arch_chacha20_blocks_nostack() fonction is implemented basically with 32 bits operations. It performs 4 QUARTERROUND operations in parallele. There are enough registers to avoid using the stack:
On input: r3: output bytes r4: 32-byte key input r5: 8-byte counter input/output r6: number of 64-byte blocks to write to output
During operation: r0: counter of blocks (initialised with r6) r4: Value '4' after key has been read, used for indexing r6-r13: key r14-r15: block counter r16-r31: chacha state
At the end: r0, r6-r13: Zeroised r14-r31: Restored
Performance on powerpc 885 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 2500000 times in 7.897495392 seconds libc: 2500000 times in 56.091632232 seconds syscall: 2500000 times in 55.704851989 seconds
Performance on powerpc 8321 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 2500000 times in 2.017183250 seconds libc: 2500000 times in 13.088533630 seconds syscall: 2500000 times in 12.952458068 seconds
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/mman.h | 2 +- arch/powerpc/include/asm/vdso/getrandom.h | 67 +++++ arch/powerpc/include/asm/vdso/vsyscall.h | 6 + arch/powerpc/include/asm/vdso_datapage.h | 2 + arch/powerpc/kernel/asm-offsets.c | 1 + arch/powerpc/kernel/vdso/Makefile | 27 +- arch/powerpc/kernel/vdso/getrandom.S | 58 ++++ arch/powerpc/kernel/vdso/vdso32.lds.S | 1 + arch/powerpc/kernel/vdso/vdso64.lds.S | 1 + arch/powerpc/kernel/vdso/vgetrandom-chacha.S | 297 +++++++++++++++++++ arch/powerpc/kernel/vdso/vgetrandom.c | 14 + 12 files changed, 471 insertions(+), 6 deletions(-) create mode 100644 arch/powerpc/include/asm/vdso/getrandom.h create mode 100644 arch/powerpc/kernel/vdso/getrandom.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom-chacha.S create mode 100644 arch/powerpc/kernel/vdso/vgetrandom.c
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index d7b09b064a8a..b45452ac4a73 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -311,6 +311,7 @@ config PPC select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK select TRACE_IRQFLAGS_SUPPORT + select VDSO_GETRANDOM # # Please keep this list sorted alphabetically. # diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h index 17a77d47ed6d..42a51a993d94 100644 --- a/arch/powerpc/include/asm/mman.h +++ b/arch/powerpc/include/asm/mman.h @@ -6,7 +6,7 @@
#include <uapi/asm/mman.h>
-#ifdef CONFIG_PPC64 +#if defined(CONFIG_PPC64) && !defined(BUILD_VDSO)
#include <asm/cputable.h> #include <linux/mm.h> diff --git a/arch/powerpc/include/asm/vdso/getrandom.h b/arch/powerpc/include/asm/vdso/getrandom.h new file mode 100644 index 000000000000..b37e581b02cb --- /dev/null +++ b/arch/powerpc/include/asm/vdso/getrandom.h @@ -0,0 +1,67 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024 Christophe Leroy christophe.leroy@csgroup.eu, CS GROUP France + */ +#ifndef _ASM_POWERPC_VDSO_GETRANDOM_H +#define _ASM_POWERPC_VDSO_GETRANDOM_H + +#ifndef __ASSEMBLY__ + +static __always_inline int do_syscall_3(const unsigned long _r0, const unsigned long _r3, + const unsigned long _r4, const unsigned long _r5) +{ + register long r0 asm("r0") = _r0; + register unsigned long r3 asm("r3") = _r3; + register unsigned long r4 asm("r4") = _r4; + register unsigned long r5 asm("r5") = _r5; + register int ret asm ("r3"); + + asm volatile( + " sc\n" + " bns+ 1f\n" + " neg %0, %0\n" + "1:\n" + : "=r" (ret), "+r" (r4), "+r" (r5), "+r" (r0) + : "r" (r3) + : "memory", "r6", "r7", "r8", "r9", "r10", "r11", "r12", "cr0", "ctr"); + + return ret; +} + +/** + * getrandom_syscall - Invoke the getrandom() syscall. + * @buffer: Destination buffer to fill with random bytes. + * @len: Size of @buffer in bytes. + * @flags: Zero or more GRND_* flags. + * Returns: The number of bytes written to @buffer, or a negative value indicating an error. + */ +static __always_inline ssize_t getrandom_syscall(void *buffer, size_t len, unsigned int flags) +{ + return do_syscall_3(__NR_getrandom, (unsigned long)buffer, + (unsigned long)len, (unsigned long)flags); +} + +static __always_inline struct vdso_rng_data *__arch_get_vdso_rng_data(void) +{ + BUILD_BUG(); +} + +ssize_t __c_kernel_getrandom(void *buffer, size_t len, unsigned int flags, void *opaque_state, + size_t opaque_len, const struct vdso_rng_data *vd); + +/** + * __arch_chacha20_blocks_nostack - Generate ChaCha20 stream without using the stack. + * @dst_bytes: Destination buffer to hold @nblocks * 64 bytes of output. + * @key: 32-byte input key. + * @counter: 8-byte counter, read on input and updated on return. + * @nblocks: Number of blocks to generate. + * + * Generates a given positive number of blocks of ChaCha20 output with nonce=0, and does not write + * to any stack or memory outside of the parameters passed to it, in order to mitigate stack data + * leaking into forked child processes. + */ +void __arch_chacha20_blocks_nostack(u8 *dst_bytes, const u32 *key, u32 *counter, size_t nblocks); + +#endif /* !__ASSEMBLY__ */ + +#endif /* _ASM_POWERPC_VDSO_GETRANDOM_H */ diff --git a/arch/powerpc/include/asm/vdso/vsyscall.h b/arch/powerpc/include/asm/vdso/vsyscall.h index 48cf23f1e273..92f480d8cc6d 100644 --- a/arch/powerpc/include/asm/vdso/vsyscall.h +++ b/arch/powerpc/include/asm/vdso/vsyscall.h @@ -17,6 +17,12 @@ struct vdso_data *__arch_get_k_vdso_data(void) } #define __arch_get_k_vdso_data __arch_get_k_vdso_data
+static __always_inline +struct vdso_rng_data *__arch_get_k_vdso_rng_data(void) +{ + return &vdso_data->rng_data; +} + /* The asm-generic header needs to be included after the definitions above */ #include <asm-generic/vdso/vsyscall.h>
diff --git a/arch/powerpc/include/asm/vdso_datapage.h b/arch/powerpc/include/asm/vdso_datapage.h index a585c8e538ff..e17500c5237e 100644 --- a/arch/powerpc/include/asm/vdso_datapage.h +++ b/arch/powerpc/include/asm/vdso_datapage.h @@ -83,6 +83,7 @@ struct vdso_arch_data { __u32 compat_syscall_map[SYSCALL_MAP_SIZE]; /* Map of compat syscalls */
struct vdso_data data[CS_BASES]; + struct vdso_rng_data rng_data; };
#else /* CONFIG_PPC64 */ @@ -95,6 +96,7 @@ struct vdso_arch_data { __u32 syscall_map[SYSCALL_MAP_SIZE]; /* Map of syscalls */ __u32 compat_syscall_map[0]; /* No compat syscalls on PPC32 */ struct vdso_data data[CS_BASES]; + struct vdso_rng_data rng_data; };
#endif /* CONFIG_PPC64 */ diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 23733282de4d..eedb2e04c785 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -335,6 +335,7 @@ int main(void)
/* datapage offsets for use by vdso */ OFFSET(VDSO_DATA_OFFSET, vdso_arch_data, data); + OFFSET(VDSO_RNG_DATA_OFFSET, vdso_arch_data, rng_data); OFFSET(CFG_TB_TICKS_PER_SEC, vdso_arch_data, tb_ticks_per_sec); #ifdef CONFIG_PPC64 OFFSET(CFG_ICACHE_BLOCKSZ, vdso_arch_data, icache_block_size); diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile index cd4ed09864f0..9109677b2f39 100644 --- a/arch/powerpc/kernel/vdso/Makefile +++ b/arch/powerpc/kernel/vdso/Makefile @@ -8,6 +8,9 @@ include $(srctree)/lib/vdso/Makefile obj-vdso32 = sigtramp32-32.o gettimeofday-32.o datapage-32.o cacheflush-32.o note-32.o getcpu-32.o obj-vdso64 = sigtramp64-64.o gettimeofday-64.o datapage-64.o cacheflush-64.o note-64.o getcpu-64.o
+obj-vdso32 += getrandom-32.o vgetrandom-chacha-32.o +obj-vdso64 += getrandom-64.o vgetrandom-chacha-64.o + ifneq ($(c-gettimeofday-y),) CFLAGS_vgettimeofday-32.o += -include $(c-gettimeofday-y) CFLAGS_REMOVE_vgettimeofday-32.o += -mcmodel=medium -mabi=elfv1 -mabi=elfv2 -mcall-aixdesc @@ -24,6 +27,16 @@ ifneq ($(c-gettimeofday-y),) CFLAGS_vgettimeofday-64.o += $(call cc-option, -ffixed-r30) endif
+ifneq ($(c-getrandom-y),) + CFLAGS_vgetrandom-32.o += -include $(c-getrandom-y) + CFLAGS_REMOVE_vgetrandom-32.o += -mcmodel=medium -mabi=elfv1 -mabi=elfv2 -mcall-aixdesc + ifdef CONFIG_CC_IS_CLANG + CFLAGS_REMOVE_vgetrandom-32.o += -fno-stack-clash-protection + endif + CFLAGS_vgetrandom-64.o += -include $(c-getrandom-y) + CFLAGS_vgetrandom-64.o += $(call cc-option, -ffixed-r30) +endif + # Build rules
ifdef CROSS32_COMPILE @@ -32,13 +45,13 @@ else VDSOCC := $(CC) endif
-targets := $(obj-vdso32) vdso32.so.dbg vgettimeofday-32.o +targets := $(obj-vdso32) vdso32.so.dbg vgettimeofday-32.o vgetrandom-32.o targets += crtsavres-32.o obj-vdso32 := $(addprefix $(obj)/, $(obj-vdso32)) -targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o +targets += $(obj-vdso64) vdso64.so.dbg vgettimeofday-64.o vgetrandom-64.o obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64))
-ccflags-y := -fno-common -fno-builtin +ccflags-y := -fno-common -fno-builtin -DBUILD_VDSO ccflags-y += $(DISABLE_LATENT_ENTROPY_PLUGIN) ccflags-y += $(call cc-option, -fno-stack-protector) ccflags-y += -DDISABLE_BRANCH_PROFILING @@ -64,9 +77,9 @@ targets += vdso64.lds CPPFLAGS_vdso64.lds += -P -C
# link rule for the .so file, .lds has to be first -$(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o $(obj)/crtsavres-32.o FORCE +$(obj)/vdso32.so.dbg: $(obj)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday-32.o $(obj)/vgetrandom-32.o $(obj)/crtsavres-32.o FORCE $(call if_changed,vdso32ld_and_check) -$(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o FORCE +$(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj)/vgettimeofday-64.o $(obj)/vgetrandom-64.o FORCE $(call if_changed,vdso64ld_and_check)
# assembly rules for the .S files @@ -76,10 +89,14 @@ $(obj)/crtsavres-32.o: %-32.o: $(srctree)/arch/powerpc/lib/crtsavres.S FORCE $(call if_changed_dep,vdso32as) $(obj)/vgettimeofday-32.o: %-32.o: %.c FORCE $(call if_changed_dep,vdso32cc) +$(obj)/vgetrandom-32.o: %-32.o: %.c FORCE + $(call if_changed_dep,vdso32cc) $(obj-vdso64): %-64.o: %.S FORCE $(call if_changed_dep,vdso64as) $(obj)/vgettimeofday-64.o: %-64.o: %.c FORCE $(call if_changed_dep,cc_o_c) +$(obj)/vgetrandom-64.o: %-64.o: %.c FORCE + $(call if_changed_dep,cc_o_c)
# Generate VDSO offsets using helper script gen-vdso32sym := $(src)/gen_vdso32_offsets.sh diff --git a/arch/powerpc/kernel/vdso/getrandom.S b/arch/powerpc/kernel/vdso/getrandom.S new file mode 100644 index 000000000000..a957cd2b2b03 --- /dev/null +++ b/arch/powerpc/kernel/vdso/getrandom.S @@ -0,0 +1,58 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Userland implementation of getrandom() for processes + * for use in the vDSO + * + * Copyright (C) 2024 Christophe Leroy christophe.leroy@csgroup.eu, CS GROUP France + */ +#include <asm/processor.h> +#include <asm/ppc_asm.h> +#include <asm/vdso.h> +#include <asm/vdso_datapage.h> +#include <asm/asm-offsets.h> +#include <asm/unistd.h> + +/* + * The macro sets two stack frames, one for the caller and one for the callee + * because there are no requirement for the caller to set a stack frame when + * calling VDSO so it may have omitted to set one, especially on PPC64 + */ + +.macro cvdso_call funct + .cfi_startproc + PPC_STLU r1, -PPC_MIN_STKFRM(r1) + .cfi_adjust_cfa_offset PPC_MIN_STKFRM + mflr r0 + PPC_STLU r1, -PPC_MIN_STKFRM(r1) + .cfi_adjust_cfa_offset PPC_MIN_STKFRM + PPC_STL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) + .cfi_rel_offset lr, PPC_MIN_STKFRM + PPC_LR_STKOFF +#ifdef __powerpc64__ + PPC_STL r2, PPC_MIN_STKFRM + STK_GOT(r1) + .cfi_rel_offset r2, PPC_MIN_STKFRM + STK_GOT +#endif + get_datapage r8 + addi r8, r8, VDSO_RNG_DATA_OFFSET + bl CFUNC(DOTSYM(\funct)) + PPC_LL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) +#ifdef __powerpc64__ + PPC_LL r2, PPC_MIN_STKFRM + STK_GOT(r1) + .cfi_restore r2 +#endif + cmpwi r3, 0 + mtlr r0 + addi r1, r1, 2 * PPC_MIN_STKFRM + .cfi_restore lr + .cfi_def_cfa_offset 0 + crclr so + bgelr+ + crset so + neg r3, r3 + blr + .cfi_endproc +.endm + + .text +V_FUNCTION_BEGIN(__kernel_getrandom) + cvdso_call __c_kernel_getrandom +V_FUNCTION_END(__kernel_getrandom) diff --git a/arch/powerpc/kernel/vdso/vdso32.lds.S b/arch/powerpc/kernel/vdso/vdso32.lds.S index 8f57107000a2..7b41d5d256e8 100644 --- a/arch/powerpc/kernel/vdso/vdso32.lds.S +++ b/arch/powerpc/kernel/vdso/vdso32.lds.S @@ -130,6 +130,7 @@ VERSION #if defined(CONFIG_PPC64) || !defined(CONFIG_SMP) __kernel_getcpu; #endif + __kernel_getrandom;
local: *; }; diff --git a/arch/powerpc/kernel/vdso/vdso64.lds.S b/arch/powerpc/kernel/vdso/vdso64.lds.S index 400819258c06..9481e4b892ed 100644 --- a/arch/powerpc/kernel/vdso/vdso64.lds.S +++ b/arch/powerpc/kernel/vdso/vdso64.lds.S @@ -123,6 +123,7 @@ VERSION __kernel_sigtramp_rt64; __kernel_getcpu; __kernel_time; + __kernel_getrandom;
local: *; }; diff --git a/arch/powerpc/kernel/vdso/vgetrandom-chacha.S b/arch/powerpc/kernel/vdso/vgetrandom-chacha.S new file mode 100644 index 000000000000..5375bc5807e1 --- /dev/null +++ b/arch/powerpc/kernel/vdso/vgetrandom-chacha.S @@ -0,0 +1,297 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024 Christophe Leroy christophe.leroy@csgroup.eu, CS GROUP France + */ + +#include <linux/linkage.h> + +#include <asm/ppc_asm.h> + +.macro quarterround4 a1 b1 c1 d1 a2 b2 c2 d2 a3 b3 c3 d3 a4 b4 c4 d4 + add \a1, \a1, \b1 + add \a2, \a2, \b2 + add \a3, \a3, \b3 + add \a4, \a4, \b4 + xor \d1, \d1, \a1 + xor \d2, \d2, \a2 + xor \d3, \d3, \a3 + xor \d4, \d4, \a4 + rotlwi \d1, \d1, 16 + rotlwi \d2, \d2, 16 + rotlwi \d3, \d3, 16 + rotlwi \d4, \d4, 16 + add \c1, \c1, \d1 + add \c2, \c2, \d2 + add \c3, \c3, \d3 + add \c4, \c4, \d4 + xor \b1, \b1, \c1 + xor \b2, \b2, \c2 + xor \b3, \b3, \c3 + xor \b4, \b4, \c4 + rotlwi \b1, \b1, 12 + rotlwi \b2, \b2, 12 + rotlwi \b3, \b3, 12 + rotlwi \b4, \b4, 12 + add \a1, \a1, \b1 + add \a2, \a2, \b2 + add \a3, \a3, \b3 + add \a4, \a4, \b4 + xor \d1, \d1, \a1 + xor \d2, \d2, \a2 + xor \d3, \d3, \a3 + xor \d4, \d4, \a4 + rotlwi \d1, \d1, 8 + rotlwi \d2, \d2, 8 + rotlwi \d3, \d3, 8 + rotlwi \d4, \d4, 8 + add \c1, \c1, \d1 + add \c2, \c2, \d2 + add \c3, \c3, \d3 + add \c4, \c4, \d4 + xor \b1, \b1, \c1 + xor \b2, \b2, \c2 + xor \b3, \b3, \c3 + xor \b4, \b4, \c4 + rotlwi \b1, \b1, 7 + rotlwi \b2, \b2, 7 + rotlwi \b3, \b3, 7 + rotlwi \b4, \b4, 7 +.endm + +#define QUARTERROUND4(a1,b1,c1,d1,a2,b2,c2,d2,a3,b3,c3,d3,a4,b4,c4,d4) \ + quarterround4 16+a1 16+b1 16+c1 16+d1 16+a2 16+b2 16+c2 16+d2 \ + 16+a3 16+b3 16+c3 16+d3 16+a4 16+b4 16+c4 16+d4 + +/* + * Very basic 32 bits implementation of ChaCha20. Produces a given positive number + * of blocks of output with a nonce of 0, taking an input key and 8-byte + * counter. Importantly does not spill to the stack. Its arguments are: + * + * r3: output bytes + * r4: 32-byte key input + * r5: 8-byte counter input/output + * r6: number of 64-byte blocks to write to output + * + * r0: counter of blocks (initialised with r6) + * r4: Value '4' after key has been read. + * r6-r13: key + * r14-r15 : counter + * r16-r31 : state + */ +SYM_FUNC_START(__arch_chacha20_blocks_nostack) +#ifdef __powerpc64__ + std r14,-144(r1) + std r15,-136(r1) + std r16,-128(r1) + std r17,-120(r1) + std r18,-112(r1) + std r19,-104(r1) + std r20,-96(r1) + std r21,-88(r1) + std r22,-80(r1) + std r23,-72(r1) + std r24,-64(r1) + std r25,-56(r1) + std r26,-48(r1) + std r27,-40(r1) + std r28,-32(r1) + std r29,-24(r1) + std r30,-16(r1) + std r31,-8(r1) +#else + stwu r1, -96(r1) +#if defined(CONFIG_CPU_LITTLE_ENDIAN) + stw r14,24(r1) + stw r15,28(r1) + stw r16,32(r1) + stw r17,36(r1) + stw r18,40(r1) + stw r19,44(r1) + stw r20,48(r1) + stw r21,52(r1) + stw r22,56(r1) + stw r23,60(r1) + stw r24,64(r1) + stw r25,68(r1) + stw r26,72(r1) + stw r27,76(r1) + stw r28,80(r1) + stw r29,84(r1) + stw r30,88(r1) + stw r31,92(r1) +#else + stmw r14, 24(r1) +#endif +#endif + mr r0, r6 + + li r31, 4 + + LWZX_LE r6, 0, r4 + LWZX_LE r7, r31, r4 + addi r4, r4, 8 + LWZX_LE r8, 0, r4 + LWZX_LE r9, r31, r4 + addi r4, r4, 8 + LWZX_LE r10, 0, r4 + LWZX_LE r11, r31, r4 + addi r4, r4, 8 + LWZX_LE r12, 0, r4 + LWZX_LE r13, r31, r4 + + li r4, 4 + + LWZX_LE r14, 0, r5 + LWZX_LE r15, r4, r5 +#ifdef __powerpc64__ + rldimi r14, r15, 32, 0 +#endif +.Lblock: + li r31, 10 + + lis r16, 0x6170 + lis r17, 0x3320 + lis r18, 0x7962 + lis r19, 0x6b20 + addi r16, r16, 0x7865 + addi r17, r17, 0x646e + addi r18, r18, 0x2d32 + addi r19, r19, 0x6574 + + mtctr r31 + + mr r20, r6 + mr r21, r7 + mr r22, r8 + mr r23, r9 + mr r24, r10 + mr r25, r11 + mr r26, r12 + mr r27, r13 + + mr r28, r14 + mr r29, r15 + li r30, 0 + li r31, 0 + +.Lpermute: + QUARTERROUND4( 0, 4, 8,12, 1, 5, 9,13, 2, 6,10,14, 3, 7,11,15) + QUARTERROUND4( 0, 5,10,15, 1, 6,11,12, 2, 7, 8,13, 3, 4, 9,14) + + bdnz .Lpermute + + addis r16, r16, 0x6170 + addis r17, r17, 0x3320 + addis r18, r18, 0x7962 + addis r19, r19, 0x6b20 + addi r16, r16, 0x7865 + addi r17, r17, 0x646e + addi r18, r18, 0x2d32 + addi r19, r19, 0x6574 + + add r20, r20, r6 + add r21, r21, r7 + add r22, r22, r8 + add r23, r23, r9 + add r24, r24, r10 + add r25, r25, r11 + add r26, r26, r12 + add r27, r27, r13 + + add r28, r28, r14 + add r29, r29, r15 + + STWX_LE r16, 0, r3 + STWX_LE r17, r4, r3 + addi r3, r3, 8 + STWX_LE r18, 0, r3 + STWX_LE r19, r4, r3 + addi r3, r3, 8 + STWX_LE r20, 0, r3 + STWX_LE r21, r4, r3 + addi r3, r3, 8 + STWX_LE r22, 0, r3 + STWX_LE r23, r4, r3 + addi r3, r3, 8 + STWX_LE r24, 0, r3 + STWX_LE r25, r4, r3 + addi r3, r3, 8 + STWX_LE r26, 0, r3 + STWX_LE r27, r4, r3 + addi r3, r3, 8 + STWX_LE r28, 0, r3 + STWX_LE r29, r4, r3 + addi r3, r3, 8 + STWX_LE r30, 0, r3 + STWX_LE r31, r4, r3 + addi r3, r3, 8 + +#ifdef __powerpc64__ + addi r14, r14, 1 + srdi r15, r14, 32 +#else + addic r14, r14, 1 + addze r15, r15 +#endif + + subic. r0, r0, 1 /* subi. can't use r0 as source */ + bne .Lblock + + STWX_LE r14, 0, r5 + STWX_LE r15, r4, r5 + + li r6, 0 + li r7, 0 + li r8, 0 + li r9, 0 + li r10, 0 + li r11, 0 + li r12, 0 + li r13, 0 + +#ifdef __powerpc64__ + ld r14,-144(r1) + ld r15,-136(r1) + ld r16,-128(r1) + ld r17,-120(r1) + ld r18,-112(r1) + ld r19,-104(r1) + ld r20,-96(r1) + ld r21,-88(r1) + ld r22,-80(r1) + ld r23,-72(r1) + ld r24,-64(r1) + ld r25,-56(r1) + ld r26,-48(r1) + ld r27,-40(r1) + ld r28,-32(r1) + ld r29,-24(r1) + ld r30,-16(r1) + ld r31,-8(r1) +#else +#if defined(CONFIG_CPU_LITTLE_ENDIAN) + lwz r14,24(r1) + lwz r15,28(r1) + lwz r16,32(r1) + lwz r17,36(r1) + lwz r18,40(r1) + lwz r19,44(r1) + lwz r20,48(r1) + lwz r21,52(r1) + lwz r22,56(r1) + lwz r23,60(r1) + lwz r24,64(r1) + lwz r25,68(r1) + lwz r26,72(r1) + lwz r27,76(r1) + lwz r28,80(r1) + lwz r29,84(r1) + lwz r30,88(r1) + lwz r31,92(r1) +#else + lmw r14, 24(r1) +#endif + addi r1, r1, 96 +#endif + blr +SYM_FUNC_END(__arch_chacha20_blocks_nostack) diff --git a/arch/powerpc/kernel/vdso/vgetrandom.c b/arch/powerpc/kernel/vdso/vgetrandom.c new file mode 100644 index 000000000000..5f855d45fb7b --- /dev/null +++ b/arch/powerpc/kernel/vdso/vgetrandom.c @@ -0,0 +1,14 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Powerpc userspace implementation of getrandom() + * + * Copyright (C) 2024 Christophe Leroy christophe.leroy@csgroup.eu, CS GROUP France + */ +#include <linux/time.h> +#include <linux/types.h> + +ssize_t __c_kernel_getrandom(void *buffer, size_t len, unsigned int flags, void *opaque_state, + size_t opaque_len, const struct vdso_rng_data *vd) +{ + return __cvdso_getrandom_data(vd, buffer, len, flags, opaque_state, opaque_len); +}
__powerpc__ is also defined on powerpc64 so __powerpc64__ needs to be checked first.
Fixes: 693f5ca08ca0 ("kselftest: Extend vDSO selftest") Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- tools/testing/selftests/vDSO/vdso_config.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/vDSO/vdso_config.h b/tools/testing/selftests/vDSO/vdso_config.h index 7b543e7f04d7..00bfed6e4922 100644 --- a/tools/testing/selftests/vDSO/vdso_config.h +++ b/tools/testing/selftests/vDSO/vdso_config.h @@ -18,13 +18,13 @@ #elif defined(__aarch64__) #define VDSO_VERSION 3 #define VDSO_NAMES 0 -#elif defined(__powerpc__) +#elif defined(__powerpc64__) #define VDSO_VERSION 1 #define VDSO_NAMES 0 -#define VDSO_32BIT 1 -#elif defined(__powerpc64__) +#elif defined(__powerpc__) #define VDSO_VERSION 1 #define VDSO_NAMES 0 +#define VDSO_32BIT 1 #elif defined (__s390__) #define VDSO_VERSION 2 #define VDSO_NAMES 0
Architectures use different location for vDSO sources: arch/mips/vdso arch/sparc/vdso arch/arm64/kernel/vdso arch/riscv/kernel/vdso arch/csky/kernel/vdso arch/x86/um/vdso arch/x86/entry/vdso arch/powerpc/kernel/vdso arch/arm/vdso arch/loongarch/vdso
Don't hard-code vdso sources location in selftest Makefile, instead create a vdso/ symbolic link in tools/arch/$arch/ and update Makefile accordingly.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- tools/arch/x86/vdso | 1 + tools/testing/selftests/vDSO/Makefile | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) create mode 120000 tools/arch/x86/vdso
diff --git a/tools/arch/x86/vdso b/tools/arch/x86/vdso new file mode 120000 index 000000000000..7eb962fd3454 --- /dev/null +++ b/tools/arch/x86/vdso @@ -0,0 +1 @@ +../../../arch/x86/entry/vdso/ \ No newline at end of file diff --git a/tools/testing/selftests/vDSO/Makefile b/tools/testing/selftests/vDSO/Makefile index 3de8e7e052ae..c9a819cacbf2 100644 --- a/tools/testing/selftests/vDSO/Makefile +++ b/tools/testing/selftests/vDSO/Makefile @@ -40,7 +40,7 @@ $(OUTPUT)/vdso_test_getrandom: parse_vdso.c $(OUTPUT)/vdso_test_getrandom: CFLAGS += -isystem $(top_srcdir)/tools/include \ -isystem $(top_srcdir)/include/uapi
-$(OUTPUT)/vdso_test_chacha: $(top_srcdir)/arch/$(ARCH)/entry/vdso/vgetrandom-chacha.S +$(OUTPUT)/vdso_test_chacha: $(top_srcdir)/tools/arch/$(ARCH)/vdso/vgetrandom-chacha.S $(OUTPUT)/vdso_test_chacha: CFLAGS += -idirafter $(top_srcdir)/tools/include \ -isystem $(top_srcdir)/arch/$(ARCH)/include \ -isystem $(top_srcdir)/include \
On Thu, Aug 22, 2024 at 09:13:21AM +0200, Christophe Leroy wrote:
Architectures use different location for vDSO sources: arch/mips/vdso arch/sparc/vdso arch/arm64/kernel/vdso arch/riscv/kernel/vdso arch/csky/kernel/vdso arch/x86/um/vdso arch/x86/entry/vdso arch/powerpc/kernel/vdso arch/arm/vdso arch/loongarch/vdso
Don't hard-code vdso sources location in selftest Makefile, instead create a vdso/ symbolic link in tools/arch/$arch/ and update Makefile accordingly.
That seems like a good way of handling it. As archs add their implementations, they can just add the symlink and do no more work. It will also make review slightly easier: if I don't see the symlink added, I'll know the submitter didn't actually test their chacha :-P.
I'll take this into my random.git tree now as a fix.
Jason
Don't hard-code x86 specific names, use vdso_config definitions to find the correct function matching the architecture.
Add random VDSO function names in names[][]. Remove the #ifdef CONFIG_VDSO32, having the name there all the time is harmless and guaranties a steady index for following strings.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- tools/testing/selftests/vDSO/vdso_config.h | 8 +++----- tools/testing/selftests/vDSO/vdso_test_getrandom.c | 8 ++++++-- 2 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/vDSO/vdso_config.h b/tools/testing/selftests/vDSO/vdso_config.h index 00bfed6e4922..740ce8c98d2e 100644 --- a/tools/testing/selftests/vDSO/vdso_config.h +++ b/tools/testing/selftests/vDSO/vdso_config.h @@ -68,16 +68,15 @@ static const char *versions[7] = { "LINUX_5.10" };
-static const char *names[2][6] = { +static const char *names[2][7] = { { "__kernel_gettimeofday", "__kernel_clock_gettime", "__kernel_time", "__kernel_clock_getres", "__kernel_getcpu", -#if defined(VDSO_32BIT) "__kernel_clock_gettime64", -#endif + "__kernel_getrandom", }, { "__vdso_gettimeofday", @@ -85,9 +84,8 @@ static const char *names[2][6] = { "__vdso_time", "__vdso_clock_getres", "__vdso_getcpu", -#if defined(VDSO_32BIT) "__vdso_clock_gettime64", -#endif + "__vdso_getrandom", }, };
diff --git a/tools/testing/selftests/vDSO/vdso_test_getrandom.c b/tools/testing/selftests/vDSO/vdso_test_getrandom.c index 05122425a873..02bcffc23e0c 100644 --- a/tools/testing/selftests/vDSO/vdso_test_getrandom.c +++ b/tools/testing/selftests/vDSO/vdso_test_getrandom.c @@ -21,6 +21,7 @@
#include "../kselftest.h" #include "parse_vdso.h" +#include "vdso_config.h"
#ifndef timespecsub #define timespecsub(tsp, usp, vsp) \ @@ -107,6 +108,9 @@ static void vgetrandom_put_state(void *state)
static void vgetrandom_init(void) { + const char *version = versions[VDSO_VERSION]; + const char **name = (const char **)&names[VDSO_NAMES]; + if (pthread_key_create(&grnd_ctx.key, vgetrandom_put_state) != 0) return; unsigned long sysinfo_ehdr = getauxval(AT_SYSINFO_EHDR); @@ -115,9 +119,9 @@ static void vgetrandom_init(void) exit(KSFT_SKIP); } vdso_init_from_sysinfo_ehdr(sysinfo_ehdr); - grnd_ctx.fn = (__typeof__(grnd_ctx.fn))vdso_sym("LINUX_2.6", "__vdso_getrandom"); + grnd_ctx.fn = (__typeof__(grnd_ctx.fn))vdso_sym(version, name[6]); if (!grnd_ctx.fn) { - printf("__vdso_getrandom is missing!\n"); + printf("%s is missing!\n", name[6]); exit(KSFT_FAIL); } if (grnd_ctx.fn(NULL, 0, 0, &grnd_ctx.params, ~0UL) != 0) {
On Thu, Aug 22, 2024 at 09:13:22AM +0200, Christophe Leroy wrote:
Don't hard-code x86 specific names, use vdso_config definitions to find the correct function matching the architecture.
Add random VDSO function names in names[][]. Remove the #ifdef CONFIG_VDSO32, having the name there all the time is harmless and guaranties a steady index for following strings.
This is indeed the right way of doing it. Thanks. I'll take this now, though with one small fixup:
- const char *version = versions[VDSO_VERSION];
- const char **name = (const char **)&names[VDSO_NAMES];
I'll just do:
const char *name = names[VDSO_NAMES][6];
Instead of referring to name[6] everywhere after. Seems more straight forward.
Jason
Le 26/08/2024 à 09:28, Jason A. Donenfeld a écrit :
On Thu, Aug 22, 2024 at 09:13:22AM +0200, Christophe Leroy wrote:
Don't hard-code x86 specific names, use vdso_config definitions to find the correct function matching the architecture.
Add random VDSO function names in names[][]. Remove the #ifdef CONFIG_VDSO32, having the name there all the time is harmless and guaranties a steady index for following strings.
This is indeed the right way of doing it. Thanks. I'll take this now, though with one small fixup:
- const char *version = versions[VDSO_VERSION];
- const char **name = (const char **)&names[VDSO_NAMES];
I'll just do:
const char *name = names[VDSO_NAMES][6];
Instead of referring to name[6] everywhere after. Seems more straight forward.
Yes you are right that's more straight forward when using only one of the names. I copy-pasted it from the gettimeofday test which uses several names from the table.
Christophe
Building test_vdso_chacha on powerpc leads to following issue:
In file included from /home/chleroy/linux-powerpc/include/linux/limits.h:7, from /opt/powerpc64-e5500--glibc--stable-2024.02-1/powerpc64-buildroot-linux-gnu/sysroot/usr/include/bits/local_lim.h:38, from /opt/powerpc64-e5500--glibc--stable-2024.02-1/powerpc64-buildroot-linux-gnu/sysroot/usr/include/bits/posix1_lim.h:161, from /opt/powerpc64-e5500--glibc--stable-2024.02-1/powerpc64-buildroot-linux-gnu/sysroot/usr/include/limits.h:195, from /opt/powerpc64-e5500--glibc--stable-2024.02-1/lib/gcc/powerpc64-buildroot-linux-gnu/12.3.0/include-fixed/limits.h:203, from /opt/powerpc64-e5500--glibc--stable-2024.02-1/lib/gcc/powerpc64-buildroot-linux-gnu/12.3.0/include-fixed/syslimits.h:7, from /opt/powerpc64-e5500--glibc--stable-2024.02-1/lib/gcc/powerpc64-buildroot-linux-gnu/12.3.0/include-fixed/limits.h:34, from /tmp/sodium/usr/local/include/sodium/export.h:7, from /tmp/sodium/usr/local/include/sodium/crypto_stream_chacha20.h:14, from vdso_test_chacha.c:6: /opt/powerpc64-e5500--glibc--stable-2024.02-1/powerpc64-buildroot-linux-gnu/sysroot/usr/include/bits/xopen_lim.h:99:6: error: missing binary operator before token "(" 99 | # if INT_MAX == 32767 | ^~~~~~~ /opt/powerpc64-e5500--glibc--stable-2024.02-1/powerpc64-buildroot-linux-gnu/sysroot/usr/include/bits/xopen_lim.h:102:7: error: missing binary operator before token "(" 102 | # if INT_MAX == 2147483647 | ^~~~~~~ /opt/powerpc64-e5500--glibc--stable-2024.02-1/powerpc64-buildroot-linux-gnu/sysroot/usr/include/bits/xopen_lim.h:126:6: error: missing binary operator before token "(" 126 | # if LONG_MAX == 2147483647 | ^~~~~~~~
This is due to kernel include/linux/limits.h being included instead of libc's limits.h
This is because directory include/ is added through option -isystem so it goes prior to glibc's include directory.
Replace -isystem by -idirafter
But this implies that now tools/include/linux/linkage.h is included instead of include/linux/linkage.h, so define a stub for SYM_FUNC_START() and SYM_FUNC_END().
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- tools/include/linux/linkage.h | 4 ++++ tools/testing/selftests/vDSO/Makefile | 4 ++-- 2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/tools/include/linux/linkage.h b/tools/include/linux/linkage.h index bc763d500262..a48ff086899c 100644 --- a/tools/include/linux/linkage.h +++ b/tools/include/linux/linkage.h @@ -1,4 +1,8 @@ #ifndef _TOOLS_INCLUDE_LINUX_LINKAGE_H #define _TOOLS_INCLUDE_LINUX_LINKAGE_H
+#define SYM_FUNC_START(x) .globl x; x: + +#define SYM_FUNC_END(x) + #endif /* _TOOLS_INCLUDE_LINUX_LINKAGE_H */ diff --git a/tools/testing/selftests/vDSO/Makefile b/tools/testing/selftests/vDSO/Makefile index c9a819cacbf2..10ffdda3f2fa 100644 --- a/tools/testing/selftests/vDSO/Makefile +++ b/tools/testing/selftests/vDSO/Makefile @@ -42,7 +42,7 @@ $(OUTPUT)/vdso_test_getrandom: CFLAGS += -isystem $(top_srcdir)/tools/include \
$(OUTPUT)/vdso_test_chacha: $(top_srcdir)/tools/arch/$(ARCH)/vdso/vgetrandom-chacha.S $(OUTPUT)/vdso_test_chacha: CFLAGS += -idirafter $(top_srcdir)/tools/include \ - -isystem $(top_srcdir)/arch/$(ARCH)/include \ - -isystem $(top_srcdir)/include \ + -idirafter $(top_srcdir)/arch/$(ARCH)/include \ + -idirafter $(top_srcdir)/include \ -D__ASSEMBLY__ -DBULID_VDSO -DCONFIG_FUNCTION_ALIGNMENT=0 \ -Wa,--noexecstack $(SODIUM)
On Thu, Aug 22, 2024 at 09:13:23AM +0200, Christophe Leroy wrote:
Replace -isystem by -idirafter
But this implies that now tools/include/linux/linkage.h is included instead of include/linux/linkage.h, so define a stub for SYM_FUNC_START() and SYM_FUNC_END().
Thanks! I got step 1, but didn't figure out step 2, and tried to do things the lazy way instead:
-#include <sodium/crypto_stream_chacha20.h> #include <sys/random.h> #include <string.h> #include <stdint.h> #include "../kselftest.h"
extern void __arch_chacha20_blocks_nostack(uint8_t *dst_bytes, const uint8_t *key, uint32_t *counter, size_t nblocks); +extern int crypto_stream_chacha20(uint8_t *dst_bytes, uint64_t dst_len, const uint8_t *nonce, const uint8_t *key);
Yours is obviously much better, so I'll queue that up instead.
On powerpc, a call to a VDSO function is not a standard C function call. Unlike x86 that returns a negated error code in case of an error, powerpc sets CR[SO] and returns the error code as a positive value.
So use a macro called VDSO_CALL() which takes a pointer to the function to call, the number of arguments and the arguments.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- tools/testing/selftests/vDSO/vdso_call.h | 12 ++++++++++++ tools/testing/selftests/vDSO/vdso_test_getrandom.c | 3 ++- 2 files changed, 14 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/vDSO/vdso_call.h
diff --git a/tools/testing/selftests/vDSO/vdso_call.h b/tools/testing/selftests/vDSO/vdso_call.h new file mode 100644 index 000000000000..ca5db2220925 --- /dev/null +++ b/tools/testing/selftests/vDSO/vdso_call.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Macro to call vDSO functions + * + * Copyright (C) 2024 Christophe Leroy christophe.leroy@csgroup.eu, CS GROUP France + */ +#ifndef __VDSO_CALL_H__ +#define __VDSO_CALL_H__ + +#define VDSO_CALL(fn, nr, args...) fn(args) + +#endif diff --git a/tools/testing/selftests/vDSO/vdso_test_getrandom.c b/tools/testing/selftests/vDSO/vdso_test_getrandom.c index 02bcffc23e0c..16ad400721c3 100644 --- a/tools/testing/selftests/vDSO/vdso_test_getrandom.c +++ b/tools/testing/selftests/vDSO/vdso_test_getrandom.c @@ -22,6 +22,7 @@ #include "../kselftest.h" #include "parse_vdso.h" #include "vdso_config.h" +#include "vdso_call.h"
#ifndef timespecsub #define timespecsub(tsp, usp, vsp) \ @@ -147,7 +148,7 @@ static ssize_t vgetrandom(void *buf, size_t len, unsigned long flags) exit(KSFT_FAIL); } } - return grnd_ctx.fn(buf, len, flags, state, grnd_ctx.params.size_of_opaque_state); + return VDSO_CALL(grnd_ctx.fn, 5, buf, len, flags, state, grnd_ctx.params.size_of_opaque_state); }
enum { TRIALS = 25000000, THREADS = 256 };
On Thu, Aug 22, 2024 at 09:13:24AM +0200, Christophe Leroy wrote:
On powerpc, a call to a VDSO function is not a standard C function call. Unlike x86 that returns a negated error code in case of an error, powerpc sets CR[SO] and returns the error code as a positive value.
So use a macro called VDSO_CALL() which takes a pointer to the function to call, the number of arguments and the arguments.
You'll probably want to move to VDSO_CALL() for the whole test suite, not just the getrandom one, right?
Le 26/08/2024 à 09:37, Jason A. Donenfeld a écrit :
On Thu, Aug 22, 2024 at 09:13:24AM +0200, Christophe Leroy wrote:
On powerpc, a call to a VDSO function is not a standard C function call. Unlike x86 that returns a negated error code in case of an error, powerpc sets CR[SO] and returns the error code as a positive value.
So use a macro called VDSO_CALL() which takes a pointer to the function to call, the number of arguments and the arguments.
You'll probably want to move to VDSO_CALL() for the whole test suite, not just the getrandom one, right?
Yes indeed, the following needs it as well:
vdso_test_abi.c vdso_test_getcpu.c vdso_test_gettimeofday.c vdso_test_correctness.c
Christophe
Add the necessary symbolic link and tell Makefile to build vdso_test_random for powerpc.
In makefile, don't use $(uname_M) which is wrong when cross-building for powerpc on an x86_64.
Implement the required VDSO_CALL macro to correctly handle errors.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu --- tools/arch/powerpc/vdso | 1 + tools/testing/selftests/vDSO/Makefile | 6 ++++ tools/testing/selftests/vDSO/vdso_call.h | 40 ++++++++++++++++++++++++ 3 files changed, 47 insertions(+) create mode 120000 tools/arch/powerpc/vdso
diff --git a/tools/arch/powerpc/vdso b/tools/arch/powerpc/vdso new file mode 120000 index 000000000000..d31004bf8f55 --- /dev/null +++ b/tools/arch/powerpc/vdso @@ -0,0 +1 @@ +../../../arch/powerpc/kernel/vdso/ \ No newline at end of file diff --git a/tools/testing/selftests/vDSO/Makefile b/tools/testing/selftests/vDSO/Makefile index 10ffdda3f2fa..7e7c9fd200d3 100644 --- a/tools/testing/selftests/vDSO/Makefile +++ b/tools/testing/selftests/vDSO/Makefile @@ -17,6 +17,12 @@ ifneq ($(SODIUM),) TEST_GEN_PROGS += vdso_test_chacha endif endif +ifeq ($(ARCH),powerpc) +TEST_GEN_PROGS += vdso_test_getrandom +ifneq ($(SODIUM),) +TEST_GEN_PROGS += vdso_test_chacha +endif +endif
CFLAGS := -std=gnu99
diff --git a/tools/testing/selftests/vDSO/vdso_call.h b/tools/testing/selftests/vDSO/vdso_call.h index ca5db2220925..2a33c25756dc 100644 --- a/tools/testing/selftests/vDSO/vdso_call.h +++ b/tools/testing/selftests/vDSO/vdso_call.h @@ -7,6 +7,46 @@ #ifndef __VDSO_CALL_H__ #define __VDSO_CALL_H__
+#ifdef __powerpc__ + +#define LOADARGS_5(fn, __arg1, __arg2, __arg3, __arg4, __arg5) do { \ + _r0 = fn; \ + _r3 = (long)__arg1; \ + _r4 = (long)__arg2; \ + _r5 = (long)__arg3; \ + _r6 = (long)__arg4; \ + _r7 = (long)__arg5; \ +} while (0) + +#define VDSO_CALL(fn, nr, args...) ({ \ + register void *_r0 asm ("r0"); \ + register long _r3 asm ("r3"); \ + register long _r4 asm ("r4"); \ + register long _r5 asm ("r5"); \ + register long _r6 asm ("r6"); \ + register long _r7 asm ("r7"); \ + register long _r8 asm ("r8"); \ + register long _rval asm ("r3"); \ + \ + LOADARGS_##nr(fn, args); \ + \ + asm volatile( \ + " mtctr %0\n" \ + " bctrl\n" \ + " bns+ 1f\n" \ + " neg 3, 3\n" \ + "1:" \ + : "+r" (_r0), "=r" (_r3), "+r" (_r4), "+r" (_r5), \ + "+r" (_r6), "+r" (_r7), "+r" (_r8) \ + : "r" (_rval) \ + : "r9", "r10", "r11", "r12", "cr0", "cr1", "cr5", \ + "cr6", "cr7", "xer", "lr", "ctr", "memory" \ + ); \ + _rval; \ +}) + +#else #define VDSO_CALL(fn, nr, args...) fn(args) +#endif
#endif
Hi Christophe,
Thanks for this series. There are quite a few preliminary patches in it, before you get to the PPC part, which fix up general build system or test harness correctness issues. Since some of those affect all architectures that are adding vDSO getrandom() support for 6.12, I'm going to take those into my random.git tree as a fix for 6.11 now, in hopes that the new archs can mostly go into arch trees without too many tree interdependencies.
So I'll reply to individual patches for that, mentioning which ones I extract.
Jason
On Mon, Aug 26, 2024 at 09:19:22AM +0200, Jason A. Donenfeld wrote:
Hi Christophe,
Thanks for this series. There are quite a few preliminary patches in it, before you get to the PPC part, which fix up general build system or test harness correctness issues. Since some of those affect all architectures that are adding vDSO getrandom() support for 6.12, I'm going to take those into my random.git tree as a fix for 6.11 now, in hopes that the new archs can mostly go into arch trees without too many tree interdependencies.
So I'll reply to individual patches for that, mentioning which ones I extract.
I've committed a bunch of these to:
https://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git/log/
For a v3, if you rebase on there, that'd make things easier for me to keep picking patches that I intend to send out for 6.11-rc6 later this week. And then hopefully your 6.12 ppc implementation can just go in via the ppc tree with my eventual ack on the crypto part, without needing these interdependencies.
Jason
linux-kselftest-mirror@lists.linaro.org