From: Yonghong Song <yonghong.song(a)linux.dev>
[ Upstream commit 100888fb6d8a185866b1520031ee7e3182b173de ]
With latest clang18 (main branch of llvm-project repo), when building bpf selftests,
[~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j
The following compilation error happens:
fatal error: error in backend: Branch target out of insn range
...
Stack dump:
0. Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
-I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
/home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
-idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
-c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
1. <eof> parser at end of file
2. Code generation
...
The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
since cpu=v4 supports 32-bit branch target offset.
The above failure is due to upstream llvm patch [1] where some inlining behavior
are changed in clang18.
To workaround the issue, previously all 180 loop iterations are fully unrolled.
The bpf macro __BPF_CPU_VERSION__ (implemented in clang18 recently) is used to avoid
unrolling changes if cpu=v4. If __BPF_CPU_VERSION__ is not available and the
compiler is clang18, the unrollng amount is unconditionally reduced.
[1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a5…
Signed-off-by: Yonghong Song <yonghong.song(a)linux.dev>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Tested-by: Alan Maguire <alan.maguire(a)oracle.com>
Link: https://lore.kernel.org/bpf/20231110193644.3130906-1-yonghong.song@linux.dev
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/progs/pyperf180.c | 22 +++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/pyperf180.c b/tools/testing/selftests/bpf/progs/pyperf180.c
index c39f559d3100..42c4a8b62e36 100644
--- a/tools/testing/selftests/bpf/progs/pyperf180.c
+++ b/tools/testing/selftests/bpf/progs/pyperf180.c
@@ -1,4 +1,26 @@
// SPDX-License-Identifier: GPL-2.0
// Copyright (c) 2019 Facebook
#define STACK_MAX_LEN 180
+
+/* llvm upstream commit at clang18
+ * https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a5…
+ * changed inlining behavior and caused compilation failure as some branch
+ * target distance exceeded 16bit representation which is the maximum for
+ * cpu v1/v2/v3. Macro __BPF_CPU_VERSION__ is later implemented in clang18
+ * to specify which cpu version is used for compilation. So a smaller
+ * unroll_count can be set if __BPF_CPU_VERSION__ is less than 4, which
+ * reduced some branch target distances and resolved the compilation failure.
+ *
+ * To capture the case where a developer/ci uses clang18 but the corresponding
+ * repo checkpoint does not have __BPF_CPU_VERSION__, a smaller unroll_count
+ * will be set as well to prevent potential compilation failures.
+ */
+#ifdef __BPF_CPU_VERSION__
+#if __BPF_CPU_VERSION__ < 4
+#define UNROLL_COUNT 90
+#endif
+#elif __clang_major__ == 18
+#define UNROLL_COUNT 90
+#endif
+
#include "pyperf.h"
--
2.43.0
From: Yonghong Song <yonghong.song(a)linux.dev>
[ Upstream commit 100888fb6d8a185866b1520031ee7e3182b173de ]
With latest clang18 (main branch of llvm-project repo), when building bpf selftests,
[~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j
The following compilation error happens:
fatal error: error in backend: Branch target out of insn range
...
Stack dump:
0. Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
-I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
/home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
-idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
-c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
1. <eof> parser at end of file
2. Code generation
...
The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
since cpu=v4 supports 32-bit branch target offset.
The above failure is due to upstream llvm patch [1] where some inlining behavior
are changed in clang18.
To workaround the issue, previously all 180 loop iterations are fully unrolled.
The bpf macro __BPF_CPU_VERSION__ (implemented in clang18 recently) is used to avoid
unrolling changes if cpu=v4. If __BPF_CPU_VERSION__ is not available and the
compiler is clang18, the unrollng amount is unconditionally reduced.
[1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a5…
Signed-off-by: Yonghong Song <yonghong.song(a)linux.dev>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Tested-by: Alan Maguire <alan.maguire(a)oracle.com>
Link: https://lore.kernel.org/bpf/20231110193644.3130906-1-yonghong.song@linux.dev
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/progs/pyperf180.c | 22 +++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/pyperf180.c b/tools/testing/selftests/bpf/progs/pyperf180.c
index c39f559d3100..42c4a8b62e36 100644
--- a/tools/testing/selftests/bpf/progs/pyperf180.c
+++ b/tools/testing/selftests/bpf/progs/pyperf180.c
@@ -1,4 +1,26 @@
// SPDX-License-Identifier: GPL-2.0
// Copyright (c) 2019 Facebook
#define STACK_MAX_LEN 180
+
+/* llvm upstream commit at clang18
+ * https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a5…
+ * changed inlining behavior and caused compilation failure as some branch
+ * target distance exceeded 16bit representation which is the maximum for
+ * cpu v1/v2/v3. Macro __BPF_CPU_VERSION__ is later implemented in clang18
+ * to specify which cpu version is used for compilation. So a smaller
+ * unroll_count can be set if __BPF_CPU_VERSION__ is less than 4, which
+ * reduced some branch target distances and resolved the compilation failure.
+ *
+ * To capture the case where a developer/ci uses clang18 but the corresponding
+ * repo checkpoint does not have __BPF_CPU_VERSION__, a smaller unroll_count
+ * will be set as well to prevent potential compilation failures.
+ */
+#ifdef __BPF_CPU_VERSION__
+#if __BPF_CPU_VERSION__ < 4
+#define UNROLL_COUNT 90
+#endif
+#elif __clang_major__ == 18
+#define UNROLL_COUNT 90
+#endif
+
#include "pyperf.h"
--
2.43.0
From: Yonghong Song <yonghong.song(a)linux.dev>
[ Upstream commit 100888fb6d8a185866b1520031ee7e3182b173de ]
With latest clang18 (main branch of llvm-project repo), when building bpf selftests,
[~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j
The following compilation error happens:
fatal error: error in backend: Branch target out of insn range
...
Stack dump:
0. Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
-I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
/home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
-idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
-c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
1. <eof> parser at end of file
2. Code generation
...
The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
since cpu=v4 supports 32-bit branch target offset.
The above failure is due to upstream llvm patch [1] where some inlining behavior
are changed in clang18.
To workaround the issue, previously all 180 loop iterations are fully unrolled.
The bpf macro __BPF_CPU_VERSION__ (implemented in clang18 recently) is used to avoid
unrolling changes if cpu=v4. If __BPF_CPU_VERSION__ is not available and the
compiler is clang18, the unrollng amount is unconditionally reduced.
[1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a5…
Signed-off-by: Yonghong Song <yonghong.song(a)linux.dev>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Tested-by: Alan Maguire <alan.maguire(a)oracle.com>
Link: https://lore.kernel.org/bpf/20231110193644.3130906-1-yonghong.song@linux.dev
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/progs/pyperf180.c | 22 +++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/pyperf180.c b/tools/testing/selftests/bpf/progs/pyperf180.c
index c39f559d3100..42c4a8b62e36 100644
--- a/tools/testing/selftests/bpf/progs/pyperf180.c
+++ b/tools/testing/selftests/bpf/progs/pyperf180.c
@@ -1,4 +1,26 @@
// SPDX-License-Identifier: GPL-2.0
// Copyright (c) 2019 Facebook
#define STACK_MAX_LEN 180
+
+/* llvm upstream commit at clang18
+ * https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a5…
+ * changed inlining behavior and caused compilation failure as some branch
+ * target distance exceeded 16bit representation which is the maximum for
+ * cpu v1/v2/v3. Macro __BPF_CPU_VERSION__ is later implemented in clang18
+ * to specify which cpu version is used for compilation. So a smaller
+ * unroll_count can be set if __BPF_CPU_VERSION__ is less than 4, which
+ * reduced some branch target distances and resolved the compilation failure.
+ *
+ * To capture the case where a developer/ci uses clang18 but the corresponding
+ * repo checkpoint does not have __BPF_CPU_VERSION__, a smaller unroll_count
+ * will be set as well to prevent potential compilation failures.
+ */
+#ifdef __BPF_CPU_VERSION__
+#if __BPF_CPU_VERSION__ < 4
+#define UNROLL_COUNT 90
+#endif
+#elif __clang_major__ == 18
+#define UNROLL_COUNT 90
+#endif
+
#include "pyperf.h"
--
2.43.0
From: Yonghong Song <yonghong.song(a)linux.dev>
[ Upstream commit 100888fb6d8a185866b1520031ee7e3182b173de ]
With latest clang18 (main branch of llvm-project repo), when building bpf selftests,
[~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j
The following compilation error happens:
fatal error: error in backend: Branch target out of insn range
...
Stack dump:
0. Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
-I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
/home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
-idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
-c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
1. <eof> parser at end of file
2. Code generation
...
The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
since cpu=v4 supports 32-bit branch target offset.
The above failure is due to upstream llvm patch [1] where some inlining behavior
are changed in clang18.
To workaround the issue, previously all 180 loop iterations are fully unrolled.
The bpf macro __BPF_CPU_VERSION__ (implemented in clang18 recently) is used to avoid
unrolling changes if cpu=v4. If __BPF_CPU_VERSION__ is not available and the
compiler is clang18, the unrollng amount is unconditionally reduced.
[1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a5…
Signed-off-by: Yonghong Song <yonghong.song(a)linux.dev>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Tested-by: Alan Maguire <alan.maguire(a)oracle.com>
Link: https://lore.kernel.org/bpf/20231110193644.3130906-1-yonghong.song@linux.dev
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/progs/pyperf180.c | 22 +++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/pyperf180.c b/tools/testing/selftests/bpf/progs/pyperf180.c
index c39f559d3100..42c4a8b62e36 100644
--- a/tools/testing/selftests/bpf/progs/pyperf180.c
+++ b/tools/testing/selftests/bpf/progs/pyperf180.c
@@ -1,4 +1,26 @@
// SPDX-License-Identifier: GPL-2.0
// Copyright (c) 2019 Facebook
#define STACK_MAX_LEN 180
+
+/* llvm upstream commit at clang18
+ * https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a5…
+ * changed inlining behavior and caused compilation failure as some branch
+ * target distance exceeded 16bit representation which is the maximum for
+ * cpu v1/v2/v3. Macro __BPF_CPU_VERSION__ is later implemented in clang18
+ * to specify which cpu version is used for compilation. So a smaller
+ * unroll_count can be set if __BPF_CPU_VERSION__ is less than 4, which
+ * reduced some branch target distances and resolved the compilation failure.
+ *
+ * To capture the case where a developer/ci uses clang18 but the corresponding
+ * repo checkpoint does not have __BPF_CPU_VERSION__, a smaller unroll_count
+ * will be set as well to prevent potential compilation failures.
+ */
+#ifdef __BPF_CPU_VERSION__
+#if __BPF_CPU_VERSION__ < 4
+#define UNROLL_COUNT 90
+#endif
+#elif __clang_major__ == 18
+#define UNROLL_COUNT 90
+#endif
+
#include "pyperf.h"
--
2.43.0
From: Yonghong Song <yonghong.song(a)linux.dev>
[ Upstream commit 100888fb6d8a185866b1520031ee7e3182b173de ]
With latest clang18 (main branch of llvm-project repo), when building bpf selftests,
[~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j
The following compilation error happens:
fatal error: error in backend: Branch target out of insn range
...
Stack dump:
0. Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
-I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
/home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
-idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
-c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
1. <eof> parser at end of file
2. Code generation
...
The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
since cpu=v4 supports 32-bit branch target offset.
The above failure is due to upstream llvm patch [1] where some inlining behavior
are changed in clang18.
To workaround the issue, previously all 180 loop iterations are fully unrolled.
The bpf macro __BPF_CPU_VERSION__ (implemented in clang18 recently) is used to avoid
unrolling changes if cpu=v4. If __BPF_CPU_VERSION__ is not available and the
compiler is clang18, the unrollng amount is unconditionally reduced.
[1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a5…
Signed-off-by: Yonghong Song <yonghong.song(a)linux.dev>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Tested-by: Alan Maguire <alan.maguire(a)oracle.com>
Link: https://lore.kernel.org/bpf/20231110193644.3130906-1-yonghong.song@linux.dev
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/progs/pyperf180.c | 22 +++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/pyperf180.c b/tools/testing/selftests/bpf/progs/pyperf180.c
index c39f559d3100..42c4a8b62e36 100644
--- a/tools/testing/selftests/bpf/progs/pyperf180.c
+++ b/tools/testing/selftests/bpf/progs/pyperf180.c
@@ -1,4 +1,26 @@
// SPDX-License-Identifier: GPL-2.0
// Copyright (c) 2019 Facebook
#define STACK_MAX_LEN 180
+
+/* llvm upstream commit at clang18
+ * https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a5…
+ * changed inlining behavior and caused compilation failure as some branch
+ * target distance exceeded 16bit representation which is the maximum for
+ * cpu v1/v2/v3. Macro __BPF_CPU_VERSION__ is later implemented in clang18
+ * to specify which cpu version is used for compilation. So a smaller
+ * unroll_count can be set if __BPF_CPU_VERSION__ is less than 4, which
+ * reduced some branch target distances and resolved the compilation failure.
+ *
+ * To capture the case where a developer/ci uses clang18 but the corresponding
+ * repo checkpoint does not have __BPF_CPU_VERSION__, a smaller unroll_count
+ * will be set as well to prevent potential compilation failures.
+ */
+#ifdef __BPF_CPU_VERSION__
+#if __BPF_CPU_VERSION__ < 4
+#define UNROLL_COUNT 90
+#endif
+#elif __clang_major__ == 18
+#define UNROLL_COUNT 90
+#endif
+
#include "pyperf.h"
--
2.43.0
From: Yonghong Song <yonghong.song(a)linux.dev>
[ Upstream commit 100888fb6d8a185866b1520031ee7e3182b173de ]
With latest clang18 (main branch of llvm-project repo), when building bpf selftests,
[~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j
The following compilation error happens:
fatal error: error in backend: Branch target out of insn range
...
Stack dump:
0. Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
-I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
-I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
/home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
-idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
-c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
1. <eof> parser at end of file
2. Code generation
...
The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
since cpu=v4 supports 32-bit branch target offset.
The above failure is due to upstream llvm patch [1] where some inlining behavior
are changed in clang18.
To workaround the issue, previously all 180 loop iterations are fully unrolled.
The bpf macro __BPF_CPU_VERSION__ (implemented in clang18 recently) is used to avoid
unrolling changes if cpu=v4. If __BPF_CPU_VERSION__ is not available and the
compiler is clang18, the unrollng amount is unconditionally reduced.
[1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a5…
Signed-off-by: Yonghong Song <yonghong.song(a)linux.dev>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Tested-by: Alan Maguire <alan.maguire(a)oracle.com>
Link: https://lore.kernel.org/bpf/20231110193644.3130906-1-yonghong.song@linux.dev
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/progs/pyperf180.c | 22 +++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/pyperf180.c b/tools/testing/selftests/bpf/progs/pyperf180.c
index c39f559d3100..42c4a8b62e36 100644
--- a/tools/testing/selftests/bpf/progs/pyperf180.c
+++ b/tools/testing/selftests/bpf/progs/pyperf180.c
@@ -1,4 +1,26 @@
// SPDX-License-Identifier: GPL-2.0
// Copyright (c) 2019 Facebook
#define STACK_MAX_LEN 180
+
+/* llvm upstream commit at clang18
+ * https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a5…
+ * changed inlining behavior and caused compilation failure as some branch
+ * target distance exceeded 16bit representation which is the maximum for
+ * cpu v1/v2/v3. Macro __BPF_CPU_VERSION__ is later implemented in clang18
+ * to specify which cpu version is used for compilation. So a smaller
+ * unroll_count can be set if __BPF_CPU_VERSION__ is less than 4, which
+ * reduced some branch target distances and resolved the compilation failure.
+ *
+ * To capture the case where a developer/ci uses clang18 but the corresponding
+ * repo checkpoint does not have __BPF_CPU_VERSION__, a smaller unroll_count
+ * will be set as well to prevent potential compilation failures.
+ */
+#ifdef __BPF_CPU_VERSION__
+#if __BPF_CPU_VERSION__ < 4
+#define UNROLL_COUNT 90
+#endif
+#elif __clang_major__ == 18
+#define UNROLL_COUNT 90
+#endif
+
#include "pyperf.h"
--
2.43.0
Rae has been shouldering a lot of the KUnit review burden for the last
year, and will continue to do so in the future. Thanks!
Signed-off-by: David Gow <davidgow(a)google.com>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index f8efcb72ad4b..2316d89806dd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11599,6 +11599,7 @@ F: fs/smb/server/
KERNEL UNIT TESTING FRAMEWORK (KUnit)
M: Brendan Higgins <brendanhiggins(a)google.com>
M: David Gow <davidgow(a)google.com>
+R: Rae Moar <rmoar(a)google.com>
L: linux-kselftest(a)vger.kernel.org
L: kunit-dev(a)googlegroups.com
S: Maintained
--
2.43.0.275.g3460e3d667-goog
Number of tests are failing when netdev renaming is active
on the system. Add udevadm settle in logic determining
the names.
Fixes: 242aaf03dc9b ("selftests: add a test for ethtool pause stats")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
---
CC: shuah(a)kernel.org
CC: saeedm(a)nvidia.com
CC: linux-kselftest(a)vger.kernel.org
---
tools/testing/selftests/drivers/net/netdevsim/ethtool-common.sh | 1 +
tools/testing/selftests/drivers/net/netdevsim/udp_tunnel_nic.sh | 1 +
2 files changed, 2 insertions(+)
diff --git a/tools/testing/selftests/drivers/net/netdevsim/ethtool-common.sh b/tools/testing/selftests/drivers/net/netdevsim/ethtool-common.sh
index 922744059aaa..80160579e0cc 100644
--- a/tools/testing/selftests/drivers/net/netdevsim/ethtool-common.sh
+++ b/tools/testing/selftests/drivers/net/netdevsim/ethtool-common.sh
@@ -51,6 +51,7 @@ function make_netdev {
fi
echo $NSIM_ID $@ > /sys/bus/netdevsim/new_device
+ udevadm settle
# get new device name
ls /sys/bus/netdevsim/devices/netdevsim${NSIM_ID}/net/
}
diff --git a/tools/testing/selftests/drivers/net/netdevsim/udp_tunnel_nic.sh b/tools/testing/selftests/drivers/net/netdevsim/udp_tunnel_nic.sh
index 1b08e042cf94..4855ef597a15 100755
--- a/tools/testing/selftests/drivers/net/netdevsim/udp_tunnel_nic.sh
+++ b/tools/testing/selftests/drivers/net/netdevsim/udp_tunnel_nic.sh
@@ -233,6 +233,7 @@ function print_tables {
function get_netdev_name {
local -n old=$1
+ udevadm settle
new=$(ls /sys/class/net)
for netdev in $new; do
--
2.43.0
Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
This series is based on the first part which was merged [1], this series is to
add the cache invalidation interface or the userspace to invalidate cache after
modifying the stage-1 page table. This includes both the iommufd changes and the
VT-d driver changes.
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20231026044216.64964-1-yi.l.liu@intel.c… - merged
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v11:
- Drop hw_error field in vtd cache invalidation uapi. devTLB invalidation
error is a serious security emergency requiring the host kernel to handle.
No need to expose it to userspace (especially given existing VMs doesn't
issue devTLB invalidation at all).
- The vtd qi_submit_sync() and related callers are reverted back to the
original state due to above drop.
- Align with the vtd path, drop the hw_error reporting in mock driver and
selftest as well since selftest is a demo of the real driver.
- Drop iommu_respond_struct_to_user_array() since no more driver want to
respond single entry in the user_array.
- Two typos from Wubinbin
v10: https://lore.kernel.org/all/20240102143834.146165-1-yi.l.liu@intel.com/
- Minor tweak to patch 07 (Kevin)
- Rebase on top of 6.7-rc8
v9: https://lore.kernel.org/linux-iommu/20231228150629.13149-1-yi.l.liu@intel.c…
- Add a test case which sets both IOMMU_TEST_INVALIDATE_FLAG_ALL and
IOMMU_TEST_INVALIDATE_FLAG_TRIGGER_ERROR in flags, and expect to succeed
and see an 'error'. (Kevin)
- Returns -ETIMEOUT in qi_check_fault() if caller is interested with the
fault when timeout happens. If not, the qi_submit_sync() will keep retry
hence unable to report the error back to user. For now, only the user cache
invalidation path has interest on the time out error. So this change only
affects the user cache invalidation path. Other path will still hang in
qi_submit_sync() when timeout happens. (Kevin)
v8: https://lore.kernel.org/linux-iommu/20231227161354.67701-1-yi.l.liu@intel.c…
- Pass invalidation hint to the cache invalidation helper in the cache_invalidate_user
op path (Kevin)
- Move the devTLB invalidation out of info->iommu loop (Kevin, Weijiang)
- Clear *fault per restart in qi_submit_sync() to avoid acroos submission error
accumulation. (Kevin)
- Define the vtd cache invalidation uapi structure in separate patch (Kevin)
- Rename inv_error to be hw_error (Kevin)
- Rename 'reqs_uptr', 'req_type', 'req_len' and 'req_num' to be 'data_uptr',
'data_type', "entry_len' and 'entry_num" (Kevin)
- Allow user to set IOMMU_TEST_INVALIDATE_FLAG_ALL and IOMMU_TEST_INVALIDATE_FLAG_TRIGGER_ERROR
in the same time (Kevin)
v7: https://lore.kernel.org/linux-iommu/20231221153948.119007-1-yi.l.liu@intel.…
- Remove domain->ops->cache_invalidate_user check in hwpt alloc path due
to failure in bisect (Baolu)
- Remove out_driver_error_code from struct iommu_hwpt_invalidate after
discussion in v6. Should expect per-entry error code.
- Rework the selftest cache invalidation part to report a per-entry error
- Allow user to pass in an empty array to have a try-and-fail mechanism for
user to check if a given req_type is supported by the kernel (Jason)
- Define a separate enum type for cache invalidation data (Jason)
- Fix the IOMMU_HWPT_INVALIDATE to always update the req_num field before
returning (Nicolin)
- Merge the VT-d nesting part 2/2
https://lore.kernel.org/linux-iommu/20231117131816.24359-1-yi.l.liu@intel.c…
into this series to avoid defining empty enum in the middle of the series.
The major difference is adding the VT-d related invalidation uapi structures
together with the generic data structures in patch 02 of this series.
- VT-d driver was refined to report ICE/ITE error from the bottom cache
invalidation submit helpers, hence the cache_invalidate_user op could
report such errors via the per-entry error field to user. VT-d driver
will not stop the invalidation array walking due to the ICE/ITE errors
as such errors are defined by VT-d spec, userspace should be able to
handle it and let the real user (say Virtual Machine) know about it.
But for other errors like invalid uapi data structure configuration,
memory copy failure, such errors should stop the array walking as it
may have more issues if go on.
- Minor fixes per Jason and Kevin's review comments
v6: https://lore.kernel.org/linux-iommu/20231117130717.19875-1-yi.l.liu@intel.c…
- No much change, just rebase on top of 6.7-rc1 as part 1/2 is merged
v5: https://lore.kernel.org/linux-iommu/20231020092426.13907-1-yi.l.liu@intel.c…
- Split the iommufd nesting series into two parts of alloc_user and
invalidation (Jason)
- Split IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING/_NESTED, and
do the same with the structures/alloc()/abort()/destroy(). Reworked the
selftest accordingly too. (Jason)
- Move hwpt/data_type into struct iommu_user_data from standalone op
arguments. (Jason)
- Rename hwpt_type to be data_type, the HWPT_TYPE to be HWPT_ALLOC_DATA,
_TYPE_DEFAULT to be _ALLOC_DATA_NONE (Jason, Kevin)
- Rename iommu_copy_user_data() to iommu_copy_struct_from_user() (Kevin)
- Add macro to the iommu_copy_struct_from_user() to calculate min_size
(Jason)
- Fix two bugs spotted by ZhaoYan
v4: https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
- Separate HWPT alloc/destroy/abort functions between user-managed HWPTs
and kernel-managed HWPTs
- Rework invalidate uAPI to be a multi-request array-based design
- Add a struct iommu_user_data_array and a helper for driver to sanitize
and copy the entry data from user space invalidation array
- Add a patch fixing TEST_LENGTH() in selftest program
- Drop IOMMU_RESV_IOVA_RANGES patches
- Update kdoc and inline comments
- Drop the code to add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation,
this does not change the rule that resv regions should only be added to the
kernel-managed HWPT. The IOMMU_RESV_SW_MSI stuff will be added in later series
as it is needed only by SMMU so far.
v3: https://lore.kernel.org/linux-iommu/20230724110406.107212-1-yi.l.liu@intel.…
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to avoid
passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Lu Baolu (2):
iommu: Add cache_invalidate_user op
iommu/vt-d: Add iotlb flush for nested domain
Nicolin Chen (4):
iommu: Add iommu_copy_struct_from_user_array helper
iommufd/selftest: Add mock_domain_cache_invalidate_user support
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (2):
iommufd: Add IOMMU_HWPT_INVALIDATE
iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
drivers/iommu/intel/nested.c | 88 ++++++++++
drivers/iommu/iommufd/hw_pagetable.c | 41 +++++
drivers/iommu/iommufd/iommufd_private.h | 10 ++
drivers/iommu/iommufd/iommufd_test.h | 23 +++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 76 +++++++++
include/linux/iommu.h | 79 +++++++++
include/uapi/linux/iommufd.h | 79 +++++++++
tools/testing/selftests/iommu/iommufd.c | 152 ++++++++++++++++++
tools/testing/selftests/iommu/iommufd_utils.h | 57 +++++++
10 files changed, 608 insertions(+)
--
2.34.1
From: Rae Moar <rmoar(a)google.com>
[ Upstream commit 8ae27bc7fff4ef467a7964821a6cedb34a05d3b2 ]
Add parsing of attributes as diagnostic data. Fixes issue with test plan
being parsed incorrectly as diagnostic data when located after
suite-level attributes.
Note that if there does not exist a test plan line, the diagnostic lines
between the suite header and the first result will be saved in the suite
log rather than the first test case log.
Signed-off-by: Rae Moar <rmoar(a)google.com>
Reviewed-by: David Gow <davidgow(a)google.com>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/kunit/kunit_parser.py | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py
index 79d8832c862a..ce34be15c929 100644
--- a/tools/testing/kunit/kunit_parser.py
+++ b/tools/testing/kunit/kunit_parser.py
@@ -450,7 +450,7 @@ def parse_diagnostic(lines: LineStream) -> List[str]:
Log of diagnostic lines
"""
log = [] # type: List[str]
- non_diagnostic_lines = [TEST_RESULT, TEST_HEADER, KTAP_START, TAP_START]
+ non_diagnostic_lines = [TEST_RESULT, TEST_HEADER, KTAP_START, TAP_START, TEST_PLAN]
while lines and not any(re.match(lines.peek())
for re in non_diagnostic_lines):
log.append(lines.pop())
@@ -726,6 +726,7 @@ def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest:
# test plan
test.name = "main"
ktap_line = parse_ktap_header(lines, test)
+ test.log.extend(parse_diagnostic(lines))
parse_test_plan(lines, test)
parent_test = True
else:
@@ -737,6 +738,7 @@ def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest:
if parent_test:
# If KTAP version line and/or subtest header is found, attempt
# to parse test plan and print test header
+ test.log.extend(parse_diagnostic(lines))
parse_test_plan(lines, test)
print_test_header(test)
expected_count = test.expected_count
--
2.43.0
From: Thomas Weißschuh <linux(a)weissschuh.net>
[ Upstream commit bdeeeaba83682225a7bf5f100fe8652a59590d33 ]
qemu for LoongArch does not work properly with direct kernel boot.
The kernel will panic during initialization and hang without any output.
When booting in EFI mode everything work correctly.
While users most likely don't have the LoongArch EFI binary installed at
least an explicit error about 'file not found' is better than a hanging
test without output that can never succeed.
Link: https://lore.kernel.org/loongarch/1738d60a-df3a-4102-b1da-d16a29b6e06a@t-8c…
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Acked-by: Willy Tarreau <w(a)1wt.eu>
Link: https://lore.kernel.org/r/20231031-nolibc-out-of-tree-v1-1-47c92f73590a@wei…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/nolibc/Makefile | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
index dfe66776a331..be7711014ade 100644
--- a/tools/testing/selftests/nolibc/Makefile
+++ b/tools/testing/selftests/nolibc/Makefile
@@ -88,6 +88,13 @@ QEMU_ARCH_s390 = s390x
QEMU_ARCH_loongarch = loongarch64
QEMU_ARCH = $(QEMU_ARCH_$(XARCH))
+QEMU_BIOS_DIR = /usr/share/edk2/
+QEMU_BIOS_loongarch = $(QEMU_BIOS_DIR)/loongarch64/OVMF_CODE.fd
+
+ifneq ($(QEMU_BIOS_$(XARCH)),)
+QEMU_ARGS_BIOS = -bios $(QEMU_BIOS_$(XARCH))
+endif
+
# QEMU_ARGS : some arch-specific args to pass to qemu
QEMU_ARGS_i386 = -M pc -append "console=ttyS0,9600 i8042.noaux panic=-1 $(TEST:%=NOLIBC_TEST=%)"
QEMU_ARGS_x86_64 = -M pc -append "console=ttyS0,9600 i8042.noaux panic=-1 $(TEST:%=NOLIBC_TEST=%)"
@@ -101,7 +108,7 @@ QEMU_ARGS_ppc64le = -M powernv -append "console=hvc0 panic=-1 $(TEST:%=NOLIBC
QEMU_ARGS_riscv = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
QEMU_ARGS_s390 = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
QEMU_ARGS_loongarch = -M virt -append "console=ttyS0,115200 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
-QEMU_ARGS = $(QEMU_ARGS_$(XARCH)) $(QEMU_ARGS_EXTRA)
+QEMU_ARGS = $(QEMU_ARGS_$(XARCH)) $(QEMU_ARGS_BIOS) $(QEMU_ARGS_EXTRA)
# OUTPUT is only set when run from the main makefile, otherwise
# it defaults to this nolibc directory.
--
2.43.0
From: Michal Wajdeczko <michal.wajdeczko(a)intel.com>
[ Upstream commit 342fb9789267ee3908959bfa136b82e88e2ce918 ]
If we run parameterized test that uses test->priv to prepare some
custom data, then value of test->priv will leak to the next param
iteration and may be unexpected. This could be easily seen if
we promote example_priv_test to parameterized test as then only
first test iteration will be successful:
$ ./tools/testing/kunit/kunit.py run \
--kunitconfig ./lib/kunit/.kunitconfig *.example_priv*
[ ] Starting KUnit Kernel (1/1)...
[ ] ============================================================
[ ] =================== example (1 subtest) ====================
[ ] ==================== example_priv_test ====================
[ ] [PASSED] example value 3
[ ] # example_priv_test: initializing
[ ] # example_priv_test: ASSERTION FAILED at lib/kunit/kunit-example-test.c:230
[ ] Expected test->priv == ((void *)0), but
[ ] test->priv == 0000000060dfe290
[ ] ((void *)0) == 0000000000000000
[ ] # example_priv_test: cleaning up
[ ] [FAILED] example value 2
[ ] # example_priv_test: initializing
[ ] # example_priv_test: ASSERTION FAILED at lib/kunit/kunit-example-test.c:230
[ ] Expected test->priv == ((void *)0), but
[ ] test->priv == 0000000060dfe290
[ ] ((void *)0) == 0000000000000000
[ ] # example_priv_test: cleaning up
[ ] [FAILED] example value 1
[ ] # example_priv_test: initializing
[ ] # example_priv_test: ASSERTION FAILED at lib/kunit/kunit-example-test.c:230
[ ] Expected test->priv == ((void *)0), but
[ ] test->priv == 0000000060dfe290
[ ] ((void *)0) == 0000000000000000
[ ] # example_priv_test: cleaning up
[ ] [FAILED] example value 0
[ ] # example_priv_test: initializing
[ ] # example_priv_test: cleaning up
[ ] # example_priv_test: pass:1 fail:3 skip:0 total:4
[ ] ================ [FAILED] example_priv_test ================
[ ] # example: initializing suite
[ ] # module: kunit_example_test
[ ] # example: exiting suite
[ ] # Totals: pass:1 fail:3 skip:0 total:4
[ ] ===================== [FAILED] example =====================
Fix that by resetting test->priv after each param iteration, in
similar way what we did for the test->status.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko(a)intel.com>
Cc: David Gow <davidgow(a)google.com>
Cc: Rae Moar <rmoar(a)google.com>
Reviewed-by: David Gow <davidgow(a)google.com>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
lib/kunit/test.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 7aceb07a1af9..1cdc405daa30 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -660,6 +660,7 @@ int kunit_run_tests(struct kunit_suite *suite)
test.param_index++;
test.status = KUNIT_SUCCESS;
test.status_comment[0] = '\0';
+ test.priv = NULL;
}
}
--
2.43.0
From: Rae Moar <rmoar(a)google.com>
[ Upstream commit 8ae27bc7fff4ef467a7964821a6cedb34a05d3b2 ]
Add parsing of attributes as diagnostic data. Fixes issue with test plan
being parsed incorrectly as diagnostic data when located after
suite-level attributes.
Note that if there does not exist a test plan line, the diagnostic lines
between the suite header and the first result will be saved in the suite
log rather than the first test case log.
Signed-off-by: Rae Moar <rmoar(a)google.com>
Reviewed-by: David Gow <davidgow(a)google.com>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/kunit/kunit_parser.py | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py
index 79d8832c862a..ce34be15c929 100644
--- a/tools/testing/kunit/kunit_parser.py
+++ b/tools/testing/kunit/kunit_parser.py
@@ -450,7 +450,7 @@ def parse_diagnostic(lines: LineStream) -> List[str]:
Log of diagnostic lines
"""
log = [] # type: List[str]
- non_diagnostic_lines = [TEST_RESULT, TEST_HEADER, KTAP_START, TAP_START]
+ non_diagnostic_lines = [TEST_RESULT, TEST_HEADER, KTAP_START, TAP_START, TEST_PLAN]
while lines and not any(re.match(lines.peek())
for re in non_diagnostic_lines):
log.append(lines.pop())
@@ -726,6 +726,7 @@ def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest:
# test plan
test.name = "main"
ktap_line = parse_ktap_header(lines, test)
+ test.log.extend(parse_diagnostic(lines))
parse_test_plan(lines, test)
parent_test = True
else:
@@ -737,6 +738,7 @@ def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest:
if parent_test:
# If KTAP version line and/or subtest header is found, attempt
# to parse test plan and print test header
+ test.log.extend(parse_diagnostic(lines))
parse_test_plan(lines, test)
print_test_header(test)
expected_count = test.expected_count
--
2.43.0
From: Thomas Weißschuh <linux(a)weissschuh.net>
[ Upstream commit bdeeeaba83682225a7bf5f100fe8652a59590d33 ]
qemu for LoongArch does not work properly with direct kernel boot.
The kernel will panic during initialization and hang without any output.
When booting in EFI mode everything work correctly.
While users most likely don't have the LoongArch EFI binary installed at
least an explicit error about 'file not found' is better than a hanging
test without output that can never succeed.
Link: https://lore.kernel.org/loongarch/1738d60a-df3a-4102-b1da-d16a29b6e06a@t-8c…
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
Acked-by: Willy Tarreau <w(a)1wt.eu>
Link: https://lore.kernel.org/r/20231031-nolibc-out-of-tree-v1-1-47c92f73590a@wei…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/nolibc/Makefile | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
index a0fc07253baf..eb258ae1d948 100644
--- a/tools/testing/selftests/nolibc/Makefile
+++ b/tools/testing/selftests/nolibc/Makefile
@@ -88,6 +88,13 @@ QEMU_ARCH_s390 = s390x
QEMU_ARCH_loongarch = loongarch64
QEMU_ARCH = $(QEMU_ARCH_$(XARCH))
+QEMU_BIOS_DIR = /usr/share/edk2/
+QEMU_BIOS_loongarch = $(QEMU_BIOS_DIR)/loongarch64/OVMF_CODE.fd
+
+ifneq ($(QEMU_BIOS_$(XARCH)),)
+QEMU_ARGS_BIOS = -bios $(QEMU_BIOS_$(XARCH))
+endif
+
# QEMU_ARGS : some arch-specific args to pass to qemu
QEMU_ARGS_i386 = -M pc -append "console=ttyS0,9600 i8042.noaux panic=-1 $(TEST:%=NOLIBC_TEST=%)"
QEMU_ARGS_x86_64 = -M pc -append "console=ttyS0,9600 i8042.noaux panic=-1 $(TEST:%=NOLIBC_TEST=%)"
@@ -101,7 +108,7 @@ QEMU_ARGS_ppc64le = -M powernv -append "console=hvc0 panic=-1 $(TEST:%=NOLIBC
QEMU_ARGS_riscv = -M virt -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
QEMU_ARGS_s390 = -M s390-ccw-virtio -m 1G -append "console=ttyS0 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
QEMU_ARGS_loongarch = -M virt -append "console=ttyS0,115200 panic=-1 $(TEST:%=NOLIBC_TEST=%)"
-QEMU_ARGS = $(QEMU_ARGS_$(XARCH)) $(QEMU_ARGS_EXTRA)
+QEMU_ARGS = $(QEMU_ARGS_$(XARCH)) $(QEMU_ARGS_BIOS) $(QEMU_ARGS_EXTRA)
# OUTPUT is only set when run from the main makefile, otherwise
# it defaults to this nolibc directory.
--
2.43.0
Hi,
An essential part of any big kernel submissions is selftests.
At the beginning of TCP-AO project, I made patches to fcnal-test.sh
and nettest.c to have the benefits of easy refactoring, early noticing
breakages, putting a moat around the code, documenting
and designing uAPI.
While tests based on fcnal-test.sh/nettest.c provided initial testing*
and were very easy to add, the pile of TCP-AO quickly grew out of
one-binary + shell-script testing.
The design of the TCP-AO testing is a bit different than one-big
selftest binary as I did previously in net/ipsec.c. I found it
beneficial to avoid implementing a tests runner/scheduler and delegate
it to the user or Makefile. The approach is very influenced
by CRIU/ZDTM testing[1]: it provides a static library with helper
functions and selftest binaries that create specific scenarios.
I also tried to utilize kselftest.h.
test_init() function does all needed preparations. To not leave
any traces after a selftest exists, it creates a network namespace
and if the test wants to establish a TCP connection, a child netns.
The parent and child netns have veth pair with proper ip addresses
and routes set up. Both peers, the client and server are different
pthreads. The treading model was chosen over forking mostly by easiness
of cleanup on a failure: no need to search for children, handle SIGCHLD,
make sure not to wait for a dead peer to perform anything, etc.
Any thread that does exit() naturally kills the tests, sweet!
The selftests are compiled currently in two variants: ipv4 and ipv6.
Ipv4-mapped-ipv6 addresses might be a third variant to add, but it's not
there in this version. As pretty much all tests are shared between two
address families, most of the code can be shared, too. To differ in code
what kind of test is running, Makefile supplies -DIPV6_TEST to compiler
and ifdeffery in tests can do things that have to be different between
address families. This is similar to TARGETS_C_BOTHBITS in x86 selftests
and also to tests code sharing in CRIU/ZDTM.
The total number of tests is 832.
From them rst_ipv{4,6} has currently one flaky subtest, that may fail:
> not ok 9 client connection was not reset: 0
I'll investigate what happens there. Also, unsigned-md5_ipv{4,6}
are flaky because of netns counter checks: it doesn't expect that
there may be retransmitted TCP segments from a previous sub-selftest.
That will be fixed. Besides, key-management_ipv{4,6} has 3 sub-tests
passing with XFAIL:
> ok 15 # XFAIL listen() after current/rnext keys set: the socket has current/rnext keys: 100:200
> ok 16 # XFAIL listen socket, delete current key from before listen(): failed to delete the key 100:100 -16
> ok 17 # XFAIL listen socket, delete rnext key from before listen(): failed to delete the key 200:200 -16
...
> # Totals: pass:117 fail:0 xfail:3 xpass:0 skip:0 error:0
Those need some more kernel work to pass instead of xfail.
The overview of selftests (see the diffstat at the bottom):
├── lib
│ ├── aolib.h
│ │ The header for all selftests to include.
│ ├── kconfig.c
│ │ Kernel kconfig detector to SKIP tests that depend on something.
│ ├── netlink.c
│ │ Netlink helper to add/modify/delete VETH/IPs/routes/VRFs
│ │ I considered just using libmnl, but this is around 400 lines
│ │ and avoids selftests dependency on out-of-tree sources/packets.
│ ├── proc.c
│ │ SNMP/netstat procfs parser and the counters comparator.
│ ├── repair.c
│ │ Heavily influenced by libsoccr and reduced to minimum TCP
│ │ socket checkpoint/repair. Shouldn't be used out of selftests,
│ │ though.
│ ├── setup.c
│ │ All the needed netns/veth/ips/etc preparations for test init.
│ ├── sock.c
│ │ Socket helpers: {s,g}etsockopt()s/connect()/listen()/etc.
│ └── utils.c
│ Random stuff (a pun intended).
├── bench-lookups.c
│ The only benchmark in selftests currently: checks how well TCP-AO
│ setsockopt()s perform, depending on the amount of keys on a socket.
├── connect.c
│ Trivial sample, can be used as a boilerplate to write a new test.
├── connect-deny.c
│ More-or-less what could be expected for TCP-AO in fcnal-test.sh
├── icmps-accept.c -> icmps-discard.c
├── icmps-discard.c
│ Verifies RFC5925 (7.8) by checking that TCP-AO connection can be
│ broken if ICMPs are accepted and survives when ::accept_icmps = 0
├── key-management.c
│ Key manipulations, rotations between randomized hashing algorithms
│ and counter checks for those scenarios.
├── restore.c
│ TCP_AO_REPAIR: verifies that a socket can be re-created without
│ TCP-AO connection being interrupted.
├── rst.c
│ As RST segments are signed on a separate code-path in kernel,
│ verifies passive/active TCP send_reset().
├── self-connect.c
│ Verifies that TCP self-connect and also simultaneous open work.
├── seq-ext.c
│ Utilizes TCP_AO_REPAIR to check that on SEQ roll-over SNE
│ increment is performed and segments with different SNEs fail to
│ pass verification.
├── setsockopt-closed.c
│ Checks that {s,g}etsockopt()s are extendable syscalls and common
│ error-paths for them.
└── unsigned-md5.c
Checks listen() socket for (non-)matching peers with: AO/MD5/none
keys. As well as their interaction with VRFs and AO_REQUIRED flag.
There are certainly more test scenarios that can be added, but even so,
I'm pretty happy that this much of TCP-AO functionality and uAPIs got
covered. These selftests were iteratively developed by me during TCP-AO
kernel upstreaming and the resulting kernel patches would have been
worse without having these tests. They provided the user-side
perspective but also allowed safer refactoring with less possibility
of introducing a regression. Now it's time to use them to dig
a moat around the TCP-AO code!
There are also people from other network companies that work on TCP-AO
(+testing), so sharing these selftests will allow them to contribute
and may benefit from their efforts.
The following changes since commit c7402612e2e61b76177f22e6e7f705adcbecc6fe:
Merge tag 'net-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net (2023-12-14 13:11:49 -0800)
are available in the Git repository at:
git@github.com:0x7f454c46/linux.git tcp-ao-selftests-v1
for you to fetch changes up to 85dc9bc676985d81f9043fd9c3a506f30851597b:
selftests/net: Add TCP-AO key-management test (2023-12-15 00:44:49 +0000)
----------------------------------------------------------------
* Planning to submit basic TCP-AO tests to fcnal-test.sh/nettest.c
separately.
[1]: https://github.com/checkpoint-restore/criu/tree/criu-dev/test/zdtm/static
Signed-off-by: Dmitry Safonov <dima(a)arista.com>
---
Dmitry Safonov (12):
selftests/net: Add TCP-AO library
selftests/net: Verify that TCP-AO complies with ignoring ICMPs
selftests/net: Add TCP-AO ICMPs accept test
selftests/net: Add a test for TCP-AO keys matching
selftests/net: Add test for TCP-AO add setsockopt() command
selftests/net: Add TCP-AO + TCP-MD5 + no sign listen socket tests
selftests/net: Add test/benchmark for removing MKTs
selftests/net: Add TCP_REPAIR TCP-AO tests
selftests/net: Add SEQ number extension test
selftests/net: Add TCP-AO RST test
selftests/net: Add TCP-AO selfconnect/simultaneous connect test
selftests/net: Add TCP-AO key-management test
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/net/tcp_ao/.gitignore | 2 +
tools/testing/selftests/net/tcp_ao/Makefile | 59 +
tools/testing/selftests/net/tcp_ao/bench-lookups.c | 358 ++++++
tools/testing/selftests/net/tcp_ao/connect-deny.c | 264 +++++
tools/testing/selftests/net/tcp_ao/connect.c | 90 ++
tools/testing/selftests/net/tcp_ao/icmps-accept.c | 1 +
tools/testing/selftests/net/tcp_ao/icmps-discard.c | 449 ++++++++
.../testing/selftests/net/tcp_ao/key-management.c | 1180 ++++++++++++++++++++
tools/testing/selftests/net/tcp_ao/lib/aolib.h | 605 ++++++++++
tools/testing/selftests/net/tcp_ao/lib/kconfig.c | 148 +++
tools/testing/selftests/net/tcp_ao/lib/netlink.c | 415 +++++++
tools/testing/selftests/net/tcp_ao/lib/proc.c | 273 +++++
tools/testing/selftests/net/tcp_ao/lib/repair.c | 254 +++++
tools/testing/selftests/net/tcp_ao/lib/setup.c | 342 ++++++
tools/testing/selftests/net/tcp_ao/lib/sock.c | 592 ++++++++++
tools/testing/selftests/net/tcp_ao/lib/utils.c | 30 +
tools/testing/selftests/net/tcp_ao/restore.c | 236 ++++
tools/testing/selftests/net/tcp_ao/rst.c | 415 +++++++
tools/testing/selftests/net/tcp_ao/self-connect.c | 197 ++++
tools/testing/selftests/net/tcp_ao/seq-ext.c | 245 ++++
.../selftests/net/tcp_ao/setsockopt-closed.c | 835 ++++++++++++++
tools/testing/selftests/net/tcp_ao/unsigned-md5.c | 742 ++++++++++++
23 files changed, 7733 insertions(+)
---
base-commit: c7402612e2e61b76177f22e6e7f705adcbecc6fe
change-id: 20231213-tcp-ao-selftests-d0f323006667
Best regards,
--
Dmitry Safonov <dima(a)arista.com>
Hi folks,
This series implements the functionality of delivering IO page faults to
user space through the IOMMUFD framework for nested translation. Nested
translation is a hardware feature that supports two-stage translation
tables for IOMMU. The second-stage translation table is managed by the
host VMM, while the first-stage translation table is owned by user
space. This allows user space to control the IOMMU mappings for its
devices.
When an IO page fault occurs on the first-stage translation table, the
IOMMU hardware can deliver the page fault to user space through the
IOMMUFD framework. User space can then handle the page fault and respond
to the device top-down through the IOMMUFD. This allows user space to
implement its own IO page fault handling policies.
User space indicates its capability of handling IO page faults by
setting the IOMMU_HWPT_ALLOC_IOPF_CAPABLE flag when allocating a
hardware page table (HWPT). IOMMUFD will then set up its infrastructure
for page fault delivery. On a successful return of HWPT allocation, the
user can retrieve and respond to page faults by reading and writing to
the file descriptor (FD) returned in out_fault_fd.
The iommu selftest framework has been updated to test the IO page fault
delivery and response functionality.
This series is based on the latest implementation of nested translation
under discussion [1] and the page fault handling framework refactoring in
the IOMMU core [2].
The series and related patches are available on GitHub: [3]
[1] https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
[2] https://lore.kernel.org/linux-iommu/20230928042734.16134-1-baolu.lu@linux.i…
[3] https://github.com/LuBaolu/intel-iommu/commits/iommufd-io-pgfault-delivery-…
Best regards,
baolu
Change log:
v2:
- Move all iommu refactoring patches into a sparated series and discuss
it in a different thread. The latest patch series [v6] is available at
https://lore.kernel.org/linux-iommu/20230928042734.16134-1-baolu.lu@linux.i…
- We discussed the timeout of the pending page fault messages. We
agreed that we shouldn't apply any timeout policy for the page fault
handling in user space.
https://lore.kernel.org/linux-iommu/20230616113232.GA84678@myrica/
- Jason suggested that we adopt a simple file descriptor interface for
reading and responding to I/O page requests, so that user space
applications can improve performance using io_uring.
https://lore.kernel.org/linux-iommu/ZJWjD1ajeem6pK3I@ziepe.ca/
v1: https://lore.kernel.org/linux-iommu/20230530053724.232765-1-baolu.lu@linux.…
Lu Baolu (6):
iommu: Add iommu page fault cookie helpers
iommufd: Add iommu page fault uapi data
iommufd: Initializing and releasing IO page fault data
iommufd: Deliver fault messages to user space
iommufd/selftest: Add IOMMU_TEST_OP_TRIGGER_IOPF test support
iommufd/selftest: Add coverage for IOMMU_TEST_OP_TRIGGER_IOPF
include/linux/iommu.h | 9 +
drivers/iommu/iommu-priv.h | 15 +
drivers/iommu/iommufd/iommufd_private.h | 12 +
drivers/iommu/iommufd/iommufd_test.h | 8 +
include/uapi/linux/iommufd.h | 65 +++++
tools/testing/selftests/iommu/iommufd_utils.h | 66 ++++-
drivers/iommu/io-pgfault.c | 50 ++++
drivers/iommu/iommufd/device.c | 69 ++++-
drivers/iommu/iommufd/hw_pagetable.c | 260 +++++++++++++++++-
drivers/iommu/iommufd/selftest.c | 56 ++++
tools/testing/selftests/iommu/iommufd.c | 24 +-
.../selftests/iommu/iommufd_fail_nth.c | 2 +-
12 files changed, 620 insertions(+), 16 deletions(-)
--
2.34.1
This adds the pasid attach/detach uAPIs for userspace to attach/detach
a PASID of a device to/from a given ioas/hwpt. Only vfio-pci driver is
enabled in this series. After this series, PASID-capable devices bound
with vfio-pci can report PASID capability to userspace and VM to enable
PASID usages like Shared Virtual Addressing (SVA).
This series first adds the helpers for pasid attach in vfio core and then
add the device cdev ioctls for pasid attach/detach, finally exposes the
device PASID capability to user. It depends on iommufd pasid attach/detach
series [1].
Complete code can be found at [2], tested with a draft Qemu branch[3]
[1] https://lore.kernel.org/linux-iommu/20231127063428.127436-1-yi.l.liu@intel.…
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_pasid
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1%…
Change log:
v1:
- Report PASID capability via VFIO_DEVICE_FEATURE (Alex)
rfc: https://lore.kernel.org/linux-iommu/20230926093121.18676-1-yi.l.liu@intel.c…
Regards,
Yi Liu
Kevin Tian (1):
vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices
Yi Liu (2):
vfio: Add VFIO_DEVICE_PASID_[AT|DE]TACH_IOMMUFD_PT
vfio: Report PASID capability via VFIO_DEVICE_FEATURE ioctl
drivers/vfio/device_cdev.c | 45 +++++++++++++++++++++
drivers/vfio/iommufd.c | 48 ++++++++++++++++++++++
drivers/vfio/pci/vfio_pci.c | 2 +
drivers/vfio/pci/vfio_pci_core.c | 47 ++++++++++++++++++++++
drivers/vfio/vfio.h | 4 ++
drivers/vfio/vfio_main.c | 8 ++++
include/linux/vfio.h | 11 ++++++
include/uapi/linux/vfio.h | 68 ++++++++++++++++++++++++++++++++
8 files changed, 233 insertions(+)
--
2.34.1
From: Jeff Xu <jeffxu(a)google.com>
This patchset proposes a new mseal() syscall for the Linux kernel.
In a nutshell, mseal() protects the VMAs of a given virtual memory
range against modifications, such as changes to their permission bits.
Modern CPUs support memory permissions, such as the read/write (RW)
and no-execute (NX) bits. Linux has supported NX since the release of
kernel version 2.6.8 in August 2004 [1]. The memory permission feature
improves the security stance on memory corruption bugs, as an attacker
cannot simply write to arbitrary memory and point the code to it. The
memory must be marked with the X bit, or else an exception will occur.
Internally, the kernel maintains the memory permissions in a data
structure called VMA (vm_area_struct). mseal() additionally protects
the VMA itself against modifications of the selected seal type.
Memory sealing is useful to mitigate memory corruption issues where a
corrupted pointer is passed to a memory management system. For
example, such an attacker primitive can break control-flow integrity
guarantees since read-only memory that is supposed to be trusted can
become writable or .text pages can get remapped. Memory sealing can
automatically be applied by the runtime loader to seal .text and
.rodata pages and applications can additionally seal security critical
data at runtime. A similar feature already exists in the XNU kernel
with the VM_FLAGS_PERMANENT [3] flag and on OpenBSD with the
mimmutable syscall [4]. Also, Chrome wants to adopt this feature for
their CFI work [2] and this patchset has been designed to be
compatible with the Chrome use case.
Two system calls are involved in sealing the map: mmap() and mseal().
The new mseal() is an syscall on 64 bit CPU, and with
following signature:
int mseal(void addr, size_t len, unsigned long flags)
addr/len: memory range.
flags: reserved.
mseal() blocks following operations for the given memory range.
1> Unmapping, moving to another location, and shrinking the size,
via munmap() and mremap(), can leave an empty space, therefore can
be replaced with a VMA with a new set of attributes.
2> Moving or expanding a different VMA into the current location,
via mremap().
3> Modifying a VMA via mmap(MAP_FIXED).
4> Size expansion, via mremap(), does not appear to pose any specific
risks to sealed VMAs. It is included anyway because the use case is
unclear. In any case, users can rely on merging to expand a sealed VMA.
5> mprotect() and pkey_mprotect().
6> Some destructive madvice() behaviors (e.g. MADV_DONTNEED) for anonymous
memory, when users don't have write permission to the memory. Those
behaviors can alter region contents by discarding pages, effectively a
memset(0) for anonymous memory.
In addition: mmap() has two related changes.
The PROT_SEAL bit in prot field of mmap(). When present, it marks
the map sealed since creation.
The MAP_SEALABLE bit in the flags field of mmap(). When present, it marks
the map as sealable. A map created without MAP_SEALABLE will not support
sealing, i.e. mseal() will fail.
Applications that don't care about sealing will expect their behavior
unchanged. For those that need sealing support, opt-in by adding
MAP_SEALABLE in mmap().
The idea that inspired this patch comes from Stephen Röttger’s work in
V8 CFI [5]. Chrome browser in ChromeOS will be the first user of this
API.
Indeed, the Chrome browser has very specific requirements for sealing,
which are distinct from those of most applications. For example, in
the case of libc, sealing is only applied to read-only (RO) or
read-execute (RX) memory segments (such as .text and .RELRO) to
prevent them from becoming writable, the lifetime of those mappings
are tied to the lifetime of the process.
Chrome wants to seal two large address space reservations that are
managed by different allocators. The memory is mapped RW- and RWX
respectively but write access to it is restricted using pkeys (or in
the future ARM permission overlay extensions). The lifetime of those
mappings are not tied to the lifetime of the process, therefore, while
the memory is sealed, the allocators still need to free or discard the
unused memory. For example, with madvise(DONTNEED).
However, always allowing madvise(DONTNEED) on this range poses a
security risk. For example if a jump instruction crosses a page
boundary and the second page gets discarded, it will overwrite the
target bytes with zeros and change the control flow. Checking
write-permission before the discard operation allows us to control
when the operation is valid. In this case, the madvise will only
succeed if the executing thread has PKEY write permissions and PKRU
changes are protected in software by control-flow integrity.
Although the initial version of this patch series is targeting the
Chrome browser as its first user, it became evident during upstream
discussions that we would also want to ensure that the patch set
eventually is a complete solution for memory sealing and compatible
with other use cases. The specific scenario currently in mind is
glibc's use case of loading and sealing ELF executables. To this end,
Stephen is working on a change to glibc to add sealing support to the
dynamic linker, which will seal all non-writable segments at startup.
Once this work is completed, all applications will be able to
automatically benefit from these new protections.
Change history:
===============
V6:
- Drop RFC from subject, Given Linus's general approval.
- Adjust syscall number for mseal (main Jan.11/2024)
- Code style fix (Matthew Wilcox)
- selftest: use ksft macros (Muhammad Usama Anjum)
- Document fix. (Randy Dunlap)
V5:
- fix build issue in mseal-Wire-up-mseal-syscall
(Suggested by Linus Torvalds, and Greg KH)
- updates on selftest.
https://lore.kernel.org/lkml/20240109154547.1839886-1-jeffxu@chromium.org/#r
V4:
(Suggested by Linus Torvalds)
- new signature: mseal(start,len,flags)
- 32 bit is not supported. vm_seal is removed, use vm_flags instead.
- single bit in vm_flags for sealed state.
- CONFIG_MSEAL kernel config is removed.
- single bit of PROT_SEAL in the "Prot" field of mmap().
Other changes:
- update selftest (Suggested by Muhammad Usama Anjum)
- update documentation.
https://lore.kernel.org/all/20240104185138.169307-1-jeffxu@chromium.org/
V3:
- Abandon per-syscall approach, (Suggested by Linus Torvalds).
- Organize sealing types around their functionality, such as
MM_SEAL_BASE, MM_SEAL_PROT_PKEY.
- Extend the scope of sealing from calls originated in userspace to
both kernel and userspace. (Suggested by Linus Torvalds)
- Add seal type support in mmap(). (Suggested by Pedro Falcato)
- Add a new sealing type: MM_SEAL_DISCARD_RO_ANON to prevent
destructive operations of madvise. (Suggested by Jann Horn and
Stephen Röttger)
- Make sealed VMAs mergeable. (Suggested by Jann Horn)
- Add MAP_SEALABLE to mmap()
- Add documentation - mseal.rst
https://lore.kernel.org/linux-mm/20231212231706.2680890-2-jeffxu@chromium.o…
v2:
Use _BITUL to define MM_SEAL_XX type.
Use unsigned long for seal type in sys_mseal() and other functions.
Remove internal VM_SEAL_XX type and convert_user_seal_type().
Remove MM_ACTION_XX type.
Remove caller_origin(ON_BEHALF_OF_XX) and replace with sealing bitmask.
Add more comments in code.
Add a detailed commit message.
https://lore.kernel.org/lkml/20231017090815.1067790-1-jeffxu@chromium.org/
v1:
https://lore.kernel.org/lkml/20231016143828.647848-1-jeffxu@chromium.org/
----------------------------------------------------------------
[1] https://kernelnewbies.org/Linux_2_6_8
[2] https://v8.dev/blog/control-flow-integrity
[3] https://github.com/apple-oss-distributions/xnu/blob/1031c584a5e37aff177559b…
[4] https://man.openbsd.org/mimmutable.2
[5] https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyXge…
[6] https://lore.kernel.org/lkml/CAG48ez3ShUYey+ZAFsU2i1RpQn0a5eOs2hzQ426Fkcgnf…
[7] https://lore.kernel.org/lkml/20230515130553.2311248-1-jeffxu@chromium.org/
Jeff Xu (4):
mseal: Wire up mseal syscall
mseal: add mseal syscall
selftest mm/mseal memory sealing
mseal:add documentation
Documentation/userspace-api/mseal.rst | 181 ++
arch/alpha/kernel/syscalls/syscall.tbl | 1 +
arch/arm/tools/syscall.tbl | 1 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 2 +
arch/m68k/kernel/syscalls/syscall.tbl | 1 +
arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
arch/parisc/kernel/syscalls/syscall.tbl | 1 +
arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
arch/s390/kernel/syscalls/syscall.tbl | 1 +
arch/sh/kernel/syscalls/syscall.tbl | 1 +
arch/sparc/kernel/syscalls/syscall.tbl | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
include/linux/mm.h | 60 +
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/mman-common.h | 8 +
include/uapi/asm-generic/unistd.h | 5 +-
kernel/sys_ni.c | 1 +
mm/Makefile | 4 +
mm/madvise.c | 12 +
mm/mmap.c | 27 +
mm/mprotect.c | 10 +
mm/mremap.c | 31 +
mm/mseal.c | 330 +++
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/mseal_test.c | 1997 +++++++++++++++++++
32 files changed, 2686 insertions(+), 2 deletions(-)
create mode 100644 Documentation/userspace-api/mseal.rst
create mode 100644 mm/mseal.c
create mode 100644 tools/testing/selftests/mm/mseal_test.c
--
2.43.0.275.g3460e3d667-goog
=== Description ===
This is a bpf-treewide change that annotates all kfuncs as such inside
.BTF_ids. This annotation eventually allows us to automatically generate
kfunc prototypes from bpftool.
We store this metadata inside a yet-unused flags field inside struct
btf_id_set8 (thanks Kumar!). pahole will be taught where to look.
More details about the full chain of events are available in commit 3's
description.
The accompanying pahole changes (still needs some cleanup) can be viewed
here on this "frozen" branch [0].
[0]: https://github.com/danobi/pahole/tree/kfunc_btf-mailed
=== Changelog ===
Changes from v2:
* Only WARN() for vmlinux kfuncs
Changes from v1:
* Move WARN_ON() up a call level
* Also return error when kfunc set is not properly tagged
* Use BTF_KFUNCS_START/END instead of flags
* Rename BTF_SET8_KFUNC to BTF_SET8_KFUNCS
Daniel Xu (3):
bpf: btf: Support flags for BTF_SET8 sets
bpf: btf: Add BTF_KFUNCS_START/END macro pair
bpf: treewide: Annotate BPF kfuncs in BTF
drivers/hid/bpf/hid_bpf_dispatch.c | 8 +++----
fs/verity/measure.c | 4 ++--
include/linux/btf_ids.h | 21 +++++++++++++++----
kernel/bpf/btf.c | 8 +++++++
kernel/bpf/cpumask.c | 4 ++--
kernel/bpf/helpers.c | 8 +++----
kernel/bpf/map_iter.c | 4 ++--
kernel/cgroup/rstat.c | 4 ++--
kernel/trace/bpf_trace.c | 8 +++----
net/bpf/test_run.c | 8 +++----
net/core/filter.c | 16 +++++++-------
net/core/xdp.c | 4 ++--
net/ipv4/bpf_tcp_ca.c | 4 ++--
net/ipv4/fou_bpf.c | 4 ++--
net/ipv4/tcp_bbr.c | 4 ++--
net/ipv4/tcp_cubic.c | 4 ++--
net/ipv4/tcp_dctcp.c | 4 ++--
net/netfilter/nf_conntrack_bpf.c | 4 ++--
net/netfilter/nf_nat_bpf.c | 4 ++--
net/xfrm/xfrm_interface_bpf.c | 4 ++--
net/xfrm/xfrm_state_bpf.c | 4 ++--
.../selftests/bpf/bpf_testmod/bpf_testmod.c | 8 +++----
22 files changed, 81 insertions(+), 60 deletions(-)
--
2.42.1
From: Maxim Mikityanskiy <maxim(a)isovalent.com>
The goal of this series is to extend the verifier's capabilities of
tracking scalars when they are spilled to stack, especially when the
spill or fill is narrowing. It also contains a fix by Eduard for
infinite loop detection and a state pruning optimization by Eduard that
compensates for a verification complexity regression introduced by
tracking unbounded scalars. These improvements reduce the surface of
false rejections that I saw while working on Cilium codebase.
Patch 1 (Maxim): Fix for an existing test, it will matter later in the
series.
Patches 2-3 (Eduard): Fixes for false rejections in infinite loop
detection that happen in the selftests when my patches are applied.
Patches 4-5 (Maxim): Fix the inconsistency of find_equal_scalars that
was possible if 32-bit spills were made.
Patches 6-11 (Maxim): Support the case when boundary checks are first
performed after the register was spilled to the stack.
Patches 12-13 (Maxim): Support narrowing fills.
Patches 14-15 (Eduard): Optimization for state pruning in stacksafe() to
mitigate the verification complexity regression.
veristat -e file,prog,states -f '!states_diff<50' -f '!states_pct<10' -f '!states_a<10' -f '!states_b<10' -C ...
* Without patch 14:
File Program States (A) States (B) States (DIFF)
-------------------- ------------ ---------- ---------- ----------------
bpf_xdp.o tail_lb_ipv6 3877 2936 -941 (-24.27%)
pyperf180.bpf.o on_event 8422 10456 +2034 (+24.15%)
pyperf600.bpf.o on_event 22259 37319 +15060 (+67.66%)
pyperf600_iter.bpf.o on_event 400 540 +140 (+35.00%)
strobemeta.bpf.o on_event 4702 13435 +8733 (+185.73%)
* With patch 14:
File Program States (A) States (B) States (DIFF)
-------------------- ------------ ---------- ---------- --------------
bpf_xdp.o tail_lb_ipv6 3877 2937 -940 (-24.25%)
pyperf600_iter.bpf.o on_event 400 500 +100 (+25.00%)
v2 changes:
Fixed comments in patch 1, moved endianness checks to header files in
patch 12 where possible, added Eduard's ACKs.
Eduard Zingerman (4):
bpf: make infinite loop detection in is_state_visited() exact
selftests/bpf: check if imprecise stack spills confuse infinite loop
detection
bpf: Optimize state pruning for spilled scalars
selftests/bpf: states pruning checks for scalar vs STACK_{MISC,ZERO}
Maxim Mikityanskiy (11):
selftests/bpf: Fix the u64_offset_to_skb_data test
bpf: Make bpf_for_each_spilled_reg consider narrow spills
selftests/bpf: Add a test case for 32-bit spill tracking
bpf: Add the assign_scalar_id_before_mov function
bpf: Add the get_reg_width function
bpf: Assign ID to scalars on spill
selftests/bpf: Test assigning ID to scalars on spill
bpf: Track spilled unbounded scalars
selftests/bpf: Test tracking spilled unbounded scalars
bpf: Preserve boundaries and track scalars on narrowing fill
selftests/bpf: Add test cases for narrowing fill
include/linux/bpf_verifier.h | 4 +-
include/linux/filter.h | 12 +
kernel/bpf/verifier.c | 155 ++++-
.../bpf/progs/verifier_direct_packet_access.c | 2 +-
.../selftests/bpf/progs/verifier_loops1.c | 24 +
.../selftests/bpf/progs/verifier_spill_fill.c | 533 +++++++++++++++++-
.../testing/selftests/bpf/verifier/precise.c | 6 +-
7 files changed, 685 insertions(+), 51 deletions(-)
--
2.43.0
The livepatching kselftests rely on comparing expected vs. observed
dmesg output. After each test, new dmesg entries are determined by the
'comm' utility comparing a saved, pre-test copy of dmesg to post-test
dmesg output.
Alexander reports that the 'comm --nocheck-order -13' invocation used by
the tests can be confused when dmesg entry timestamps vary in magnitude
(ie, "[ 98.820331]" vs. "[ 100.031067]"), in which case, additional
messages are reported as new. The unexpected entries then spoil the
test results.
Instead of relying on 'comm' or 'diff' to determine new testing dmesg
entries, refactor the code:
- pre-test : log a unique canary dmesg entry
- test : run tests, log messages
- post-test : filter dmesg starting from pre-test message
Reported-by: Alexander Gordeev <agordeev(a)linux.ibm.com>
Closes: https://lore.kernel.org/live-patching/ZYAimyPYhxVA9wKg@li-008a6a4c-3549-11b…
Signed-off-by: Joe Lawrence <joe.lawrence(a)redhat.com>
---
.../testing/selftests/livepatch/functions.sh | 37 +++++++++----------
1 file changed, 17 insertions(+), 20 deletions(-)
diff --git a/tools/testing/selftests/livepatch/functions.sh b/tools/testing/selftests/livepatch/functions.sh
index c8416c54b463..b1fd7362c2fe 100644
--- a/tools/testing/selftests/livepatch/functions.sh
+++ b/tools/testing/selftests/livepatch/functions.sh
@@ -42,17 +42,6 @@ function die() {
exit 1
}
-# save existing dmesg so we can detect new content
-function save_dmesg() {
- SAVED_DMESG=$(mktemp --tmpdir -t klp-dmesg-XXXXXX)
- dmesg > "$SAVED_DMESG"
-}
-
-# cleanup temporary dmesg file from save_dmesg()
-function cleanup_dmesg_file() {
- rm -f "$SAVED_DMESG"
-}
-
function push_config() {
DYNAMIC_DEBUG=$(grep '^kernel/livepatch' /sys/kernel/debug/dynamic_debug/control | \
awk -F'[: ]' '{print "file " $1 " line " $2 " " $4}')
@@ -99,7 +88,6 @@ function set_ftrace_enabled() {
function cleanup() {
pop_config
- cleanup_dmesg_file
}
# setup_config - save the current config and set a script exit trap that
@@ -280,7 +268,15 @@ function set_pre_patch_ret {
function start_test {
local test="$1"
- save_dmesg
+ # Dump something unique into the dmesg log, then stash the entry
+ # in LAST_DMESG. The check_result() function will use it to
+ # find new kernel messages since the test started.
+ local last_dmesg_msg="livepatch kselftest timestamp: $(date --rfc-3339=ns)"
+ log "$last_dmesg_msg"
+ loop_until 'dmesg | grep -q "$last_dmesg_msg"' ||
+ die "buffer busy? can't find canary dmesg message: $last_dmesg_msg"
+ LAST_DMESG=$(dmesg | grep "$last_dmesg_msg")
+
echo -n "TEST: $test ... "
log "===== TEST: $test ====="
}
@@ -291,23 +287,24 @@ function check_result {
local expect="$*"
local result
- # Note: when comparing dmesg output, the kernel log timestamps
- # help differentiate repeated testing runs. Remove them with a
- # post-comparison sed filter.
-
- result=$(dmesg | comm --nocheck-order -13 "$SAVED_DMESG" - | \
+ # Test results include any new dmesg entry since LAST_DMESG, then:
+ # - include lines matching keywords
+ # - exclude lines matching keywords
+ # - filter out dmesg timestamp prefixes
+ result=$(dmesg | awk -v last_dmesg="$LAST_DMESG" 'p; $0 == last_dmesg { p=1 }' | \
grep -e 'livepatch:' -e 'test_klp' | \
grep -v '\(tainting\|taints\) kernel' | \
sed 's/^\[[ 0-9.]*\] //')
if [[ "$expect" == "$result" ]] ; then
echo "ok"
+ elif [[ "$result" == "" ]] ; then
+ echo -e "not ok\n\nbuffer overrun? can't find canary dmesg entry: $LAST_DMESG\n"
+ die "livepatch kselftest(s) failed"
else
echo -e "not ok\n\n$(diff -upr --label expected --label result <(echo "$expect") <(echo "$result"))\n"
die "livepatch kselftest(s) failed"
fi
-
- cleanup_dmesg_file
}
# check_sysfs_rights(modname, rel_path, expected_rights) - check sysfs
--
2.41.0
This series updates all instances of LLVM Phabricator and Bugzilla links
to point to GitHub commits directly and LLVM's Bugzilla to GitHub issue
shortlinks respectively.
I split up the Phabricator patch into BPF selftests and the rest of the
kernel in case the BPF folks want to take it separately from the rest of
the series, there are obviously no dependency issues in that case. The
Bugzilla change was mechanical enough and should have no conflicts.
I am aiming this at Andrew and CC'ing other lists, in case maintainers
want to chime in, but I think this is pretty uncontroversial (famous
last words...).
---
Nathan Chancellor (3):
selftests/bpf: Update LLVM Phabricator links
arch and include: Update LLVM Phabricator links
treewide: Update LLVM Bugzilla links
arch/arm64/Kconfig | 4 +--
arch/powerpc/Makefile | 4 +--
arch/powerpc/kvm/book3s_hv_nested.c | 2 +-
arch/riscv/Kconfig | 2 +-
arch/riscv/include/asm/ftrace.h | 2 +-
arch/s390/include/asm/ftrace.h | 2 +-
arch/x86/power/Makefile | 2 +-
crypto/blake2b_generic.c | 2 +-
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 2 +-
drivers/media/test-drivers/vicodec/codec-fwht.c | 2 +-
drivers/regulator/Kconfig | 2 +-
include/asm-generic/vmlinux.lds.h | 2 +-
include/linux/compiler-clang.h | 2 +-
lib/Kconfig.kasan | 2 +-
lib/raid6/Makefile | 2 +-
lib/stackinit_kunit.c | 2 +-
mm/slab_common.c | 2 +-
net/bridge/br_multicast.c | 2 +-
security/Kconfig | 2 +-
tools/testing/selftests/bpf/README.rst | 32 +++++++++++-----------
tools/testing/selftests/bpf/prog_tests/xdpwall.c | 2 +-
.../selftests/bpf/progs/test_core_reloc_type_id.c | 2 +-
23 files changed, 40 insertions(+), 40 deletions(-)
---
base-commit: 0dd3ee31125508cd67f7e7172247f05b7fd1753a
change-id: 20240109-update-llvm-links-d03f9d649e1e
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
Hi all:
The core frequency is subjected to the process variation in semiconductors.
Not all cores are able to reach the maximum frequency respecting the
infrastructure limits. Consequently, AMD has redefined the concept of
maximum frequency of a part. This means that a fraction of cores can reach
maximum frequency. To find the best process scheduling policy for a given
scenario, OS needs to know the core ordering informed by the platform through
highest performance capability register of the CPPC interface.
Earlier implementations of amd-pstate preferred core only support a static
core ranking and targeted performance. Now it has the ability to dynamically
change the preferred core based on the workload and platform conditions and
accounting for thermals and aging.
Amd-pstate driver utilizes the functions and data structures provided by
the ITMT architecture to enable the scheduler to favor scheduling on cores
which can be get a higher frequency with lower voltage.
We call it amd-pstate preferred core.
Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
Amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.
Amd-pstate driver will provide an initial core ordering at boot time.
It relies on the CPPC interface to communicate the core ranking to the
operating system and scheduler to make sure that OS is choosing the cores
with highest performance firstly for scheduling the process. When amd-pstate
driver receives a message with the highest performance change, it will
update the core ranking.
Changes from V12->V13:
- ACPI: CPPC:
- - modify commit message.
- - modify handle function of the notify(0x85).
- cpufreq: amd-pstate:
- - implement update_limits() callback function.
- x86:
- - pick up Acked-By flag added by Petkov.
Changes from V11->V12:
- all:
- - pick up Reviewed-By flag added by Perry.
- cpufreq: amd-pstate:
- - rebase the latest linux-next and fixed conflicts.
- - fixed the issue about cpudata without init in amd_pstate_update_highest_perf().
Changes from V10->V11:
- cpufreq: amd-pstate:
- - according Perry's commnts, I replace the string with str_enabled_disable().
Changes from V9->V10:
- cpufreq: amd-pstate:
- - add judgement for highest_perf. When it is less than 255, the
preferred core feature is enabled. And it will set the priority.
- - deleset "static u32 max_highest_perf" etc, because amd p-state
perferred coe does not require specail process for hotpulg.
Changes form V8->V9:
- all:
- - pick up Tested-By flag added by Oleksandr.
- cpufreq: amd-pstate:
- - pick up Review-By flag added by Wyes.
- - ignore modification of bug.
- - add a attribute of prefcore_ranking.
- - modify data type conversion from u32 to int.
- Documentation: amd-pstate:
- - pick up Review-By flag added by Wyes.
Changes form V7->V8:
- all:
- - pick up Review-By flag added by Mario and Ray.
- cpufreq: amd-pstate:
- - use hw_prefcore embeds into cpudata structure.
- - delete preferred core init from cpu online/off.
Changes form V6->V7:
- x86:
- - Modify kconfig about X86_AMD_PSTATE.
- cpufreq: amd-pstate:
- - modify incorrect comments about scheduler_work().
- - convert highest_perf data type.
- - modify preferred core init when cpu init and online.
- ACPI: CPPC:
- - modify link of CPPC highest performance.
- cpufreq:
- - modify link of CPPC highest performance changed.
Changes form V5->V6:
- cpufreq: amd-pstate:
- - modify the wrong tag order.
- - modify warning about hw_prefcore sysfs attribute.
- - delete duplicate comments.
- - modify the variable name cppc_highest_perf to prefcore_ranking.
- - modify judgment conditions for setting highest_perf.
- - modify sysfs attribute for CPPC highest perf to pr_debug message.
- Documentation: amd-pstate:
- - modify warning: title underline too short.
Changes form V4->V5:
- cpufreq: amd-pstate:
- - modify sysfs attribute for CPPC highest perf.
- - modify warning about comments
- - rebase linux-next
- cpufreq:
- - Moidfy warning about function declarations.
- Documentation: amd-pstate:
- - align with ``amd-pstat``
Changes form V3->V4:
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
Changes form V2->V3:
- x86:
- - Modify kconfig and description.
- cpufreq: amd-pstate:
- - Add Co-developed-by tag in commit message.
- cpufreq:
- - Modify commit message.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
Changes form V1->V2:
- ACPI: CPPC:
- - Add reference link.
- cpufreq:
- - Moidfy link error.
- cpufreq: amd-pstate:
- - Init the priorities of all online CPUs
- - Use a single variable to represent the status of preferred core.
- Documentation:
- - Default enabled preferred core.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
- - Default enabled preferred core.
- - Use a single variable to represent the status of preferred core.
*** BLURB HERE ***
Meng Li (7):
x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
ACPI: CPPC: Add get the highest performance cppc control
cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
cpufreq: Add a notification message that the highest perf has changed
cpufreq: amd-pstate: Update amd-pstate preferred core ranking
dynamically
Documentation: amd-pstate: introduce amd-pstate preferred core
Documentation: introduce amd-pstate preferrd core mode kernel command
line options
.../admin-guide/kernel-parameters.txt | 5 +
Documentation/admin-guide/pm/amd-pstate.rst | 59 +++++-
arch/x86/Kconfig | 5 +-
drivers/acpi/cppc_acpi.c | 13 ++
drivers/acpi/processor_driver.c | 6 +
drivers/cpufreq/amd-pstate.c | 183 +++++++++++++++++-
include/acpi/cppc_acpi.h | 5 +
include/linux/amd-pstate.h | 10 +
8 files changed, 274 insertions(+), 12 deletions(-)
--
2.34.1
Add a test to exercize cpu hotplug with the function tracer active to
ensure that sensitive functions in idle path are excluded from being
traced. This helps catch issues such as the one fixed by commit
4b3338aaa74d ("powerpc/ftrace: Fix stack teardown in ftrace_no_trace").
Signed-off-by: Naveen N Rao <naveen(a)kernel.org>
---
v2: Add a check for next available online cpu, as suggested by Masami.
.../ftrace/test.d/ftrace/func_hotplug.tc | 42 +++++++++++++++++++
1 file changed, 42 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/ftrace/func_hotplug.tc
diff --git a/tools/testing/selftests/ftrace/test.d/ftrace/func_hotplug.tc b/tools/testing/selftests/ftrace/test.d/ftrace/func_hotplug.tc
new file mode 100644
index 000000000000..ccfbfde3d942
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/ftrace/func_hotplug.tc
@@ -0,0 +1,42 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0-or-later
+# description: ftrace - function trace across cpu hotplug
+# requires: function:tracer
+
+if ! which nproc ; then
+ nproc() {
+ ls -d /sys/devices/system/cpu/cpu[0-9]* | wc -l
+ }
+fi
+
+NP=`nproc`
+
+if [ $NP -eq 1 ] ;then
+ echo "We cannot test cpu hotplug in UP environment"
+ exit_unresolved
+fi
+
+# Find online cpu
+for i in /sys/devices/system/cpu/cpu[1-9]*; do
+ if [ -f $i/online ] && [ "$(cat $i/online)" = "1" ]; then
+ cpu=$i
+ break
+ fi
+done
+
+if [ -z "$cpu" ]; then
+ echo "We cannot test cpu hotplug with a single cpu online"
+ exit_unresolved
+fi
+
+echo 0 > tracing_on
+echo > trace
+
+: "Set $(basename $cpu) offline/online with function tracer enabled"
+echo function > current_tracer
+echo 1 > tracing_on
+(echo 0 > $cpu/online)
+(echo "forked"; sleep 1)
+(echo 1 > $cpu/online)
+echo 0 > tracing_on
+echo nop > current_tracer
base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86
--
2.43.0
Use 2 separate variables of types int and unsigned long long instead of
confusing them. This corrects the correct print format for each of them
and removes the build warning:
warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘long long unsigned int’
Fixes: a4cb3b243343 ("selftests: mm: add a test for remapping to area immediately after existing mapping")
Signed-off-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
---
Changes since v1:
- Don't just fix the print format, instead use different variables
---
tools/testing/selftests/mm/mremap_test.c | 27 ++++++++++++------------
1 file changed, 14 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/mm/mremap_test.c b/tools/testing/selftests/mm/mremap_test.c
index 1d4c1589c305..2f8b991f78cb 100644
--- a/tools/testing/selftests/mm/mremap_test.c
+++ b/tools/testing/selftests/mm/mremap_test.c
@@ -360,7 +360,8 @@ static long long remap_region(struct config c, unsigned int threshold_mb,
char pattern_seed)
{
void *addr, *src_addr, *dest_addr, *dest_preamble_addr;
- unsigned long long i;
+ int d;
+ unsigned long long t;
struct timespec t_start = {0, 0}, t_end = {0, 0};
long long start_ns, end_ns, align_mask, ret, offset;
unsigned long long threshold;
@@ -378,8 +379,8 @@ static long long remap_region(struct config c, unsigned int threshold_mb,
/* Set byte pattern for source block. */
srand(pattern_seed);
- for (i = 0; i < threshold; i++)
- memset((char *) src_addr + i, (char) rand(), 1);
+ for (t = 0; t < threshold; t++)
+ memset((char *) src_addr + t, (char) rand(), 1);
/* Mask to zero out lower bits of address for alignment */
align_mask = ~(c.dest_alignment - 1);
@@ -420,8 +421,8 @@ static long long remap_region(struct config c, unsigned int threshold_mb,
/* Set byte pattern for the dest preamble block. */
srand(pattern_seed);
- for (i = 0; i < c.dest_preamble_size; i++)
- memset((char *) dest_preamble_addr + i, (char) rand(), 1);
+ for (d = 0; d < c.dest_preamble_size; d++)
+ memset((char *) dest_preamble_addr + d, (char) rand(), 1);
}
clock_gettime(CLOCK_MONOTONIC, &t_start);
@@ -437,14 +438,14 @@ static long long remap_region(struct config c, unsigned int threshold_mb,
/* Verify byte pattern after remapping */
srand(pattern_seed);
- for (i = 0; i < threshold; i++) {
+ for (t = 0; t < threshold; t++) {
char c = (char) rand();
- if (((char *) dest_addr)[i] != c) {
+ if (((char *) dest_addr)[t] != c) {
ksft_print_msg("Data after remap doesn't match at offset %llu\n",
- i);
+ t);
ksft_print_msg("Expected: %#x\t Got: %#x\n", c & 0xff,
- ((char *) dest_addr)[i] & 0xff);
+ ((char *) dest_addr)[t] & 0xff);
ret = -1;
goto clean_up_dest;
}
@@ -453,14 +454,14 @@ static long long remap_region(struct config c, unsigned int threshold_mb,
/* Verify the dest preamble byte pattern after remapping */
if (c.dest_preamble_size) {
srand(pattern_seed);
- for (i = 0; i < c.dest_preamble_size; i++) {
+ for (d = 0; d < c.dest_preamble_size; d++) {
char c = (char) rand();
- if (((char *) dest_preamble_addr)[i] != c) {
+ if (((char *) dest_preamble_addr)[d] != c) {
ksft_print_msg("Preamble data after remap doesn't match at offset %d\n",
- i);
+ d);
ksft_print_msg("Expected: %#x\t Got: %#x\n", c & 0xff,
- ((char *) dest_preamble_addr)[i] & 0xff);
+ ((char *) dest_preamble_addr)[d] & 0xff);
ret = -1;
goto clean_up_dest;
}
--
2.42.0
Fix following build warning:
warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘long long unsigned int’
Fixes: a4cb3b243343 ("selftests: mm: add a test for remapping to area immediately after existing mapping")
Signed-off-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
---
tools/testing/selftests/mm/mremap_test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/mremap_test.c b/tools/testing/selftests/mm/mremap_test.c
index 1d4c1589c305..dd1cbb068982 100644
--- a/tools/testing/selftests/mm/mremap_test.c
+++ b/tools/testing/selftests/mm/mremap_test.c
@@ -457,7 +457,7 @@ static long long remap_region(struct config c, unsigned int threshold_mb,
char c = (char) rand();
if (((char *) dest_preamble_addr)[i] != c) {
- ksft_print_msg("Preamble data after remap doesn't match at offset %d\n",
+ ksft_print_msg("Preamble data after remap doesn't match at offset %llu\n",
i);
ksft_print_msg("Expected: %#x\t Got: %#x\n", c & 0xff,
((char *) dest_preamble_addr)[i] & 0xff);
--
2.42.0
From: Christoph Müllner <christoph.muellner(a)vrull.eu>
When building the RISC-V selftests with a riscv32 compiler I ran into
a couple of compiler warnings. While riscv32 support for these tests is
questionable, the fixes are so trivial that it is probably best to simply
apply them.
Note that the missing-include patch and some format string warnings
are also relevant for riscv64.
Christoph Müllner (5):
tools: selftests: riscv: Fix compile warnings in hwprobe
tools: selftests: riscv: Fix compile warnings in cbo
tools: selftests: riscv: Add missing include for vector test
tools: selftests: riscv: Fix compile warnings in vector tests
tools: selftests: riscv: Fix compile warnings in mm tests
tools/testing/selftests/riscv/hwprobe/cbo.c | 6 +++---
tools/testing/selftests/riscv/hwprobe/hwprobe.c | 4 ++--
tools/testing/selftests/riscv/mm/mmap_test.h | 3 +++
tools/testing/selftests/riscv/vector/v_initval_nolibc.c | 2 +-
tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c | 3 +++
tools/testing/selftests/riscv/vector/vstate_prctl.c | 4 ++--
6 files changed, 14 insertions(+), 8 deletions(-)
--
2.41.0
When execute the dirty_log_test on some aarch64 machine, it sometimes
trigger the ASSERT:
==== Test Assertion Failure ====
dirty_log_test.c:384: dirty_ring_vcpu_ring_full
pid=14854 tid=14854 errno=22 - Invalid argument
1 0x00000000004033eb: dirty_ring_collect_dirty_pages at dirty_log_test.c:384
2 0x0000000000402d27: log_mode_collect_dirty_pages at dirty_log_test.c:505
3 (inlined by) run_test at dirty_log_test.c:802
4 0x0000000000403dc7: for_each_guest_mode at guest_modes.c:100
5 0x0000000000401dff: main at dirty_log_test.c:941 (discriminator 3)
6 0x0000ffff9be173c7: ?? ??:0
7 0x0000ffff9be1749f: ?? ??:0
8 0x000000000040206f: _start at ??:?
Didn't continue vcpu even without ring full
The dirty_log_test fails when execute the dirty-ring test, this is
because the sem_vcpu_cont and the sem_vcpu_stop is non-zero value when
execute the dirty_ring_collect_dirty_pages() function. When those two
sem_t variables are non-zero, the dirty_ring_wait_vcpu() at the
beginning of the dirty_ring_collect_dirty_pages() will not wait for the
vcpu to stop, but continue to execute the following code. In this case,
before vcpu stop, if the dirty_ring_vcpu_ring_full is true, and the
dirty_ring_collect_dirty_pages() has passed the check for the
dirty_ring_vcpu_ring_full but hasn't execute the check for the
continued_vcpu, the vcpu stop, and set the dirty_ring_vcpu_ring_full to
false. Then dirty_ring_collect_dirty_pages() will trigger the ASSERT.
Why sem_vcpu_cont and sem_vcpu_stop can be non-zero value? It's because
the dirty_ring_before_vcpu_join() execute the sem_post(&sem_vcpu_cont)
at the end of each dirty-ring test. It can cause two cases:
1. sem_vcpu_cont be non-zero. When we set the host_quit to be true,
the vcpu_worker directly see the host_quit to be true, it quit. So
the log_mode_before_vcpu_join() function will set the sem_vcpu_cont
to 1, since the vcpu_worker has quit, it won't consume it.
2. sem_vcpu_stop be non-zero. When we set the host_quit to be true,
the vcpu_worker has entered the guest state, the next time it exit
from guest state, it will set the sem_vcpu_stop to 1, and then see
the host_quit, no one will consume the sem_vcpu_stop.
When execute more and more dirty-ring tests, the sem_vcpu_cont and
sem_vcpu_stop can be larger and larger, which makes many code paths
don't wait for the sem_t. Thus finally cause the problem.
To fix this problem, we can wait a while before set the host_quit to
true, which gives the vcpu time to enter the guest state, so it will
exit again. Then we can wait the vcpu to exit, and let it continue
again, then the vcpu will see the host_quit. Thus the sem_vcpu_cont and
sem_vcpu_stop will be both zero when test finished.
Signed-off-by: Shaoqin Huang <shahuang(a)redhat.com>
---
v1->v2:
- Fix the real logic bug, not just fresh the context.
v1: https://lore.kernel.org/all/20231116093536.22256-1-shahuang@redhat.com/
---
tools/testing/selftests/kvm/dirty_log_test.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c
index 936f3a8d1b83..a6e0ff46a07c 100644
--- a/tools/testing/selftests/kvm/dirty_log_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_test.c
@@ -417,7 +417,8 @@ static void dirty_ring_after_vcpu_run(struct kvm_vcpu *vcpu, int ret, int err)
static void dirty_ring_before_vcpu_join(void)
{
- /* Kick another round of vcpu just to make sure it will quit */
+ /* Wait vcpu exit, and let it continue to see the host_quit. */
+ dirty_ring_wait_vcpu();
sem_post(&sem_vcpu_cont);
}
@@ -719,6 +720,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
struct kvm_vm *vm;
unsigned long *bmap;
uint32_t ring_buf_idx = 0;
+ int sem_val;
if (!log_mode_supported()) {
print_skip("Log mode '%s' not supported",
@@ -726,6 +728,11 @@ static void run_test(enum vm_guest_mode mode, void *arg)
return;
}
+ sem_getvalue(&sem_vcpu_stop, &sem_val);
+ assert(sem_val == 0);
+ sem_getvalue(&sem_vcpu_cont, &sem_val);
+ assert(sem_val == 0);
+
/*
* We reserve page table for 2 times of extra dirty mem which
* will definitely cover the original (1G+) test range. Here
@@ -825,6 +832,13 @@ static void run_test(enum vm_guest_mode mode, void *arg)
sync_global_to_guest(vm, iteration);
}
+ /*
+ *
+ * Before we set the host_quit, let the vcpu has time to run, to make
+ * sure we consume the sem_vcpu_stop and the vcpu consume the
+ * sem_vcpu_cont, to keep the semaphore balance.
+ */
+ usleep(p->interval * 1000);
/* Tell the vcpu thread to quit */
host_quit = true;
log_mode_before_vcpu_join();
--
2.40.1
Now that we have the VISIBLE_IF_KUNIT and EXPORT_SYMBOL_IF_KUNIT macros,
update the instructions to recommend this way of testing static
functions.
Signed-off-by: Arthur Grillo <arthurgrillo(a)riseup.net>
---
Changes in v3:
- Maintain the old '#include' way
- Link to v2: https://lore.kernel.org/r/20240108-kunit-doc-export-v2-1-8f2dd3395fed@riseu…
Changes in v2:
- Fix #if condition
- Link to v1: https://lore.kernel.org/r/20240108-kunit-doc-export-v1-1-119368df0d96@riseu…
---
Documentation/dev-tools/kunit/usage.rst | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/Documentation/dev-tools/kunit/usage.rst b/Documentation/dev-tools/kunit/usage.rst
index c27e1646ecd9..8e35b94a17ec 100644
--- a/Documentation/dev-tools/kunit/usage.rst
+++ b/Documentation/dev-tools/kunit/usage.rst
@@ -671,8 +671,23 @@ Testing Static Functions
------------------------
If we do not want to expose functions or variables for testing, one option is to
-conditionally ``#include`` the test file at the end of your .c file. For
-example:
+conditionally export the used symbol. For example:
+
+.. code-block:: c
+
+ /* In my_file.c */
+
+ VISIBLE_IF_KUNIT int do_interesting_thing();
+ EXPORT_SYMBOL_IF_KUNIT(do_interesting_thing);
+
+ /* In my_file.h */
+
+ #if IS_ENABLED(CONFIG_KUNIT)
+ int do_interesting_thing(void);
+ #endif
+
+Alternatively, you could conditionally ``#include`` the test file at the end of
+your .c file. For example:
.. code-block:: c
---
base-commit: eeb8e8d9f124f279e80ae679f4ba6e822ce4f95f
change-id: 20240108-kunit-doc-export-eec1f910ab67
Best regards,
--
Arthur Grillo <arthurgrillo(a)riseup.net>
The rules to link selftests are:
> $(OUTPUT)/%_ipv4: %.c
> $(LINK.c) $^ $(LDLIBS) -o $@
>
> $(OUTPUT)/%_ipv6: %.c
> $(LINK.c) -DIPV6_TEST $^ $(LDLIBS) -o $@
The intel test robot uses only selftest's Makefile, not the top linux
Makefile:
> make W=1 O=/tmp/kselftest -C tools/testing/selftests
So, $(LINK.c) is determined by environment, rather than by kernel
Makefiles. On my machine (as well as other people that ran tcp-ao
selftests) GNU/Make implicit definition does use $(LDFLAGS):
> [dima@Mindolluin ~]$ make -p -f/dev/null | grep '^LINK.c\>'
> make: *** No targets. Stop.
> LINK.c = $(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET_ARCH)
But, according to build robot report, it's not the case for them.
While I could just avoid using pre-defined $(LINK.c), it's also used by
selftests/lib.mk by default.
Anyways, according to GNU/Make documentation [1], I should have used
$(LDLIBS) instead of $(LDFLAGS) in the first place, so let's just do it:
> LDFLAGS
> Extra flags to give to compilers when they are supposed to invoke
> the linker, ‘ld’, such as -L. Libraries (-lfoo) should be added
> to the LDLIBS variable instead.
> LDLIBS
> Library flags or names given to compilers when they are supposed
> to invoke the linker, ‘ld’. LOADLIBES is a deprecated (but still
> supported) alternative to LDLIBS. Non-library linker flags, such
> as -L, should go in the LDFLAGS variable.
[1]: https://www.gnu.org/software/make/manual/html_node/Implicit-Variables.html
Fixes: cfbab37b3da0 ("selftests/net: Add TCP-AO library")
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401011151.veyYTJzq-lkp@intel.com/
Signed-off-by: Dmitry Safonov <dima(a)arista.com>
---
tools/testing/selftests/net/tcp_ao/Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/tcp_ao/Makefile b/tools/testing/selftests/net/tcp_ao/Makefile
index 8e60bae67aa9..522d991e310e 100644
--- a/tools/testing/selftests/net/tcp_ao/Makefile
+++ b/tools/testing/selftests/net/tcp_ao/Makefile
@@ -52,5 +52,5 @@ $(OUTPUT)/%_ipv6: %.c
$(OUTPUT)/icmps-accept_ipv4: CFLAGS+= -DTEST_ICMPS_ACCEPT
$(OUTPUT)/icmps-accept_ipv6: CFLAGS+= -DTEST_ICMPS_ACCEPT
-$(OUTPUT)/bench-lookups_ipv4: LDFLAGS+= -lm
-$(OUTPUT)/bench-lookups_ipv6: LDFLAGS+= -lm
+$(OUTPUT)/bench-lookups_ipv4: LDLIBS+= -lm
+$(OUTPUT)/bench-lookups_ipv6: LDLIBS+= -lm
---
base-commit: 8cb47d7cd090a690c1785385b2f3d407d4a53ad0
change-id: 20240110-tcp_ao-selftests-makefile-3dafb1e96df8
Best regards,
--
Dmitry Safonov <dima(a)arista.com>
Changes in v5:
* Fixed an issue found by Joe that copied Kbuild files along with the
test modules to the installation directory.
* Added Joe Lawrense review tags.
Changes in v4:
* Documented how to compile the livepatch selftests without running the
tests (Joe)
* Removed the mention to lib/livepatch on MAINTAINERS file, reported by
checkpatch.
Changes in v3:
* Rebased on top of v6.6-rc5
* The commits messages were improved (Thanks Petr!)
* Created TEST_GEN_MODS_DIR variable to point to a directly that contains kernel
modules, and adapt selftests to build it before running the test.
* Moved test_klp-call_getpid out of test_programs, since the gen_tar
would just copy the generated test programs to the livepatches dir,
and so scripts relying on test_programs/test_klp-call_getpid will fail.
* Added a module_param for klp_pids, describing it's usage.
* Simplified the call_getpid program to ignore the return of getpid syscall,
since we only want to make sure the process transitions correctly to the
patched stated
* The test-syscall.sh not prints a log message showing the number of remaining
processes to transition into to livepatched state, and check_output expects it
to be 0.
* Added MODULE_AUTHOR and MODULE_DESCRIPTION to test_klp_syscall.c
- Link to v3: https://lore.kernel.org/r/20231031-send-lp-kselftests-v3-0-2b1655c2605f@sus…
- Link to v2: https://lore.kernel.org/linux-kselftest/20220630141226.2802-1-mpdesouza@sus…
This patchset moves the current kernel testing livepatch modules from
lib/livepatches to tools/testing/selftest/livepatch/test_modules, and compiles
them as out-of-tree modules before testing.
There is also a new test being added. This new test exercises multiple processes
calling a syscall, while a livepatch patched the syscall.
Why this move is an improvement:
* The modules are now compiled as out-of-tree modules against the current
running kernel, making them capable of being tested on different systems with
newer or older kernels.
* Such approach now needs kernel-devel package to be installed, since they are
out-of-tree modules. These can be generated by running "make rpm-pkg" in the
kernel source.
What needs to be solved:
* Currently gen_tar only packages the resulting binaries of the tests, and not
the sources. For the current approach, the newly added modules would be
compiled and then packaged. It works when testing on a system with the same
kernel version. But it will fail when running on a machine with different kernel
version, since module was compiled against the kernel currently running.
This is not a new problem, just aligning the expectations. For the current
approach to be truly system agnostic gen_tar would need to include the module
and program sources to be compiled in the target systems.
Thanks in advance!
Marcos
Signed-off-by: Marcos Paulo de Souza <mpdesouza(a)suse.com>
---
Marcos Paulo de Souza (3):
kselftests: lib.mk: Add TEST_GEN_MODS_DIR variable
livepatch: Move tests from lib/livepatch to selftests/livepatch
selftests: livepatch: Test livepatching a heavily called syscall
Documentation/dev-tools/kselftest.rst | 4 +
MAINTAINERS | 1 -
arch/s390/configs/debug_defconfig | 1 -
arch/s390/configs/defconfig | 1 -
lib/Kconfig.debug | 22 ----
lib/Makefile | 2 -
lib/livepatch/Makefile | 14 ---
tools/testing/selftests/lib.mk | 25 ++++-
tools/testing/selftests/livepatch/Makefile | 5 +-
tools/testing/selftests/livepatch/README | 25 +++--
tools/testing/selftests/livepatch/config | 1 -
tools/testing/selftests/livepatch/functions.sh | 34 +++---
.../testing/selftests/livepatch/test-callbacks.sh | 50 ++++-----
tools/testing/selftests/livepatch/test-ftrace.sh | 6 +-
.../testing/selftests/livepatch/test-livepatch.sh | 10 +-
.../selftests/livepatch/test-shadow-vars.sh | 2 +-
tools/testing/selftests/livepatch/test-state.sh | 18 ++--
tools/testing/selftests/livepatch/test-syscall.sh | 53 ++++++++++
tools/testing/selftests/livepatch/test-sysfs.sh | 6 +-
.../selftests/livepatch/test_klp-call_getpid.c | 44 ++++++++
.../selftests/livepatch/test_modules/Makefile | 20 ++++
.../test_modules}/test_klp_atomic_replace.c | 0
.../test_modules}/test_klp_callbacks_busy.c | 0
.../test_modules}/test_klp_callbacks_demo.c | 0
.../test_modules}/test_klp_callbacks_demo2.c | 0
.../test_modules}/test_klp_callbacks_mod.c | 0
.../livepatch/test_modules}/test_klp_livepatch.c | 0
.../livepatch/test_modules}/test_klp_shadow_vars.c | 0
.../livepatch/test_modules}/test_klp_state.c | 0
.../livepatch/test_modules}/test_klp_state2.c | 0
.../livepatch/test_modules}/test_klp_state3.c | 0
.../livepatch/test_modules/test_klp_syscall.c | 116 +++++++++++++++++++++
32 files changed, 339 insertions(+), 121 deletions(-)
---
base-commit: 89ecef4cb0ac442d5ad48c1aae1e2e1e7744d46f
change-id: 20231031-send-lp-kselftests-4c917dcd4565
Best regards,
--
Marcos Paulo de Souza <mpdesouza(a)suse.com>
Hi all:
The core frequency is subjected to the process variation in semiconductors.
Not all cores are able to reach the maximum frequency respecting the
infrastructure limits. Consequently, AMD has redefined the concept of
maximum frequency of a part. This means that a fraction of cores can reach
maximum frequency. To find the best process scheduling policy for a given
scenario, OS needs to know the core ordering informed by the platform through
highest performance capability register of the CPPC interface.
Earlier implementations of amd-pstate preferred core only support a static
core ranking and targeted performance. Now it has the ability to dynamically
change the preferred core based on the workload and platform conditions and
accounting for thermals and aging.
Amd-pstate driver utilizes the functions and data structures provided by
the ITMT architecture to enable the scheduler to favor scheduling on cores
which can be get a higher frequency with lower voltage.
We call it amd-pstate preferred core.
Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
Amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.
Amd-pstate driver will provide an initial core ordering at boot time.
It relies on the CPPC interface to communicate the core ranking to the
operating system and scheduler to make sure that OS is choosing the cores
with highest performance firstly for scheduling the process. When amd-pstate
driver receives a message with the highest performance change, it will
update the core ranking.
Changes from V11->V12:
- all:
- - pick up Reviewed-By flag added by Perry.
- cpufreq: amd-pstate:
- - rebase the latest linux-next and fixed conflicts.
- - fixed the issue about cpudata without init in amd_pstate_update_highest_perf().
Changes from V10->V11:
- cpufreq: amd-pstate:
- - according Perry's commnts, I replace the string with str_enabled_disable().
Changes from V9->V10:
- cpufreq: amd-pstate:
- - add judgement for highest_perf. When it is less than 255, the
preferred core feature is enabled. And it will set the priority.
- - deleset "static u32 max_highest_perf" etc, because amd p-state
perferred coe does not require specail process for hotpulg.
Changes form V8->V9:
- all:
- - pick up Tested-By flag added by Oleksandr.
- cpufreq: amd-pstate:
- - pick up Review-By flag added by Wyes.
- - ignore modification of bug.
- - add a attribute of prefcore_ranking.
- - modify data type conversion from u32 to int.
- Documentation: amd-pstate:
- - pick up Review-By flag added by Wyes.
Changes form V7->V8:
- all:
- - pick up Review-By flag added by Mario and Ray.
- cpufreq: amd-pstate:
- - use hw_prefcore embeds into cpudata structure.
- - delete preferred core init from cpu online/off.
Changes form V6->V7:
- x86:
- - Modify kconfig about X86_AMD_PSTATE.
- cpufreq: amd-pstate:
- - modify incorrect comments about scheduler_work().
- - convert highest_perf data type.
- - modify preferred core init when cpu init and online.
- acpi: cppc:
- - modify link of CPPC highest performance.
- cpufreq:
- - modify link of CPPC highest performance changed.
Changes form V5->V6:
- cpufreq: amd-pstate:
- - modify the wrong tag order.
- - modify warning about hw_prefcore sysfs attribute.
- - delete duplicate comments.
- - modify the variable name cppc_highest_perf to prefcore_ranking.
- - modify judgment conditions for setting highest_perf.
- - modify sysfs attribute for CPPC highest perf to pr_debug message.
- Documentation: amd-pstate:
- - modify warning: title underline too short.
Changes form V4->V5:
- cpufreq: amd-pstate:
- - modify sysfs attribute for CPPC highest perf.
- - modify warning about comments
- - rebase linux-next
- cpufreq:
- - Moidfy warning about function declarations.
- Documentation: amd-pstate:
- - align with ``amd-pstat``
Changes form V3->V4:
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
Changes form V2->V3:
- x86:
- - Modify kconfig and description.
- cpufreq: amd-pstate:
- - Add Co-developed-by tag in commit message.
- cpufreq:
- - Modify commit message.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
Changes form V1->V2:
- acpi: cppc:
- - Add reference link.
- cpufreq:
- - Moidfy link error.
- cpufreq: amd-pstate:
- - Init the priorities of all online CPUs
- - Use a single variable to represent the status of preferred core.
- Documentation:
- - Default enabled preferred core.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
- - Default enabled preferred core.
- - Use a single variable to represent the status of preferred core.
Meng Li (7):
x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
acpi: cppc: Add get the highest performance cppc control
cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
cpufreq: Add a notification message that the highest perf has changed
cpufreq: amd-pstate: Update amd-pstate preferred core ranking
dynamically
Documentation: amd-pstate: introduce amd-pstate preferred core
Documentation: introduce amd-pstate preferrd core mode kernel command
line options
.../admin-guide/kernel-parameters.txt | 5 +
Documentation/admin-guide/pm/amd-pstate.rst | 59 +++++-
arch/x86/Kconfig | 5 +-
drivers/acpi/cppc_acpi.c | 13 ++
drivers/acpi/processor_driver.c | 6 +
drivers/cpufreq/amd-pstate.c | 175 +++++++++++++++++-
drivers/cpufreq/cpufreq.c | 13 ++
include/acpi/cppc_acpi.h | 5 +
include/linux/amd-pstate.h | 10 +
include/linux/cpufreq.h | 5 +
10 files changed, 284 insertions(+), 12 deletions(-)
--
2.34.1
Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
This series is based on the first part which was merged [1], this series is to
add the cache invalidation interface or the userspace to invalidate cache after
modifying the stage-1 page table. This includes both the iommufd changes and the
VT-d driver changes.
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20231026044216.64964-1-yi.l.liu@intel.c… - merged
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v8:
- Pass invalidation hint to the cache invalidation helper in the cache_invalidate_user
op path (Kevin)
- Move the devTLB invalidation out of info->iommu loop (Kevin, Weijiang)
- Clear *fault per restart in qi_submit_sync() to avoid acroos submission error
accumulation. (Kevin)
- Define the vtd cache invalidation uapi structure in separate patch (Kevin)
- Rename inv_error to be hw_error (Kevin)
- Rename 'reqs_uptr', 'req_type', 'req_len' and 'req_num' to be 'data_uptr',
'data_type', "entry_len' and 'entry_num" (Kevin)
- Allow user to set IOMMU_TEST_INVALIDATE_FLAG_ALL and IOMMU_TEST_INVALIDATE_FLAG_TRIGGER_ERROR
in the same time (Kevin)
v7: https://lore.kernel.org/linux-iommu/20231221153948.119007-1-yi.l.liu@intel.…
- Remove domain->ops->cache_invalidate_user check in hwpt alloc path due
to failure in bisect (Baolu)
- Remove out_driver_error_code from struct iommu_hwpt_invalidate after
discussion in v6. Should expect per-entry error code.
- Rework the selftest cache invalidation part to report a per-entry error
- Allow user to pass in an empty array to have a try-and-fail mechanism for
user to check if a given req_type is supported by the kernel (Jason)
- Define a separate enum type for cache invalidation data (Jason)
- Fix the IOMMU_HWPT_INVALIDATE to always update the req_num field before
returning (Nicolin)
- Merge the VT-d nesting part 2/2
https://lore.kernel.org/linux-iommu/20231117131816.24359-1-yi.l.liu@intel.c…
into this series to avoid defining empty enum in the middle of the series.
The major difference is adding the VT-d related invalidation uapi structures
together with the generic data structures in patch 02 of this series.
- VT-d driver was refined to report ICE/ITE error from the bottom cache
invalidation submit helpers, hence the cache_invalidate_user op could
report such errors via the per-entry error field to user. VT-d driver
will not stop the invalidation array walking due to the ICE/ITE errors
as such errors are defined by VT-d spec, userspace should be able to
handle it and let the real user (say Virtual Machine) know about it.
But for other errors like invalid uapi data structure configuration,
memory copy failure, such errors should stop the array walking as it
may have more issues if go on.
- Minor fixes per Jason and Kevin's review comments
v6: https://lore.kernel.org/linux-iommu/20231117130717.19875-1-yi.l.liu@intel.c…
- No much change, just rebase on top of 6.7-rc1 as part 1/2 is merged
v5: https://lore.kernel.org/linux-iommu/20231020092426.13907-1-yi.l.liu@intel.c…
- Split the iommufd nesting series into two parts of alloc_user and
invalidation (Jason)
- Split IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING/_NESTED, and
do the same with the structures/alloc()/abort()/destroy(). Reworked the
selftest accordingly too. (Jason)
- Move hwpt/data_type into struct iommu_user_data from standalone op
arguments. (Jason)
- Rename hwpt_type to be data_type, the HWPT_TYPE to be HWPT_ALLOC_DATA,
_TYPE_DEFAULT to be _ALLOC_DATA_NONE (Jason, Kevin)
- Rename iommu_copy_user_data() to iommu_copy_struct_from_user() (Kevin)
- Add macro to the iommu_copy_struct_from_user() to calculate min_size
(Jason)
- Fix two bugs spotted by ZhaoYan
v4: https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
- Separate HWPT alloc/destroy/abort functions between user-managed HWPTs
and kernel-managed HWPTs
- Rework invalidate uAPI to be a multi-request array-based design
- Add a struct iommu_user_data_array and a helper for driver to sanitize
and copy the entry data from user space invalidation array
- Add a patch fixing TEST_LENGTH() in selftest program
- Drop IOMMU_RESV_IOVA_RANGES patches
- Update kdoc and inline comments
- Drop the code to add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation,
this does not change the rule that resv regions should only be added to the
kernel-managed HWPT. The IOMMU_RESV_SW_MSI stuff will be added in later series
as it is needed only by SMMU so far.
v3: https://lore.kernel.org/linux-iommu/20230724110406.107212-1-yi.l.liu@intel.…
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to avoid
passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Lu Baolu (4):
iommu: Add cache_invalidate_user op
iommu/vt-d: Allow qi_submit_sync() to return the QI faults
iommu/vt-d: Convert stage-1 cache invalidation to return QI fault
iommu/vt-d: Add iotlb flush for nested domain
Nicolin Chen (4):
iommu: Add iommu_copy_struct_from_user_array helper
iommufd/selftest: Add mock_domain_cache_invalidate_user support
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (2):
iommufd: Add IOMMU_HWPT_INVALIDATE
iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
drivers/iommu/intel/dmar.c | 38 ++--
drivers/iommu/intel/iommu.c | 12 +-
drivers/iommu/intel/iommu.h | 8 +-
drivers/iommu/intel/irq_remapping.c | 2 +-
drivers/iommu/intel/nested.c | 118 ++++++++++++
drivers/iommu/intel/pasid.c | 14 +-
drivers/iommu/intel/svm.c | 14 +-
drivers/iommu/iommufd/hw_pagetable.c | 41 ++++
drivers/iommu/iommufd/iommufd_private.h | 10 +
drivers/iommu/iommufd/iommufd_test.h | 39 ++++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 86 +++++++++
include/linux/iommu.h | 100 ++++++++++
include/uapi/linux/iommufd.h | 98 ++++++++++
tools/testing/selftests/iommu/iommufd.c | 179 ++++++++++++++++++
tools/testing/selftests/iommu/iommufd_utils.h | 57 ++++++
16 files changed, 781 insertions(+), 38 deletions(-)
--
2.34.1
From: Jeff Xu <jeffxu(a)chromium.org>
This patchset proposes a new mseal() syscall for the Linux kernel.
In a nutshell, mseal() protects the VMAs of a given virtual memory
range against modifications, such as changes to their permission bits.
Modern CPUs support memory permissions, such as the read/write (RW)
and no-execute (NX) bits. Linux has supported NX since the release of
kernel version 2.6.8 in August 2004 [1]. The memory permission feature
improves the security stance on memory corruption bugs, as an attacker
cannot simply write to arbitrary memory and point the code to it. The
memory must be marked with the X bit, or else an exception will occur.
Internally, the kernel maintains the memory permissions in a data
structure called VMA (vm_area_struct). mseal() additionally protects
the VMA itself against modifications of the selected seal type.
Memory sealing is useful to mitigate memory corruption issues where a
corrupted pointer is passed to a memory management system. For
example, such an attacker primitive can break control-flow integrity
guarantees since read-only memory that is supposed to be trusted can
become writable or .text pages can get remapped. Memory sealing can
automatically be applied by the runtime loader to seal .text and
.rodata pages and applications can additionally seal security critical
data at runtime. A similar feature already exists in the XNU kernel
with the VM_FLAGS_PERMANENT [3] flag and on OpenBSD with the
mimmutable syscall [4]. Also, Chrome wants to adopt this feature for
their CFI work [2] and this patchset has been designed to be
compatible with the Chrome use case.
Two system calls are involved in sealing the map: mmap() and mseal().
The new mseal() is an syscall on 64 bit CPU, and with
following signature:
int mseal(void addr, size_t len, unsigned long flags)
addr/len: memory range.
flags: reserved.
mseal() blocks following operations for the given memory range.
1> Unmapping, moving to another location, and shrinking the size,
via munmap() and mremap(), can leave an empty space, therefore can
be replaced with a VMA with a new set of attributes.
2> Moving or expanding a different VMA into the current location,
via mremap().
3> Modifying a VMA via mmap(MAP_FIXED).
4> Size expansion, via mremap(), does not appear to pose any specific
risks to sealed VMAs. It is included anyway because the use case is
unclear. In any case, users can rely on merging to expand a sealed VMA.
5> mprotect() and pkey_mprotect().
6> Some destructive madvice() behaviors (e.g. MADV_DONTNEED) for anonymous
memory, when users don't have write permission to the memory. Those
behaviors can alter region contents by discarding pages, effectively a
memset(0) for anonymous memory.
In addition: mmap() has two related changes.
The PROT_SEAL bit in prot field of mmap(). When present, it marks
the map sealed since creation.
The MAP_SEALABLE bit in the flags field of mmap(). When present, it marks
the map as sealable. A map created without MAP_SEALABLE will not support
sealing, i.e. mseal() will fail.
Applications that don't care about sealing will expect their behavior
unchanged. For those that need sealing support, opt-in by adding
MAP_SEALABLE in mmap().
The idea that inspired this patch comes from Stephen Röttger’s work in
V8 CFI [5]. Chrome browser in ChromeOS will be the first user of this
API.
Indeed, the Chrome browser has very specific requirements for sealing,
which are distinct from those of most applications. For example, in
the case of libc, sealing is only applied to read-only (RO) or
read-execute (RX) memory segments (such as .text and .RELRO) to
prevent them from becoming writable, the lifetime of those mappings
are tied to the lifetime of the process.
Chrome wants to seal two large address space reservations that are
managed by different allocators. The memory is mapped RW- and RWX
respectively but write access to it is restricted using pkeys (or in
the future ARM permission overlay extensions). The lifetime of those
mappings are not tied to the lifetime of the process, therefore, while
the memory is sealed, the allocators still need to free or discard the
unused memory. For example, with madvise(DONTNEED).
However, always allowing madvise(DONTNEED) on this range poses a
security risk. For example if a jump instruction crosses a page
boundary and the second page gets discarded, it will overwrite the
target bytes with zeros and change the control flow. Checking
write-permission before the discard operation allows us to control
when the operation is valid. In this case, the madvise will only
succeed if the executing thread has PKEY write permissions and PKRU
changes are protected in software by control-flow integrity.
Although the initial version of this patch series is targeting the
Chrome browser as its first user, it became evident during upstream
discussions that we would also want to ensure that the patch set
eventually is a complete solution for memory sealing and compatible
with other use cases. The specific scenario currently in mind is
glibc's use case of loading and sealing ELF executables. To this end,
Stephen is working on a change to glibc to add sealing support to the
dynamic linker, which will seal all non-writable segments at startup.
Once this work is completed, all applications will be able to
automatically benefit from these new protections.
Change history:
===============
V5:
- fix build issue in mseal-Wire-up-mseal-syscall
(Suggested by Linus Torvalds, and Greg KH)
- updates on selftest.
V4:
(Suggested by Linus Torvalds)
- new signature: mseal(start,len,flags)
- 32 bit is not supported. vm_seal is removed, use vm_flags instead.
- single bit in vm_flags for sealed state.
- CONFIG_MSEAL kernel config is removed.
- single bit of PROT_SEAL in the "Prot" field of mmap().
Other changes:
- update selftest (Suggested by Muhammad Usama Anjum)
- update documentation.
https://lore.kernel.org/all/20240104185138.169307-1-jeffxu@chromium.org/
V3:
- Abandon per-syscall approach, (Suggested by Linus Torvalds).
- Organize sealing types around their functionality, such as
MM_SEAL_BASE, MM_SEAL_PROT_PKEY.
- Extend the scope of sealing from calls originated in userspace to
both kernel and userspace. (Suggested by Linus Torvalds)
- Add seal type support in mmap(). (Suggested by Pedro Falcato)
- Add a new sealing type: MM_SEAL_DISCARD_RO_ANON to prevent
destructive operations of madvise. (Suggested by Jann Horn and
Stephen Röttger)
- Make sealed VMAs mergeable. (Suggested by Jann Horn)
- Add MAP_SEALABLE to mmap()
- Add documentation - mseal.rst
https://lore.kernel.org/linux-mm/20231212231706.2680890-2-jeffxu@chromium.o…
v2:
Use _BITUL to define MM_SEAL_XX type.
Use unsigned long for seal type in sys_mseal() and other functions.
Remove internal VM_SEAL_XX type and convert_user_seal_type().
Remove MM_ACTION_XX type.
Remove caller_origin(ON_BEHALF_OF_XX) and replace with sealing bitmask.
Add more comments in code.
Add a detailed commit message.
https://lore.kernel.org/lkml/20231017090815.1067790-1-jeffxu@chromium.org/
v1:
https://lore.kernel.org/lkml/20231016143828.647848-1-jeffxu@chromium.org/
----------------------------------------------------------------
[1] https://kernelnewbies.org/Linux_2_6_8
[2] https://v8.dev/blog/control-flow-integrity
[3] https://github.com/apple-oss-distributions/xnu/blob/1031c584a5e37aff177559b…
[4] https://man.openbsd.org/mimmutable.2
[5] https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyXge…
[6] https://lore.kernel.org/lkml/CAG48ez3ShUYey+ZAFsU2i1RpQn0a5eOs2hzQ426Fkcgnf…
[7] https://lore.kernel.org/lkml/20230515130553.2311248-1-jeffxu@chromium.org/
Jeff Xu (4):
mseal: Wire up mseal syscall
mseal: add mseal syscall
selftest mm/mseal memory sealing
mseal:add documentation
Documentation/userspace-api/mseal.rst | 181 ++
arch/alpha/kernel/syscalls/syscall.tbl | 1 +
arch/arm/tools/syscall.tbl | 1 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 2 +
arch/m68k/kernel/syscalls/syscall.tbl | 1 +
arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
arch/parisc/kernel/syscalls/syscall.tbl | 1 +
arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
arch/s390/kernel/syscalls/syscall.tbl | 1 +
arch/sh/kernel/syscalls/syscall.tbl | 1 +
arch/sparc/kernel/syscalls/syscall.tbl | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
include/linux/mm.h | 60 +
include/linux/syscalls.h | 1 +
include/uapi/asm-generic/mman-common.h | 7 +
include/uapi/asm-generic/unistd.h | 5 +-
kernel/sys_ni.c | 1 +
mm/Makefile | 4 +
mm/madvise.c | 12 +
mm/mmap.c | 27 +
mm/mprotect.c | 10 +
mm/mremap.c | 31 +
mm/mseal.c | 330 +++
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/mseal_test.c | 1989 +++++++++++++++++++
32 files changed, 2677 insertions(+), 2 deletions(-)
create mode 100644 Documentation/userspace-api/mseal.rst
create mode 100644 mm/mseal.c
create mode 100644 tools/testing/selftests/mm/mseal_test.c
--
2.43.0.195.gebba966016-goog
The kernel sefltest mm/hugepage-vmemmap fails on architectures
which has different page size other than 4K. In hugepage-vmemmap
page size used is 4k so the pfn calculation will go wrong on systems
which has different page size .The length of MAP_HUGETLB memory must
be hugepage aligned but in hugepage-vmemmap map length is 2M so this
will not get aligned if the system has differnet hugepage size.
Added psize() to get the page size and default_huge_page_size() to
get the default hugepage size at run time, hugepage-vmemmap test pass
on powerpc with 64K page size and x86 with 4K page size.
Result on powerpc without patch (page size 64K)
*# ./hugepage-vmemmap
Returned address is 0x7effff000000 whose pfn is 0
Head page flags (100000000) is invalid
check_page_flags: Invalid argument
*#
Result on powerpc with patch (page size 64K)
*# ./hugepage-vmemmap
Returned address is 0x7effff000000 whose pfn is 600
*#
Result on x86 with patch (page size 4K)
*# ./hugepage-vmemmap
Returned address is 0x7fc7c2c00000 whose pfn is 1dac00
*#
Signed-off-by: Donet Tom <donettom(a)linux.vnet.ibm.com>
Reported-by : Geetika Moolchandani (geetika(a)linux.ibm.com)
Tested-by : Geetika Moolchandani (geetika(a)linux.ibm.com)
---
tools/testing/selftests/mm/hugepage-vmemmap.c | 29 ++++++++++++-------
1 file changed, 18 insertions(+), 11 deletions(-)
diff --git a/tools/testing/selftests/mm/hugepage-vmemmap.c b/tools/testing/selftests/mm/hugepage-vmemmap.c
index 5b354c209e93..894d28c3dd47 100644
--- a/tools/testing/selftests/mm/hugepage-vmemmap.c
+++ b/tools/testing/selftests/mm/hugepage-vmemmap.c
@@ -10,10 +10,7 @@
#include <unistd.h>
#include <sys/mman.h>
#include <fcntl.h>
-
-#define MAP_LENGTH (2UL * 1024 * 1024)
-
-#define PAGE_SIZE 4096
+#include "vm_util.h"
#define PAGE_COMPOUND_HEAD (1UL << 15)
#define PAGE_COMPOUND_TAIL (1UL << 16)
@@ -39,6 +36,9 @@
#define MAP_FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB)
#endif
+static size_t pagesize;
+static size_t maplength;
+
static void write_bytes(char *addr, size_t length)
{
unsigned long i;
@@ -56,7 +56,7 @@ static unsigned long virt_to_pfn(void *addr)
if (fd < 0)
return -1UL;
- lseek(fd, (unsigned long)addr / PAGE_SIZE * sizeof(pagemap), SEEK_SET);
+ lseek(fd, (unsigned long)addr / pagesize * sizeof(pagemap), SEEK_SET);
read(fd, &pagemap, sizeof(pagemap));
close(fd);
@@ -86,7 +86,7 @@ static int check_page_flags(unsigned long pfn)
* this also verifies kernel has correctly set the fake page_head to tail
* while hugetlb_free_vmemmap is enabled.
*/
- for (i = 1; i < MAP_LENGTH / PAGE_SIZE; i++) {
+ for (i = 1; i < maplength / pagesize; i++) {
read(fd, &pageflags, sizeof(pageflags));
if ((pageflags & TAIL_PAGE_FLAGS) != TAIL_PAGE_FLAGS ||
(pageflags & HEAD_PAGE_FLAGS) == HEAD_PAGE_FLAGS) {
@@ -106,18 +106,25 @@ int main(int argc, char **argv)
void *addr;
unsigned long pfn;
- addr = mmap(MAP_ADDR, MAP_LENGTH, PROT_READ | PROT_WRITE, MAP_FLAGS, -1, 0);
+ pagesize = psize();
+ maplength = default_huge_page_size();
+ if (!maplength) {
+ printf("Unable to determine huge page size\n");
+ exit(1);
+ }
+
+ addr = mmap(MAP_ADDR, maplength, PROT_READ | PROT_WRITE, MAP_FLAGS, -1, 0);
if (addr == MAP_FAILED) {
perror("mmap");
exit(1);
}
/* Trigger allocation of HugeTLB page. */
- write_bytes(addr, MAP_LENGTH);
+ write_bytes(addr, maplength);
pfn = virt_to_pfn(addr);
if (pfn == -1UL) {
- munmap(addr, MAP_LENGTH);
+ munmap(addr, maplength);
perror("virt_to_pfn");
exit(1);
}
@@ -125,13 +132,13 @@ int main(int argc, char **argv)
printf("Returned address is %p whose pfn is %lx\n", addr, pfn);
if (check_page_flags(pfn) < 0) {
- munmap(addr, MAP_LENGTH);
+ munmap(addr, maplength);
perror("check_page_flags");
exit(1);
}
/* munmap() length of MAP_HUGETLB memory must be hugepage aligned */
- if (munmap(addr, MAP_LENGTH)) {
+ if (munmap(addr, maplength)) {
perror("munmap");
exit(1);
}
--
2.43.0
This test case triggers a race between madvise(MADV_DONTNEED) and
mmap() in a single huge page, which got stolen (while reserved).
Once the only page is stolen, the memory previously mmaped (and
madvise(MADV_DONTNEED) got a SIGBUS when accessed.
I am not adding this test to the un_vmtests.sh scripts, since this test
fails at upstream.
Breno Leitao (1):
selftests/mm: add a new test for madv and hugetlb mmap
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 1 +
.../selftests/mm/hugetlb_madv_vs_map.c | 124 ++++++++++++++++++
3 files changed, 126 insertions(+)
create mode 100644 tools/testing/selftests/mm/hugetlb_madv_vs_map.c
--
2.34.1
Hi Linus,
Please pull the nolibc update for Linux 6.8-rc1.
This nolibc update for Linux 6.8-rc1 consists of:
* Support for PIC mode on MIPS.
* Support for getrlimit()/setrlimit().
* Replace some custom declarations with UAPI includes.
* A new script "run-tests.sh" to run the testsuite over different architectures
and configurations.
* A few non-functional code cleanups.
* Minor improvements to nolibc-test, primarily to support the test script.
There are no urgent fixes available at this time.
diff is attached. Build and nolibc tests were run on next.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit b85ea95d086471afb4ad062012a4d73cd328fa86:
Linux 6.7-rc1 (2023-11-12 16:19:07 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-nolibc-6.8-rc1
for you to fetch changes up to d543d9ddf593b1f4cb1d57d9ac0ad279fe18adaf:
selftests/nolibc: disable coredump via setrlimit (2023-12-11 22:38:37 +0100)
----------------------------------------------------------------
linux_kselftest-nolibc-6.8-rc1
This nolibc update for Linux 6.8-rc1 consists of:
* Support for PIC mode on MIPS.
* Support for getrlimit()/setrlimit().
* Replace some custom declarations with UAPI includes.
* A new script "run-tests.sh" to run the testsuite over different architectures
and configurations.
* A few non-functional code cleanups.
* Minor improvements to nolibc-test, primarily to support the test script.
There are no urgent fixes available at this time.
----------------------------------------------------------------
Mark Brown (1):
tools/nolibc: Use linux/wait.h rather than duplicating it
Thomas Weißschuh (21):
selftests/nolibc: don't hang on config input
selftests/nolibc: use EFI -bios for LoongArch qemu
selftests/nolibc: anchor paths in $(srcdir) if possible
selftests/nolibc: support out-of-tree builds
selftests/nolibc: add script to run testsuite
tools/nolibc: error out on unsupported architecture
tools/nolibc: move MIPS ABI validation into arch-mips.h
selftests/nolibc: use XARCH for MIPS
selftests/nolibc: explicitly specify ABI for MIPS
selftests/nolibc: extraconfig support
selftests/nolibc: add configuration for mipso32be
selftests/nolibc: fix testcase status alignment
selftests/nolibc: introduce QEMU_ARCH_USER
selftests/nolibc: run-tests.sh: enable testing via qemu-user
tools/nolibc: mips: add support for PIC
selftests/nolibc: make result alignment more robust
tools/nolibc: annotate va_list printf formats
tools/nolibc: drop duplicated testcase ioctl_tiocinq
tools/nolibc: drop custom definition of struct rusage
tools/nolibc: add support for getrlimit/setrlimit
selftests/nolibc: disable coredump via setrlimit
tools/include/nolibc/arch-mips.h | 11 +-
tools/include/nolibc/arch.h | 4 +-
tools/include/nolibc/stdio.h | 4 +-
tools/include/nolibc/sys.h | 38 ++++++
tools/include/nolibc/types.h | 25 +---
tools/testing/selftests/nolibc/.gitignore | 1 +
tools/testing/selftests/nolibc/Makefile | 65 ++++++++---
tools/testing/selftests/nolibc/nolibc-test.c | 51 ++++++--
tools/testing/selftests/nolibc/run-tests.sh | 169 +++++++++++++++++++++++++++
9 files changed, 318 insertions(+), 50 deletions(-)
create mode 100755 tools/testing/selftests/nolibc/run-tests.sh
----------------------------------------------------------------
Hi Linus,
Please pull the following Kselftest update for Linux 6.8-rc1.
This kselftest update for Linux 6.8-rc1 consists of enhancements
to reporting test results, fixes to root and user run behavior
and fixing ksft_print_msg() calls.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit b85ea95d086471afb4ad062012a4d73cd328fa86:
Linux 6.7-rc1 (2023-11-12 16:19:07 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-next-6.8-rc1
for you to fetch changes up to ee9793be08b1a1c29308a099c01790a3befb390a:
tracing/selftests: Add ownership modification tests for eventfs (2023-12-22 10:01:41 -0700)
----------------------------------------------------------------
linux_kselftest-next-6.8-rc1
This kselftest update for Linux 6.8-rc1 consists of enhancements
to reporting test results, fixes to root and user run behavior
and fixing ksft_print_msg() calls.
----------------------------------------------------------------
Atul Kumar Pant (1):
selftests: sched: Remove initialization to 0 for a static variable
Mark Brown (3):
kselftest/vDSO: Make test name reporting for vdso_abi_test tooling friendly
kselftest/vDSO: Fix message formatting for clock_id logging
kselftest/vDSO: Use ksft_print_msg() rather than printf in vdso_test_abi
Osama Muhammad (1):
selftests: prctl: Add prctl test for PR_GET_NAME
Steven Rostedt (Google) (1):
tracing/selftests: Add ownership modification tests for eventfs
Swarup Laxman Kotiaklapudi (1):
selftests: capabilities: namespace create varies for root and normal user
angquan yu (3):
selftests:breakpoints: Fix Format String Warning in breakpoint_test
selftests/breakpoints: Fix format specifier in ksft_print_msg in step_after_suspend_test.c
selftests:x86: Fix Format String Warnings in lam.c
.../selftests/breakpoints/breakpoint_test.c | 4 +-
.../breakpoints/step_after_suspend_test.c | 2 +-
tools/testing/selftests/capabilities/test_execve.c | 6 +-
.../ftrace/test.d/00basic/test_ownership.tc | 114 +++++++++++++++++++++
tools/testing/selftests/prctl/set-process-name.c | 32 ++++++
tools/testing/selftests/sched/cs_prctl_test.c | 2 +-
tools/testing/selftests/vDSO/vdso_test_abi.c | 72 +++++++------
tools/testing/selftests/x86/lam.c | 4 +-
8 files changed, 192 insertions(+), 44 deletions(-)
create mode 100644 tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
----------------------------------------------------------------
Hi Linus,
Please pull the following KUnit next update for Linux 6.8-rc1.
This KUnit update for Linux 6.8-rc1 consists of:
- a new feature that adds APIs for managing devices introducing
a set of helper functions which allow devices (internally a
struct kunit_device) to be created and managed by KUnit.
These devices will be automatically unregistered on
test exit. These helpers can either use a user-provided
struct device_driver, or have one automatically created and
managed by KUnit. In both cases, the device lives on a new
kunit_bus.
- changes to switch drm/tests to use kunit devices
- several fixes and enhancements to attribute feature
- changes to reorganize deferred action function introducing
KUNIT_DEFINE_ACTION_WRAPPER
- new feature adds ability to run tests after boot using debugfs
- fixes and enhancements to string-stream-test:
- parse ERR_PTR in string_stream_destroy()
- unchecked dereference in bug fix in debugfs_print_results()
- handling errors from alloc_string_stream()
- NULL-dereference bug fix in kunit_init_suite()
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit ceb6a6f023fd3e8b07761ed900352ef574010bcb:
Linux 6.7-rc6 (2023-12-17 15:19:28 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-kunit-6.8-rc1
for you to fetch changes up to 539e582a375dedee95a4fa9ca3f37cdb25c441ec:
kunit: Fix some comments which were mistakenly kerneldoc (2024-01-03 09:10:37 -0700)
----------------------------------------------------------------
linux_kselftest-kunit-6.8-rc1
This KUnit update for Linux 6.8-rc1 consists of:
- a new feature that adds APIs for managing devices introducing
a set of helper functions which allow devices (internally a
struct kunit_device) to be created and managed by KUnit.
These devices will be automatically unregistered on
test exit. These helpers can either use a user-provided
struct device_driver, or have one automatically created and
managed by KUnit. In both cases, the device lives on a new
kunit_bus.
- changes to switch drm/tests to use kunit devices
- several fixes and enhancements to attribute feature
- changes to reorganize deferred action function introducing
KUNIT_DEFINE_ACTION_WRAPPER
- new feature adds ability to run tests after boot using debugfs
- fixes and enhancements to string-stream-test:
- parse ERR_PTR in string_stream_destroy()
- unchecked dereference in bug fix in debugfs_print_results()
- handling errors from alloc_string_stream()
- NULL-dereference bug fix in kunit_init_suite()
----------------------------------------------------------------
David Gow (4):
kunit: Add a macro to wrap a deferred action function
drm/tests: Use KUNIT_DEFINE_ACTION_WRAPPER()
drm/vc4: tests: Use KUNIT_DEFINE_ACTION_WRAPPER
kunit: Fix some comments which were mistakenly kerneldoc
Maxime Ripard (1):
drm/tests: Switch to kunit devices
Michal Wajdeczko (2):
kunit: Add example for using test->priv
kunit: Reset test->priv after each param iteration
Rae Moar (8):
kunit: tool: fix parsing of test attributes
kunit: tool: add test for parsing attributes
kunit: move KUNIT_TABLE out of INIT_DATA
kunit: add KUNIT_INIT_TABLE to init linker section
kunit: add example suite to test init suites
kunit: add is_init test attribute
kunit: add ability to run tests after boot using debugfs
Documentation: Add debugfs docs with run after boot
Richard Fitzgerald (8):
kunit: string-stream-test: Avoid cast warning when testing gfp_t flags
kunit: string-stream: Allow ERR_PTR to be passed to string_stream_destroy()
kunit: debugfs: Fix unchecked dereference in debugfs_print_results()
kunit: debugfs: Handle errors from alloc_string_stream()
kunit: Fix NULL-dereference in kunit_init_suite() if suite->log is NULL
kunit: Allow passing function pointer to kunit_activate_static_stub()
kunit: Add example of kunit_activate_static_stub() with pointer-to-function
kunit: Protect string comparisons against NULL
davidgow(a)google.com (4):
kunit: Add APIs for managing devices
fortify: test: Use kunit_device
overflow: Replace fake root_device with kunit_device
ASoC: topology: Replace fake root_device with kunit_device in tests
Documentation/dev-tools/kunit/api/resource.rst | 9 +
Documentation/dev-tools/kunit/run_manual.rst | 51 +++++-
Documentation/dev-tools/kunit/running_tips.rst | 7 +
Documentation/dev-tools/kunit/usage.rst | 60 ++++++-
drivers/gpu/drm/tests/drm_kunit_helpers.c | 78 +--------
drivers/gpu/drm/vc4/tests/vc4_mock.c | 9 +-
include/asm-generic/vmlinux.lds.h | 11 +-
include/kunit/device.h | 80 +++++++++
include/kunit/resource.h | 21 +++
include/kunit/static_stub.h | 2 +-
include/kunit/test.h | 33 ++--
include/linux/module.h | 2 +
kernel/module/main.c | 3 +
lib/fortify_kunit.c | 5 +-
lib/kunit/Makefile | 3 +-
lib/kunit/attributes.c | 60 +++++++
lib/kunit/debugfs.c | 102 +++++++++++-
lib/kunit/device-impl.h | 17 ++
lib/kunit/device.c | 181 +++++++++++++++++++++
lib/kunit/executor.c | 68 +++++++-
lib/kunit/kunit-example-test.c | 87 ++++++++++
lib/kunit/kunit-test.c | 139 +++++++++++++++-
lib/kunit/string-stream-test.c | 2 +-
lib/kunit/string-stream.c | 2 +-
lib/kunit/test.c | 48 +++++-
lib/overflow_kunit.c | 5 +-
sound/soc/soc-topology-test.c | 10 +-
tools/testing/kunit/kunit_parser.py | 4 +-
tools/testing/kunit/kunit_tool_test.py | 16 ++
.../kunit/test_data/test_parse_attributes.log | 9 +
30 files changed, 978 insertions(+), 146 deletions(-)
create mode 100644 include/kunit/device.h
create mode 100644 lib/kunit/device-impl.h
create mode 100644 lib/kunit/device.c
create mode 100644 tools/testing/kunit/test_data/test_parse_attributes.log
----------------------------------------------------------------
Changes in v4:
* Documented how to compile the livepatch selftests without running the
tests (Joe)
* Removed the mention to lib/livepatch on MAINTAINERS file, reported by
checkpatch.
Changes in v3:
* Rebased on top of v6.6-rc5
* The commits messages were improved (Thanks Petr!)
* Created TEST_GEN_MODS_DIR variable to point to a directly that contains kernel
modules, and adapt selftests to build it before running the test.
* Moved test_klp-call_getpid out of test_programs, since the gen_tar
would just copy the generated test programs to the livepatches dir,
and so scripts relying on test_programs/test_klp-call_getpid will fail.
* Added a module_param for klp_pids, describing it's usage.
* Simplified the call_getpid program to ignore the return of getpid syscall,
since we only want to make sure the process transitions correctly to the
patched stated
* The test-syscall.sh not prints a log message showing the number of remaining
processes to transition into to livepatched state, and check_output expects it
to be 0.
* Added MODULE_AUTHOR and MODULE_DESCRIPTION to test_klp_syscall.c
- Link to v3: https://lore.kernel.org/r/20231031-send-lp-kselftests-v3-0-2b1655c2605f@sus…
- Link to v2: https://lore.kernel.org/linux-kselftest/20220630141226.2802-1-mpdesouza@sus…
This patchset moves the current kernel testing livepatch modules from
lib/livepatches to tools/testing/selftest/livepatch/test_modules, and compiles
them as out-of-tree modules before testing.
There is also a new test being added. This new test exercises multiple processes
calling a syscall, while a livepatch patched the syscall.
Why this move is an improvement:
* The modules are now compiled as out-of-tree modules against the current
running kernel, making them capable of being tested on different systems with
newer or older kernels.
* Such approach now needs kernel-devel package to be installed, since they are
out-of-tree modules. These can be generated by running "make rpm-pkg" in the
kernel source.
What needs to be solved:
* Currently gen_tar only packages the resulting binaries of the tests, and not
the sources. For the current approach, the newly added modules would be
compiled and then packaged. It works when testing on a system with the same
kernel version. But it will fail when running on a machine with different kernel
version, since module was compiled against the kernel currently running.
This is not a new problem, just aligning the expectations. For the current
approach to be truly system agnostic gen_tar would need to include the module
and program sources to be compiled in the target systems.
Thanks in advance!
Marcos
Signed-off-by: Marcos Paulo de Souza <mpdesouza(a)suse.com>
---
Marcos Paulo de Souza (3):
kselftests: lib.mk: Add TEST_GEN_MODS_DIR variable
livepatch: Move tests from lib/livepatch to selftests/livepatch
selftests: livepatch: Test livepatching a heavily called syscall
Documentation/dev-tools/kselftest.rst | 4 +
MAINTAINERS | 1 -
arch/s390/configs/debug_defconfig | 1 -
arch/s390/configs/defconfig | 1 -
lib/Kconfig.debug | 22 ----
lib/Makefile | 2 -
lib/livepatch/Makefile | 14 ---
tools/testing/selftests/lib.mk | 20 +++-
tools/testing/selftests/livepatch/Makefile | 5 +-
tools/testing/selftests/livepatch/README | 25 +++--
tools/testing/selftests/livepatch/config | 1 -
tools/testing/selftests/livepatch/functions.sh | 34 +++---
.../testing/selftests/livepatch/test-callbacks.sh | 50 ++++-----
tools/testing/selftests/livepatch/test-ftrace.sh | 6 +-
.../testing/selftests/livepatch/test-livepatch.sh | 10 +-
.../selftests/livepatch/test-shadow-vars.sh | 2 +-
tools/testing/selftests/livepatch/test-state.sh | 18 ++--
tools/testing/selftests/livepatch/test-syscall.sh | 53 ++++++++++
tools/testing/selftests/livepatch/test-sysfs.sh | 6 +-
.../selftests/livepatch/test_klp-call_getpid.c | 44 ++++++++
.../selftests/livepatch/test_modules/Makefile | 20 ++++
.../test_modules}/test_klp_atomic_replace.c | 0
.../test_modules}/test_klp_callbacks_busy.c | 0
.../test_modules}/test_klp_callbacks_demo.c | 0
.../test_modules}/test_klp_callbacks_demo2.c | 0
.../test_modules}/test_klp_callbacks_mod.c | 0
.../livepatch/test_modules}/test_klp_livepatch.c | 0
.../livepatch/test_modules}/test_klp_shadow_vars.c | 0
.../livepatch/test_modules}/test_klp_state.c | 0
.../livepatch/test_modules}/test_klp_state2.c | 0
.../livepatch/test_modules}/test_klp_state3.c | 0
.../livepatch/test_modules/test_klp_syscall.c | 116 +++++++++++++++++++++
32 files changed, 334 insertions(+), 121 deletions(-)
---
base-commit: 206ed72d6b33f53b2a8bf043f54ed6734121d26b
change-id: 20231031-send-lp-kselftests-4c917dcd4565
Best regards,
--
Marcos Paulo de Souza <mpdesouza(a)suse.com>
Minor consistency fixes.
They clear a couple of compiler format string warnings.
[1/2] is the fix of an obvious typo in the format specifier
[2/2] is securing the print function against spurious format specifiers
in passed paramater string
Mirsad Todorovac (2):
selftest: breakpoints: fix a minor typo in the format string
selftest: breakpoints: clear the format string warning and secure the
output
tools/testing/selftests/breakpoints/breakpoint_test.c | 4 ++--
tools/testing/selftests/breakpoints/step_after_suspend_test.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
--
2.40.1
Minor fixes of compiler warnings and one bug in the number of parameters which
would not crash the test but it is better fixed for correctness sake.
As the general climate in the Linux kernel community is to fix all compiler
warnings, this could be on the right track, even if only in the testing suite.
Changelog:
v1 -> v2:
- Compared to v1, commit subject lines have been adjusted to reflect the style
of the subsystem, as suggested by Mark.
- 1/4 was already acked and unchanged (adjusted the subject line as suggested)
(code unchanged)
- 2/4 was acked with suggestion to adjust the subject line (done).
(code unchanged)
- 3/4 The format specifier was changed from %d to %u as suggested.
- The 4/4 submitted for review (in the v1 it was delayed by an omission).
(code unchanged)
Mirsad Todorovac (4):
kselftest/alsa - mixer-test: fix the number of parameters to
ksft_exit_fail_msg()
kselftest/alsa - mixer-test: Fix the print format specifier warning
kselftest/alsa - mixer-test: Fix the print format specifier warning
kselftest/alsa - conf: Stringify the printed errno in sysfs_get()
tools/testing/selftests/alsa/conf.c | 2 +-
tools/testing/selftests/alsa/mixer-test.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
--
2.40.1
Some aarch64 systems running a PREEMPT_RT patched kernel, needs
more time to complete the test.
This change mirrors:
commit ba83af059153 ("Improve stability of find_vma BPF test")
addressing similar requirements and allowing the QTI SA8775P based
systems, and others, to complete the test when running RT kernel.
Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati(a)gmail.com>
---
tools/testing/selftests/bpf/prog_tests/find_vma.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/find_vma.c b/tools/testing/selftests/bpf/prog_tests/find_vma.c
index 5165b38f0e59..43d62db8d57b 100644
--- a/tools/testing/selftests/bpf/prog_tests/find_vma.c
+++ b/tools/testing/selftests/bpf/prog_tests/find_vma.c
@@ -51,7 +51,7 @@ static void test_find_vma_pe(struct find_vma *skel)
struct bpf_link *link = NULL;
volatile int j = 0;
int pfd, i;
- const int one_bn = 1000000000;
+ const int dummy_wait = 2500000000;
pfd = open_pe();
if (pfd < 0) {
@@ -68,10 +68,10 @@ static void test_find_vma_pe(struct find_vma *skel)
if (!ASSERT_OK_PTR(link, "attach_perf_event"))
goto cleanup;
- for (i = 0; i < one_bn && find_vma_pe_condition(skel); ++i)
+ for (i = 0; i < dummy_wait && find_vma_pe_condition(skel); ++i)
++j;
- test_and_reset_skel(skel, -EBUSY /* in nmi, irq_work is busy */, i == one_bn);
+ test_and_reset_skel(skel, -EBUSY /* in nmi, irq_work is busy */, i == dummy_wait);
cleanup:
bpf_link__destroy(link);
close(pfd);
--
2.34.1
Now that we have the VISIBLE_IF_KUNIT and EXPORT_SYMBOL_IF_KUNIT macros,
update the instructions to stop recommending including .c files.
Signed-off-by: Arthur Grillo <arthurgrillo(a)riseup.net>
---
Changes in v2:
- Fix #if condition
- Link to v1: https://lore.kernel.org/r/20240108-kunit-doc-export-v1-1-119368df0d96@riseu…
---
Documentation/dev-tools/kunit/usage.rst | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/Documentation/dev-tools/kunit/usage.rst b/Documentation/dev-tools/kunit/usage.rst
index c27e1646ecd9..f095c6bb76ff 100644
--- a/Documentation/dev-tools/kunit/usage.rst
+++ b/Documentation/dev-tools/kunit/usage.rst
@@ -671,19 +671,22 @@ Testing Static Functions
------------------------
If we do not want to expose functions or variables for testing, one option is to
-conditionally ``#include`` the test file at the end of your .c file. For
-example:
+conditionally export the used symbol.
.. code-block:: c
/* In my_file.c */
- static int do_interesting_thing();
+ VISIBLE_IF_KUNIT int do_interesting_thing();
+ EXPORT_SYMBOL_IF_KUNIT(do_interesting_thing);
- #ifdef CONFIG_MY_KUNIT_TEST
- #include "my_kunit_test.c"
+ /* In my_file.h */
+
+ #if IS_ENABLED(CONFIG_KUNIT)
+ int do_interesting_thing(void);
#endif
+
Injecting Test-Only Code
------------------------
---
base-commit: eeb8e8d9f124f279e80ae679f4ba6e822ce4f95f
change-id: 20240108-kunit-doc-export-eec1f910ab67
Best regards,
--
Arthur Grillo <arthurgrillo(a)riseup.net>
Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
kunit_free_suite_set()") fixed a wild-memory-access bug that could have
happened during the loading phase of test suites built and executed as
loadable modules. However, it also introduced a problematic side effect
that causes test suites modules to crash when they attempt to register
fake devices.
When a module is loaded, it traverses the MODULE_STATE_UNFORMED and
MODULE_STATE_COMING states before reaching the normal operating state
MODULE_STATE_LIVE. Finally, when the module is removed, it moves to
MODULE_STATE_GOING before being released. However, if the loading
function load_module() fails between complete_formation() and
do_init_module(), the module goes directly from MODULE_STATE_COMING to
MODULE_STATE_GOING without passing through MODULE_STATE_LIVE.
This behavior was causing kunit_module_exit() to be called without
having first executed kunit_module_init(). Since kunit_module_exit() is
responsible for freeing the memory allocated by kunit_module_init()
through kunit_filter_suites(), this behavior was resulting in a
wild-memory-access bug.
Commit 2810c1e99867 ("kunit: Fix wild-memory-access bug in
kunit_free_suite_set()") fixed this issue by running the tests when the
module is still in MODULE_STATE_COMING. However, modules in that state
are not fully initialized, lacking sysfs kobjects. Therefore, if a test
module attempts to register a fake device, it will inevitably crash.
This patch proposes a different approach to fix the original
wild-memory-access bug while restoring the normal module execution flow
by making kunit_module_exit() able to detect if kunit_module_init() has
previously initialized the tests suite set. In this way, test modules
can once again register fake devices without crashing.
This behavior is achieved by checking whether mod->kunit_suites is a
virtual or direct mapping address. If it is a virtual address, then
kunit_module_init() has allocated the suite_set in kunit_filter_suites()
using kmalloc_array(). On the contrary, if mod->kunit_suites is still
pointing to the original address that was set when looking up the
.kunit_test_suites section of the module, then the loading phase has
failed and there's no memory to be freed.
v3:
- add a comment to clarify why the start address is checked
v2:
- add include <linux/mm.h>
Fixes: 2810c1e99867 ("kunit: Fix wild-memory-access bug in kunit_free_suite_set()")
Tested-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
Reviewed-by: Javier Martinez Canillas <javierm(a)redhat.com>
Signed-off-by: Marco Pagani <marpagan(a)redhat.com>
---
lib/kunit/test.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 7aceb07a1af9..3263e0d5e0f6 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -16,6 +16,7 @@
#include <linux/panic.h>
#include <linux/sched/debug.h>
#include <linux/sched.h>
+#include <linux/mm.h>
#include "debugfs.h"
#include "hooks-impl.h"
@@ -775,12 +776,19 @@ static void kunit_module_exit(struct module *mod)
};
const char *action = kunit_action();
+ /*
+ * Check if the start address is a valid virtual address to detect
+ * if the module load sequence has failed and the suite set has not
+ * been initialized and filtered.
+ */
+ if (!suite_set.start || !virt_addr_valid(suite_set.start))
+ return;
+
if (!action)
__kunit_test_suites_exit(mod->kunit_suites,
mod->num_kunit_suites);
- if (suite_set.start)
- kunit_free_suite_set(suite_set);
+ kunit_free_suite_set(suite_set);
}
static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
@@ -790,12 +798,12 @@ static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
switch (val) {
case MODULE_STATE_LIVE:
+ kunit_module_init(mod);
break;
case MODULE_STATE_GOING:
kunit_module_exit(mod);
break;
case MODULE_STATE_COMING:
- kunit_module_init(mod);
break;
case MODULE_STATE_UNFORMED:
break;
base-commit: 33cc938e65a98f1d29d0a18403dbbee050dcad9a
--
2.43.0
This is the second part to add Intel VT-d nested translation based on IOMMUFD
nesting infrastructure. As the iommufd nesting infrastructure series [1],
iommu core supports new ops to invalidate the cache after the modifictions
in stage-1 page table. So far, the cache invalidation data is vendor specific,
the data_type (IOMMU_HWPT_DATA_VTD_S1) defined for the vendor specific HWPT
allocation is reused in the cache invalidation path. User should provide the
correct data_type that suit with the type used in HWPT allocation.
IOMMU_HWPT_INVALIDATE iotcl returns an error in @out_driver_error_code. However
Intel VT-d does not define error code so far, so it's not easy to pre-define it
in iommufd neither. As a result, this field should just be ignored on VT-d platform.
Complete code can be found in [2], corresponding QEMU could can be found in [3].
[1] https://lore.kernel.org/linux-iommu/20231117130717.19875-1-yi.l.liu@intel.c…
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v7:
- No much change, just rebase on top of 6.7-rc1
v6: https://lore.kernel.org/linux-iommu/20231020093719.18725-1-yi.l.liu@intel.c…
- Address comments from Kevin
- Split the VT-d nesting series into two parts (Jason)
v5: https://lore.kernel.org/linux-iommu/20230921075431.125239-1-yi.l.liu@intel.…
- Add Kevin's r-b for patch 2, 3 ,5 8, 10
- Drop enforce_cache_coherency callback from the nested type domain ops (Kevin)
- Remove duplicate agaw check in patch 04 (Kevin)
- Remove duplicate domain_update_iommu_cap() in patch 06 (Kevin)
- Check parent's force_snooping to set pgsnp in the pasid entry (Kevin)
- uapi data structure check (Kevin)
- Simplify the errata handling as user can allocate nested parent domain
v4: https://lore.kernel.org/linux-iommu/20230724111335.107427-1-yi.l.liu@intel.…
- Remove ascii art tables (Jason)
- Drop EMT (Tina, Jason)
- Drop MTS and related definitions (Kevin)
- Rename macro IOMMU_VTD_PGTBL_ to IOMMU_VTD_S1_ (Kevin)
- Rename struct iommu_hwpt_intel_vtd_ to iommu_hwpt_vtd_ (Kevin)
- Rename struct iommu_hwpt_intel_vtd to iommu_hwpt_vtd_s1 (Kevin)
- Put the vendor specific hwpt alloc data structure before enuma iommu_hwpt_type (Kevin)
- Do not trim the higher page levels of S2 domain in nested domain attachment as the
S2 domain may have been used independently. (Kevin)
- Remove the first-stage pgd check against the maximum address of s2_domain as hw
can check it anyhow. It makes sense to check every pfns used in the stage-1 page
table. But it cannot make it. So just leave it to hw. (Kevin)
- Split the iotlb flush part into an order of uapi, helper and callback implementation (Kevin)
- Change the policy of VT-d nesting errata, disallow RO mapping once a domain is used
as parent domain of a nested domain. This removes the nested_users counting. (Kevin)
- Minor fix for "make htmldocs"
v3: https://lore.kernel.org/linux-iommu/20230511145110.27707-1-yi.l.liu@intel.c…
- Further split the patches into an order of adding helpers for nested
domain, iotlb flush, nested domain attachment and nested domain allocation
callback, then report the hw_info to userspace.
- Add batch support in cache invalidation from userspace
- Disallow nested translation usage if RO mappings exists in stage-2 domain
due to errata on readonly mappings on Sapphire Rapids platform.
v2: https://lore.kernel.org/linux-iommu/20230309082207.612346-1-yi.l.liu@intel.…
- The iommufd infrastructure is split to be separate series.
v1: https://lore.kernel.org/linux-iommu/20230209043153.14964-1-yi.l.liu@intel.c…
Regards,
Yi Liu
Yi Liu (3):
iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
iommu/vt-d: Make iotlb flush helpers to be extern
iommu/vt-d: Add iotlb flush for nested domain
drivers/iommu/intel/iommu.c | 10 +++----
drivers/iommu/intel/iommu.h | 6 ++++
drivers/iommu/intel/nested.c | 54 ++++++++++++++++++++++++++++++++++++
include/uapi/linux/iommufd.h | 36 ++++++++++++++++++++++++
4 files changed, 101 insertions(+), 5 deletions(-)
--
2.34.1
Now that we have the VISIBLE_IF_KUNIT and EXPORT_SYMBOL_IF_KUNIT macros,
update the instructions to stop recommending including .c files.
Signed-off-by: Arthur Grillo <arthurgrillo(a)riseup.net>
---
Documentation/dev-tools/kunit/usage.rst | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/Documentation/dev-tools/kunit/usage.rst b/Documentation/dev-tools/kunit/usage.rst
index c27e1646ecd9..7410b39ec5b7 100644
--- a/Documentation/dev-tools/kunit/usage.rst
+++ b/Documentation/dev-tools/kunit/usage.rst
@@ -671,19 +671,22 @@ Testing Static Functions
------------------------
If we do not want to expose functions or variables for testing, one option is to
-conditionally ``#include`` the test file at the end of your .c file. For
-example:
+conditionally export the used symbol.
.. code-block:: c
/* In my_file.c */
- static int do_interesting_thing();
+ VISIBLE_IF_KUNIT int do_interesting_thing();
+ EXPORT_SYMBOL_IF_KUNIT(do_interesting_thing);
+
+ /* In my_file.h */
#ifdef CONFIG_MY_KUNIT_TEST
- #include "my_kunit_test.c"
+ int do_interesting_thing(void);
#endif
+
Injecting Test-Only Code
------------------------
---
base-commit: eeb8e8d9f124f279e80ae679f4ba6e822ce4f95f
change-id: 20240108-kunit-doc-export-eec1f910ab67
Best regards,
--
Arthur Grillo <arthurgrillo(a)riseup.net>
Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
This series adds the cache invalidation path for the userspace to invalidate
cache after modifying the stage-1 page table. This is based on the first part
of nesting [1]
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20231026044216.64964-1-yi.l.liu@intel.c… - merged
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v6:
- No much change, just rebase on top of 6.7-rc1 as part 1/2 is merged
v5: https://lore.kernel.org/linux-iommu/20231020092426.13907-1-yi.l.liu@intel.c…
- Split the iommufd nesting series into two parts of alloc_user and
invalidation (Jason)
- Split IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING/_NESTED, and
do the same with the structures/alloc()/abort()/destroy(). Reworked the
selftest accordingly too. (Jason)
- Move hwpt/data_type into struct iommu_user_data from standalone op
arguments. (Jason)
- Rename hwpt_type to be data_type, the HWPT_TYPE to be HWPT_ALLOC_DATA,
_TYPE_DEFAULT to be _ALLOC_DATA_NONE (Jason, Kevin)
- Rename iommu_copy_user_data() to iommu_copy_struct_from_user() (Kevin)
- Add macro to the iommu_copy_struct_from_user() to calculate min_size
(Jason)
- Fix two bugs spotted by ZhaoYan
v4: https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
- Separate HWPT alloc/destroy/abort functions between user-managed HWPTs
and kernel-managed HWPTs
- Rework invalidate uAPI to be a multi-request array-based design
- Add a struct iommu_user_data_array and a helper for driver to sanitize
and copy the entry data from user space invalidation array
- Add a patch fixing TEST_LENGTH() in selftest program
- Drop IOMMU_RESV_IOVA_RANGES patches
- Update kdoc and inline comments
- Drop the code to add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation,
this does not change the rule that resv regions should only be added to the
kernel-managed HWPT. The IOMMU_RESV_SW_MSI stuff will be added in later series
as it is needed only by SMMU so far.
v3: https://lore.kernel.org/linux-iommu/20230724110406.107212-1-yi.l.liu@intel.…
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to avoid
passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Lu Baolu (1):
iommu: Add cache_invalidate_user op
Nicolin Chen (4):
iommu: Add iommu_copy_struct_from_user_array helper
iommufd/selftest: Add mock_domain_cache_invalidate_user support
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (1):
iommufd: Add IOMMU_HWPT_INVALIDATE
drivers/iommu/iommufd/hw_pagetable.c | 35 ++++++++
drivers/iommu/iommufd/iommufd_private.h | 9 ++
drivers/iommu/iommufd/iommufd_test.h | 22 +++++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 69 +++++++++++++++
include/linux/iommu.h | 84 +++++++++++++++++++
include/uapi/linux/iommufd.h | 35 ++++++++
tools/testing/selftests/iommu/iommufd.c | 75 +++++++++++++++++
tools/testing/selftests/iommu/iommufd_utils.h | 63 ++++++++++++++
9 files changed, 395 insertions(+)
--
2.34.1
Minor fixes of compiler warnings and one bug in the number of parameters which
would not crash the test but it is better fixed for correctness sake.
As the general climate in the Linux kernel community is to fix all compiler
warnings, this could be on the right track, even if only in the testing suite.
Mirsad Todorovac (4):
kselftest: alsa: fix the number of parameters to ksft_exit_fail_msg()
kselftest: alsa: Fix the printf format specifier in call to
ksft_print_msg()
ksellftest: alsa: Fix the printf format specifier to unsigned int
selftests: alsa: Fix the exit error message parameter in sysfs_get()
tools/testing/selftests/alsa/conf.c | 2 +-
tools/testing/selftests/alsa/mixer-test.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
--
2.40.1
In particular, fcnal-test.sh timed out on slower hardware after
some new permutations of tests were added.
This single test ran for almost an hour instead of the expected
25 min (1500s). 75 minutes should suffice for most systems.
Cc: David Ahern <dsahern(a)kernel.org>
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: Paolo Abeni <pabeni(a)redhat.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: netdev(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
---
tools/testing/selftests/net/settings | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/settings b/tools/testing/selftests/net/settings
index dfc27cdc6c05..ed8418e8217a 100644
--- a/tools/testing/selftests/net/settings
+++ b/tools/testing/selftests/net/settings
@@ -1 +1 @@
-timeout=1500
+timeout=4500
--
2.40.1
Hi, all,
The default timeout for tools/testing/selftest/net groups of tests is 1500s (25m).
This is less than half of what is required to run the full fcnal-test.sh on my hardware
(53m48s).
With the timeout adjusted, tests passed 914 of 914 OK.
Best regards,
Mirsad Todorovac
diff --git a/tools/testing/selftests/net/settings b/tools/testing/selftests/net/settings
index dfc27cdc6c05..ed8418e8217a 100644
--- a/tools/testing/selftests/net/settings
+++ b/tools/testing/selftests/net/settings
@@ -1 +1 @@
-timeout=1500
+timeout=3600
-----------------------------------------------------------------
[snip]
#################################################################
Ping LLA with multiple interfaces
TEST: Pre cycle, ping out ns-B - multicast IP [ OK ]
TEST: Pre cycle, ping out ns-C - multicast IP [ OK ]
TEST: Post cycle ns-A eth1, ping out ns-B - multicast IP [ OK ]
TEST: Post cycle ns-A eth1, ping out ns-C - multicast IP [ OK ]
TEST: Post cycle ns-A eth2, ping out ns-B - multicast IP [ OK ]
TEST: Post cycle ns-A eth2, ping out ns-C - multicast IP [ OK ]
#################################################################
SNAT on VRF
TEST: IPv4 TCP connection over VRF with SNAT [ OK ]
TEST: IPv6 TCP connection over VRF with SNAT [ OK ]
Tests passed: 914
Tests failed: 0
real 53m48.460s
user 0m32.885s
sys 2m41.509s
root@hostname:/
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union
"I see something approaching fast ... Will it be friends with me?"
Hi, all,
There is a minor omission in selftests/alsa/conf.c, returning errno where there is
strerror(errno) passed in the sibling calls to printf().
The bug was apparently introduced with the commit aba51cd0949ae
("selftests: alsa - add PCM test").
As a diff speaks like a thousand words, the fix is simple:
Regards,
Mirsad
----- cut -----
diff --git a/tools/testing/selftests/alsa/conf.c b/tools/testing/selftests/alsa/conf.c
index 00925eb8d9f4..89e3656a042d 100644
--- a/tools/testing/selftests/alsa/conf.c
+++ b/tools/testing/selftests/alsa/conf.c
@@ -179,7 +179,7 @@ static char *sysfs_get(const char *sysfs_root, const char *id)
close(fd);
if (len < 0)
ksft_exit_fail_msg("sysfs: unable to read value '%s': %s\n",
- path, errno);
+ path, strerror(errno));
while (len > 0 && path[len-1] == '\n')
len--;
path[len] = '\0';
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union
"I see something approaching fast ... Will it be friends with me?"
=== Description ===
This is a bpf-treewide change that annotates all kfuncs as such inside
.BTF_ids. This annotation eventually allows us to automatically generate
kfunc prototypes from bpftool.
We store this metadata inside a yet-unused flags field inside struct
btf_id_set8 (thanks Kumar!). pahole will be taught where to look.
More details about the full chain of events are available in commit 3's
description.
The accompanying pahole changes (still needs some cleanup) can be viewed
here on this "frozen" branch [0].
[0]: https://github.com/danobi/pahole/tree/kfunc_btf-mailed
=== Changelog ===
Changes from v1:
* Move WARN_ON() up a call level
* Also return error when kfunc set is not properly tagged
* Use BTF_KFUNCS_START/END instead of flags
* Rename BTF_SET8_KFUNC to BTF_SET8_KFUNCS
Daniel Xu (3):
bpf: btf: Support flags for BTF_SET8 sets
bpf: btf: Add BTF_KFUNCS_START/END macro pair
bpf: treewide: Annotate BPF kfuncs in BTF
drivers/hid/bpf/hid_bpf_dispatch.c | 8 +++----
fs/verity/measure.c | 4 ++--
include/linux/btf_ids.h | 21 +++++++++++++++----
kernel/bpf/btf.c | 4 ++++
kernel/bpf/cpumask.c | 4 ++--
kernel/bpf/helpers.c | 8 +++----
kernel/bpf/map_iter.c | 4 ++--
kernel/cgroup/rstat.c | 4 ++--
kernel/trace/bpf_trace.c | 8 +++----
net/bpf/test_run.c | 8 +++----
net/core/filter.c | 16 +++++++-------
net/core/xdp.c | 4 ++--
net/ipv4/bpf_tcp_ca.c | 4 ++--
net/ipv4/fou_bpf.c | 4 ++--
net/ipv4/tcp_bbr.c | 4 ++--
net/ipv4/tcp_cubic.c | 4 ++--
net/ipv4/tcp_dctcp.c | 4 ++--
net/netfilter/nf_conntrack_bpf.c | 4 ++--
net/netfilter/nf_nat_bpf.c | 4 ++--
net/xfrm/xfrm_interface_bpf.c | 4 ++--
net/xfrm/xfrm_state_bpf.c | 4 ++--
.../selftests/bpf/bpf_testmod/bpf_testmod.c | 8 +++----
22 files changed, 77 insertions(+), 60 deletions(-)
--
2.42.1
The expression "source ../lib.sh" added to net/forwarding/lib.sh in commit
25ae948b4478 ("selftests/net: add lib.sh") does not work for tests outside
net/forwarding which source net/forwarding/lib.sh (1). It also does not
work in some cases where only a subset of tests are exported (2).
Avoid the problems mentioned above by replacing the faulty expression with
a copy of the content from net/lib.sh which is used by files under
net/forwarding.
A more thorough solution which avoids duplicating content between
net/lib.sh and net/forwarding/lib.sh has been posted here:
https://lore.kernel.org/netdev/20231222135836.992841-1-bpoirier@nvidia.com/
The approach in the current patch is a stopgap solution to avoid submitting
large changes at the eleventh hour of this development cycle.
Example of problem 1)
tools/testing/selftests/drivers/net/bonding$ ./dev_addr_lists.sh
./net_forwarding_lib.sh: line 41: ../lib.sh: No such file or directory
TEST: bonding cleanup mode active-backup [ OK ]
TEST: bonding cleanup mode 802.3ad [ OK ]
TEST: bonding LACPDU multicast address to slave (from bond down) [ OK ]
TEST: bonding LACPDU multicast address to slave (from bond up) [ OK ]
An error message is printed but since the test does not use functions from
net/lib.sh, the test results are not affected.
Example of problem 2)
tools/testing/selftests$ make install TARGETS="net/forwarding"
tools/testing/selftests$ cd kselftest_install/net/forwarding/
tools/testing/selftests/kselftest_install/net/forwarding$ ./pedit_ip.sh veth{0..3}
lib.sh: line 41: ../lib.sh: No such file or directory
TEST: ping [ OK ]
TEST: ping6 [ OK ]
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth1 ingress pedit ip src set 198.51.100.1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth2 egress pedit ip src set 198.51.100.1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth1 ingress pedit ip dst set 198.51.100.1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth2 egress pedit ip dst set 198.51.100.1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth1 ingress pedit ip6 src set 2001:db8:2::1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth2 egress pedit ip6 src set 2001:db8:2::1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth1 ingress pedit ip6 dst set 2001:db8:2::1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth2 egress pedit ip6 dst set 2001:db8:2::1 [FAIL]
Expected to get 10 packets, but got .
In this case, the test results are affected.
Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Suggested-by: Ido Schimmel <idosch(a)nvidia.com>
Suggested-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Ido Schimmel <idosch(a)nvidia.com>
Tested-by: Petr Machata <petrm(a)nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier(a)nvidia.com>
---
tools/testing/selftests/net/forwarding/lib.sh | 27 ++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 69ef2a40df21..8a61464ab6eb 100755
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -38,7 +38,32 @@ if [[ -f $relative_path/forwarding.config ]]; then
source "$relative_path/forwarding.config"
fi
-source ../lib.sh
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+busywait()
+{
+ local timeout=$1; shift
+
+ local start_time="$(date -u +%s%3N)"
+ while true
+ do
+ local out
+ out=$("$@")
+ local ret=$?
+ if ((!ret)); then
+ echo -n "$out"
+ return 0
+ fi
+
+ local current_time="$(date -u +%s%3N)"
+ if ((current_time - start_time > timeout)); then
+ echo -n "$out"
+ return 1
+ fi
+ done
+}
+
##############################################################################
# Sanity checks
--
2.43.0
This series attempts to reduce the parsing overhead of IPv6 extension
headers in GRO and GSO, by removing extension header specific code and
enabling the frag0 fast path.
The following changes were made:
- Removed some unnecessary HBH conditionals by adding HBH offload
to inet6_offloads
- Added a utility function to support frag0 fast path in ipv6_gro_receive
- Added selftests for IPv6 packets with extension headers in GRO
Richard
v2 -> v3:
- Removed previously added IPv6 extension header length constant and
using sizeof(*opth) instead.
- Removed unnecessary conditional in gro selftest framework
- v2:
https://lore.kernel.org/netdev/127b8199-1cd4-42d7-9b2b-875abaad93fe@gmail.c…
v1 -> v2:
- Added a minimum IPv6 extension header length constant to make code self
documenting.
- Added new selftest which checks that packets with different extension
header payloads do not coalesce.
- Added more info in the second commit message regarding the code changes.
- v1:
https://lore.kernel.org/netdev/f4eff69d-3917-4c42-8c6b-d09597ac4437@gmail.c…
Richard Gobert (3):
net: gso: add HBH extension header offload support
net: gro: parse ipv6 ext headers without frag0 invalidation
selftests/net: fix GRO coalesce test and add ext header coalesce tests
net/ipv6/exthdrs_offload.c | 11 ++++
net/ipv6/ip6_offload.c | 76 +++++++++++++++++--------
tools/testing/selftests/net/gro.c | 93 +++++++++++++++++++++++++++++--
3 files changed, 150 insertions(+), 30 deletions(-)
--
2.36.1
Hi,
for this v3 I changed the approach for identifying devices in a stable
way from the match fields back to the hardware topology (used in v1).
The match fields were proposed as a way to avoid the possible issue of
PCI topology being reconfigured, but that wasn't observed on any real
system so far. However using match fields does allow for a real issue if
an external device similar to an internal one is connected to the
system, which results in a change of the match count and therefore a
test failure. So using the HW topology was chosen as the most reliable
approach.
The per-platform device description file now uses YAML following a
suggestion from Chris Obbard, and the test script was re-written in
python to handle the new YAML format.
A second sample board file is also now included for an x86 platform,
which contains an USB controller behind a PCI controller, which wasn't
possible to describe in v1.
Thanks,
Nícolas
v2: https://lore.kernel.org/all/20231127233558.868365-1-nfraprado@collabora.com
v1: https://lore.kernel.org/all/20231024211818.365844-1-nfraprado@collabora.com
Original cover letter:
This is part of an effort to improve detection of regressions impacting
device probe on all platforms. The recently merged DT kselftest [3]
detects probe issues for all devices described statically in the DT.
That leaves out devices discovered at run-time from discoverable busses.
This is where this test comes in. All of the devices that are connected
through discoverable busses (ie USB and PCI), and which are internal and
therefore always present, can be described in a per-platform file so
they can be checked for. The test will check that the device has been
instantiated and bound to a driver.
Patch 1 introduces the test. Patch 2 and 3 add the device definitions
for the google,spherion machine (Acer Chromebook 514) and XPS 13 as
examples.
This is the output from the test running on Spherion:
TAP version 13
Using board file: boards/google,spherion.yaml
1..8
ok 1 /usb2-controller(a)11200000/1.4.1/camera.device
ok 2 /usb2-controller(a)11200000/1.4.1/camera.0.driver
ok 3 /usb2-controller(a)11200000/1.4.1/camera.1.driver
ok 4 /usb2-controller(a)11200000/1.4.2/bluetooth.device
ok 5 /usb2-controller(a)11200000/1.4.2/bluetooth.0.driver
ok 6 /usb2-controller(a)11200000/1.4.2/bluetooth.1.driver
ok 7 /pci-controller(a)11230000/0.0/0.0/wifi.device
ok 8 /pci-controller(a)11230000/0.0/0.0/wifi.driver
Totals: pass:8 fail:0 xfail:0 xpass:0 skip:0 error:0
[3] https://lore.kernel.org/all/20230828211424.2964562-1-nfraprado@collabora.co…
Changes in v3:
- Reverted approach of encoding stable device reference in test file
from device match fields (from modalias) back to HW topology (from v1)
- Changed board file description to YAML
- Rewrote test script in python to handle YAML and support x86 platforms
Changes in v2:
- Changed approach of encoding stable device reference in test file from
HW topology to device match fields (the ones from modalias)
- Better documented test format
Nícolas F. R. A. Prado (3):
kselftest: Add test to verify probe of devices from discoverable
busses
kselftest: devices: Add sample board file for google,spherion
kselftest: devices: Add sample board file for XPS 13 9300
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/devices/Makefile | 4 +
.../devices/boards/Dell Inc.,XPS 13 9300.yaml | 40 +++
.../devices/boards/google,spherion.yaml | 50 +++
tools/testing/selftests/devices/ksft.py | 90 +++++
.../devices/test_discoverable_devices.py | 318 ++++++++++++++++++
6 files changed, 503 insertions(+)
create mode 100644 tools/testing/selftests/devices/Makefile
create mode 100644 tools/testing/selftests/devices/boards/Dell Inc.,XPS 13 9300.yaml
create mode 100644 tools/testing/selftests/devices/boards/google,spherion.yaml
create mode 100644 tools/testing/selftests/devices/ksft.py
create mode 100755 tools/testing/selftests/devices/test_discoverable_devices.py
--
2.43.0
Add NULL checks to KUNIT_BINARY_STR_ASSERTION() so that it will fail
cleanly if either pointer is NULL, instead of causing a NULL pointer
dereference in the strcmp().
A test failure could be that a string is unexpectedly NULL. This could
be trapped by KUNIT_ASSERT_NOT_NULL() but that would terminate the test
at that point. It's preferable that the KUNIT_EXPECT_STR*() macros can
handle NULL pointers as a failure.
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
---
include/kunit/test.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/kunit/test.h b/include/kunit/test.h
index b163b9984b33..c2ce379c329b 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -758,7 +758,7 @@ do { \
.right_text = #right, \
}; \
\
- if (likely(strcmp(__left, __right) op 0)) \
+ if (likely((__left) && (__right) && (strcmp(__left, __right) op 0))) \
break; \
\
\
--
2.30.2
The RISC-V arch_timer selftests is used to validate Sstc timer
functionality in a guest, which sets up periodic timer interrupts
and check the basic interrupt status upon its receipt.
This KVM selftests was ported from aarch64 arch_timer and tested
with Linux v6.7-rc4 on a Qemu riscv64 virt machine.
---
Changed since v4:
* Rebased to Linux 6.7-rc4
* Included Paolo's patch(01/11) to fix issues with SPLIT_TESTS
* Droped the patch(KVM: selftests: Unify the makefile rule for split targets)
since Paolo's patch had included the fix
* Added new patch(05/11) to include header file vdso/processor.h from linux
source tree to leverage the cpu_relax() definition - Conor/Andrew
* Added new patch(11/11) to enable user configuration of timer error margin
parameter which alleviate the intermitent failure in stress test - Andrew
* Other minor fixes per Andrew's comments
Haibo Xu (10):
KVM: arm64: selftests: Split arch_timer test code
KVM: selftests: Add CONFIG_64BIT definition for the build
tools: riscv: Add header file csr.h
tools: riscv: Add header file vdso/processor.h
KVM: riscv: selftests: Switch to use macro from csr.h
KVM: riscv: selftests: Add exception handling support
KVM: riscv: selftests: Add guest helper to get vcpu id
KVM: riscv: selftests: Change vcpu_has_ext to a common function
KVM: riscv: selftests: Add sstc timer test
KVM: selftests: Enable tunning of err_margin_us in arch timer test
Paolo Bonzini (1):
selftests/kvm: Fix issues with $(SPLIT_TESTS)
tools/arch/riscv/include/asm/csr.h | 521 ++++++++++++++++++
tools/arch/riscv/include/asm/vdso/processor.h | 32 ++
tools/testing/selftests/kvm/Makefile | 27 +-
.../selftests/kvm/aarch64/arch_timer.c | 295 +---------
tools/testing/selftests/kvm/arch_timer.c | 259 +++++++++
.../selftests/kvm/include/aarch64/processor.h | 4 -
.../selftests/kvm/include/kvm_util_base.h | 9 +
.../selftests/kvm/include/riscv/arch_timer.h | 71 +++
.../selftests/kvm/include/riscv/processor.h | 65 ++-
.../testing/selftests/kvm/include/test_util.h | 2 +
.../selftests/kvm/include/timer_test.h | 45 ++
.../selftests/kvm/lib/riscv/handlers.S | 101 ++++
.../selftests/kvm/lib/riscv/processor.c | 87 +++
.../testing/selftests/kvm/riscv/arch_timer.c | 111 ++++
.../selftests/kvm/riscv/get-reg-list.c | 11 +-
15 files changed, 1333 insertions(+), 307 deletions(-)
create mode 100644 tools/arch/riscv/include/asm/csr.h
create mode 100644 tools/arch/riscv/include/asm/vdso/processor.h
create mode 100644 tools/testing/selftests/kvm/arch_timer.c
create mode 100644 tools/testing/selftests/kvm/include/riscv/arch_timer.h
create mode 100644 tools/testing/selftests/kvm/include/timer_test.h
create mode 100644 tools/testing/selftests/kvm/lib/riscv/handlers.S
create mode 100644 tools/testing/selftests/kvm/riscv/arch_timer.c
--
2.34.1
The patch set [1] added a general lib.sh in net selftests, and converted
several test scripts to source the lib.sh.
unicast_extensions.sh (converted in [1]) and pmtu.sh (converted in [2])
have a /bin/sh shebang which may point to various shells in different
distributions, but "source" is only available in some of them. For
example, "source" is a built-it function in bash, but it cannot be
used in dash.
Refer to other scripts that were converted together, simply change the
shebang to bash to fix the following issues when the default /bin/sh
points to other shells.
# selftests: net: unicast_extensions.sh
# ./unicast_extensions.sh: 31: source: not found
# ###########################################################################
# Unicast address extensions tests (behavior of reserved IPv4 addresses)
# ###########################################################################
# TEST: assign and ping within 240/4 (1 of 2) (is allowed) [FAIL]
# TEST: assign and ping within 240/4 (2 of 2) (is allowed) [FAIL]
# TEST: assign and ping within 0/8 (1 of 2) (is allowed) [FAIL]
# TEST: assign and ping within 0/8 (2 of 2) (is allowed) [FAIL]
# TEST: assign and ping inside 255.255/16 (is allowed) [FAIL]
# TEST: assign and ping inside 255.255.255/24 (is allowed) [FAIL]
# TEST: route between 240.5.6/24 and 255.1.2/24 (is allowed) [FAIL]
# TEST: route between 0.200/16 and 245.99/16 (is allowed) [FAIL]
# TEST: assign and ping lowest address (/24) [FAIL]
# TEST: assign and ping lowest address (/26) [FAIL]
# TEST: routing using lowest address [FAIL]
# TEST: assigning 0.0.0.0 (is forbidden) [ OK ]
# TEST: assigning 255.255.255.255 (is forbidden) [ OK ]
# TEST: assign and ping inside 127/8 (is forbidden) [ OK ]
# TEST: assign and ping class D address (is forbidden) [ OK ]
# TEST: routing using class D (is forbidden) [ OK ]
# TEST: routing using 127/8 (is forbidden) [ OK ]
not ok 51 selftests: net: unicast_extensions.sh # exit=1
v1 -> v2:
- Fix pmtu.sh which has the same issue as unicast_extensions.sh,
suggested by Hangbin
- Change the style of the "source" line to be consistent with other
tests, suggested by Hangbin
Link: https://lore.kernel.org/all/20231202020110.362433-1-liuhangbin@gmail.com/ [1]
Link: https://lore.kernel.org/all/20231219094856.1740079-1-liuhangbin@gmail.com/ [2]
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Signed-off-by: Yujie Liu <yujie.liu(a)intel.com>
---
tools/testing/selftests/net/pmtu.sh | 4 ++--
tools/testing/selftests/net/unicast_extensions.sh | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/net/pmtu.sh b/tools/testing/selftests/net/pmtu.sh
index 175d3d1d773b..f10879788f61 100755
--- a/tools/testing/selftests/net/pmtu.sh
+++ b/tools/testing/selftests/net/pmtu.sh
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/bin/bash
# SPDX-License-Identifier: GPL-2.0
#
# Check that route PMTU values match expectations, and that initial device MTU
@@ -198,7 +198,7 @@
# - pmtu_ipv6_route_change
# Same as above but with IPv6
-source ./lib.sh
+source lib.sh
PAUSE_ON_FAIL=no
VERBOSE=0
diff --git a/tools/testing/selftests/net/unicast_extensions.sh b/tools/testing/selftests/net/unicast_extensions.sh
index b7a2cb9e7477..f52aa5f7da52 100755
--- a/tools/testing/selftests/net/unicast_extensions.sh
+++ b/tools/testing/selftests/net/unicast_extensions.sh
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/bin/bash
# SPDX-License-Identifier: GPL-2.0
#
# By Seth Schoen (c) 2021, for the IPv4 Unicast Extensions Project
@@ -28,7 +28,7 @@
# These tests provide an easy way to flip the expected result of any
# of these behaviors for testing kernel patches that change them.
-source ./lib.sh
+source lib.sh
# nettest can be run from PATH or from same directory as this selftest
if ! which nettest >/dev/null; then
base-commit: cd4d7263d58ab98fd4dee876776e4da6c328faa3
--
2.34.1
This series attempts to reduce the parsing overhead of IPv6 extension
headers in GRO and GSO, by removing extension header specific code and
enabling the frag0 fast path.
The following changes were made:
- Removed some unnecessary HBH conditionals by adding HBH offload
to inet6_offloads
- Added a utility function to support frag0 fast path in ipv6_gro_receive
- Added selftests for IPv6 packets with extension headers in GRO
Richard
v1 -> v2:
- Added a minimum IPv6 extension header length constant to make code self
documenting.
- Added new selftest which checks that packets with different extension
header payloads do not coalesce.
- Added more info in the second commit message regarding the code changes.
- v1:
https://lore.kernel.org/netdev/f4eff69d-3917-4c42-8c6b-d09597ac4437@gmail.c…
Richard Gobert (3):
net: gso: add HBH extension header offload support
net: gro: parse ipv6 ext headers without frag0 invalidation
selftests/net: fix GRO coalesce test and add ext header coalesce tests
include/net/ipv6.h | 1 +
net/ipv6/exthdrs_offload.c | 11 ++++
net/ipv6/ip6_offload.c | 76 +++++++++++++++++--------
tools/testing/selftests/net/gro.c | 94 +++++++++++++++++++++++++++++--
4 files changed, 152 insertions(+), 30 deletions(-)
--
2.36.1
Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
This series is based on the first part which was merged [1], this series is to
add the cache invalidation interface or the userspace to invalidate cache after
modifying the stage-1 page table. This includes both the iommufd changes and the
VT-d driver changes.
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20231026044216.64964-1-yi.l.liu@intel.c… - merged
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v10:
- Minor tweak to patch 07 (Kevin)
- Rebase on top of 6.7-rc8
v9: https://lore.kernel.org/linux-iommu/20231228150629.13149-1-yi.l.liu@intel.c…
- Add a test case which sets both IOMMU_TEST_INVALIDATE_FLAG_ALL and
IOMMU_TEST_INVALIDATE_FLAG_TRIGGER_ERROR in flags, and expect to succeed
and see an 'error'. (Kevin)
- Returns -ETIMEOUT in qi_check_fault() if caller is interested with the
fault when timeout happens. If not, the qi_submit_sync() will keep retry
hence unable to report the error back to user. For now, only the user cache
invalidation path has interest on the time out error. So this change only
affects the user cache invalidation path. Other path will still hang in
qi_submit_sync() when timeout happens. (Kevin)
v8: https://lore.kernel.org/linux-iommu/20231227161354.67701-1-yi.l.liu@intel.c…
- Pass invalidation hint to the cache invalidation helper in the cache_invalidate_user
op path (Kevin)
- Move the devTLB invalidation out of info->iommu loop (Kevin, Weijiang)
- Clear *fault per restart in qi_submit_sync() to avoid acroos submission error
accumulation. (Kevin)
- Define the vtd cache invalidation uapi structure in separate patch (Kevin)
- Rename inv_error to be hw_error (Kevin)
- Rename 'reqs_uptr', 'req_type', 'req_len' and 'req_num' to be 'data_uptr',
'data_type', "entry_len' and 'entry_num" (Kevin)
- Allow user to set IOMMU_TEST_INVALIDATE_FLAG_ALL and IOMMU_TEST_INVALIDATE_FLAG_TRIGGER_ERROR
in the same time (Kevin)
v7: https://lore.kernel.org/linux-iommu/20231221153948.119007-1-yi.l.liu@intel.…
- Remove domain->ops->cache_invalidate_user check in hwpt alloc path due
to failure in bisect (Baolu)
- Remove out_driver_error_code from struct iommu_hwpt_invalidate after
discussion in v6. Should expect per-entry error code.
- Rework the selftest cache invalidation part to report a per-entry error
- Allow user to pass in an empty array to have a try-and-fail mechanism for
user to check if a given req_type is supported by the kernel (Jason)
- Define a separate enum type for cache invalidation data (Jason)
- Fix the IOMMU_HWPT_INVALIDATE to always update the req_num field before
returning (Nicolin)
- Merge the VT-d nesting part 2/2
https://lore.kernel.org/linux-iommu/20231117131816.24359-1-yi.l.liu@intel.c…
into this series to avoid defining empty enum in the middle of the series.
The major difference is adding the VT-d related invalidation uapi structures
together with the generic data structures in patch 02 of this series.
- VT-d driver was refined to report ICE/ITE error from the bottom cache
invalidation submit helpers, hence the cache_invalidate_user op could
report such errors via the per-entry error field to user. VT-d driver
will not stop the invalidation array walking due to the ICE/ITE errors
as such errors are defined by VT-d spec, userspace should be able to
handle it and let the real user (say Virtual Machine) know about it.
But for other errors like invalid uapi data structure configuration,
memory copy failure, such errors should stop the array walking as it
may have more issues if go on.
- Minor fixes per Jason and Kevin's review comments
v6: https://lore.kernel.org/linux-iommu/20231117130717.19875-1-yi.l.liu@intel.c…
- No much change, just rebase on top of 6.7-rc1 as part 1/2 is merged
v5: https://lore.kernel.org/linux-iommu/20231020092426.13907-1-yi.l.liu@intel.c…
- Split the iommufd nesting series into two parts of alloc_user and
invalidation (Jason)
- Split IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING/_NESTED, and
do the same with the structures/alloc()/abort()/destroy(). Reworked the
selftest accordingly too. (Jason)
- Move hwpt/data_type into struct iommu_user_data from standalone op
arguments. (Jason)
- Rename hwpt_type to be data_type, the HWPT_TYPE to be HWPT_ALLOC_DATA,
_TYPE_DEFAULT to be _ALLOC_DATA_NONE (Jason, Kevin)
- Rename iommu_copy_user_data() to iommu_copy_struct_from_user() (Kevin)
- Add macro to the iommu_copy_struct_from_user() to calculate min_size
(Jason)
- Fix two bugs spotted by ZhaoYan
v4: https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
- Separate HWPT alloc/destroy/abort functions between user-managed HWPTs
and kernel-managed HWPTs
- Rework invalidate uAPI to be a multi-request array-based design
- Add a struct iommu_user_data_array and a helper for driver to sanitize
and copy the entry data from user space invalidation array
- Add a patch fixing TEST_LENGTH() in selftest program
- Drop IOMMU_RESV_IOVA_RANGES patches
- Update kdoc and inline comments
- Drop the code to add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation,
this does not change the rule that resv regions should only be added to the
kernel-managed HWPT. The IOMMU_RESV_SW_MSI stuff will be added in later series
as it is needed only by SMMU so far.
v3: https://lore.kernel.org/linux-iommu/20230724110406.107212-1-yi.l.liu@intel.…
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to avoid
passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Lu Baolu (4):
iommu: Add cache_invalidate_user op
iommu/vt-d: Allow qi_submit_sync() to return the QI faults
iommu/vt-d: Convert stage-1 cache invalidation to return QI fault
iommu/vt-d: Add iotlb flush for nested domain
Nicolin Chen (4):
iommu: Add iommu_copy_struct_from_user_array helper
iommufd/selftest: Add mock_domain_cache_invalidate_user support
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (2):
iommufd: Add IOMMU_HWPT_INVALIDATE
iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
drivers/iommu/intel/dmar.c | 42 ++--
drivers/iommu/intel/iommu.c | 12 +-
drivers/iommu/intel/iommu.h | 8 +-
drivers/iommu/intel/irq_remapping.c | 2 +-
drivers/iommu/intel/nested.c | 107 ++++++++++
drivers/iommu/intel/pasid.c | 14 +-
drivers/iommu/intel/svm.c | 14 +-
drivers/iommu/iommufd/hw_pagetable.c | 41 ++++
drivers/iommu/iommufd/iommufd_private.h | 10 +
drivers/iommu/iommufd/iommufd_test.h | 39 ++++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 86 ++++++++
include/linux/iommu.h | 100 +++++++++
include/uapi/linux/iommufd.h | 101 ++++++++++
tools/testing/selftests/iommu/iommufd.c | 190 ++++++++++++++++++
tools/testing/selftests/iommu/iommufd_utils.h | 57 ++++++
16 files changed, 787 insertions(+), 39 deletions(-)
--
2.34.1
For now, we have to call some helpers when we need to update the csum,
such as bpf_l4_csum_replace, bpf_l3_csum_replace, etc. These helpers are
not inlined, which causes poor performance.
In fact, we can define our own csum update functions in BPF program
instead of bpf_l3_csum_replace, which is totally inlined and efficient.
However, we can't do this for bpf_l4_csum_replace for now, as we can't
update skb->csum, which can cause skb->csum invalid in the rx path with
CHECKSUM_COMPLETE mode.
What's more, we can't use the direct data access and have to use
skb_store_bytes() with the BPF_F_RECOMPUTE_CSUM flag in some case, such
as modifing the vni in the vxlan header and the underlay udp header has
no checksum.
In the first patch, we make skb->csum readable and writable, and we make
skb->ip_summed readable. For now, for tc only. With these 2 fields, we
don't need to call bpf helpers for csum update any more.
In the second patch, we add some testcases for the read/write testing for
skb->csum and skb->ip_summed.
If this series is acceptable, we can define the inlined functions for csum
update in libbpf in the next step.
Menglong Dong (2):
bpf: add csum/ip_summed fields to __sk_buff
testcases/bpf: add testcases for skb->csum to ctx_skb.c
include/linux/skbuff.h | 2 +
include/uapi/linux/bpf.h | 2 +
net/core/filter.c | 22 ++++++++++
tools/include/uapi/linux/bpf.h | 2 +
.../testing/selftests/bpf/verifier/ctx_skb.c | 43 +++++++++++++++++++
5 files changed, 71 insertions(+)
--
2.39.2
While testing the split PMD path with lockdep enabled I've got an
"Invalid wait context" error caused by split_huge_page_to_list() trying
to lock anon_vma->rwsem while inside RCU read section. The issues is due
to move_pages_pte() calling split_folio() under RCU read lock. Fix this
by unmapping the PTEs and exiting RCU read section before splitting the
folio and then retrying. The same retry pattern is used when locking the
folio or anon_vma in this function. After splitting the large folio we
unlock and release it because after the split the old folio might not be
the one that contains the src_addr.
Fixes: 94b01c885131 ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
---
Changes from v1 [1]:
1. Reset src_folio and src_folio_pte after folio is split, per Peter Xu
[1] https://lore.kernel.org/all/20231230025607.2476912-1-surenb@google.com/
mm/userfaultfd.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 5e718014e671..216ab4c8621f 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -1078,9 +1078,18 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd,
/* at this point we have src_folio locked */
if (folio_test_large(src_folio)) {
+ /* split_folio() can block */
+ pte_unmap(&orig_src_pte);
+ pte_unmap(&orig_dst_pte);
+ src_pte = dst_pte = NULL;
err = split_folio(src_folio);
if (err)
goto out;
+ /* have to reacquire the folio after it got split */
+ folio_unlock(src_folio);
+ folio_put(src_folio);
+ src_folio = NULL;
+ goto retry;
}
if (!src_anon_vma) {
--
2.43.0.472.g3155946c3a-goog
While testing the split PMD path with lockdep enabled I've got an
"Invalid wait context" error caused by split_huge_page_to_list() trying
to lock anon_vma->rwsem while inside RCU read section. The issues is due
to move_pages_pte() calling split_folio() under RCU read lock. Fix this
by unmapping the PTEs and exiting RCU read section before splitting the
folio and then retrying. The same retry pattern is used when locking the
folio or anon_vma in this function.
Fixes: 94b01c885131 ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
---
Patch applies over mm-unstable.
Please note that the SHA in Fixes tag is unstable.
mm/userfaultfd.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 5e718014e671..71393410e028 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -1078,9 +1078,14 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd,
/* at this point we have src_folio locked */
if (folio_test_large(src_folio)) {
+ /* split_folio() can block */
+ pte_unmap(&orig_src_pte);
+ pte_unmap(&orig_dst_pte);
+ src_pte = dst_pte = NULL;
err = split_folio(src_folio);
if (err)
goto out;
+ goto retry;
}
if (!src_anon_vma) {
--
2.43.0.472.g3155946c3a-goog
From: Roberto Sassu <roberto.sassu(a)huawei.com>
IMA and EVM are not effectively LSMs, especially due to the fact that in
the past they could not provide a security blob while there is another LSM
active.
That changed in the recent years, the LSM stacking feature now makes it
possible to stack together multiple LSMs, and allows them to provide a
security blob for most kernel objects. While the LSM stacking feature has
some limitations being worked out, it is already suitable to make IMA and
EVM as LSMs.
The main purpose of this patch set is to remove IMA and EVM function calls,
hardcoded in the LSM infrastructure and other places in the kernel, and to
register them as LSM hook implementations, so that those functions are
called by the LSM infrastructure like other regular LSMs.
This patch set introduces two new LSMs 'ima' and 'evm', so that functions
can be registered to their respective LSM, and removes the 'integrity' LSM.
integrity_kernel_module_request() was moved to IMA, since it was related to
appraisal. integrity_inode_free() was replaced with
ima_inode_free_security() (EVM does not need to free memory).
In order to make 'ima' and 'evm' independent LSMs, it was necessary to
split integrity metadata used by both IMA and EVM, and to let them manage
their own. The special case of the IMA_NEW_FILE flag, managed by IMA and
used by EVM, was handled by introducing a new flag in EVM, EVM_NEW_FILE,
managed by two additional LSM hooks, evm_post_path_mknod() and
evm_file_free(), equivalent to their counterparts ima_post_path_mknod() and
ima_file_free().
In addition to splitting metadata, it was decided to embed the full
structure into the inode security blob, rather than using a cache of
objects and allocating them on demand. This opens for new possibilities,
such as improving locking in IMA.
Another follow-up change was removing the iint parameter from
evm_verifyxattr(), that IMA used to pass integrity metadata to EVM. After
splitting metadata, and aligning EVM_NEW_FILE with IMA_NEW_FILE, this
parameter was not necessary anymore.
The last part was to ensure that the order of IMA and EVM functions is
respected after they become LSMs. Since the order of lsm_info structures in
the .lsm_info.init section depends on the order object files containing
those structures are passed to the linker of the kernel image, and since
IMA is before EVM in the Makefile, that is sufficient to assert that IMA
functions are executed before EVM ones.
The patch set is organized as follows.
Patches 1-9 make IMA and EVM functions suitable to be registered to the LSM
infrastructure, by aligning function parameters.
Patches 10-18 add new LSM hooks in the same places where IMA and EVM
functions are called, if there is no LSM hook already.
Patches 19-21 introduce the new standalone LSMs 'ima' and 'evm', and move
hardcoded calls to IMA, EVM and integrity functions to those LSMs.
Patches 22-23 remove the dependency on the 'integrity' LSM by splitting
integrity metadata, so that the 'ima' and 'evm' LSMs can use their own.
They also duplicate iint_lockdep_annotate() in ima_main.c, since the mutex
field was moved from integrity_iint_cache to ima_iint_cache.
Patch 24 finally removes the 'integrity' LSM, since 'ima' and 'evm' are now
self-contained and independent.
The patch set applies on top of lsm/dev, commit 80b4ff1d2c9b ("selftests:
remove the LSM_ID_IMA check in lsm/lsm_list_modules_test"). The
linux-integrity/next-integrity-testing at commit f17167bea279 ("ima: Remove
EXPERIMENTAL from Kconfig") was merged.
Changelog:
v7:
- Use return instead of goto in __vfs_removexattr_locked() (suggested by
Casey)
- Clarify in security/integrity/Makefile that the order of 'ima' and 'evm'
LSMs depends on the order in which IMA and EVM are compiled
- Move integrity_iint_cache flags to ima.h and evm.h in security/ and
duplicate IMA_NEW_FILE to EVM_NEW_FILE
- Rename evm_inode_get_iint() to evm_iint_inode() and ima_inode_get_iint()
to ima_iint_inode(), check if inode->i_security is NULL, and just return
the pointer from the inode security blob
- Restore the non-NULL checks after ima_iint_inode() and evm_iint_inode()
(suggested by Casey)
- Introduce evm_file_free() to clear EVM_NEW_FILE
- Remove comment about LSM_ORDER_LAST not guaranteeing the order of 'ima'
and 'evm' LSMs
- Lock iint->mutex before reading IMA_COLLECTED flag in __ima_inode_hash()
and restored ima_policy_flag check
- Remove patch about the hardcoded ordering of 'ima' and 'evm' LSMs in
security.c
- Add missing ima_inode_free_security() to free iint->ima_hash
- Add the cases for LSM_ID_IMA and LSM_ID_EVM in lsm_list_modules_test.c
- Mention about the change in IMA and EVM post functions for private
inodes
v6:
- See v7
v5:
- Rename security_file_pre_free() to security_file_release() and the LSM
hook file_pre_free_security to file_release (suggested by Paul)
- Move integrity_kernel_module_request() to ima_main.c (renamed to
ima_kernel_module_request())
- Split the integrity_iint_cache structure into ima_iint_cache and
evm_iint_cache, so that IMA and EVM can use disjoint metadata and
reserve space with the LSM infrastructure
- Reserve space for the entire ima_iint_cache and evm_iint_cache
structures, not just the pointer (suggested by Paul)
- Introduce ima_inode_get_iint() and evm_inode_get_iint() to retrieve
respectively the ima_iint_cache and evm_iint_cache structure from the
security blob
- Remove the various non-NULL checks for the ima_iint_cache and
evm_iint_cache structures, since the LSM infrastructure ensure that they
always exist
- Remove the iint parameter from evm_verifyxattr() since IMA and EVM
use disjoint integrity metaddata
- Introduce the evm_post_path_mknod() to set the IMA_NEW_FILE flag
- Register the inode_alloc_security LSM hook in IMA and EVM to
initialize the respective integrity metadata structures
- Remove the 'integrity' LSM completely and instead make 'ima' and 'evm'
proper standalone LSMs
- Add the inode parameter to ima_get_verity_digest(), since the inode
field is not present in ima_iint_cache
- Move iint_lockdep_annotate() to ima_main.c (renamed to
ima_iint_lockdep_annotate())
- Remove ima_get_lsm_id() and evm_get_lsm_id(), since IMA and EVM directly
register the needed LSM hooks
- Enforce 'ima' and 'evm' LSM ordering at LSM infrastructure level
v4:
- Improve short and long description of
security_inode_post_create_tmpfile(), security_inode_post_set_acl(),
security_inode_post_remove_acl() and security_file_post_open()
(suggested by Mimi)
- Improve commit message of 'ima: Move to LSM infrastructure' (suggested
by Mimi)
v3:
- Drop 'ima: Align ima_post_path_mknod() definition with LSM
infrastructure' and 'ima: Align ima_post_create_tmpfile() definition
with LSM infrastructure', define the new LSM hooks with the same
IMA parameters instead (suggested by Mimi)
- Do IS_PRIVATE() check in security_path_post_mknod() and
security_inode_post_create_tmpfile() on the new inode rather than the
parent directory (in the post method it is available)
- Don't export ima_file_check() (suggested by Stefan)
- Remove redundant check of file mode in ima_post_path_mknod() (suggested
by Mimi)
- Mention that ima_post_path_mknod() is now conditionally invoked when
CONFIG_SECURITY_PATH=y (suggested by Mimi)
- Mention when a LSM hook will be introduced in the IMA/EVM alignment
patches (suggested by Mimi)
- Simplify the commit messages when introducing a new LSM hook
- Still keep the 'extern' in the function declaration, until the
declaration is removed (suggested by Mimi)
- Improve documentation of security_file_pre_free()
- Register 'ima' and 'evm' as standalone LSMs (suggested by Paul)
- Initialize the 'ima' and 'evm' LSMs from 'integrity', to keep the
original ordering of IMA and EVM functions as when they were hardcoded
- Return the IMA and EVM LSM IDs to 'integrity' for registration of the
integrity-specific hooks
- Reserve an xattr slot from the 'evm' LSM instead of 'integrity'
- Pass the LSM ID to init_ima_appraise_lsm()
v2:
- Add description for newly introduced LSM hooks (suggested by Casey)
- Clarify in the description of security_file_pre_free() that actions can
be performed while the file is still open
v1:
- Drop 'evm: Complete description of evm_inode_setattr()', 'fs: Fix
description of vfs_tmpfile()' and 'security: Introduce LSM_ORDER_LAST',
they were sent separately (suggested by Christian Brauner)
- Replace dentry with file descriptor parameter for
security_inode_post_create_tmpfile()
- Introduce mode_stripped and pass it as mode argument to
security_path_mknod() and security_path_post_mknod()
- Use goto in do_mknodat() and __vfs_removexattr_locked() (suggested by
Mimi)
- Replace __lsm_ro_after_init with __ro_after_init
- Modify short description of security_inode_post_create_tmpfile() and
security_inode_post_set_acl() (suggested by Stefan)
- Move security_inode_post_setattr() just after security_inode_setattr()
(suggested by Mimi)
- Modify short description of security_key_post_create_or_update()
(suggested by Mimi)
- Add back exported functions ima_file_check() and
evm_inode_init_security() respectively to ima.h and evm.h (reported by
kernel robot)
- Remove extern from prototype declarations and fix style issues
- Remove unnecessary include of linux/lsm_hooks.h in ima_main.c and
ima_appraise.c
Roberto Sassu (24):
ima: Align ima_inode_post_setattr() definition with LSM infrastructure
ima: Align ima_file_mprotect() definition with LSM infrastructure
ima: Align ima_inode_setxattr() definition with LSM infrastructure
ima: Align ima_inode_removexattr() definition with LSM infrastructure
ima: Align ima_post_read_file() definition with LSM infrastructure
evm: Align evm_inode_post_setattr() definition with LSM infrastructure
evm: Align evm_inode_setxattr() definition with LSM infrastructure
evm: Align evm_inode_post_setxattr() definition with LSM
infrastructure
security: Align inode_setattr hook definition with EVM
security: Introduce inode_post_setattr hook
security: Introduce inode_post_removexattr hook
security: Introduce file_post_open hook
security: Introduce file_release hook
security: Introduce path_post_mknod hook
security: Introduce inode_post_create_tmpfile hook
security: Introduce inode_post_set_acl hook
security: Introduce inode_post_remove_acl hook
security: Introduce key_post_create_or_update hook
ima: Move to LSM infrastructure
ima: Move IMA-Appraisal to LSM infrastructure
evm: Move to LSM infrastructure
evm: Make it independent from 'integrity' LSM
ima: Make it independent from 'integrity' LSM
integrity: Remove LSM
fs/attr.c | 5 +-
fs/file_table.c | 3 +-
fs/namei.c | 12 +-
fs/nfsd/vfs.c | 3 +-
fs/open.c | 1 -
fs/posix_acl.c | 5 +-
fs/xattr.c | 9 +-
include/linux/evm.h | 111 +-------
include/linux/fs.h | 2 -
include/linux/ima.h | 142 ----------
include/linux/integrity.h | 27 --
include/linux/lsm_hook_defs.h | 20 +-
include/linux/security.h | 59 ++++
include/uapi/linux/lsm.h | 2 +
security/integrity/Makefile | 1 +
security/integrity/digsig_asymmetric.c | 23 --
security/integrity/evm/evm.h | 19 ++
security/integrity/evm/evm_crypto.c | 4 +-
security/integrity/evm/evm_main.c | 195 ++++++++++---
security/integrity/iint.c | 197 +------------
security/integrity/ima/ima.h | 120 +++++++-
security/integrity/ima/ima_api.c | 15 +-
security/integrity/ima/ima_appraise.c | 64 +++--
security/integrity/ima/ima_init.c | 2 +-
security/integrity/ima/ima_main.c | 201 +++++++++++---
security/integrity/ima/ima_policy.c | 2 +-
security/integrity/integrity.h | 80 +-----
security/keys/key.c | 10 +-
security/security.c | 261 +++++++++++-------
security/selinux/hooks.c | 3 +-
security/smack/smack_lsm.c | 4 +-
.../selftests/lsm/lsm_list_modules_test.c | 6 +
32 files changed, 783 insertions(+), 825 deletions(-)
--
2.34.1
This MIB counter is similar to the one of TCP -- CurrEstab -- available
in /proc/net/snmp. This is useful to quickly list the number of MPTCP
connections without having to iterate over all of them.
Patch 1 prepares its support by adding new helper functions:
- MPTCP_DEC_STATS(): similar to MPTCP_INC_STATS(), but this time to
decrement a counter.
- mptcp_set_state(): similar to tcp_set_state(), to change the state of
an MPTCP socket, and to inc/decrement the new counter when needed.
Patch 2 uses mptcp_set_state() instead of directly calling
inet_sk_state_store() to change the state of MPTCP sockets.
Patch 3 and 4 validate the new feature in MPTCP "join" and "diag"
selftests.
Signed-off-by: Matthieu Baerts <matttbe(a)kernel.org>
---
Geliang Tang (4):
mptcp: add CurrEstab MIB counter support
mptcp: use mptcp_set_state
selftests: mptcp: join: check CURRESTAB counters
selftests: mptcp: diag: check CURRESTAB counters
net/mptcp/mib.c | 1 +
net/mptcp/mib.h | 8 ++++
net/mptcp/pm_netlink.c | 5 +++
net/mptcp/protocol.c | 56 ++++++++++++++++---------
net/mptcp/protocol.h | 1 +
net/mptcp/subflow.c | 2 +-
tools/testing/selftests/net/mptcp/diag.sh | 17 +++++++-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 46 +++++++++++++++++---
8 files changed, 110 insertions(+), 26 deletions(-)
---
base-commit: 56794e5358542b7c652f202946e53bfd2373b5e0
change-id: 20231221-upstream-net-next-20231221-mptcp-currestab-5a2867b4020b
Best regards,
--
Matthieu Baerts <matttbe(a)kernel.org>
suite->log must be checked for NULL before passing it to
string_stream_clear(). This was done in kunit_init_test() but was missing
from kunit_init_suite().
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
Fixes: 6d696c4695c5 ("kunit: add ability to run tests after boot using debugfs")
---
lib/kunit/test.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index e803d998e855..ea7f0913e55a 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -658,7 +658,9 @@ static void kunit_init_suite(struct kunit_suite *suite)
kunit_debugfs_create_suite(suite);
suite->status_comment[0] = '\0';
suite->suite_init_err = 0;
- string_stream_clear(suite->log);
+
+ if (suite->log)
+ string_stream_clear(suite->log);
}
bool kunit_enabled(void)
--
2.30.2
This makes the uevent selftests build not write to the source tree
unconditionally, as that breaks out of tree builds when the source tree
is read-only. It also avoids leaving a git repository in a dirty state
after a build.
v2: drop spurious extra SPDX-License-Identifier
Signed-off-by: Antonio Terceiro <antonio.terceiro(a)linaro.org>
---
tools/testing/selftests/uevent/Makefile | 15 +++------------
1 file changed, 3 insertions(+), 12 deletions(-)
diff --git a/tools/testing/selftests/uevent/Makefile b/tools/testing/selftests/uevent/Makefile
index f7baa9aa2932..872969f42694 100644
--- a/tools/testing/selftests/uevent/Makefile
+++ b/tools/testing/selftests/uevent/Makefile
@@ -1,17 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-include ../lib.mk
-
-.PHONY: all clean
-
-BINARIES := uevent_filtering
-CFLAGS += -Wl,-no-as-needed -Wall
+CFLAGS += -Wl,-no-as-needed -Wall $(KHDR_INCLUDES)
-uevent_filtering: uevent_filtering.c ../kselftest.h ../kselftest_harness.h
- $(CC) $(CFLAGS) $< -o $@
+TEST_GEN_PROGS = uevent_filtering
-TEST_PROGS += $(BINARIES)
-EXTRA_CLEAN := $(BINARIES)
-
-all: $(BINARIES)
+include ../lib.mk
--
2.43.0
Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
This series is based on the first part which was merged [1], this series is to
add the cache invalidation interface or the userspace to invalidate cache after
modifying the stage-1 page table. This includes both the iommufd changes and the
VT-d driver changes.
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20231026044216.64964-1-yi.l.liu@intel.c… - merged
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v9:
- Add a test case which sets both IOMMU_TEST_INVALIDATE_FLAG_ALL and
IOMMU_TEST_INVALIDATE_FLAG_TRIGGER_ERROR in flags, and expect to succeed
and see an 'error'. (Kevin)
- Returns -ETIMEOUT in qi_check_fault() if caller is interested with the
fault when timeout happens. If not, the qi_submit_sync() will keep retry
hence unable to report the error back to user. For now, only the user cache
invalidation path has interest on the time out error. So this change only
affects the user cache invalidation path. Other path will still hang in
qi_submit_sync() when timeout happens. (Kevin)
v8: https://lore.kernel.org/linux-iommu/20231227161354.67701-1-yi.l.liu@intel.c…
- Pass invalidation hint to the cache invalidation helper in the cache_invalidate_user
op path (Kevin)
- Move the devTLB invalidation out of info->iommu loop (Kevin, Weijiang)
- Clear *fault per restart in qi_submit_sync() to avoid acroos submission error
accumulation. (Kevin)
- Define the vtd cache invalidation uapi structure in separate patch (Kevin)
- Rename inv_error to be hw_error (Kevin)
- Rename 'reqs_uptr', 'req_type', 'req_len' and 'req_num' to be 'data_uptr',
'data_type', "entry_len' and 'entry_num" (Kevin)
- Allow user to set IOMMU_TEST_INVALIDATE_FLAG_ALL and IOMMU_TEST_INVALIDATE_FLAG_TRIGGER_ERROR
in the same time (Kevin)
v7: https://lore.kernel.org/linux-iommu/20231221153948.119007-1-yi.l.liu@intel.…
- Remove domain->ops->cache_invalidate_user check in hwpt alloc path due
to failure in bisect (Baolu)
- Remove out_driver_error_code from struct iommu_hwpt_invalidate after
discussion in v6. Should expect per-entry error code.
- Rework the selftest cache invalidation part to report a per-entry error
- Allow user to pass in an empty array to have a try-and-fail mechanism for
user to check if a given req_type is supported by the kernel (Jason)
- Define a separate enum type for cache invalidation data (Jason)
- Fix the IOMMU_HWPT_INVALIDATE to always update the req_num field before
returning (Nicolin)
- Merge the VT-d nesting part 2/2
https://lore.kernel.org/linux-iommu/20231117131816.24359-1-yi.l.liu@intel.c…
into this series to avoid defining empty enum in the middle of the series.
The major difference is adding the VT-d related invalidation uapi structures
together with the generic data structures in patch 02 of this series.
- VT-d driver was refined to report ICE/ITE error from the bottom cache
invalidation submit helpers, hence the cache_invalidate_user op could
report such errors via the per-entry error field to user. VT-d driver
will not stop the invalidation array walking due to the ICE/ITE errors
as such errors are defined by VT-d spec, userspace should be able to
handle it and let the real user (say Virtual Machine) know about it.
But for other errors like invalid uapi data structure configuration,
memory copy failure, such errors should stop the array walking as it
may have more issues if go on.
- Minor fixes per Jason and Kevin's review comments
v6: https://lore.kernel.org/linux-iommu/20231117130717.19875-1-yi.l.liu@intel.c…
- No much change, just rebase on top of 6.7-rc1 as part 1/2 is merged
v5: https://lore.kernel.org/linux-iommu/20231020092426.13907-1-yi.l.liu@intel.c…
- Split the iommufd nesting series into two parts of alloc_user and
invalidation (Jason)
- Split IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING/_NESTED, and
do the same with the structures/alloc()/abort()/destroy(). Reworked the
selftest accordingly too. (Jason)
- Move hwpt/data_type into struct iommu_user_data from standalone op
arguments. (Jason)
- Rename hwpt_type to be data_type, the HWPT_TYPE to be HWPT_ALLOC_DATA,
_TYPE_DEFAULT to be _ALLOC_DATA_NONE (Jason, Kevin)
- Rename iommu_copy_user_data() to iommu_copy_struct_from_user() (Kevin)
- Add macro to the iommu_copy_struct_from_user() to calculate min_size
(Jason)
- Fix two bugs spotted by ZhaoYan
v4: https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
- Separate HWPT alloc/destroy/abort functions between user-managed HWPTs
and kernel-managed HWPTs
- Rework invalidate uAPI to be a multi-request array-based design
- Add a struct iommu_user_data_array and a helper for driver to sanitize
and copy the entry data from user space invalidation array
- Add a patch fixing TEST_LENGTH() in selftest program
- Drop IOMMU_RESV_IOVA_RANGES patches
- Update kdoc and inline comments
- Drop the code to add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation,
this does not change the rule that resv regions should only be added to the
kernel-managed HWPT. The IOMMU_RESV_SW_MSI stuff will be added in later series
as it is needed only by SMMU so far.
v3: https://lore.kernel.org/linux-iommu/20230724110406.107212-1-yi.l.liu@intel.…
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to avoid
passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Lu Baolu (4):
iommu: Add cache_invalidate_user op
iommu/vt-d: Allow qi_submit_sync() to return the QI faults
iommu/vt-d: Convert stage-1 cache invalidation to return QI fault
iommu/vt-d: Add iotlb flush for nested domain
Nicolin Chen (4):
iommu: Add iommu_copy_struct_from_user_array helper
iommufd/selftest: Add mock_domain_cache_invalidate_user support
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (2):
iommufd: Add IOMMU_HWPT_INVALIDATE
iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
drivers/iommu/intel/dmar.c | 49 +++--
drivers/iommu/intel/iommu.c | 12 +-
drivers/iommu/intel/iommu.h | 8 +-
drivers/iommu/intel/irq_remapping.c | 2 +-
drivers/iommu/intel/nested.c | 107 ++++++++++
drivers/iommu/intel/pasid.c | 14 +-
drivers/iommu/intel/svm.c | 14 +-
drivers/iommu/iommufd/hw_pagetable.c | 41 ++++
drivers/iommu/iommufd/iommufd_private.h | 10 +
drivers/iommu/iommufd/iommufd_test.h | 39 ++++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 86 ++++++++
include/linux/iommu.h | 100 +++++++++
include/uapi/linux/iommufd.h | 101 ++++++++++
tools/testing/selftests/iommu/iommufd.c | 190 ++++++++++++++++++
tools/testing/selftests/iommu/iommufd_utils.h | 57 ++++++
16 files changed, 794 insertions(+), 39 deletions(-)
--
2.34.1
The patch set [1] added a general lib.sh in net selftests, and converted
several test scripts to source the lib.sh.
The shebang of unicast_extensions.sh is /bin/sh which may point to various
shells in different distributions, but "source" is only available in some
of them. For example, "source" is a built-it function in bash, but it
cannot be used in dash.
Refer to other scripts that were converted together, simply change the
shebang to bash to suppress the following errors when the default /bin/sh
points to other shells.
# selftests: net: unicast_extensions.sh
# ./unicast_extensions.sh: 31: source: not found
# ###########################################################################
# Unicast address extensions tests (behavior of reserved IPv4 addresses)
# ###########################################################################
# TEST: assign and ping within 240/4 (1 of 2) (is allowed) [FAIL]
# TEST: assign and ping within 240/4 (2 of 2) (is allowed) [FAIL]
# TEST: assign and ping within 0/8 (1 of 2) (is allowed) [FAIL]
# TEST: assign and ping within 0/8 (2 of 2) (is allowed) [FAIL]
# TEST: assign and ping inside 255.255/16 (is allowed) [FAIL]
# TEST: assign and ping inside 255.255.255/24 (is allowed) [FAIL]
# TEST: route between 240.5.6/24 and 255.1.2/24 (is allowed) [FAIL]
# TEST: route between 0.200/16 and 245.99/16 (is allowed) [FAIL]
# TEST: assign and ping lowest address (/24) [FAIL]
# TEST: assign and ping lowest address (/26) [FAIL]
# TEST: routing using lowest address [FAIL]
# TEST: assigning 0.0.0.0 (is forbidden) [ OK ]
# TEST: assigning 255.255.255.255 (is forbidden) [ OK ]
# TEST: assign and ping inside 127/8 (is forbidden) [ OK ]
# TEST: assign and ping class D address (is forbidden) [ OK ]
# TEST: routing using class D (is forbidden) [ OK ]
# TEST: routing using 127/8 (is forbidden) [ OK ]
not ok 51 selftests: net: unicast_extensions.sh # exit=1
Link: https://lore.kernel.org/all/20231202020110.362433-1-liuhangbin@gmail.com/ [1]
Fixes: 0f4765d0b48d ("selftests/net: convert unicast_extensions.sh to run it in unique namespace")
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Signed-off-by: Yujie Liu <yujie.liu(a)intel.com>
---
tools/testing/selftests/net/unicast_extensions.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/unicast_extensions.sh b/tools/testing/selftests/net/unicast_extensions.sh
index b7a2cb9e7477..2766990c2b78 100755
--- a/tools/testing/selftests/net/unicast_extensions.sh
+++ b/tools/testing/selftests/net/unicast_extensions.sh
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/bin/bash
# SPDX-License-Identifier: GPL-2.0
#
# By Seth Schoen (c) 2021, for the IPv4 Unicast Extensions Project
--
2.34.1
This series attempts to reduce parsing overhead of IPv6 extension headers
in GRO and GSO, by removing extension header specific code and enabling
the frag0 fast path.
The following changes were made:
- Specific unnecessary HBH conditionals were removed by adding HBH offload
to inet6_offloads
- Added a utility function to support frag0 fast path in ipv6_gro_receive
- Added self-test for IPv6 packets with extension headers in GRO
Richard
Richard Gobert (3):
net: gso: add HBH extension header offload support
net: gro: parse ipv6 ext headers without frag0 invalidation
selftests/net: fix GRO coalesce test and add ext header coalesce test
net/ipv6/exthdrs_offload.c | 11 +++++
net/ipv6/ip6_offload.c | 76 ++++++++++++++++++++----------
tools/testing/selftests/net/gro.c | 78 ++++++++++++++++++++++++++++---
3 files changed, 134 insertions(+), 31 deletions(-)
--
2.36.1
Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
This series is based on the first part which was merged [1], this series is to
add the cache invalidation interface or the userspace to invalidate cache after
modifying the stage-1 page table. This includes both the iommufd changes and the
VT-d driver changes.
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20231026044216.64964-1-yi.l.liu@intel.c… - merged
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/zhenzhong/wip/iommufd_nesting_rfcv1
Change log:
v7:
- Remove domain->ops->cache_invalidate_user check in hwpt alloc path due
to failure in bisect (Baolu)
- Remove out_driver_error_code from struct iommu_hwpt_invalidate after
discussion in v6. Should expect per-entry error code.
- Rework the selftest cache invalidation part to report a per-entry error
- Allow user to pass in an empty array to have a try-and-fail mechanism for
user to check if a given req_type is supported by the kernel (Jason)
- Define a separate enum type for cache invalidation data (Jason)
- Fix the IOMMU_HWPT_INVALIDATE to always update the req_num field before
returning (Nicolin)
- Merge the VT-d nesting part 2/2
https://lore.kernel.org/linux-iommu/20231117131816.24359-1-yi.l.liu@intel.c…
into this series to avoid defining empty enum in the middle of the series.
The major difference is adding the VT-d related invalidation uapi structures
together with the generic data structures in patch 02 of this series.
- VT-d driver was refined to report ICE/ITE error from the bottom cache
invalidation submit helpers, hence the cache_invalidate_user op could
report such errors via the per-entry error field to user. VT-d driver
will not stop the invalidation array walking due to the ICE/ITE errors
as such errors are defined by VT-d spec, userspace should be able to
handle it and let the real user (say Virtual Machine) know about it.
But for other errors like invalid uapi data structure configuration,
memory copy failure, such errors should stop the array walking as it
may have more issues if go on.
- Minor fixes per Jason and Kevin's review comments
v6: https://lore.kernel.org/linux-iommu/20231117130717.19875-1-yi.l.liu@intel.c…
- No much change, just rebase on top of 6.7-rc1 as part 1/2 is merged
v5: https://lore.kernel.org/linux-iommu/20231020092426.13907-1-yi.l.liu@intel.c…
- Split the iommufd nesting series into two parts of alloc_user and
invalidation (Jason)
- Split IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING/_NESTED, and
do the same with the structures/alloc()/abort()/destroy(). Reworked the
selftest accordingly too. (Jason)
- Move hwpt/data_type into struct iommu_user_data from standalone op
arguments. (Jason)
- Rename hwpt_type to be data_type, the HWPT_TYPE to be HWPT_ALLOC_DATA,
_TYPE_DEFAULT to be _ALLOC_DATA_NONE (Jason, Kevin)
- Rename iommu_copy_user_data() to iommu_copy_struct_from_user() (Kevin)
- Add macro to the iommu_copy_struct_from_user() to calculate min_size
(Jason)
- Fix two bugs spotted by ZhaoYan
v4: https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.…
- Separate HWPT alloc/destroy/abort functions between user-managed HWPTs
and kernel-managed HWPTs
- Rework invalidate uAPI to be a multi-request array-based design
- Add a struct iommu_user_data_array and a helper for driver to sanitize
and copy the entry data from user space invalidation array
- Add a patch fixing TEST_LENGTH() in selftest program
- Drop IOMMU_RESV_IOVA_RANGES patches
- Update kdoc and inline comments
- Drop the code to add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation,
this does not change the rule that resv regions should only be added to the
kernel-managed HWPT. The IOMMU_RESV_SW_MSI stuff will be added in later series
as it is needed only by SMMU so far.
v3: https://lore.kernel.org/linux-iommu/20230724110406.107212-1-yi.l.liu@intel.…
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to avoid
passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Lu Baolu (4):
iommu: Add cache_invalidate_user op
iommu/vt-d: Allow qi_submit_sync() to return the QI faults
iommu/vt-d: Convert pasid based cache invalidation to return QI fault
iommu/vt-d: Add iotlb flush for nested domain
Nicolin Chen (4):
iommu: Add iommu_copy_struct_from_user_array helper
iommufd/selftest: Add mock_domain_cache_invalidate_user support
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (1):
iommufd: Add IOMMU_HWPT_INVALIDATE
drivers/iommu/intel/dmar.c | 36 ++--
drivers/iommu/intel/iommu.c | 12 +-
drivers/iommu/intel/iommu.h | 8 +-
drivers/iommu/intel/irq_remapping.c | 2 +-
drivers/iommu/intel/nested.c | 116 ++++++++++++
drivers/iommu/intel/pasid.c | 14 +-
drivers/iommu/intel/svm.c | 14 +-
drivers/iommu/iommufd/hw_pagetable.c | 36 ++++
drivers/iommu/iommufd/iommufd_private.h | 10 ++
drivers/iommu/iommufd/iommufd_test.h | 39 ++++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 93 ++++++++++
include/linux/iommu.h | 101 +++++++++++
include/uapi/linux/iommufd.h | 100 +++++++++++
tools/testing/selftests/iommu/iommufd.c | 170 ++++++++++++++++++
tools/testing/selftests/iommu/iommufd_utils.h | 57 ++++++
16 files changed, 773 insertions(+), 38 deletions(-)
--
2.34.1
Patch 1 is a cleanup one: mptcp_is_tcpsk() helper was modifying sock_ops
in some cases which is unexpected with that name.
Patch 2 to 4 add support for two socket options: IP_LOCAL_PORT_RANGE and
IP_BIND_ADDRESS_NO_PORT. The first one is a preparation patch, the
second one adds the support while the last one modifies an existing
selftest to validate the new features.
Signed-off-by: Matthieu Baerts <matttbe(a)kernel.org>
---
Davide Caratti (1):
mptcp: don't overwrite sock_ops in mptcp_is_tcpsk()
Maxim Galaganov (3):
mptcp: rename mptcp_setsockopt_sol_ip_set_transparent()
mptcp: sockopt: support IP_LOCAL_PORT_RANGE and IP_BIND_ADDRESS_NO_PORT
selftests/net: add MPTCP coverage for IP_LOCAL_PORT_RANGE
net/mptcp/protocol.c | 108 +++++++++-------------
net/mptcp/sockopt.c | 27 +++++-
tools/testing/selftests/net/ip_local_port_range.c | 12 +++
3 files changed, 79 insertions(+), 68 deletions(-)
---
base-commit: 62ed78f3baff396bd928ee77077580c5aa940149
change-id: 20231219-upstream-net-next-20231219-mptcp-sockopts-ephemeral-ports-645522e83161
Best regards,
--
Matthieu Baerts <matttbe(a)kernel.org>
From: Maxim Mikityanskiy <maxim(a)isovalent.com>
The goal of this series is to extend the verifier's capabilities of
tracking scalars when they are spilled to stack, especially when the
spill or fill is narrowing. It also contains a fix by Eduard for
infinite loop detection and a state pruning optimization by Eduard that
compensates for a verification complexity regression introduced by
tracking unbounded scalars. These improvements reduce the surface of
false rejections that I saw while working on Cilium codebase.
Patch 1 (Maxim): Fix for an existing test, it will matter later in the
series.
Patches 2-3 (Eduard): Fixes for false rejections in infinite loop
detection that happen in the selftests when my patches are applied.
Patches 4-5 (Maxim): Fix the inconsistency of find_equal_scalars that
was possible if 32-bit spills were made.
Patches 6-11 (Maxim): Support the case when boundary checks are first
performed after the register was spilled to the stack.
Patches 12-13 (Maxim): Support narrowing fills.
Patches 14-15 (Eduard): Optimization for state pruning in stacksafe() to
mitigate the verification complexity regression.
veristat -e file,prog,states -f '!states_diff<50' -f '!states_pct<10' -f '!states_a<10' -f '!states_b<10' -C ...
* Without patch 14:
File Program States (A) States (B) States (DIFF)
-------------------- ------------ ---------- ---------- ----------------
bpf_xdp.o tail_lb_ipv6 3877 2936 -941 (-24.27%)
pyperf180.bpf.o on_event 8422 10456 +2034 (+24.15%)
pyperf600.bpf.o on_event 22259 37319 +15060 (+67.66%)
pyperf600_iter.bpf.o on_event 400 540 +140 (+35.00%)
strobemeta.bpf.o on_event 4702 13435 +8733 (+185.73%)
* With patch 14:
File Program States (A) States (B) States (DIFF)
-------------------- ------------ ---------- ---------- --------------
bpf_xdp.o tail_lb_ipv6 3877 2937 -940 (-24.25%)
pyperf600_iter.bpf.o on_event 400 500 +100 (+25.00%)
Eduard Zingerman (4):
bpf: make infinite loop detection in is_state_visited() exact
selftests/bpf: check if imprecise stack spills confuse infinite loop
detection
bpf: Optimize state pruning for spilled scalars
selftests/bpf: states pruning checks for scalar vs STACK_{MISC,ZERO}
Maxim Mikityanskiy (11):
selftests/bpf: Fix the u64_offset_to_skb_data test
bpf: Make bpf_for_each_spilled_reg consider narrow spills
selftests/bpf: Add a test case for 32-bit spill tracking
bpf: Add the assign_scalar_id_before_mov function
bpf: Add the get_reg_width function
bpf: Assign ID to scalars on spill
selftests/bpf: Test assigning ID to scalars on spill
bpf: Track spilled unbounded scalars
selftests/bpf: Test tracking spilled unbounded scalars
bpf: Preserve boundaries and track scalars on narrowing fill
selftests/bpf: Add test cases for narrowing fill
include/linux/bpf_verifier.h | 2 +-
kernel/bpf/verifier.c | 160 +++++-
.../bpf/progs/verifier_direct_packet_access.c | 2 +-
.../selftests/bpf/progs/verifier_loops1.c | 24 +
.../selftests/bpf/progs/verifier_spill_fill.c | 529 +++++++++++++++++-
.../testing/selftests/bpf/verifier/precise.c | 6 +-
6 files changed, 677 insertions(+), 46 deletions(-)
--
2.42.1
The KUnit device helpers are documented with kerneldoc in their header
file, but also have short comments over their implementation. These were
mistakenly formatted as kerneldoc comments, even though they're not
valid kerneldoc. It shouldn't cause any serious problems -- this file
isn't included in the docs -- but it could be confusing, and causes
warnings.
Remove the extra '*' so that these aren't treated as kerneldoc.
Fixes: d03c720e03bd ("kunit: Add APIs for managing devices")
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312181920.H4EPAH20-lkp@intel.com/
Signed-off-by: David Gow <davidgow(a)google.com>
---
lib/kunit/device.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/kunit/device.c b/lib/kunit/device.c
index 1db4305b615a..f5371287b375 100644
--- a/lib/kunit/device.c
+++ b/lib/kunit/device.c
@@ -60,7 +60,7 @@ static void kunit_device_release(struct device *d)
kfree(to_kunit_device(d));
}
-/**
+/*
* Create and register a KUnit-managed struct device_driver on the kunit_bus.
* Returns an error pointer on failure.
*/
@@ -124,7 +124,7 @@ static struct kunit_device *kunit_device_register_internal(struct kunit *test,
return kunit_dev;
}
-/**
+/*
* Create and register a new KUnit-managed device, using the user-supplied device_driver.
* On failure, returns an error pointer.
*/
@@ -141,7 +141,7 @@ struct device *kunit_device_register_with_driver(struct kunit *test,
}
EXPORT_SYMBOL_GPL(kunit_device_register_with_driver);
-/**
+/*
* Create and register a new KUnit-managed device, including a matching device_driver.
* On failure, returns an error pointer.
*/
--
2.43.0.472.g3155946c3a-goog
Here is the last part of converting net selftests to run in unique namespace.
This part converts all left tests. After the conversion, we can run the net
sleftests in parallel. e.g.
# ./run_kselftest.sh -n -t net:reuseport_bpf
TAP version 13
1..1
# selftests: net: reuseport_bpf
ok 1 selftests: net: reuseport_bpf
mod 10...
# Socket 0: 0
# Socket 1: 1
...
# Socket 4: 19
# Testing filter add without bind...
# SUCCESS
# ./run_kselftest.sh -p -n -t net:cmsg_so_mark.sh -t net:cmsg_time.sh -t net:cmsg_ipv6.sh
TAP version 13
1..3
# selftests: net: cmsg_so_mark.sh
ok 1 selftests: net: cmsg_so_mark.sh
# selftests: net: cmsg_time.sh
ok 2 selftests: net: cmsg_time.sh
# selftests: net: cmsg_ipv6.sh
ok 3 selftests: net: cmsg_ipv6.sh
# ./run_kselftest.sh -p -n -c net
TAP version 13
1..95
# selftests: net: reuseport_bpf_numa
ok 3 selftests: net: reuseport_bpf_numa
# selftests: net: reuseport_bpf_cpu
ok 2 selftests: net: reuseport_bpf_cpu
# selftests: net: sk_bind_sendto_listen
ok 9 selftests: net: sk_bind_sendto_listen
# selftests: net: reuseaddr_conflict
ok 5 selftests: net: reuseaddr_conflict
...
Here is the part 1 link:
https://lore.kernel.org/netdev/20231202020110.362433-1-liuhangbin@gmail.com
part 2 link:
https://lore.kernel.org/netdev/20231206070801.1691247-1-liuhangbin@gmail.com
part 3 link:
https://lore.kernel.org/netdev/20231213060856.4030084-1-liuhangbin@gmail.com
Hangbin Liu (8):
selftests/net: convert gre_gso.sh to run it in unique namespace
selftests/net: convert netns-name.sh to run it in unique namespace
selftests/net: convert rtnetlink.sh to run it in unique namespace
selftests/net: convert stress_reuseport_listen.sh to run it in unique
namespace
selftests/net: convert xfrm_policy.sh to run it in unique namespace
selftests/net: use unique netns name for setup_loopback.sh
setup_veth.sh
selftests/net: convert pmtu.sh to run it in unique namespace
kselftest/runner.sh: add netns support
tools/testing/selftests/kselftest/runner.sh | 38 ++++-
tools/testing/selftests/net/gre_gso.sh | 18 +--
tools/testing/selftests/net/gro.sh | 4 +-
tools/testing/selftests/net/netns-name.sh | 44 +++---
tools/testing/selftests/net/pmtu.sh | 27 ++--
tools/testing/selftests/net/rtnetlink.sh | 34 +++--
tools/testing/selftests/net/setup_loopback.sh | 8 +-
tools/testing/selftests/net/setup_veth.sh | 9 +-
.../selftests/net/stress_reuseport_listen.sh | 6 +-
tools/testing/selftests/net/toeplitz.sh | 14 +-
tools/testing/selftests/net/xfrm_policy.sh | 138 +++++++++---------
tools/testing/selftests/run_kselftest.sh | 10 +-
12 files changed, 193 insertions(+), 157 deletions(-)
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
As there were bugs found with the ownership of eventfs dynamic file
creation. Add a test to test it.
It will remount tracefs with a different gid and check the ownership of
the eventfs directory, as well as the system and event directories. It
will also check the event file directories.
It then does a chgrp on each of these as well to see if they all get
updated as expected.
Then it remounts the tracefs file system back to the original group and
makes sure that all the updated files and directories were reset back to
the original ownership.
It does the same for instances that change the ownership of he instance
directory.
Note, because the uid is not reset by a remount, it is tested for every
file by switching it to a new owner and then back again.
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Tested-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since v3: https://lore.kernel.org/linux-trace-kernel/20231221211229.13398ef3@gandalf.…
- Added missing SPDX and removed exec permission from file (Shuah Khan)
.../ftrace/test.d/00basic/test_ownership.tc | 114 ++++++++++++++++++
1 file changed, 114 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
new file mode 100644
index 000000000000..add7d5bf585d
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
@@ -0,0 +1,114 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Test file and directory owership changes for eventfs
+
+original_group=`stat -c "%g" .`
+original_owner=`stat -c "%u" .`
+
+mount_point=`stat -c '%m' .`
+mount_options=`mount | grep "$mount_point" | sed -e 's/.*(\(.*\)).*/\1/'`
+
+# find another owner and group that is not the original
+other_group=`tac /etc/group | grep -v ":$original_group:" | head -1 | cut -d: -f3`
+other_owner=`tac /etc/passwd | grep -v ":$original_owner:" | head -1 | cut -d: -f3`
+
+# Remove any group ownership already
+new_options=`echo "$mount_options" | sed -e "s/gid=[0-9]*/gid=$other_group/"`
+
+if [ "$new_options" = "$mount_options" ]; then
+ new_options="$mount_options,gid=$other_group"
+ mount_options="$mount_options,gid=$original_group"
+fi
+
+canary="events/timer events/timer/timer_cancel events/timer/timer_cancel/format"
+
+test() {
+ file=$1
+ test_group=$2
+
+ owner=`stat -c "%u" $file`
+ group=`stat -c "%g" $file`
+
+ echo "testing $file $owner=$original_owner and $group=$test_group"
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+ if [ $group -ne $test_group ]; then
+ exit_fail
+ fi
+
+ # Note, the remount does not update ownership so test going to and from owner
+ echo "test owner $file to $other_owner"
+ chown $other_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $other_owner ]; then
+ exit_fail
+ fi
+
+ chown $original_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+
+}
+
+run_tests() {
+ for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events
+ test "events" $original_group
+ for d in "." "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched
+ test "events/sched" $original_group
+ for d in "." "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch
+ test "events/sched/sched_switch" $original_group
+ for d in "." "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch/enable
+ test "events/sched/sched_switch/enable" $original_group
+ for d in "." $canary; do
+ test "$d" $other_group
+ done
+}
+
+mount -o remount,"$new_options" .
+
+run_tests
+
+mount -o remount,"$mount_options" .
+
+for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $original_group
+done
+
+# check instances as well
+
+chgrp $other_group instances
+
+instance="$(mktemp -u test-XXXXXX)"
+
+mkdir instances/$instance
+
+cd instances/$instance
+
+run_tests
+
+cd ../..
+
+rmdir instances/$instance
+
+chgrp $original_group instances
+
+exit 0
--
2.42.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
As there were bugs found with the ownership of eventfs dynamic file
creation. Add a test to test it.
It will remount tracefs with a different gid and check the ownership of
the eventfs directory, as well as the system and event directories. It
will also check the event file directories.
It then does a chgrp on each of these as well to see if they all get
updated as expected.
Then it remounts the tracefs file system back to the original group and
makes sure that all the updated files and directories were reset back to
the original ownership.
It does the same for instances that change the ownership of he instance
directory.
Note, because the uid is not reset by a remount, it is tested for every
file by switching it to a new owner and then back again.
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since v2: https://lore.kernel.org/linux-trace-kernel/20231221194516.53e1ee43@gandalf.…
- Changed the instance test name from "foo-$(mktemp -u XXXXX)" to
"$(mktemp -u test-XXXXXX)" as Masami reported that busybox mktemp only
works with 6 Xs and not 5. Also changed "foo" to "test" and placed it
into the mktemp format.
.../ftrace/test.d/00basic/test_ownership.tc | 113 ++++++++++++++++++
1 file changed, 113 insertions(+)
create mode 100755 tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
new file mode 100755
index 000000000000..4c20be3a714a
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
@@ -0,0 +1,113 @@
+#!/bin/sh
+# description: Test file and directory owership changes for eventfs
+
+original_group=`stat -c "%g" .`
+original_owner=`stat -c "%u" .`
+
+mount_point=`stat -c '%m' .`
+mount_options=`mount | grep "$mount_point" | sed -e 's/.*(\(.*\)).*/\1/'`
+
+# find another owner and group that is not the original
+other_group=`tac /etc/group | grep -v ":$original_group:" | head -1 | cut -d: -f3`
+other_owner=`tac /etc/passwd | grep -v ":$original_owner:" | head -1 | cut -d: -f3`
+
+# Remove any group ownership already
+new_options=`echo "$mount_options" | sed -e "s/gid=[0-9]*/gid=$other_group/"`
+
+if [ "$new_options" = "$mount_options" ]; then
+ new_options="$mount_options,gid=$other_group"
+ mount_options="$mount_options,gid=$original_group"
+fi
+
+canary="events/timer events/timer/timer_cancel events/timer/timer_cancel/format"
+
+test() {
+ file=$1
+ test_group=$2
+
+ owner=`stat -c "%u" $file`
+ group=`stat -c "%g" $file`
+
+ echo "testing $file $owner=$original_owner and $group=$test_group"
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+ if [ $group -ne $test_group ]; then
+ exit_fail
+ fi
+
+ # Note, the remount does not update ownership so test going to and from owner
+ echo "test owner $file to $other_owner"
+ chown $other_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $other_owner ]; then
+ exit_fail
+ fi
+
+ chown $original_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+
+}
+
+run_tests() {
+ for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events
+ test "events" $original_group
+ for d in "." "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched
+ test "events/sched" $original_group
+ for d in "." "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch
+ test "events/sched/sched_switch" $original_group
+ for d in "." "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch/enable
+ test "events/sched/sched_switch/enable" $original_group
+ for d in "." $canary; do
+ test "$d" $other_group
+ done
+}
+
+mount -o remount,"$new_options" .
+
+run_tests
+
+mount -o remount,"$mount_options" .
+
+for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $original_group
+done
+
+# check instances as well
+
+chgrp $other_group instances
+
+instance="$(mktemp -u test-XXXXXX)"
+
+mkdir instances/$instance
+
+cd instances/$instance
+
+run_tests
+
+cd ../..
+
+rmdir instances/$instance
+
+chgrp $original_group instances
+
+exit 0
--
2.42.0
This series implements support for SME use in non-protected KVM guests.
Much of this is very similar to SVE, the main additional challenge that
SME presents is that it introduces two new controls which change the
registers seen by guests:
- PSTATE.ZA enables the ZA matrix register and, if SME2 is supported,
the ZT0 LUT register.
- PSTATE.SM enables streaming mode, a new floating point mode which
uses the SVE register set with a separately configured vector length.
In streaming mode implementation of the FFR register is optional.
It is also permitted to build systems which support SME without SVE, in
this case when not in streaming mode no SVE registers or instructions
are available. Further, there is no requirement that there be any
overlap in the set of vector lengths supported by SVE and SME in a
system, this is expected to be a common situation in practical systems.
Since there is a new vector length to configure we introduce a new
feature parallel to the existing SVE one with a new pseudo register for
the streaming mode vector length. Due to the overlap with SVE caused by
streaming mode rather than finalising SME as a separate feature we use
the existing SVE finalisation to also finalise SME, a new define
KVM_ARM_VCPU_VEC is provided to help make user code clearer. Finalising
SVE and SME separately would introduce complication with register access
since finalising SVE makes the SVE regsiters writeable by userspace and
doing multiple finalisations results in an error being reported.
Dealing with a state where the SVE registers are writeable due to one of
SVE or SME being finalised but may have their VL changed by the other
being finalised seems like needless complexity with minimal practical
utility, it seems clearer to just express directly that only one
finalisation can be done in the ABI.
We represent the streaming mode registers to userspace by always using
the existing SVE registers to access the floating point state, using the
larger of the SME and (if enabled for the guest) SVE vector lengths.
There are a large number of subfeatures for SME, most of which only
offer additional instructions but some of which (SME2 and FA64) add
architectural state. The expectation is that these will be configured
via the ID registers but since the mechanism for doing this is still
unclear the current code enables SME2 and FA64 for the guest if the host
supports them regardless of what the ID registers say.
Since we do not yet have support for SVE in protected guests and SME is
very reliant on SVE this series does not implement support for SME in
protected guests. This will be added separately once SVE support is
merged into mainline (or along with merging that), there is code for
protected guests using SVE in the Android tree.
The new KVM_ARM_VCPU_VEC feature and ZA and ZT0 registers have not been
added to the get-reg-list selftest, the idea of supporting additional
features there without restructuring the program to generate all
possible feature combinations has been rejected. I will post a separate
series which does that restructuring.
I am seeing some test failures currently which I've not got to the
bottom of, at this point I'm reasonably sure these are preexisting
issues in the kernel which are more apparent in a guest.
To: Marc Zyngier <maz(a)kernel.org>
To: Oliver Upton <oliver.upton(a)linux.dev>
To: James Morse <james.morse(a)arm.com>
To: Suzuki K Poulose <suzuki.poulose(a)arm.com>
To: Catalin Marinas <catalin.marinas(a)arm.com>
To: Will Deacon <will(a)kernel.org>
Cc: <linux-arm-kernel(a)lists.infradead.org>
Cc: <kvmarm(a)lists.linux.dev>
Cc: <linux-kernel(a)vger.kernel.org>
To: Paolo Bonzini <pbonzini(a)redhat.com>
To: Jonathan Corbet <corbet(a)lwn.net>
Cc: <kvm(a)vger.kernel.org>
Cc: <linux-doc(a)vger.kernel.org>
To: Shuah Khan <shuah(a)kernel.org>
Cc: <linux-kselftest(a)vger.kernel.org>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Changes in v2:
- Rebase onto v6.7-rc3.
- Configure subfeatures based on host system only.
- Complete nVHE support.
- There was some snafu with sending v1 out, it didn't make it to the
lists but in case it hit people's inboxes I'm sending as v2.
---
Mark Brown (22):
KVM: arm64: Document why we trap SVE access from the host
arm64/fpsimd: Make SVE<->FPSIMD rewriting available to KVM
KVM: arm64: Move SVE state access macros after feature test macros
KVM: arm64: Store vector lengths in an array
KVM: arm64: Document the KVM ABI for SME
KVM: arm64: Make FFR restore optional in __sve_restore_state()
KVM: arm64: Define guest flags for SME
KVM: arm64: Rename SVE finalization constants to be more general
KVM: arm64: Basic SME system register descriptions
KVM: arm64: Add support for TPIDR2_EL0
KVM: arm64: Make SMPRI_EL1 RES0 for SME guests
KVM: arm64: Make SVCR a normal system register
KVM: arm64: Context switch SME state for guest
KVM: arm64: Manage and handle SME traps
KVM: arm64: Implement SME vector length configuration
KVM: arm64: Rename sve_state_reg_region
KVM: arm64: Support userspace access to streaming mode SVE registers
KVM: arm64: Expose ZA to userspace
KVM: arm64: Provide userspace access to ZT0
KVM: arm64: Support SME version configuration via ID registers
KVM: arm64: Provide userspace ABI for enabling SME
KVM: arm64: selftests: Add SME system registers to get-reg-list
Documentation/virt/kvm/api.rst | 104 +++++---
arch/arm64/include/asm/fpsimd.h | 5 +
arch/arm64/include/asm/kvm_emulate.h | 13 +-
arch/arm64/include/asm/kvm_host.h | 99 +++++---
arch/arm64/include/asm/kvm_hyp.h | 3 +-
arch/arm64/include/uapi/asm/kvm.h | 33 +++
arch/arm64/kernel/fpsimd.c | 51 +++-
arch/arm64/kvm/arm.c | 16 +-
arch/arm64/kvm/fpsimd.c | 266 ++++++++++++++++++---
arch/arm64/kvm/guest.c | 230 +++++++++++++++---
arch/arm64/kvm/handle_exit.c | 11 +
arch/arm64/kvm/hyp/fpsimd.S | 11 +-
arch/arm64/kvm/hyp/include/hyp/switch.h | 86 ++++++-
arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 16 ++
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 60 ++++-
arch/arm64/kvm/hyp/nvhe/switch.c | 13 +-
arch/arm64/kvm/hyp/vhe/switch.c | 3 +
arch/arm64/kvm/reset.c | 150 +++++++++---
arch/arm64/kvm/sys_regs.c | 67 +++++-
include/uapi/linux/kvm.h | 1 +
tools/testing/selftests/kvm/aarch64/get-reg-list.c | 32 ++-
21 files changed, 1063 insertions(+), 207 deletions(-)
---
base-commit: 4ae6e89253b387476c2ba0202c3a80f2e1284e91
change-id: 20230301-kvm-arm64-sme-06a1246d3636
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Swap the arguments to typecheck_fn() in kunit_activate_static_stub()
so that real_fn_addr can be either the function itself or a pointer
to that function.
This is useful to simplify redirecting static functions in a module.
Having to pass the actual function meant that it must be exported
from the module. Either making the 'static' and EXPORT_SYMBOL*()
conditional (which makes the code messy), or change it to always
exported (which increases the export namespace and prevents the
compiler inlining a trivial stub function in non-test builds).
With the original definition of kunit_activate_static_stub() the
address of real_fn_addr was passed to typecheck_fn() as the type to
be passed. This meant that if real_fn_addr was a pointer-to-function
it would resolve to a ** instead of a *, giving an error like this:
error: initialization of ‘int (**)(int)’ from incompatible pointer
type ‘int (*)(int)’ [-Werror=incompatible-pointer-types]
kunit_activate_static_stub(test, add_one_fn_ptr, subtract_one);
| ^~~~~~~~~~~~
./include/linux/typecheck.h:21:25: note: in definition of macro
‘typecheck_fn’
21 | ({ typeof(type) __tmp = function; \
Swapping the arguments to typecheck_fn makes it take the type of a
pointer to the replacement function. Either a function or a pointer
to function can be assigned to that. For example:
static int some_function(int x)
{
/* whatever */
}
int (* some_function_ptr)(int) = some_function;
static int replacement(int x)
{
/* whatever */
}
Then:
kunit_activate_static_stub(test, some_function, replacement);
yields:
typecheck_fn(typeof(&replacement), some_function);
and:
kunit_activate_static_stub(test, some_function_ptr, replacement);
yields:
typecheck_fn(typeof(&replacement), some_function_ptr);
The two typecheck_fn() then resolve to:
int (*__tmp)(int) = some_function;
and
int (*__tmp)(int) = some_function_ptr;
Both of these are valid. In the first case the compiler inserts
an implicit '&' to take the address of the supplied function, and
in the second case the RHS is already a pointer to the same type.
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
Reviewed-by: Rae Moar <rmoar(a)google.com>
---
No changes since V1.
---
include/kunit/static_stub.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/kunit/static_stub.h b/include/kunit/static_stub.h
index 85315c80b303..bf940322dfc0 100644
--- a/include/kunit/static_stub.h
+++ b/include/kunit/static_stub.h
@@ -93,7 +93,7 @@ void __kunit_activate_static_stub(struct kunit *test,
* The redirection can be disabled again with kunit_deactivate_static_stub().
*/
#define kunit_activate_static_stub(test, real_fn_addr, replacement_addr) do { \
- typecheck_fn(typeof(&real_fn_addr), replacement_addr); \
+ typecheck_fn(typeof(&replacement_addr), real_fn_addr); \
__kunit_activate_static_stub(test, real_fn_addr, replacement_addr); \
} while (0)
--
2.30.2
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
As there were bugs found with the ownership of eventfs dynamic file
creation. Add a test to test it.
It will remount tracefs with a different gid and check the ownership of
the eventfs directory, as well as the system and event directories. It
will also check the event file directories.
It then does a chgrp on each of these as well to see if they all get
updated as expected.
Then it remounts the tracefs file system back to the original group and
makes sure that all the updated files and directories were reset back to
the original ownership.
It does the same for instances that change the ownership of he instance
directory.
Note, because the uid is not reset by a remount, it is tested for every
file by switching it to a new owner and then back again.
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since v1: https://lore.kernel.org/linux-trace-kernel/20231221193551.13a0b7bd@gandalf.…
- Fixed a cut and paste error of using $original_group for finding another uid
.../ftrace/test.d/00basic/test_ownership.tc | 113 ++++++++++++++++++
1 file changed, 113 insertions(+)
create mode 100755 tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
new file mode 100755
index 000000000000..83cbd116d06b
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
@@ -0,0 +1,113 @@
+#!/bin/sh
+# description: Test file and directory owership changes for eventfs
+
+original_group=`stat -c "%g" .`
+original_owner=`stat -c "%u" .`
+
+mount_point=`stat -c '%m' .`
+mount_options=`mount | grep "$mount_point" | sed -e 's/.*(\(.*\)).*/\1/'`
+
+# find another owner and group that is not the original
+other_group=`tac /etc/group | grep -v ":$original_group:" | head -1 | cut -d: -f3`
+other_owner=`tac /etc/passwd | grep -v ":$original_owner:" | head -1 | cut -d: -f3`
+
+# Remove any group ownership already
+new_options=`echo "$mount_options" | sed -e "s/gid=[0-9]*/gid=$other_group/"`
+
+if [ "$new_options" = "$mount_options" ]; then
+ new_options="$mount_options,gid=$other_group"
+ mount_options="$mount_options,gid=$original_group"
+fi
+
+canary="events/timer events/timer/timer_cancel events/timer/timer_cancel/format"
+
+test() {
+ file=$1
+ test_group=$2
+
+ owner=`stat -c "%u" $file`
+ group=`stat -c "%g" $file`
+
+ echo "testing $file $owner=$original_owner and $group=$test_group"
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+ if [ $group -ne $test_group ]; then
+ exit_fail
+ fi
+
+ # Note, the remount does not update ownership so test going to and from owner
+ echo "test owner $file to $other_owner"
+ chown $other_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $other_owner ]; then
+ exit_fail
+ fi
+
+ chown $original_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+
+}
+
+run_tests() {
+ for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events
+ test "events" $original_group
+ for d in "." "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched
+ test "events/sched" $original_group
+ for d in "." "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch
+ test "events/sched/sched_switch" $original_group
+ for d in "." "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch/enable
+ test "events/sched/sched_switch/enable" $original_group
+ for d in "." $canary; do
+ test "$d" $other_group
+ done
+}
+
+mount -o remount,"$new_options" .
+
+run_tests
+
+mount -o remount,"$mount_options" .
+
+for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $original_group
+done
+
+# check instances as well
+
+chgrp $other_group instances
+
+instance="foo-$(mktemp -u XXXXX)"
+
+mkdir instances/$instance
+
+cd instances/$instance
+
+run_tests
+
+cd ../..
+
+rmdir instances/$instance
+
+chgrp $original_group instances
+
+exit 0
--
2.42.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
As there were bugs found with the ownership of eventfs dynamic file
creation. Add a test to test it.
It will remount tracefs with a different gid and check the ownership of
the eventfs directory, as well as the system and event directories. It
will also check the event file directories.
It then does a chgrp on each of these as well to see if they all get
updated as expected.
Then it remounts the tracefs file system back to the original group and
makes sure that all the updated files and directories were reset back to
the original ownership.
It does the same for instances that change the ownership of he instance
directory.
Note, because the uid is not reset by a remount, it is tested for every
file by switching it to a new owner and then back again.
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
.../ftrace/test.d/00basic/test_ownership.tc | 113 ++++++++++++++++++
1 file changed, 113 insertions(+)
create mode 100755 tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
new file mode 100755
index 000000000000..de8cdf6f207b
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/test_ownership.tc
@@ -0,0 +1,113 @@
+#!/bin/sh
+# description: Test file and directory owership changes for eventfs
+
+original_group=`stat -c "%g" .`
+original_owner=`stat -c "%u" .`
+
+mount_point=`stat -c '%m' .`
+mount_options=`mount | grep "$mount_point" | sed -e 's/.*(\(.*\)).*/\1/'`
+
+# find another owner and group that is not the original
+other_group=`tac /etc/group | grep -v ":$original_group:" | head -1 | cut -d: -f3`
+other_owner=`tac /etc/passwd | grep -v ":$original_group:" | head -1 | cut -d: -f3`
+
+# Remove any group ownership already
+new_options=`echo "$mount_options" | sed -e "s/gid=[0-9]*/gid=$other_group/"`
+
+if [ "$new_options" = "$mount_options" ]; then
+ new_options="$mount_options,gid=$other_group"
+ mount_options="$mount_options,gid=$original_group"
+fi
+
+canary="events/timer events/timer/timer_cancel events/timer/timer_cancel/format"
+
+test() {
+ file=$1
+ test_group=$2
+
+ owner=`stat -c "%u" $file`
+ group=`stat -c "%g" $file`
+
+ echo "testing $file $owner=$original_owner and $group=$test_group"
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+ if [ $group -ne $test_group ]; then
+ exit_fail
+ fi
+
+ # Note, the remount does not update ownership so test going to and from owner
+ echo "test owner $file to $other_owner"
+ chown $other_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $other_owner ]; then
+ exit_fail
+ fi
+
+ chown $original_owner $file
+ owner=`stat -c "%u" $file`
+ if [ $owner -ne $original_owner ]; then
+ exit_fail
+ fi
+
+}
+
+run_tests() {
+ for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events
+ test "events" $original_group
+ for d in "." "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched
+ test "events/sched" $original_group
+ for d in "." "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch
+ test "events/sched/sched_switch" $original_group
+ for d in "." "events/sched/sched_switch/enable" $canary; do
+ test "$d" $other_group
+ done
+
+ chgrp $original_group events/sched/sched_switch/enable
+ test "events/sched/sched_switch/enable" $original_group
+ for d in "." $canary; do
+ test "$d" $other_group
+ done
+}
+
+mount -o remount,"$new_options" .
+
+run_tests
+
+mount -o remount,"$mount_options" .
+
+for d in "." "events" "events/sched" "events/sched/sched_switch" "events/sched/sched_switch/enable" $canary; do
+ test "$d" $original_group
+done
+
+# check instances as well
+
+chgrp $other_group instances
+
+instance="foo-$(mktemp -u XXXXX)"
+
+mkdir instances/$instance
+
+cd instances/$instance
+
+run_tests
+
+cd ../..
+
+rmdir instances/$instance
+
+chgrp $original_group instances
+
+exit 0
--
2.42.0
This makes the uevent selftests build not write to the source tree
unconditionally, as that breaks out of tree builds when the source tree
is read-only. It also avoids leaving a git repository in a dirty state
after a build.
Signed-off-by: Antonio Terceiro <antonio.terceiro(a)linaro.org>
---
tools/testing/selftests/uevent/Makefile | 16 ++++------------
1 file changed, 4 insertions(+), 12 deletions(-)
diff --git a/tools/testing/selftests/uevent/Makefile b/tools/testing/selftests/uevent/Makefile
index f7baa9aa2932..9d1ba09baa90 100644
--- a/tools/testing/selftests/uevent/Makefile
+++ b/tools/testing/selftests/uevent/Makefile
@@ -1,17 +1,9 @@
# SPDX-License-Identifier: GPL-2.0
all:
-include ../lib.mk
-
-.PHONY: all clean
-
-BINARIES := uevent_filtering
-CFLAGS += -Wl,-no-as-needed -Wall
-
-uevent_filtering: uevent_filtering.c ../kselftest.h ../kselftest_harness.h
- $(CC) $(CFLAGS) $< -o $@
+# SPDX-License-Identifier: GPL-2.0
+CFLAGS += -Wl,-no-as-needed -Wall $(KHDR_INCLUDES)
-TEST_PROGS += $(BINARIES)
-EXTRA_CLEAN := $(BINARIES)
+TEST_GEN_PROGS = uevent_filtering
-all: $(BINARIES)
+include ../lib.mk
--
2.43.0
Currently the seccomp benchmark selftest produces non-standard output,
meaning that while it makes a number of checks of the performance it
observes this has to be parsed by humans. This means that automated
systems running this suite of tests are almost certainly ignoring the
results which isn't ideal for spotting problems. Let's rework things so
that each check that the program does is reported as a test result to
the framework.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Mark Brown (2):
kselftest/seccomp: Use kselftest output functions for benchmark
kselftest/seccomp: Report each expectation we assert as a KTAP test
.../testing/selftests/seccomp/seccomp_benchmark.c | 105 +++++++++++++--------
1 file changed, 65 insertions(+), 40 deletions(-)
---
base-commit: 2cc14f52aeb78ce3f29677c2de1f06c0e91471ab
change-id: 20231219-b4-kselftest-seccomp-benchmark-ktap-357603823708
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Hi all,
The livepatch selftest somehow fails in -next on s390 due to what
appears to me as 'comm' usage issue. E.g the removal of timestamp-
less line "with link type OSD_10GIG." in the below output forces
'comm' to produce the correct result in check_result() function of
tools/testing/selftests/livepatch/functions.sh script:
[ 11.229256] qeth 0.0.bd02: qdio: OSA on SC 2624 using AI:1 QEBSM:0 PRI:1 TDD:1 SIGA: W
[ 11.250189] systemd-journald[943]: Successfully sent stream file descriptor to service manager.
[ 11.258763] qeth 0.0.bd00: Device is a OSD Express card (level: 0165)
with link type OSD_10GIG.
[ 11.259261] qeth 0.0.bd00: The device represents a Bridge Capable Port
[ 11.262376] qeth 0.0.bd00: MAC address b2:96:9c:49:aa:e9 successfully registered
[ 11.269654] qeth 0.0.bd00: MAC address 06:c6:b5:7d:ee:63 successfully registered
By contrast, using the 'diff' instead works as a charm. But it was
removed with commit 2f3f651f3756 ("selftests/livepatch: Use "comm"
instead of "diff" for dmesg").
I am attaching the contents of "$expect" and "$result" script
variables and the output of 'dmesg' before and after test run
dmesg-saved.txt and dmesg.txt.
Another 'dmesg' output dmesg-saved1.txt and dmesg1.txt also
shows the same problem, which seems like something to do with
sorting.
The minimal reproducer attached is dmesg-saved1-rep.txt and
dmesg1-rep.txt, that could be described as:
--- dmesg-saved1-rep.txt 2023-12-17 21:08:14.171014218 +0100
+++ dmesg1-rep.txt 2023-12-17 21:06:52.221014218 +0100
@@ -1,3 +1,3 @@
-[ 98.820331] livepatch: 'test_klp_state2': starting patching transition
[ 100.031067] livepatch: 'test_klp_state2': completing patching transition
[ 284.224335] livepatch: kernel.ftrace_enabled = 1
+[ 284.232921] ===== TEST: basic shadow variable API =====
The culprit is the extra space in [ 98.820331] timestamp, that from
the script point of view produces the output with two extra lines:
[ 100.031067] livepatch: 'test_klp_state2': completing patching transition
[ 284.224335] livepatch: kernel.ftrace_enabled = 1
[ 284.232921] ===== TEST: basic shadow variable API =====
If the line with [ 98.820331] timestamp removed or changed to e.g
[ 100.031066] (aka 1 us less), then the result output is as expected:
[ 284.232921] ===== TEST: basic shadow variable API =====
Thanks!
This patchset moves the current kernel testing livepatch modules from
lib/livepatches to tools/testing/selftest/livepatch/test_modules, and compiles
them as out-of-tree modules before testing.
There is also a new test being added. This new test exercises multiple processes
calling a syscall, while a livepatch patched the syscall.
Why this move is an improvement:
* The modules are now compiled as out-of-tree modules against the current
running kernel, making them capable of being tested on different systems with
newer or older kernels.
* Such approach now needs kernel-devel package to be installed, since they are
out-of-tree modules. These can be generated by running "make rpm-pkg" in the
kernel source.
What needs to be solved:
* Currently gen_tar only packages the resulting binaries of the tests, and not
the sources. For the current approach, the newly added modules would be
compiled and then packaged. It works when testing on a system with the same
kernel version. But it will fail when running on a machine with different kernel
version, since module was compiled against the kernel currently running.
This is not a new problem, just aligning the expectations. For the current
approach to be truly system agnostic gen_tar would need to include the module
and program sources to be compiled in the target systems.
I'm sending the patches now so it can be discussed before Plumbers.
Thanks in advance!
Marcos
To: Shuah Khan <shuah(a)kernel.org>
To: Jonathan Corbet <corbet(a)lwn.net>
To: Heiko Carstens <hca(a)linux.ibm.com>
To: Vasily Gorbik <gor(a)linux.ibm.com>
To: Alexander Gordeev <agordeev(a)linux.ibm.com>
To: Christian Borntraeger <borntraeger(a)linux.ibm.com>
To: Sven Schnelle <svens(a)linux.ibm.com>
To: Josh Poimboeuf <jpoimboe(a)kernel.org>
To: Jiri Kosina <jikos(a)kernel.org>
To: Miroslav Benes <mbenes(a)suse.cz>
To: Petr Mladek <pmladek(a)suse.com>
To: Joe Lawrence <joe.lawrence(a)redhat.com>
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-doc(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-s390(a)vger.kernel.org
Cc: live-patching(a)vger.kernel.org
Signed-off-by: Marcos Paulo de Souza <mpdesouza(a)suse.com>
Changes in v3:
* Rebased on top of v6.6-rc5
* The commits messages were improved (Thanks Petr!)
* Created TEST_GEN_MODS_DIR variable to point to a directly that contains kernel
modules, and adapt selftests to build it before running the test.
* Moved test_klp-call_getpid out of test_programs, since the gen_tar
would just copy the generated test programs to the livepatches dir,
and so scripts relying on test_programs/test_klp-call_getpid will fail.
* Added a module_param for klp_pids, describing it's usage.
* Simplified the call_getpid program to ignore the return of getpid syscall,
since we only want to make sure the process transitions correctly to the
patched stated
* The test-syscall.sh not prints a log message showing the number of remaining
processes to transition into to livepatched state, and check_output expects it
to be 0.
* Added MODULE_AUTHOR and MODULE_DESCRIPTION to test_klp_syscall.c
The v2 can be seen here:
https://lore.kernel.org/linux-kselftest/20220630141226.2802-1-mpdesouza@sus…
---
Marcos Paulo de Souza (3):
kselftests: lib.mk: Add TEST_GEN_MODS_DIR variable
livepatch: Move tests from lib/livepatch to selftests/livepatch
selftests: livepatch: Test livepatching a heavily called syscall
Documentation/dev-tools/kselftest.rst | 4 +
arch/s390/configs/debug_defconfig | 1 -
arch/s390/configs/defconfig | 1 -
lib/Kconfig.debug | 22 ----
lib/Makefile | 2 -
lib/livepatch/Makefile | 14 ---
tools/testing/selftests/lib.mk | 20 +++-
tools/testing/selftests/livepatch/Makefile | 5 +-
tools/testing/selftests/livepatch/README | 17 +--
tools/testing/selftests/livepatch/config | 1 -
tools/testing/selftests/livepatch/functions.sh | 34 +++---
.../testing/selftests/livepatch/test-callbacks.sh | 50 ++++-----
tools/testing/selftests/livepatch/test-ftrace.sh | 6 +-
.../testing/selftests/livepatch/test-livepatch.sh | 10 +-
.../selftests/livepatch/test-shadow-vars.sh | 2 +-
tools/testing/selftests/livepatch/test-state.sh | 18 ++--
tools/testing/selftests/livepatch/test-syscall.sh | 53 ++++++++++
tools/testing/selftests/livepatch/test-sysfs.sh | 6 +-
.../selftests/livepatch/test_klp-call_getpid.c | 44 ++++++++
.../selftests/livepatch/test_modules/Makefile | 20 ++++
.../test_modules}/test_klp_atomic_replace.c | 0
.../test_modules}/test_klp_callbacks_busy.c | 0
.../test_modules}/test_klp_callbacks_demo.c | 0
.../test_modules}/test_klp_callbacks_demo2.c | 0
.../test_modules}/test_klp_callbacks_mod.c | 0
.../livepatch/test_modules}/test_klp_livepatch.c | 0
.../livepatch/test_modules}/test_klp_shadow_vars.c | 0
.../livepatch/test_modules}/test_klp_state.c | 0
.../livepatch/test_modules}/test_klp_state2.c | 0
.../livepatch/test_modules}/test_klp_state3.c | 0
.../livepatch/test_modules/test_klp_syscall.c | 116 +++++++++++++++++++++
31 files changed, 325 insertions(+), 121 deletions(-)
---
base-commit: 6489bf2e1df1c84e9bcd4694029ff35b39fd3397
change-id: 20231031-send-lp-kselftests-4c917dcd4565
Best regards,
--
Marcos Paulo de Souza <mpdesouza(a)suse.com>
From: Paul Durrant <pdurrant(a)amazon.com>
This series has some small fixes from what was in version 10 [1]:
* KVM: pfncache: allow a cache to be activated with a fixed (userspace) HVA
This required a small fix to kvm_gpc_check() for an error that was
introduced in version 8.
* KVM: xen: separate initialization of shared_info cache and content
This accidentally regressed a fix in commit 5d6d6a7d7e66a ("KVM: x86:
Refine calculation of guest wall clock to use a single TSC read").
* KVM: xen: re-initialize shared_info if guest (32/64-bit) mode is set
This mistakenly removed the initialization of shared_info from the code
setting the KVM_XEN_ATTR_TYPE_SHARED_INFO attribute, which broke the self-
tests.
* KVM: xen: split up kvm_xen_set_evtchn_fast()
This had a /32 and a /64 swapped in set_vcpu_info_evtchn_pending().
[1] https://lore.kernel.org/kvm/20231204144334.910-1-paul@xen.org/
Paul Durrant (19):
KVM: pfncache: Add a map helper function
KVM: pfncache: remove unnecessary exports
KVM: xen: mark guest pages dirty with the pfncache lock held
KVM: pfncache: add a mark-dirty helper
KVM: pfncache: remove KVM_GUEST_USES_PFN usage
KVM: pfncache: stop open-coding offset_in_page()
KVM: pfncache: include page offset in uhva and use it consistently
KVM: pfncache: allow a cache to be activated with a fixed (userspace)
HVA
KVM: xen: separate initialization of shared_info cache and content
KVM: xen: re-initialize shared_info if guest (32/64-bit) mode is set
KVM: xen: allow shared_info to be mapped by fixed HVA
KVM: xen: allow vcpu_info to be mapped by fixed HVA
KVM: selftests / xen: map shared_info using HVA rather than GFN
KVM: selftests / xen: re-map vcpu_info using HVA rather than GPA
KVM: xen: advertize the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA capability
KVM: xen: split up kvm_xen_set_evtchn_fast()
KVM: xen: don't block on pfncache locks in kvm_xen_set_evtchn_fast()
KVM: pfncache: check the need for invalidation under read lock first
KVM: xen: allow vcpu_info content to be 'safely' copied
Documentation/virt/kvm/api.rst | 53 ++-
arch/x86/kvm/x86.c | 7 +-
arch/x86/kvm/xen.c | 360 +++++++++++-------
include/linux/kvm_host.h | 40 +-
include/linux/kvm_types.h | 8 -
include/uapi/linux/kvm.h | 9 +-
.../selftests/kvm/x86_64/xen_shinfo_test.c | 59 ++-
virt/kvm/pfncache.c | 188 ++++-----
8 files changed, 466 insertions(+), 258 deletions(-)
base-commit: f2a3fb7234e52f72ff4a38364dbf639cf4c7d6c6
--
2.39.2
For now, the reg bounds is not handled for BPF_JNE case, which can cause
the failure of following case:
/* The type of "a" is u32 */
if (a > 0 && a < 100) {
/* the range of the register for a is [0, 99], not [1, 99],
* and will cause the following error:
*
* invalid zero-sized read
*
* as a can be 0.
*/
bpf_skb_store_bytes(skb, xx, xx, a, 0);
}
In the code above, "a > 0" will be compiled to "if a == 0 goto xxx". In
the TRUE branch, the dst_reg will be marked as known to 0. However, in the
fallthrough(FALSE) branch, the dst_reg will not be handled, which makes
the [min, max] for a is [0, 99], not [1, 99].
In the 1st patch, we reduce the range of the dst reg if the src reg is a
const and is exactly the edge of the dst reg For BPF_JNE.
In the 2nd patch, we remove reduplicated s32 casting in "crafted_cases".
In the 3rd patch, we just activate the test case for this logic in
range_cond(), which is committed by Andrii in the
commit 8863238993e2 ("selftests/bpf: BPF register range bounds tester").
In the 4th patch, we convert the case above to a testcase and add it to
verifier_bounds.c.
Changes since v4:
- add the 2nd patch
- add "{U32, U32, {0, U32_MAX}, {U32_MAX, U32_MAX}}" that we missed in the
3rd patch
- add some comments to the function that we add in the 4th patch
- add reg_not_equal_const() in the 4th patch
Changes since v3:
- do some adjustment to the crafted cases that we added in the 2nd patch
- add the 3rd patch
Changes since v2:
- fix a typo in the subject of the 1st patch
- add some comments to the 1st patch, as Eduard advised
- add some cases to the "crafted_cases"
Changes since v1:
- simplify the code in the 1st patch
- introduce the 2nd patch for the testing
Menglong Dong (4):
bpf: make the verifier tracks the "not equal" for regs
selftests/bpf: remove reduplicated s32 casting in "crafted_cases"
selftests/bpf: activate the OP_NE logic in range_cond()
selftests/bpf: add testcase to verifier_bounds.c for BPF_JNE
kernel/bpf/verifier.c | 38 +++++++++++-
.../selftests/bpf/prog_tests/reg_bounds.c | 27 +++++---
.../selftests/bpf/progs/verifier_bounds.c | 62 +++++++++++++++++++
3 files changed, 116 insertions(+), 11 deletions(-)
--
2.39.2
Swap the arguments to typecheck_fn() in kunit_activate_static_stub()
so that real_fn_addr can be either the function itself or a pointer
to that function.
This is useful to simplify redirecting static functions in a module.
Having to pass the actual function meant that it must be exported
from the module. Either making the 'static' and EXPORT_SYMBOL*()
conditional (which makes the code messy), or change it to always
exported (which increases the export namespace and prevents the
compiler inlining a trivial stub function in non-test builds).
With the original definition of kunit_activate_static_stub() the
address of real_fn_addr was passed to typecheck_fn() as the type to
be passed. This meant that if real_fn_addr was a pointer-to-function
it would resolve to a ** instead of a *, giving an error like this:
error: initialization of ‘int (**)(int)’ from incompatible pointer
type ‘int (*)(int)’ [-Werror=incompatible-pointer-types]
kunit_activate_static_stub(test, add_one_fn_ptr, subtract_one);
| ^~~~~~~~~~~~
./include/linux/typecheck.h:21:25: note: in definition of macro
‘typecheck_fn’
21 | ({ typeof(type) __tmp = function; \
Swapping the arguments to typecheck_fn makes it take the type of a
pointer to the replacement function. Either a function or a pointer
to function can be assigned to that. For example:
static int some_function(int x)
{
/* whatever */
}
int (* some_function_ptr)(int) = some_function;
static int replacement(int x)
{
/* whatever */
}
Then:
kunit_activate_static_stub(test, some_function, replacement);
yields:
typecheck_fn(typeof(&replacement), some_function);
and:
kunit_activate_static_stub(test, some_function_ptr, replacement);
yields:
typecheck_fn(typeof(&replacement), some_function_ptr);
The two typecheck_fn() then resolve to:
int (*__tmp)(int) = some_function;
and
int (*__tmp)(int) = some_function_ptr;
Both of these are valid. In the first case the compiler inserts
an implicit '&' to take the address of the supplied function, and
in the second case the RHS is already a pointer to the same type.
Signed-off-by: Richard Fitzgerald <rf(a)opensource.cirrus.com>
---
include/kunit/static_stub.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/kunit/static_stub.h b/include/kunit/static_stub.h
index 85315c80b303..bf940322dfc0 100644
--- a/include/kunit/static_stub.h
+++ b/include/kunit/static_stub.h
@@ -93,7 +93,7 @@ void __kunit_activate_static_stub(struct kunit *test,
* The redirection can be disabled again with kunit_deactivate_static_stub().
*/
#define kunit_activate_static_stub(test, real_fn_addr, replacement_addr) do { \
- typecheck_fn(typeof(&real_fn_addr), replacement_addr); \
+ typecheck_fn(typeof(&replacement_addr), real_fn_addr); \
__kunit_activate_static_stub(test, real_fn_addr, replacement_addr); \
} while (0)
--
2.30.2
The seccomp benchmark runs five scenarios, one calibration run with no
seccomp filters enabled then four further runs each adding a filter. The
calibration run times itself for 15s and then each additional run executes
for the same number of times.
Currently the seccomp tests, including the benchmark, run with an extended
120s timeout but this is not sufficient to robustly run the tests on a lot
of platforms. Sample timings from some recent runs:
Platform Run 1 Run 2 Run 3 Run 4
--------- ----- ----- ----- -----
PowerEdge R200 16.6s 16.6s 31.6s 37.4s
BBB (arm) 20.4s 20.4s 54.5s
Synquacer (arm64) 20.7s 23.7s 40.3s
The x86 runs from the PowerEdge are quite marginal and routinely fail, for
the successful run reported here the timed portions of the run are at
117.2s leaving less than 3s of margin which is frequently breached. The
added overhead of adding filters on the other platforms is such that there
is no prospect of their runs fitting into the 120s timeout, especially
on 32 bit arm where there is no BPF JIT.
While we could lower the time we calibrate for I'm also already seeing the
currently completing runs reporting issues with the per filter overheads
not matching expectations:
Let's instead raise the timeout to 180s which is only a 50% increase on the
current timeout which is itself not *too* large given that there's only two
tests in this suite.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/seccomp/settings | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/seccomp/settings b/tools/testing/selftests/seccomp/settings
index 6091b45d226b..a953c96aa16e 100644
--- a/tools/testing/selftests/seccomp/settings
+++ b/tools/testing/selftests/seccomp/settings
@@ -1 +1 @@
-timeout=120
+timeout=180
---
base-commit: 2cc14f52aeb78ce3f29677c2de1f06c0e91471ab
change-id: 20231219-b4-kselftest-seccomp-benchmark-timeout-05b66e7d29d1
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Hi,
It is said eBPF is a safe way to extend kernels and that is very
attarctive, but we need to use kfuncs to add new usage of eBPF and
kfuncs are said as unstable as EXPORT_SYMBOL_GPL. So now I'd like to ask
some questions:
1) Which should I choose, BPF kfuncs or ioctl, when adding a new feature
for userspace apps?
2) How should I use BPF kfuncs from userspace apps if I add them?
Here, a "userspace app" means something not like a system-wide daemon
like systemd (particularly, I have QEMU in mind). I'll describe the
context more below:
---
I'm working on a new feature that aids virtio-net implementations using
tuntap virtual network device. You can see [1] for details, but
basically it's to extend BPF_PROG_TYPE_SOCKET_FILTER to report four more
bytes.
However, with long discussions we have confirmed extending
BPF_PROG_TYPE_SOCKET_FILTER is not going to happen, and adding kfuncs is
the way forward. So I decided how to add kfuncs to the kernel and how to
use it. There are rich documentations for the kernel side, but I found
little about the userspace. The best I could find is a systemd change
proposal that is based on WIP kernel changes[2].
So now I'm wondering how I should use BPF kfuncs from userspace apps if
I add them. In the systemd discussion, it is told that Linus said it's
fine to use BPF kfuncs in a private infrastructure big companies own, or
in systemd as those users know well about the system[3]. Indeed, those
users should be able to make more assumptions on the kernel than
"normal" userspace applications can.
Returning to my proposal, I'm proposing a new feature to be used by QEMU
or other VMM applications. QEMU is more like a normal userspace
application, and usually does not make much assumptions on the kernel it
runs on. For example, it's generally safe to run a Debian container
including QEMU installed with apt on Fedora. BPF kfuncs may work even in
such a situation thanks to CO-RE, but it sounds like *accidentally*
creating UAPIs.
Considering all above, how can I integrate BPF kfuncs to the application?
If BPF kfuncs are like EXPORT_SYMBOL_GPL, the natural way to handle them
is to think of BPF programs as some sort of kernel modules and
incorporate logic that behaves like modprobe. More concretely, I can put
eBPF binaries to a directory like:
/usr/local/share/qemu/ebpf/$KERNEL_RELEASE
Then, QEMU can uname() and get the path to the binary. It will give an
error if it can't find the binary for the current kernel so that it
won't create accidental UAPIs.
The obvious downside of this is that it complicates packaging a lot; it
requires packaging QEMU eBPF binaries each time a new kernel comes up.
This complexity is centrally managed by modprobe for kernel modules, but
apparently each application needs to take care of it for BPF programs.
In conclusion, I see too much complexity to use BPF in a userspace
application, which we didn't have to care for
BPF_PROG_TYPE_SOCKET_FILTER. Isn't there a better way? Or shouldn't I
use BPF in my case in the first place?
Thanks,
Akihiko Odaki
[1]
https://lore.kernel.org/all/20231015141644.260646-1-akihiko.odaki@daynix.co…
[2] https://github.com/systemd/systemd/pull/29797
[3] https://github.com/systemd/systemd/pull/29797#discussion_r1384637939
The vec-syscfg selftest verifies that setting the VL of the currently
tested vector type does not disrupt the VL of the other vector type. To do
this it records the current vector length for each type but neglects to
guard this with a check for that vector type actually being supported. Add
one, using a helper function which we also update all the other instances
of this pattern.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/arm64/fp/vec-syscfg.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/arm64/fp/vec-syscfg.c b/tools/testing/selftests/arm64/fp/vec-syscfg.c
index 5f648b97a06f..ea9c7d47790f 100644
--- a/tools/testing/selftests/arm64/fp/vec-syscfg.c
+++ b/tools/testing/selftests/arm64/fp/vec-syscfg.c
@@ -66,6 +66,11 @@ static struct vec_data vec_data[] = {
},
};
+static bool vec_type_supported(struct vec_data *data)
+{
+ return getauxval(data->hwcap_type) & data->hwcap;
+}
+
static int stdio_read_integer(FILE *f, const char *what, int *val)
{
int n = 0;
@@ -564,8 +569,11 @@ static void prctl_set_all_vqs(struct vec_data *data)
return;
}
- for (i = 0; i < ARRAY_SIZE(vec_data); i++)
+ for (i = 0; i < ARRAY_SIZE(vec_data); i++) {
+ if (!vec_type_supported(&vec_data[i]))
+ continue;
orig_vls[i] = vec_data[i].rdvl();
+ }
for (vq = SVE_VQ_MIN; vq <= SVE_VQ_MAX; vq++) {
vl = sve_vl_from_vq(vq);
@@ -594,7 +602,7 @@ static void prctl_set_all_vqs(struct vec_data *data)
if (&vec_data[i] == data)
continue;
- if (!(getauxval(vec_data[i].hwcap_type) & vec_data[i].hwcap))
+ if (!vec_type_supported(&vec_data[i]))
continue;
if (vec_data[i].rdvl() != orig_vls[i]) {
@@ -765,7 +773,7 @@ int main(void)
struct vec_data *data = &vec_data[i];
unsigned long supported;
- supported = getauxval(data->hwcap_type) & data->hwcap;
+ supported = vec_type_supported(data);
if (!supported)
all_supported = false;
---
base-commit: 2cc14f52aeb78ce3f29677c2de1f06c0e91471ab
change-id: 20231215-kselftest-arm64-vec-syscfg-rdvl-7944e19ac64f
Best regards,
--
Mark Brown <broonie(a)kernel.org>
When running tests on a CI system (e.g. LAVA) it is useful to output
test results in TAP format so that the CI can parse the fine-grained
results to show regressions. Many of the mm selftest binaries already
output using the TAP format. And the kselftests runner
(run_kselftest.sh) also uses the format. CI systems such as LAVA can
already handle nested TAP reports. However, with the mm selftests we
have 3 levels of nesting (run_kselftest.sh -> run_vmtests.sh ->
individual test binaries) and the middle level did not previously
support TAP, which breaks the parser.
Let's fix that by teaching run_vmtests.sh to output using the TAP
format. Ideally this would be opt-in via a command line argument to
avoid the possibility of breaking anyone's existing scripts that might
scrape the output. However, it is not possible to pass arguments to
tests invoked via run_kselftest.sh. So I've implemented an opt-out
option (-n), which will revert to the existing output format.
Future changes to this file should be aware of 2 new conventions:
- output that is part of the TAP reporting is piped through tap_output
- general output is piped through tap_prefix
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
---
tools/testing/selftests/mm/run_vmtests.sh | 51 +++++++++++++++++------
1 file changed, 39 insertions(+), 12 deletions(-)
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index 87f513f5cf91..246d53a5d7f2 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -5,6 +5,7 @@
# Kselftest framework requirement - SKIP code is 4.
ksft_skip=4
+count_total=0
count_pass=0
count_fail=0
count_skip=0
@@ -17,6 +18,7 @@ usage: ${BASH_SOURCE[0]:-$0} [ options ]
-a: run all tests, including extra ones
-t: specify specific categories to tests to run
-h: display this message
+ -n: disable TAP output
The default behavior is to run required tests only. If -a is specified,
will run all tests.
@@ -77,12 +79,14 @@ EOF
}
RUN_ALL=false
+TAP_PREFIX="# "
-while getopts "aht:" OPT; do
+while getopts "aht:n" OPT; do
case ${OPT} in
"a") RUN_ALL=true ;;
"h") usage ;;
"t") VM_SELFTEST_ITEMS=${OPTARG} ;;
+ "n") TAP_PREFIX= ;;
esac
done
shift $((OPTIND -1))
@@ -184,30 +188,52 @@ fi
VADDR64=0
echo "$ARCH64STR" | grep "$ARCH" &>/dev/null && VADDR64=1
+tap_prefix() {
+ sed -e "s/^/${TAP_PREFIX}/"
+}
+
+tap_output() {
+ if [[ ! -z "$TAP_PREFIX" ]]; then
+ read str
+ echo $str
+ fi
+}
+
+pretty_name() {
+ echo "$*" | sed -e 's/^\(bash \)\?\.\///'
+}
+
# Usage: run_test [test binary] [arbitrary test arguments...]
run_test() {
if test_selected ${CATEGORY}; then
+ local test=$(pretty_name "$*")
local title="running $*"
local sep=$(echo -n "$title" | tr "[:graph:][:space:]" -)
- printf "%s\n%s\n%s\n" "$sep" "$title" "$sep"
+ printf "%s\n%s\n%s\n" "$sep" "$title" "$sep" | tap_prefix
- "$@"
- local ret=$?
+ ("$@" 2>&1) | tap_prefix
+ local ret=${PIPESTATUS[0]}
+ count_total=$(( count_total + 1 ))
if [ $ret -eq 0 ]; then
count_pass=$(( count_pass + 1 ))
- echo "[PASS]"
+ echo "[PASS]" | tap_prefix
+ echo "ok ${count_total} ${test}" | tap_output
elif [ $ret -eq $ksft_skip ]; then
count_skip=$(( count_skip + 1 ))
- echo "[SKIP]"
+ echo "[SKIP]" | tap_prefix
+ echo "ok ${count_total} ${test} # SKIP" | tap_output
exitcode=$ksft_skip
else
count_fail=$(( count_fail + 1 ))
- echo "[FAIL]"
+ echo "[FAIL]" | tap_prefix
+ echo "not ok ${count_total} ${test} # exit=$ret" | tap_output
exitcode=1
fi
fi # test_selected
}
+echo "TAP version 13" | tap_output
+
CATEGORY="hugetlb" run_test ./hugepage-mmap
shmmax=$(cat /proc/sys/kernel/shmmax)
@@ -231,9 +257,9 @@ CATEGORY="hugetlb" run_test ./hugetlb_fault_after_madv
echo "$nr_hugepages_tmp" > /proc/sys/vm/nr_hugepages
if test_selected "hugetlb"; then
- echo "NOTE: These hugetlb tests provide minimal coverage. Use"
- echo " https://github.com/libhugetlbfs/libhugetlbfs.git for"
- echo " hugetlb regression testing."
+ echo "NOTE: These hugetlb tests provide minimal coverage. Use" | tap_prefix
+ echo " https://github.com/libhugetlbfs/libhugetlbfs.git for" | tap_prefix
+ echo " hugetlb regression testing." | tap_prefix
fi
CATEGORY="mmap" run_test ./map_fixed_noreplace
@@ -312,7 +338,7 @@ CATEGORY="hmm" run_test bash ./test_hmm.sh smoke
# MADV_POPULATE_READ and MADV_POPULATE_WRITE tests
CATEGORY="madv_populate" run_test ./madv_populate
-echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
+(echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope 2>&1) | tap_prefix
CATEGORY="memfd_secret" run_test ./memfd_secret
# KSM KSM_MERGE_TIME_HUGE_PAGES test with size of 100
@@ -369,6 +395,7 @@ CATEGORY="mkdirty" run_test ./mkdirty
CATEGORY="mdwe" run_test ./mdwe_test
-echo "SUMMARY: PASS=${count_pass} SKIP=${count_skip} FAIL=${count_fail}"
+echo "SUMMARY: PASS=${count_pass} SKIP=${count_skip} FAIL=${count_fail}" | tap_prefix
+echo "1..${count_total}" | tap_output
exit $exitcode
--
2.25.1
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
Add a test that writes longs strings, some over the size of the sub buffer
and make sure that the entire content is there.
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since v3: https://lore.kernel.org/linux-trace-kernel/20231212192317.0fb6b101@gandalf.…
- Removed / */ from regex, to catch more than one space added to the
beginning of the print. This would have caught the bug of using "%*s"
instead of "%.*s". Luckily, the trace_printk test caught that.
.../ftrace/test.d/00basic/trace_marker.tc | 82 +++++++++++++++++++
1 file changed, 82 insertions(+)
create mode 100755 tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc b/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
new file mode 100755
index 000000000000..9aa0db2b84fc
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
@@ -0,0 +1,82 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Basic tests on writing to trace_marker
+# requires: trace_marker
+# flags: instance
+
+get_buffer_data_size() {
+ sed -ne 's/^.*data.*size:\([0-9][0-9]*\).*/\1/p' events/header_page
+}
+
+get_buffer_data_offset() {
+ sed -ne 's/^.*data.*offset:\([0-9][0-9]*\).*/\1/p' events/header_page
+}
+
+get_event_header_size() {
+ type_len=`sed -ne 's/^.*type_len.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ time_len=`sed -ne 's/^.*time_delta.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ array_len=`sed -ne 's/^.*array.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ total_bits=$((type_len+time_len+array_len))
+ total_bits=$((total_bits+7))
+ echo $((total_bits/8))
+}
+
+get_print_event_buf_offset() {
+ sed -ne 's/^.*buf.*offset:\([0-9][0-9]*\).*/\1/p' events/ftrace/print/format
+}
+
+event_header_size=`get_event_header_size`
+print_header_size=`get_print_event_buf_offset`
+
+data_offset=`get_buffer_data_offset`
+
+marker_meta=$((event_header_size+print_header_size))
+
+make_str() {
+ cnt=$1
+ # subtract two for \n\0 as marker adds these
+ cnt=$((cnt-2))
+ printf -- 'X%.0s' $(seq $cnt)
+}
+
+write_buffer() {
+ size=$1
+
+ str=`make_str $size`
+
+ # clear the buffer
+ echo > trace
+
+ # write the string into the marker
+ echo -n $str > trace_marker
+
+ echo $str
+}
+
+test_buffer() {
+
+ size=`get_buffer_data_size`
+ oneline_size=$((size-marker_meta))
+ echo size = $size
+ echo meta size = $marker_meta
+
+ # Now add a little more the meta data overhead will overflow
+
+ str=`write_buffer $size`
+
+ # Make sure the line was broken
+ new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: /,"");printf "%s", $0; exit}' trace`
+
+ if [ "$new_str" = "$str" ]; then
+ exit fail;
+ fi
+
+ # Make sure the entire line can be found
+ new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: /,"");printf "%s", $0; }' trace`
+
+ if [ "$new_str" != "$str" ]; then
+ exit fail;
+ fi
+}
+
+test_buffer
--
2.42.0
If an integer's type has x bits, shifting the integer left by x or more
is undefined behavior.
This can happen in the rotate function when attempting to do a rotation
of the whole value by 0.
Fixes: 0dd714bfd200 ("KVM: s390: selftest: memop: Add cmpxchg tests")
Signed-off-by: Nina Schoetterl-Glausch <nsg(a)linux.ibm.com>
---
tools/testing/selftests/kvm/s390x/memop.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c
index bb3ca9a5d731..2eba9575828e 100644
--- a/tools/testing/selftests/kvm/s390x/memop.c
+++ b/tools/testing/selftests/kvm/s390x/memop.c
@@ -485,11 +485,13 @@ static bool popcount_eq(__uint128_t a, __uint128_t b)
static __uint128_t rotate(int size, __uint128_t val, int amount)
{
- unsigned int bits = size * 8;
+ unsigned int left, right, bits = size * 8;
- amount = (amount + bits) % bits;
+ right = (amount + bits) % bits;
+ /* % 128 prevents left shift UB if size == 16 && right == 0 */
+ left = (bits - right) % 128;
val = cut_to_size(size, val);
- return (val << (bits - amount)) | (val >> amount);
+ return (val << left) | (val >> right);
}
const unsigned int max_block = 16;
base-commit: 305230142ae0637213bf6e04f6d9f10bbcb74af8
--
2.40.1
A statement used %d print formatter where %s should have
been used. The same has been fixed in this commit.
Signed-off-by: Ghanshyam Agrawal <ghanshyam1898(a)gmail.com>
---
tools/testing/selftests/alsa/mixer-test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/alsa/mixer-test.c b/tools/testing/selftests/alsa/mixer-test.c
index 21e482b23f50..23df154fcdd7 100644
--- a/tools/testing/selftests/alsa/mixer-test.c
+++ b/tools/testing/selftests/alsa/mixer-test.c
@@ -138,7 +138,7 @@ static void find_controls(void)
err = snd_ctl_elem_info(card_data->handle,
ctl_data->info);
if (err < 0) {
- ksft_print_msg("%s getting info for %d\n",
+ ksft_print_msg("%s getting info for %s\n",
snd_strerror(err),
ctl_data->name);
}
--
2.25.1
Here are a few fixes related to MPTCP:
Patch 1 avoids skipping some subtests of the MPTCP Join selftest by
mistake when using older versions of GCC. This fixes a patch introduced
in v6.4, backported up to v6.1.
Patch 2 fixes an inconsistent state when using MPTCP + FastOpen. A fix
for v6.2.
Patch 3 adds a description for MPTCP Kunit test modules to avoid a
warning.
Patch 4 adds an entry to the mailmap file for Geliang's email addresses.
Signed-off-by: Matthieu Baerts <matttbe(a)kernel.org>
---
Geliang Tang (2):
selftests: mptcp: join: fix subflow_send_ack lookup
mailmap: add entries for Geliang Tang
Matthieu Baerts (1):
mptcp: fill in missing MODULE_DESCRIPTION()
Paolo Abeni (1):
mptcp: fix inconsistent state on fastopen race
.mailmap | 4 ++++
net/mptcp/crypto_test.c | 1 +
net/mptcp/protocol.c | 6 +++---
net/mptcp/protocol.h | 9 +++++---
net/mptcp/subflow.c | 28 +++++++++++++++----------
net/mptcp/token_test.c | 1 +
tools/testing/selftests/net/mptcp/mptcp_join.sh | 8 +++----
7 files changed, 36 insertions(+), 21 deletions(-)
---
base-commit: 64b8bc7d5f1434c636a40bdcfcd42b278d1714be
change-id: 20231215-upstream-net-20231215-mptcp-misc-fixes-33c4380c2f32
Best regards,
--
Matthieu Baerts <matttbe(a)kernel.org>
kvm_page_table_test's current default guest memory is set to 1GB,
however on a 4GB of system memory this setting causes an OOM event.
While it is able to control the test program arguments using an
environment variable, KSELFTEST_KVM_PAGE_TABLE_TEST_ARGS, it is not
intuitively clear for a selftest users the above variable exists, change
the default guest memory down to 128MB so that small systems can run
this test without seeing an OOM.
---
Signed-off-by: Itaru Kitayama <itaru.kitayama(a)linux.dev>
---
tools/testing/selftests/kvm/kvm_page_table_test.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/kvm_page_table_test.c b/tools/testing/selftests/kvm/kvm_page_table_test.c
index 69f26d80c821..3cef22642bcb 100644
--- a/tools/testing/selftests/kvm/kvm_page_table_test.c
+++ b/tools/testing/selftests/kvm/kvm_page_table_test.c
@@ -24,8 +24,8 @@
#define TEST_MEM_SLOT_INDEX 1
-/* Default size(1GB) of the memory for testing */
-#define DEFAULT_TEST_MEM_SIZE (1 << 30)
+/* Default size(128MB) of the memory for testing */
+#define DEFAULT_TEST_MEM_SIZE (1 << 27)
/* Default guest test virtual memory offset */
#define DEFAULT_GUEST_TEST_MEM 0xc0000000
---
base-commit: a39b6ac3781d46ba18193c9dbb2110f31e9bffe9
change-id: 20231217-selftest-dev-c769544c303d
Best regards,
--
Itaru Kitayama <itaru.kitayama(a)linux.dev>
Two small improves to BPF exceptions in this patchset:
1. Allow throwing exceptions in XDP progs
2. Add some macros to help release references before throwing exceptions
Note the macros are intended to be temporary, at least until BPF
exception infra is able to automatically release acquired resources.
Daniel Xu (3):
bpf: xdp: Register generic_kfunc_set with XDP programs
bpf: selftests: Add bpf_assert_if() and bpf_assert_with_if() macros
bpf: selftests: Test bpf_assert_if() and bpf_assert_with_if()
kernel/bpf/helpers.c | 1 +
.../testing/selftests/bpf/bpf_experimental.h | 22 +++++++
.../selftests/bpf/prog_tests/exceptions.c | 5 ++
.../testing/selftests/bpf/progs/exceptions.c | 61 +++++++++++++++++++
4 files changed, 89 insertions(+)
--
2.42.1
If we run parameterized test that uses test->priv to prepare some
custom data, then value of test->priv will leak to the next param
iteration and may be unexpected.
Cc: David Gow <davidgow(a)google.com>
Cc: Rae Moar <rmoar(a)google.com>
Michal Wajdeczko (2):
kunit: Add example for using test->priv
kunit: Reset test->priv after each param iteration
lib/kunit/kunit-example-test.c | 15 +++++++++++++++
lib/kunit/test.c | 1 +
2 files changed, 16 insertions(+)
--
2.25.1
Hi all,
Here's v4 series to improve resctrl selftests with generalized test
framework and rewritten CAT test.
The series contains following improvements:
- Excludes shareable bits from CAT test allocation to avoid interference
- Replaces file "sink" with a volatile variable
- Alters read pattern to defeat HW prefetcher optimizations
- Rewrites CAT test to make the CAT test reliable and truly measure
if CAT is working or not
- Introduces generalized test framework making easier to add new tests
- Lots of other cleanups & refactoring
This series has been tested across a large number of systems from
different generations.
v4:
- Reworded a few error prints
- Changelog improvements
- fprintf()'s error handling changed ksft_perror() -> ksft_print_msg()
- Keep using ksft_*() instead of fprintf() in get_bit_mask()
- Check against div-by-zero
- Adjust one return type
v3:
- New patches to handle return errno, perror() and return value comments
- Tweak changelogs
- Moved error printout removal to other patch
- Zero bit CBM returns error
- Tweak comments
- Make get_shareable_mask() static
- Return directly without storing result into ret variable first
- llc -> LLC
- Altered changelog and removed "the whole time" wording because
llc occu results are still unsigned long
- Altered changelog's wording to not say "a volatile pointer"
- Make min_diff_percent and MIN_DIFF_PERCENT_PER_BIT unsigned long
- Add patch to restore CPU affinity after CAT test
- Move uparams clear into init function
- Add CPU vendor ID bitmask comment
- Use test_resource_feature_check(test) in CMT
- "feature" -> "resource" in function comment
v2:
- Postpone adding L2 CAT test as more investigations are necessary
- Add patch to remove ctrlc_handler() from wrong place
- Improvements to changelogs
- Function comments improvements & comment cleanups
- Move some parts of the changes into more logical patch
- If checks: buf == NULL -> !buf
- Variable naming:
- p -> buf
- cbm_mask_path -> cbm_path
- Function naming:
- get_cbm_mask() -> get_full_cbm()
- cache_size() -> cache_portion_size()
- Use PATH_MAX
- Improved cache_portion_size() parameter names
- int count -> unsigned int
- Pass filename to measurement taking functions instead of
resctrl_val_param
- !lines ? : reversal
- Removed bogus static from function local variable
- Open perf fd only once, reset & enable in the innermost test loop
- Add perf fd ioctl() error handling
- Add patch to change compiler optimization prevention "sink" from file
to volatile variable
- Remove cpu_no and resource (the latter was added in v1) members from
resctrl_val_param (pass uparams and test where those are needed)
- Removed ARRAY_SIZE() macro
- Add patch to rename "resource_id" to "domain_id"
Ilpo Järvinen (29):
selftests/resctrl: Convert perror() to ksft_perror() or
ksft_print_msg()
selftests/resctrl: Return -1 instead of errno on error
selftests/resctrl: Don't use ctrlc_handler() outside signal handling
selftests/resctrl: Change function comments to say < 0 on error
selftests/resctrl: Split fill_buf to allow tests finer-grained control
selftests/resctrl: Refactor fill_buf functions
selftests/resctrl: Refactor get_cbm_mask() and rename to
get_full_cbm()
selftests/resctrl: Mark get_cache_size() cache_type const
selftests/resctrl: Create cache_portion_size() helper
selftests/resctrl: Exclude shareable bits from schemata in CAT test
selftests/resctrl: Split measure_cache_vals()
selftests/resctrl: Split show_cache_info() to test specific and
generic parts
selftests/resctrl: Remove unnecessary __u64 -> unsigned long
conversion
selftests/resctrl: Remove nested calls in perf event handling
selftests/resctrl: Consolidate naming of perf event related things
selftests/resctrl: Improve perf init
selftests/resctrl: Convert perf related globals to locals
selftests/resctrl: Move cat_val() to cat_test.c and rename to
cat_test()
selftests/resctrl: Open perf fd before start & add error handling
selftests/resctrl: Replace file write with volatile variable
selftests/resctrl: Read in less obvious order to defeat prefetch
optimizations
selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test
selftests/resctrl: Restore the CPU affinity after CAT test
selftests/resctrl: Create struct for input parameters
selftests/resctrl: Introduce generalized test framework
selftests/resctrl: Pass write_schemata() resource instead of test name
selftests/resctrl: Add helper to convert L2/3 to integer
selftests/resctrl: Rename resource ID to domain ID
selftests/resctrl: Get domain id from cache id
tools/testing/selftests/resctrl/cache.c | 287 +++++----------
tools/testing/selftests/resctrl/cat_test.c | 337 +++++++++++-------
tools/testing/selftests/resctrl/cmt_test.c | 80 +++--
tools/testing/selftests/resctrl/fill_buf.c | 132 ++++---
tools/testing/selftests/resctrl/mba_test.c | 30 +-
tools/testing/selftests/resctrl/mbm_test.c | 32 +-
tools/testing/selftests/resctrl/resctrl.h | 135 +++++--
.../testing/selftests/resctrl/resctrl_tests.c | 197 ++++------
tools/testing/selftests/resctrl/resctrl_val.c | 138 +++----
tools/testing/selftests/resctrl/resctrlfs.c | 321 +++++++++++------
10 files changed, 945 insertions(+), 744 deletions(-)
--
2.30.2
KUnit tests often need to provide a struct device, and thus far have
mostly been using root_device_register() or platform devices to create
a 'fake device' for use with, e.g., code which uses device-managed
resources. This has several disadvantages, including not being designed
for test use, scattering files in sysfs, and requiring manual teardown
on test exit, which may not always be possible in case of failure.
Instead, introduce a set of helper functions which allow devices
(internally a struct kunit_device) to be created and managed by KUnit --
i.e., they will be automatically unregistered on test exit. These
helpers can either use a user-provided struct device_driver, or have one
automatically created and managed by KUnit. In both cases, the device
lives on a new kunit_bus.
This is a follow-up to a previous proposal here:
https://lore.kernel.org/linux-kselftest/20230325043104.3761770-1-davidgow@g…
(The kunit_defer() function in the first patch there has since been
merged as the 'deferred actions' feature.)
My intention is to take this whole series in via the kselftest/kunit
branch, but I'm equally okay with splitting up the later patches which
use this to go via the various subsystem trees in case there are merge
conflicts.
Cheers,
-- David
Signed-off-by: David Gow <davidgow(a)google.com>
---
Changes in v4:
- Update tags, fix a missing Signed-off-by.
- Link to v3: https://lore.kernel.org/r/20231214-kunit_bus-v3-0-7e9a287d3048@google.com
Changes in v3:
- Port the DRM tests to these new helpers (Thanks, Maxime!)
- Include the lib/kunit/device-impl.h file, which was missing from the
previous revision.
- Fix a use-after-free bug in kunit_device_driver_test, which resulted
in memory corruption on some clang-built UML builds.
- The 'test_state' is now allocated with kunit_kzalloc(), not on the
stack, as the stack will be gone when cleanup occurs.
- Link to v2: https://lore.kernel.org/r/20231208-kunit_bus-v2-0-e95905d9b325@google.com
Changes in v2:
- Simplify device/driver/bus matching, removing the no-longer-required
kunit_bus_match function. (Thanks, Greg)
- The return values are both more consistent (kunit_device_register now
returns an explicit error pointer, rather than failing the test), and
better documented.
- Add some basic documentation to the implementations as well as the
headers. The documentation in the headers is still more complete, and
is now properly compiled into the HTML documentation (under
dev-tools/kunit/api/resources.html). (Thanks, Matti)
- Moved the internal-only kunit_bus_init() function to a private header,
lib/kunit/device-impl.h to avoid polluting the public headers, and
match other internal-only headers. (Thanks, Greg)
- Alphabetise KUnit includes in other test modules. (Thanks, Amadeusz.)
- Several code cleanups, particularly around error handling and
allocation. (Thanks Greg, Maxime)
- Several const-correctness and casting improvements. (Thanks, Greg)
- Added a new test to verify KUnit cleanup triggers device cleanup.
(Thanks, Maxime).
- Improved the user-specified device test to verify that probe/remove
hooks are called correctly. (Thanks, Maxime).
- The overflow test no-longer needlessly calls
kunit_device_unregister().
- Several other minor cleanups and documentation improvements, which
hopefully make this a bit clearer and more robust.
- Link to v1: https://lore.kernel.org/r/20231205-kunit_bus-v1-0-635036d3bc13@google.com
---
David Gow (4):
kunit: Add APIs for managing devices
fortify: test: Use kunit_device
overflow: Replace fake root_device with kunit_device
ASoC: topology: Replace fake root_device with kunit_device in tests
Maxime Ripard (1):
drm/tests: Switch to kunit devices
Documentation/dev-tools/kunit/api/resource.rst | 9 ++
Documentation/dev-tools/kunit/usage.rst | 50 +++++++
drivers/gpu/drm/tests/drm_kunit_helpers.c | 66 +--------
include/kunit/device.h | 80 +++++++++++
lib/fortify_kunit.c | 5 +-
lib/kunit/Makefile | 3 +-
lib/kunit/device-impl.h | 17 +++
lib/kunit/device.c | 181 +++++++++++++++++++++++++
lib/kunit/kunit-test.c | 134 +++++++++++++++++-
lib/kunit/test.c | 3 +
lib/overflow_kunit.c | 5 +-
sound/soc/soc-topology-test.c | 10 +-
12 files changed, 485 insertions(+), 78 deletions(-)
---
base-commit: b285ba6f8cc1b2bfece0b4350fdb92c8780bc698
change-id: 20230718-kunit_bus-ab19c4ef48dc
Best regards,
--
David Gow <davidgow(a)google.com>
The "locked-in-memory size" limit per process can be non-multiple of
page_size. The mmap() fails if we try to allocate locked-in-memory
with same size as the allowed limit if it isn't multiple of the
page_size because mmap() rounds off the memory size to be allocated
to next multiple of page_size.
Fix this by flooring the length to be allocated with mmap() to the
previous multiple of the page_size.
Fixes: 76fe17ef588a ("secretmem: test: add basic selftest for memfd_secret(2)")
Reported-by: "kernelci.org bot" <bot(a)kernelci.org>
Signed-off-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
---
tools/testing/selftests/mm/memfd_secret.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/mm/memfd_secret.c b/tools/testing/selftests/mm/memfd_secret.c
index 957b9e18c729..9b298f6a04b3 100644
--- a/tools/testing/selftests/mm/memfd_secret.c
+++ b/tools/testing/selftests/mm/memfd_secret.c
@@ -62,6 +62,9 @@ static void test_mlock_limit(int fd)
char *mem;
len = mlock_limit_cur;
+ if (len % page_size != 0)
+ len = (len/page_size) * page_size;
+
mem = mmap(NULL, len, prot, mode, fd, 0);
if (mem == MAP_FAILED) {
fail("unable to mmap secret memory\n");
--
2.42.0
Changes since v6:
* Remove inlines from arm_pmuv3.c
* Use format attribute mechanism from SPE
* Re-arrange attributes so that threshold comes last and can
potentially be extended
* Emit an error if the max threshold is exceeded rather than clamping
* Convert all register fields to GENMASK
Changes since v5:
* Restructure the docs and add some more explanations
* PMMIR.WIDTH -> PMMIR.THWIDTH in one comment
* Don't write EVTYPER.TC if TH is 0. Doesn't have any functional
effect but it might be a bit easier to understand the code.
* Expand the format field #define names
Changes since v4:
* Rebase onto v6.7-rc1, it no longer depends on kvmarm/next
* Remove change that moved ARMV8_PMU_EVTYPE_MASK to the asm files.
This actually depended on those files being included in a certain
order with arm_pmuv3.h to avoid circular includes. Now the
definition is done programmatically in arm_pmuv3.c instead.
Changes since v3:
* Drop #include changes to KVM source files because since
commit bc512d6a9b92 ("KVM: arm64: Make PMEVTYPER<n>_EL0.NSH RES0 if
EL2 isn't advertised"), KVM doesn't use ARMV8_PMU_EVTYPE_MASK
anymore
Changes since v2:
* Split threshold_control attribute into two, threshold_compare and
threshold_count so that it's easier to use
* Add some notes to the first commit message and the cover letter
about the behavior in KVM
* Update the docs commit with regards to the split attribute
Changes since v1:
* Fix build on aarch32 by disabling FEAT_PMUv3_TH and splitting event
type mask between the platforms
* Change armv8pmu_write_evtype() to take unsigned long instead of u64
so it isn't unnecessarily wide on aarch32
* Add UL suffix to aarch64 event type mask definition
----
FEAT_PMUv3_TH (Armv8.8) is a new feature that allows conditional
counting of PMU events depending on how much the event increments on
a single cycle. Two new config fields for perf_event_open have been
added, and a PMU cap file for reading the max_threshold. See the second
commit message and the docs in the last commit for more details.
The feature is not currently supported on KVM guests, and PMMIR is set
to read as zero, so it's not advertised as available. But it can be
added at a later time. Writes to PMEVTYPER.TC and TH from guests are
already RES0.
The change has been validated on the Arm FVP model:
# Zero values, works as expected (as before).
$ perf stat -e dtlb_walk/threshold=0,threshold_compare=0/ -- true
5962 dtlb_walk/threshold=0,threshold_compare=0/
# Threshold >= 255 causes count to be 0 because dtlb_walk doesn't
# increase by more than 1 per cycle.
$ perf stat -e dtlb_walk/threshold=255,threshold_compare=2/ -- true
0 dtlb_walk/threshold=255,threshold_compare=2/
# Keeping comparison as >= but lowering the threshold to 1 makes the
# count return.
$ perf stat -e dtlb_walk/threshold=1,threshold_compare=2/ -- true
6329 dtlb_walk/threshold=1,threshold_compare=2/
James Clark (11):
arm: perf: Remove inlines from arm_pmuv3.c
arm: perf/kvm: Use GENMASK for ARMV8_PMU_PMCR_N
arm: perf: Use GENMASK for PMMIR fields
arm: perf: Convert remaining fields to use GENMASK
arm64: perf: Include threshold control fields in PMEVTYPER mask
arm: pmu: Share user ABI format mechanism with SPE
perf/arm_dmc620: Remove duplicate format attribute #defines
KVM: selftests: aarch64: Update tools copy of arm_pmuv3.h
arm: pmu: Move error message and -EOPNOTSUPP to individual PMUs
arm64: perf: Add support for event counting threshold
Documentation: arm64: Document the PMU event counting threshold
feature
Documentation/arch/arm64/perf.rst | 72 +++++++
arch/arm/kernel/perf_event_v7.c | 6 +-
arch/arm64/kvm/pmu-emul.c | 8 +-
arch/arm64/kvm/sys_regs.c | 4 +-
drivers/perf/apple_m1_cpu_pmu.c | 6 +-
drivers/perf/arm_dmc620_pmu.c | 22 +--
drivers/perf/arm_pmu.c | 11 +-
drivers/perf/arm_pmuv3.c | 175 ++++++++++++++----
drivers/perf/arm_spe_pmu.c | 22 ---
include/linux/perf/arm_pmu.h | 22 +++
include/linux/perf/arm_pmuv3.h | 34 ++--
tools/include/perf/arm_pmuv3.h | 43 +++--
.../kvm/aarch64/vpmu_counter_access.c | 5 +-
13 files changed, 296 insertions(+), 134 deletions(-)
--
2.34.1
Here is the 3rd part of converting net selftests to run in unique namespace.
This part converts all srv6 and fib tests.
Note that patch 06 is a fix for testing fib_nexthop_multiprefix.
Here is the part 1 link:
https://lore.kernel.org/netdev/20231202020110.362433-1-liuhangbin@gmail.com
And part 2 link:
https://lore.kernel.org/netdev/20231206070801.1691247-1-liuhangbin@gmail.com
v1 -> v2:
- indent the test result in case --- cut off the patch (Jakub Kicinski)
Hangbin Liu (13):
selftests/net: add variable NS_LIST for lib.sh
selftests/net: convert srv6_end_dt46_l3vpn_test.sh to run it in unique
namespace
selftests/net: convert srv6_end_dt4_l3vpn_test.sh to run it in unique
namespace
selftests/net: convert srv6_end_dt6_l3vpn_test.sh to run it in unique
namespace
selftests/net: convert fcnal-test.sh to run it in unique namespace
selftests/net: fix grep checking for fib_nexthop_multiprefix
selftests/net: convert fib_nexthop_multiprefix to run it in unique
namespace
selftests/net: convert fib_nexthop_nongw.sh to run it in unique
namespace
selftests/net: convert fib_nexthops.sh to run it in unique namespace
selftests/net: convert fib-onlink-tests.sh to run it in unique
namespace
selftests/net: convert fib_rule_tests.sh to run it in unique namespace
selftests/net: convert fib_tests.sh to run it in unique namespace
selftests/net: convert fdb_flush.sh to run it in unique namespace
tools/testing/selftests/net/fcnal-test.sh | 30 ++-
tools/testing/selftests/net/fdb_flush.sh | 11 +-
.../testing/selftests/net/fib-onlink-tests.sh | 9 +-
.../selftests/net/fib_nexthop_multiprefix.sh | 98 +++++-----
.../selftests/net/fib_nexthop_nongw.sh | 34 ++--
tools/testing/selftests/net/fib_nexthops.sh | 142 +++++++-------
tools/testing/selftests/net/fib_rule_tests.sh | 36 ++--
tools/testing/selftests/net/fib_tests.sh | 184 +++++++++---------
tools/testing/selftests/net/lib.sh | 8 +
tools/testing/selftests/net/settings | 2 +-
.../selftests/net/srv6_end_dt46_l3vpn_test.sh | 51 +++--
.../selftests/net/srv6_end_dt4_l3vpn_test.sh | 48 ++---
.../selftests/net/srv6_end_dt6_l3vpn_test.sh | 46 ++---
13 files changed, 332 insertions(+), 367 deletions(-)
--
2.43.0
This patchset adds two kfunc helpers, bpf_xdp_get_xfrm_state() and
bpf_xdp_xfrm_state_release() that wrap xfrm_state_lookup() and
xfrm_state_put(). The intent is to support software RSS (via XDP) for
the ongoing/upcoming ipsec pcpu work [0]. Recent experiments performed
on (hopefully) reproducible AWS testbeds indicate that single tunnel
pcpu ipsec can reach line rate on 100G ENA nics.
Note this patchset only tests/shows generic xfrm_state access. The
"secret sauce" (if you can really even call it that) involves accessing
a soon-to-be-upstreamed pcpu_num field in xfrm_state. Early example is
available here [1].
[0]: https://datatracker.ietf.org/doc/draft-ietf-ipsecme-multi-sa-performance/03/
[1]: https://github.com/danobi/xdp-tools/blob/e89a1c617aba3b50d990f779357d6ce286…
Changes from v5:
* Improve kfunc doc comments
* Remove extraneous replay-window setting on selftest reverse path
* Squash two kfunc commits into one
* Rebase to bpf-next to pick up bitfield write patches
* Remove testing of opts.error in selftest prog
Changes from v4:
* Fixup commit message for selftest
* Set opts->error -ENOENT for !x
* Revert single file xfrm + bpf
Changes from v3:
* Place all xfrm bpf integrations in xfrm_bpf.c
* Avoid using nval as a temporary
* Rebase to bpf-next
* Remove extraneous __failure_unpriv annotation for verifier tests
Changes from v2:
* Fix/simplify BPF_CORE_WRITE_BITFIELD() algorithm
* Added verifier tests for bitfield writes
* Fix state leakage across test_tunnel subtests
Changes from v1:
* Move xfrm tunnel tests to test_progs
* Fix writing to opts->error when opts is invalid
* Use __bpf_kfunc_start_defs()
* Remove unused vxlanhdr definition
* Add and use BPF_CORE_WRITE_BITFIELD() macro
* Make series bisect clean
Changes from RFCv2:
* Rebased to ipsec-next
* Fix netns leak
Changes from RFCv1:
* Add Antony's commit tags
* Add KF_ACQUIRE and KF_RELEASE semantics
Daniel Xu (5):
bpf: xfrm: Add bpf_xdp_get_xfrm_state() kfunc
bpf: selftests: test_tunnel: Setup fresh topology for each subtest
bpf: selftests: test_tunnel: Use vmlinux.h declarations
bpf: selftests: Move xfrm tunnel test to test_progs
bpf: xfrm: Add selftest for bpf_xdp_get_xfrm_state()
include/net/xfrm.h | 9 +
net/xfrm/Makefile | 1 +
net/xfrm/xfrm_policy.c | 2 +
net/xfrm/xfrm_state_bpf.c | 134 +++++++++++++++
.../selftests/bpf/prog_tests/test_tunnel.c | 162 +++++++++++++++++-
.../selftests/bpf/progs/bpf_tracing_net.h | 1 +
.../selftests/bpf/progs/test_tunnel_kern.c | 138 ++++++++-------
tools/testing/selftests/bpf/test_tunnel.sh | 92 ----------
8 files changed, 384 insertions(+), 155 deletions(-)
create mode 100644 net/xfrm/xfrm_state_bpf.c
--
2.42.1
This patchset adds two kfunc helpers, bpf_xdp_get_xfrm_state() and
bpf_xdp_xfrm_state_release() that wrap xfrm_state_lookup() and
xfrm_state_put(). The intent is to support software RSS (via XDP) for
the ongoing/upcoming ipsec pcpu work [0]. Recent experiments performed
on (hopefully) reproducible AWS testbeds indicate that single tunnel
pcpu ipsec can reach line rate on 100G ENA nics.
Note this patchset only tests/shows generic xfrm_state access. The
"secret sauce" (if you can really even call it that) involves accessing
a soon-to-be-upstreamed pcpu_num field in xfrm_state. Early example is
available here [1].
[0]: https://datatracker.ietf.org/doc/draft-ietf-ipsecme-multi-sa-performance/03/
[1]: https://github.com/danobi/xdp-tools/blob/e89a1c617aba3b50d990f779357d6ce286…
Changes from v4:
* Fixup commit message for selftest
* Set opts->error -ENOENT for !x
* Revert single file xfrm + bpf
Changes from v3:
* Place all xfrm bpf integrations in xfrm_bpf.c
* Avoid using nval as a temporary
* Rebase to bpf-next
* Remove extraneous __failure_unpriv annotation for verifier tests
Changes from v2:
* Fix/simplify BPF_CORE_WRITE_BITFIELD() algorithm
* Added verifier tests for bitfield writes
* Fix state leakage across test_tunnel subtests
Changes from v1:
* Move xfrm tunnel tests to test_progs
* Fix writing to opts->error when opts is invalid
* Use __bpf_kfunc_start_defs()
* Remove unused vxlanhdr definition
* Add and use BPF_CORE_WRITE_BITFIELD() macro
* Make series bisect clean
Changes from RFCv2:
* Rebased to ipsec-next
* Fix netns leak
Changes from RFCv1:
* Add Antony's commit tags
* Add KF_ACQUIRE and KF_RELEASE semantics
Daniel Xu (9):
bpf: xfrm: Add bpf_xdp_get_xfrm_state() kfunc
bpf: xfrm: Add bpf_xdp_xfrm_state_release() kfunc
libbpf: Add BPF_CORE_WRITE_BITFIELD() macro
bpf: selftests: test_loader: Support __btf_path() annotation
bpf: selftests: Add verifier tests for CO-RE bitfield writes
bpf: selftests: test_tunnel: Setup fresh topology for each subtest
bpf: selftests: test_tunnel: Use vmlinux.h declarations
bpf: selftests: Move xfrm tunnel test to test_progs
bpf: xfrm: Add selftest for bpf_xdp_get_xfrm_state()
include/net/xfrm.h | 9 +
net/xfrm/Makefile | 1 +
net/xfrm/xfrm_policy.c | 2 +
net/xfrm/xfrm_state_bpf.c | 130 ++++++++++++++
tools/lib/bpf/bpf_core_read.h | 32 ++++
.../selftests/bpf/prog_tests/test_tunnel.c | 162 +++++++++++++++++-
.../selftests/bpf/prog_tests/verifier.c | 2 +
tools/testing/selftests/bpf/progs/bpf_misc.h | 1 +
.../selftests/bpf/progs/bpf_tracing_net.h | 1 +
.../selftests/bpf/progs/test_tunnel_kern.c | 138 ++++++++-------
.../bpf/progs/verifier_bitfield_write.c | 100 +++++++++++
tools/testing/selftests/bpf/test_loader.c | 7 +
tools/testing/selftests/bpf/test_tunnel.sh | 92 ----------
13 files changed, 522 insertions(+), 155 deletions(-)
create mode 100644 net/xfrm/xfrm_state_bpf.c
create mode 100644 tools/testing/selftests/bpf/progs/verifier_bitfield_write.c
--
2.42.1
The test is inspired by the pmu_event_filter_test which implemented by x86. On
the arm64 platform, there is the same ability to set the pmu_event_filter
through the KVM_ARM_VCPU_PMU_V3_FILTER attribute. So add the test for arm64.
The series first move some pmu common code from vpmu_counter_access to
lib/aarch64/vpmu.c and include/aarch64/vpmu.h, which can be used by
pmu_event_filter_test. Then fix a bug related to the [enable|disable]_counter,
and at last, implement the test itself.
Changelog:
----------
v1->v2:
- Improve the commit message. [Eric]
- Fix the bug in [enable|disable]_counter. [Raghavendra & Marc]
- Add the check if kvm has attr KVM_ARM_VCPU_PMU_V3_FILTER.
- Add if host pmu support the test event throught pmceid0.
- Split the test_invalid_filter() to another patch. [Eric]
v1: https://lore.kernel.org/all/20231123063750.2176250-1-shahuang@redhat.com/
Shaoqin Huang (5):
KVM: selftests: aarch64: Make the [create|destroy]_vpmu_vm() public
KVM: selftests: aarch64: Move pmu helper functions into vpmu.h
KVM: selftests: aarch64: Fix the buggy [enable|disable]_counter
KVM: selftests: aarch64: Introduce pmu_event_filter_test
KVM: selftests: aarch64: Add invalid filter test in
pmu_event_filter_test
tools/testing/selftests/kvm/Makefile | 2 +
.../kvm/aarch64/pmu_event_filter_test.c | 267 ++++++++++++++++++
.../kvm/aarch64/vpmu_counter_access.c | 218 ++------------
.../selftests/kvm/include/aarch64/vpmu.h | 135 +++++++++
.../testing/selftests/kvm/lib/aarch64/vpmu.c | 74 +++++
5 files changed, 502 insertions(+), 194 deletions(-)
create mode 100644 tools/testing/selftests/kvm/aarch64/pmu_event_filter_test.c
create mode 100644 tools/testing/selftests/kvm/include/aarch64/vpmu.h
create mode 100644 tools/testing/selftests/kvm/lib/aarch64/vpmu.c
--
2.40.1
Hi all,
Here's v3 series to improve resctrl selftests with generalized test
framework and rewritten CAT test. As agreed, v3 does not include the
group naming patch which will become part of Maciej's non-contiguous
serie. The error handling cleanups (return errno, perror() & return
value comment cleanups) and CPU affinity restore for CAT test add to
the patch count.
The series contains following improvements:
- Excludes shareable bits from CAT test allocation to avoid interference
- Replaces file "sink" with a volatile variable
- Alters read pattern to defeat HW prefetcher optimizations
- Rewrites CAT test to make the CAT test reliable and truly measure
if CAT is working or not
- Introduces generalized test framework making easier to add new tests
- Lots of other cleanups & refactoring
This serie have been tested across a large number of systems from
different generations.
v3:
- New patches to handle return errno, perror() and return value comments
- Tweak changelogs
- Moved error printout removal to other patch
- Zero bit CBM returns error
- Tweak comments
- Make get_shareable_mask() static
- Return directly without storing result into ret variable first
- llc -> LLC
- Altered changelog and removed "the whole time" wording because
llc occu results are still unsigned long
- Altered changelog's wording to not say "a volatile pointer"
- Make min_diff_percent and MIN_DIFF_PERCENT_PER_BIT unsigned long
- Add patch to restore CPU affinity after CAT test
- Move uparams clear into init function
- Add CPU vendor ID bitmask comment
- Use test_resource_feature_check(test) in CMT
- "feature" -> "resource" in function comment
v2:
- Postpone adding L2 CAT test as more investigations are necessary
- Add patch to remove ctrlc_handler() from wrong place
- Improvements to changelogs
- Function comments improvements & comment cleanups
- Move some parts of the changes into more logical patch
- If checks: buf == NULL -> !buf
- Variable naming:
- p -> buf
- cbm_mask_path -> cbm_path
- Function naming:
- get_cbm_mask() -> get_full_cbm()
- cache_size() -> cache_portion_size()
- Use PATH_MAX
- Improved cache_portion_size() parameter names
- int count -> unsigned int
- Pass filename to measurement taking functions instead of
resctrl_val_param
- !lines ? : reversal
- Removed bogus static from function local variable
- Open perf fd only once, reset & enable in the innermost test loop
- Add perf fd ioctl() error handling
- Add patch to change compiler optimization prevention "sink" from file
to volatile variable
- Remove cpu_no and resource (the latter was added in v1) members from
resctrl_val_param (pass uparams and test where those are needed)
- Removed ARRAY_SIZE() macro
- Add patch to rename "resource_id" to "domain_id"
Ilpo Järvinen (29):
selftests/resctrl: Convert perror() to ksft_perror() or
ksft_print_msg()
selftests/resctrl: Return -1 instead of errno on error
selftests/resctrl: Don't use ctrlc_handler() outside signal handling
selftests/resctrl: Change function comments to say < 0 on error
selftests/resctrl: Split fill_buf to allow tests finer-grained control
selftests/resctrl: Refactor fill_buf functions
selftests/resctrl: Refactor get_cbm_mask() and rename to
get_full_cbm()
selftests/resctrl: Mark get_cache_size() cache_type const
selftests/resctrl: Create cache_portion_size() helper
selftests/resctrl: Exclude shareable bits from schemata in CAT test
selftests/resctrl: Split measure_cache_vals()
selftests/resctrl: Split show_cache_info() to test specific and
generic parts
selftests/resctrl: Remove unnecessary __u64 -> unsigned long
conversion
selftests/resctrl: Remove nested calls in perf event handling
selftests/resctrl: Consolidate naming of perf event related things
selftests/resctrl: Improve perf init
selftests/resctrl: Convert perf related globals to locals
selftests/resctrl: Move cat_val() to cat_test.c and rename to
cat_test()
selftests/resctrl: Open perf fd before start & add error handling
selftests/resctrl: Replace file write with volatile variable
selftests/resctrl: Read in less obvious order to defeat prefetch
optimizations
selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test
selftests/resctrl: Restore the CPU affinity after CAT test
selftests/resctrl: Create struct for input parameters
selftests/resctrl: Introduce generalized test framework
selftests/resctrl: Pass write_schemata() resource instead of test name
selftests/resctrl: Add helper to convert L2/3 to integer
selftests/resctrl: Rename resource ID to domain ID
selftests/resctrl: Get domain id from cache id
tools/testing/selftests/resctrl/cache.c | 287 +++++----------
tools/testing/selftests/resctrl/cat_test.c | 337 +++++++++++-------
tools/testing/selftests/resctrl/cmt_test.c | 80 +++--
tools/testing/selftests/resctrl/fill_buf.c | 132 ++++---
tools/testing/selftests/resctrl/mba_test.c | 30 +-
tools/testing/selftests/resctrl/mbm_test.c | 32 +-
tools/testing/selftests/resctrl/resctrl.h | 126 +++++--
.../testing/selftests/resctrl/resctrl_tests.c | 197 ++++------
tools/testing/selftests/resctrl/resctrl_val.c | 138 +++----
tools/testing/selftests/resctrl/resctrlfs.c | 321 +++++++++++------
10 files changed, 936 insertions(+), 744 deletions(-)
--
2.30.2
KUnit tests often need to provide a struct device, and thus far have
mostly been using root_device_register() or platform devices to create
a 'fake device' for use with, e.g., code which uses device-managed
resources. This has several disadvantages, including not being designed
for test use, scattering files in sysfs, and requiring manual teardown
on test exit, which may not always be possible in case of failure.
Instead, introduce a set of helper functions which allow devices
(internally a struct kunit_device) to be created and managed by KUnit --
i.e., they will be automatically unregistered on test exit. These
helpers can either use a user-provided struct device_driver, or have one
automatically created and managed by KUnit. In both cases, the device
lives on a new kunit_bus.
This is a follow-up to a previous proposal here:
https://lore.kernel.org/linux-kselftest/20230325043104.3761770-1-davidgow@g…
(The kunit_defer() function in the first patch there has since been
merged as the 'deferred actions' feature.)
My intention is to take this whole series in via the kselftest/kunit
branch, but I'm equally okay with splitting up the later patches which
use this to go via the various subsystem trees in case there are merge
conflicts.
Cheers,
-- David
Signed-off-by: David Gow <davidgow(a)google.com>
---
Changes in v3:
- Port the DRM tests to these new helpers (Thanks, Maxime!)
- Include the lib/kunit/device-impl.h file, which was missing from the
previous revision.
- Fix a use-after-free bug in kunit_device_driver_test, which resulted
in memory corruption on some clang-built UML builds.
- The 'test_state' is now allocated with kunit_kzalloc(), not on the
stack, as the stack will be gone when cleanup occurs.
- Link to v2: https://lore.kernel.org/r/20231208-kunit_bus-v2-0-e95905d9b325@google.com
Changes in v2:
- Simplify device/driver/bus matching, removing the no-longer-required
kunit_bus_match function. (Thanks, Greg)
- The return values are both more consistent (kunit_device_register now
returns an explicit error pointer, rather than failing the test), and
better documented.
- Add some basic documentation to the implementations as well as the
headers. The documentation in the headers is still more complete, and
is now properly compiled into the HTML documentation (under
dev-tools/kunit/api/resources.html). (Thanks, Matti)
- Moved the internal-only kunit_bus_init() function to a private header,
lib/kunit/device-impl.h to avoid polluting the public headers, and
match other internal-only headers. (Thanks, Greg)
- Alphabetise KUnit includes in other test modules. (Thanks, Amadeusz.)
- Several code cleanups, particularly around error handling and
allocation. (Thanks Greg, Maxime)
- Several const-correctness and casting improvements. (Thanks, Greg)
- Added a new test to verify KUnit cleanup triggers device cleanup.
(Thanks, Maxime).
- Improved the user-specified device test to verify that probe/remove
hooks are called correctly. (Thanks, Maxime).
- The overflow test no-longer needlessly calls
kunit_device_unregister().
- Several other minor cleanups and documentation improvements, which
hopefully make this a bit clearer and more robust.
- Link to v1: https://lore.kernel.org/r/20231205-kunit_bus-v1-0-635036d3bc13@google.com
---
David Gow (4):
kunit: Add APIs for managing devices
fortify: test: Use kunit_device
overflow: Replace fake root_device with kunit_device
ASoC: topology: Replace fake root_device with kunit_device in tests
Maxime Ripard (1):
drm/tests: Switch to kunit devices
Documentation/dev-tools/kunit/api/resource.rst | 9 ++
Documentation/dev-tools/kunit/usage.rst | 50 +++++++
drivers/gpu/drm/tests/drm_kunit_helpers.c | 66 +--------
include/kunit/device.h | 80 +++++++++++
lib/fortify_kunit.c | 5 +-
lib/kunit/Makefile | 3 +-
lib/kunit/device-impl.h | 17 +++
lib/kunit/device.c | 181 +++++++++++++++++++++++++
lib/kunit/kunit-test.c | 134 +++++++++++++++++-
lib/kunit/test.c | 3 +
lib/overflow_kunit.c | 5 +-
sound/soc/soc-topology-test.c | 10 +-
12 files changed, 485 insertions(+), 78 deletions(-)
---
base-commit: b285ba6f8cc1b2bfece0b4350fdb92c8780bc698
change-id: 20230718-kunit_bus-ab19c4ef48dc
Best regards,
--
David Gow <davidgow(a)google.com>
From: Paul Durrant <pdurrant(a)amazon.com>
There are four new patches in the series over what was in version 9 [1]:
* KVM: xen: separate initialization of shared_info cache and content
* KVM: xen: (re-)initialize shared_info if guest (32/64-bit) mode is set
These deal with a missing re-initialization of shared_info if either the
guest or VMM changes the 'long_mode' flag. This was discovred in testing
when the guest wallclock reverted to the Unix epoch because the pvclock
information in the shared_info page was not in the correct place, and so
the guest read zeroes instead.
* KVM: xen: don't block on pfncache locks in kvm_xen_set_evtchn_fast()
* KVM: pfncache: check the need for invalidation under read lock first
The first of these fixes a bug discovered when compiling the kernel with
CONFIG_PROVE_RAW_LOCK_NESTING: kvm_xen_set_evtchn_fast() can be called from
the callback of a HRTIMER_MODE_ABS_HARD timer and hence be executed in
IRQ context. It should therefore not block on any lock. Thus two
occurrences of a read_lock() are converted to a read_trylock() which
kick the code down a slow-path if they fail.
The second patch removes a 'false' contention on the pfncache lock that
could result in taking that slow-path: the MMU notifier callback need only
take a pfncache read lock; it only need take a write lock if a match is
found.
Apart from these new patches...
* KVM: xen: split up kvm_xen_set_evtchn_fast()
... has been re-worked to (hopefully) improve readability and also validate
the 'correct' vcpu_info structure depending on whether the guest is in long
mode or not.
[1] https://lore.kernel.org/kvm/20231122121822.1042-1-paul@xen.org/
Paul Durrant (19):
KVM: pfncache: Add a map helper function
KVM: pfncache: remove unnecessary exports
KVM: xen: mark guest pages dirty with the pfncache lock held
KVM: pfncache: add a mark-dirty helper
KVM: pfncache: remove KVM_GUEST_USES_PFN usage
KVM: pfncache: stop open-coding offset_in_page()
KVM: pfncache: include page offset in uhva and use it consistently
KVM: pfncache: allow a cache to be activated with a fixed (userspace)
HVA
KVM: xen: separate initialization of shared_info cache and content
KVM: xen: (re-)initialize shared_info if guest (32/64-bit) mode is set
KVM: xen: allow shared_info to be mapped by fixed HVA
KVM: xen: allow vcpu_info to be mapped by fixed HVA
KVM: selftests / xen: map shared_info using HVA rather than GFN
KVM: selftests / xen: re-map vcpu_info using HVA rather than GPA
KVM: xen: advertize the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA capability
KVM: xen: split up kvm_xen_set_evtchn_fast()
KVM: xen: don't block on pfncache locks in kvm_xen_set_evtchn_fast()
KVM: pfncache: check the need for invalidation under read lock first
KVM: xen: allow vcpu_info content to be 'safely' copied
Documentation/virt/kvm/api.rst | 53 ++-
arch/x86/kvm/x86.c | 7 +-
arch/x86/kvm/xen.c | 358 +++++++++++-------
include/linux/kvm_host.h | 40 +-
include/linux/kvm_types.h | 8 -
include/uapi/linux/kvm.h | 9 +-
.../selftests/kvm/x86_64/xen_shinfo_test.c | 59 ++-
virt/kvm/pfncache.c | 185 ++++-----
8 files changed, 461 insertions(+), 258 deletions(-)
base-commit: 1ab097653e4dd8d23272d028a61352c23486fd4a
--
2.39.2
KUnit tests often need to provide a struct device, and thus far have
mostly been using root_device_register() or platform devices to create
a 'fake device' for use with, e.g., code which uses device-managed
resources. This has several disadvantages, including not being designed
for test use, scattering files in sysfs, and requiring manual teardown
on test exit, which may not always be possible in case of failure.
Instead, introduce a set of helper functions which allow devices
(internally a struct kunit_device) to be created and managed by KUnit --
i.e., they will be automatically unregistered on test exit. These
helpers can either use a user-provided struct device_driver, or have one
automatically created and managed by KUnit. In both cases, the device
lives on a new kunit_bus.
This is a follow-up to a previous proposal here:
https://lore.kernel.org/linux-kselftest/20230325043104.3761770-1-davidgow@g…
(The kunit_defer() function in the first patch there has since been
merged as the 'deferred actions' feature.)
My intention is to take this whole series in via the kselftest/kunit
branch, but I'm equally okay with splitting up the later patches which
use this to go via the various subsystem trees in case there are merge
conflicts.
I'd really appreciate any extra scrutiny that can be given to this;
particularly around the device refcounts and whether we can guarantee
that the device will be released at the correct point in the test
cleanup. I've seen a few crashes in kunit_cleanup, but only on some
already flaky/fragile UML/clang/alltests setups, which seem to go away
if I remove the devm_add_action() call (or if I enable any debugging
features / symbols, annoyingly).
Cheers,
-- David
Signed-off-by: David Gow <davidgow(a)google.com>
---
Changes in v2:
- Simplify device/driver/bus matching, removing the no-longer-required
kunit_bus_match function. (Thanks, Greg)
- The return values are both more consistent (kunit_device_register now
returns an explicit error pointer, rather than failing the test), and
better documented.
- Add some basic documentation to the implementations as well as the
headers. The documentation in the headers is still more complete, and
is now properly compiled into the HTML documentation (under
dev-tools/kunit/api/resources.html). (Thanks, Matti)
- Moved the internal-only kunit_bus_init() function to a private header,
lib/kunit/device-impl.h to avoid polluting the public headers, and
match other internal-only headers. (Thanks, Greg)
- Alphabetise KUnit includes in other test modules. (Thanks, Amadeusz.)
- Several code cleanups, particularly around error handling and
allocation. (Thanks Greg, Maxime)
- Several const-correctness and casting improvements. (Thanks, Greg)
- Added a new test to verify KUnit cleanup triggers device cleanup.
(Thanks, Maxime).
- Improved the user-specified device test to verify that probe/remove
hooks are called correctly. (Thanks, Maxime).
- The overflow test no-longer needlessly calls
kunit_device_unregister().
- Several other minor cleanups and documentation improvements, which
hopefully make this a bit clearer and more robust.
- Link to v1: https://lore.kernel.org/r/20231205-kunit_bus-v1-0-635036d3bc13@google.com
---
David Gow (4):
kunit: Add APIs for managing devices
fortify: test: Use kunit_device
overflow: Replace fake root_device with kunit_device
ASoC: topology: Replace fake root_device with kunit_device in tests
Documentation/dev-tools/kunit/api/resource.rst | 9 ++
Documentation/dev-tools/kunit/usage.rst | 50 +++++++
include/kunit/device.h | 80 +++++++++++
lib/fortify_kunit.c | 5 +-
lib/kunit/Makefile | 3 +-
lib/kunit/device.c | 181 +++++++++++++++++++++++++
lib/kunit/kunit-test.c | 134 +++++++++++++++++-
lib/kunit/test.c | 3 +
lib/overflow_kunit.c | 5 +-
sound/soc/soc-topology-test.c | 10 +-
10 files changed, 465 insertions(+), 15 deletions(-)
---
base-commit: c8613be119892ccceffbc550b9b9d7d68b995c9e
change-id: 20230718-kunit_bus-ab19c4ef48dc
Best regards,
--
David Gow <davidgow(a)google.com>
Alter the linker section of KUNIT_TABLE to move it out of INIT_DATA and
into DATA_DATA.
Data for KUnit tests does not need to be in the init section.
In order to run tests again after boot the KUnit data cannot be labeled as
init data as the kernel could write over it.
Add a KUNIT_INIT_TABLE in the next patch for KUnit tests that test init
data/functions.
Reviewed-by: David Gow <davidgow(a)google.com>
Signed-off-by: Rae Moar <rmoar(a)google.com>
---
include/asm-generic/vmlinux.lds.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index bae0fe4d499b..1107905d37fc 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -370,7 +370,8 @@
BRANCH_PROFILE() \
TRACE_PRINTKS() \
BPF_RAW_TP() \
- TRACEPOINT_STR()
+ TRACEPOINT_STR() \
+ KUNIT_TABLE()
/*
* Data section helpers
@@ -699,8 +700,7 @@
THERMAL_TABLE(governor) \
EARLYCON_TABLE() \
LSM_TABLE() \
- EARLY_LSM_TABLE() \
- KUNIT_TABLE()
+ EARLY_LSM_TABLE()
#define INIT_TEXT \
*(.init.text .init.text.*) \
base-commit: b285ba6f8cc1b2bfece0b4350fdb92c8780bc698
--
2.43.0.472.g3155946c3a-goog
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
Add a test that writes longs strings, some over the size of the sub buffer
and make sure that the entire content is there.
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since v2: https://lore.kernel.org/linux-trace-kernel/20231212151632.25c9b67d@gandalf.…
- Realized with the upcoming change of the dynamic subbuffer sizes, that
this test will fail if the subbuffer is bigger than what the trace_seq
can hold. Now the trace_marker does not always utilize the full subbuffer
but the size of the trace_seq instead. As that size isn't available to
user space, we can only just make sure all content is there.
.../ftrace/test.d/00basic/trace_marker.tc | 82 +++++++++++++++++++
1 file changed, 82 insertions(+)
create mode 100755 tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc b/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
new file mode 100755
index 000000000000..b24aff5807df
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
@@ -0,0 +1,82 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Basic tests on writing to trace_marker
+# requires: trace_marker
+# flags: instance
+
+get_buffer_data_size() {
+ sed -ne 's/^.*data.*size:\([0-9][0-9]*\).*/\1/p' events/header_page
+}
+
+get_buffer_data_offset() {
+ sed -ne 's/^.*data.*offset:\([0-9][0-9]*\).*/\1/p' events/header_page
+}
+
+get_event_header_size() {
+ type_len=`sed -ne 's/^.*type_len.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ time_len=`sed -ne 's/^.*time_delta.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ array_len=`sed -ne 's/^.*array.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ total_bits=$((type_len+time_len+array_len))
+ total_bits=$((total_bits+7))
+ echo $((total_bits/8))
+}
+
+get_print_event_buf_offset() {
+ sed -ne 's/^.*buf.*offset:\([0-9][0-9]*\).*/\1/p' events/ftrace/print/format
+}
+
+event_header_size=`get_event_header_size`
+print_header_size=`get_print_event_buf_offset`
+
+data_offset=`get_buffer_data_offset`
+
+marker_meta=$((event_header_size+print_header_size))
+
+make_str() {
+ cnt=$1
+ # subtract two for \n\0 as marker adds these
+ cnt=$((cnt-2))
+ printf -- 'X%.0s' $(seq $cnt)
+}
+
+write_buffer() {
+ size=$1
+
+ str=`make_str $size`
+
+ # clear the buffer
+ echo > trace
+
+ # write the string into the marker
+ echo -n $str > trace_marker
+
+ echo $str
+}
+
+test_buffer() {
+
+ size=`get_buffer_data_size`
+ oneline_size=$((size-marker_meta))
+ echo size = $size
+ echo meta size = $marker_meta
+
+ # Now add a little more the meta data overhead will overflow
+
+ str=`write_buffer $size`
+
+ # Make sure the line was broken
+ new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: /,"");printf "%s", $0; exit}' trace`
+
+ if [ "$new_str" = "$str" ]; then
+ exit fail;
+ fi
+
+ # Make sure the entire line can be found
+ new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: */,"");printf "%s", $0; }' trace`
+
+ if [ "$new_str" != "$str" ]; then
+ exit fail;
+ fi
+}
+
+test_buffer
--
2.42.0
When we dynamically generate a name for a configuration in get-reg-list
we use strcat() to append to a buffer allocated using malloc() but we
never initialise that buffer. Since malloc() offers no guarantees
regarding the contents of the memory it returns this can lead to us
corrupting, and likely overflowing, the buffer:
vregs: PASS
vregs+pmu: PASS
sve: PASS
sve+pmu: PASS
vregs+pauth_address+pauth_generic: PASS
X�vr+gspauth_addre+spauth_generi+pmu: PASS
Initialise the buffer to an empty string to avoid this.
Fixes: 2f9ace5d4557 ("KVM: arm64: selftests: get-reg-list: Introduce vcpu configs")
Reviewed-by: Andrew Jones <ajones(a)ventanamicro.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v3:
- Rebase this bugfix onto v6.7-rc1
- Link to v2: https://lore.kernel.org/r/20231017-kvm-get-reg-list-str-init-v2-1-ee30b1df3…
Changes in v2:
- Update Fixes: tag.
- Link to v1: https://lore.kernel.org/r/20231013-kvm-get-reg-list-str-init-v1-1-034f370ff…
---
tools/testing/selftests/kvm/get-reg-list.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/kvm/get-reg-list.c b/tools/testing/selftests/kvm/get-reg-list.c
index be7bf5224434..dd62a6976c0d 100644
--- a/tools/testing/selftests/kvm/get-reg-list.c
+++ b/tools/testing/selftests/kvm/get-reg-list.c
@@ -67,6 +67,7 @@ static const char *config_name(struct vcpu_reg_list *c)
c->name = malloc(len);
+ c->name[0] = '\0';
len = 0;
for_each_sublist(c, s) {
if (!strcmp(s->name, "base"))
---
base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86
change-id: 20231012-kvm-get-reg-list-str-init-76c8ed4e19d6
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Add a test to exercize cpu hotplug with the function tracer active to
ensure that sensitive functions in idle path are excluded from being
traced. This helps catch issues such as the one fixed by commit
4b3338aaa74d ("powerpc/ftrace: Fix stack teardown in ftrace_no_trace").
Signed-off-by: Naveen N Rao <naveen(a)kernel.org>
---
.../ftrace/test.d/ftrace/func_hotplug.tc | 30 +++++++++++++++++++
1 file changed, 30 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/ftrace/func_hotplug.tc
diff --git a/tools/testing/selftests/ftrace/test.d/ftrace/func_hotplug.tc b/tools/testing/selftests/ftrace/test.d/ftrace/func_hotplug.tc
new file mode 100644
index 000000000000..49731a2b5c23
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/ftrace/func_hotplug.tc
@@ -0,0 +1,30 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: ftrace - function trace across cpu hotplug
+# requires: function:tracer
+
+if ! which nproc ; then
+ nproc() {
+ ls -d /sys/devices/system/cpu/cpu[0-9]* | wc -l
+ }
+fi
+
+NP=`nproc`
+
+if [ $NP -eq 1 ] ;then
+ echo "We can not test cpu hotplug in UP environment"
+ exit_unresolved
+fi
+
+echo 0 > tracing_on
+echo > trace
+: "Set CPU1 offline/online with function tracer enabled"
+echo function > current_tracer
+echo 1 > tracing_on
+(echo 0 > /sys/devices/system/cpu/cpu1/online)
+(echo "forked"; sleep 1)
+(echo 1 > /sys/devices/system/cpu/cpu1/online)
+echo 0 > tracing_on
+
+: "Check CPU1 events are recorded"
+grep -q -e "\[001\]" trace
base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86
--
2.43.0
Alter the linker section of KUNIT_TABLE to move it out of INIT_DATA and
into DATA_DATA.
Data for KUnit tests does not need to be in the init section.
In order to run tests again after boot the KUnit data cannot be labeled as
init data as the kernel could write over it.
Add a KUNIT_INIT_TABLE in the next patch for KUnit tests that test init
data/functions.
Signed-off-by: Rae Moar <rmoar(a)google.com>
---
Changes since v3:
- No changes
include/asm-generic/vmlinux.lds.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index bae0fe4d499b..1107905d37fc 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -370,7 +370,8 @@
BRANCH_PROFILE() \
TRACE_PRINTKS() \
BPF_RAW_TP() \
- TRACEPOINT_STR()
+ TRACEPOINT_STR() \
+ KUNIT_TABLE()
/*
* Data section helpers
@@ -699,8 +700,7 @@
THERMAL_TABLE(governor) \
EARLYCON_TABLE() \
LSM_TABLE() \
- EARLY_LSM_TABLE() \
- KUNIT_TABLE()
+ EARLY_LSM_TABLE()
#define INIT_TEXT \
*(.init.text .init.text.*) \
base-commit: b285ba6f8cc1b2bfece0b4350fdb92c8780bc698
--
2.43.0.472.g3155946c3a-goog
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
Now that the trace_marker can write up to the max size of the sub buffer.
Add a test to see if it actually can happen.
The README is updated to state that the trace_marker writes can be broken
up, and the test checks the README for that statement so that it does not
fail on older kernels that does not support this.
If the README does not have the specified update, the test will still test
if all the string is written, as that should work with older kernels.
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: linux-kselftest(a)vger.kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since v1: https://lore.kernel.org/linux-trace-kernel/20231212135441.0337c3e9@gandalf.…
- Fix description as it was a cut and paste from the subbuffer size tests
that are not added yet.
kernel/trace/trace.c | 1 +
.../ftrace/test.d/00basic/trace_marker.tc | 112 ++++++++++++++++++
2 files changed, 113 insertions(+)
create mode 100755 tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 2f8d59834c00..cbfcdd882590 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5595,6 +5595,7 @@ static const char readme_msg[] =
" delta: Delta difference against a buffer-wide timestamp\n"
" absolute: Absolute (standalone) timestamp\n"
"\n trace_marker\t\t- Writes into this file writes into the kernel buffer\n"
+ "\n May be broken into multiple events based on sub-buffer size.\n"
"\n trace_marker_raw\t\t- Writes into this file writes binary data into the kernel buffer\n"
" tracing_cpumask\t- Limit which CPUs to trace\n"
" instances\t\t- Make sub-buffers with: mkdir instances/foo\n"
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc b/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
new file mode 100755
index 000000000000..bf7f6f50c88a
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
@@ -0,0 +1,112 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Basic tests on writing to trace_marker
+# requires: trace_marker
+# flags: instance
+
+get_buffer_data_size() {
+ sed -ne 's/^.*data.*size:\([0-9][0-9]*\).*/\1/p' events/header_page
+}
+
+get_buffer_data_offset() {
+ sed -ne 's/^.*data.*offset:\([0-9][0-9]*\).*/\1/p' events/header_page
+}
+
+get_event_header_size() {
+ type_len=`sed -ne 's/^.*type_len.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ time_len=`sed -ne 's/^.*time_delta.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ array_len=`sed -ne 's/^.*array.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ total_bits=$((type_len+time_len+array_len))
+ total_bits=$((total_bits+7))
+ echo $((total_bits/8))
+}
+
+get_print_event_buf_offset() {
+ sed -ne 's/^.*buf.*offset:\([0-9][0-9]*\).*/\1/p' events/ftrace/print/format
+}
+
+event_header_size=`get_event_header_size`
+print_header_size=`get_print_event_buf_offset`
+
+# Find the README
+README=""
+if [ -f README ]; then
+ README="README"
+# instance?
+elif [ -f ../../README ]; then
+ README="../../README"
+fi
+
+testone=0
+if [ ! -z "$README" ]; then
+ if grep -q "May be broken into multiple events based on sub-buffer size" $README; then
+ testone=1
+ fi
+fi
+
+data_offset=`get_buffer_data_offset`
+
+marker_meta=$((event_header_size+print_header_size))
+
+make_str() {
+ cnt=$1
+ # subtract two for \n\0 as marker adds these
+ cnt=$((cnt-2))
+ printf -- 'X%.0s' $(seq $cnt)
+}
+
+write_buffer() {
+ size=$1
+
+ str=`make_str $size`
+
+ # clear the buffer
+ echo > trace
+
+ # write the string into the marker
+ echo -n $str > trace_marker
+
+ echo $str
+}
+
+test_buffer() {
+
+ size=`get_buffer_data_size`
+ oneline_size=$((size-marker_meta))
+ echo size = $size
+ echo meta size = $marker_meta
+
+ if [ $testone -eq 1 ]; then
+ echo oneline size = $oneline_size
+
+ str=`write_buffer $oneline_size`
+
+ # Should be in one single event
+ new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: */,"");printf "%s", $0; exit}' trace`
+
+ if [ "$new_str" != "$str" ]; then
+ exit fail;
+ fi
+ fi
+
+ # Now add a little more the meta data overhead will overflow
+
+ str=`write_buffer $size`
+
+ # Make sure the line was broken
+ new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: /,"");printf "%s", $0; exit}' trace`
+
+ if [ "$new_str" = "$str" ]; then
+ exit fail;
+ fi
+
+ # Make sure the entire line can be found
+ new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: */,"");printf "%s", $0; }' trace`
+
+ if [ "$new_str" != "$str" ]; then
+ exit fail;
+ fi
+}
+
+test_buffer
+
--
2.42.0
Here is the 3rd part of converting net selftests to run in unique namespace.
This part converts all srv6 and fib tests.
Note that patch 06 is a fix for testing fib_nexthop_multiprefix.
Here is the part 1 link:
https://lore.kernel.org/netdev/20231202020110.362433-1-liuhangbin@gmail.com
And part 2 link:
https://lore.kernel.org/netdev/20231206070801.1691247-1-liuhangbin@gmail.com
Hangbin Liu (13):
selftests/net: add variable NS_LIST for lib.sh
selftests/net: convert srv6_end_dt46_l3vpn_test.sh to run it in unique
namespace
selftests/net: convert srv6_end_dt4_l3vpn_test.sh to run it in unique
namespace
selftests/net: convert srv6_end_dt6_l3vpn_test.sh to run it in unique
namespace
selftests/net: convert fcnal-test.sh to run it in unique namespace
selftests/net: fix grep checking for fib_nexthop_multiprefix
selftests/net: convert fib_nexthop_multiprefix to run it in unique
namespace
selftests/net: convert fib_nexthop_nongw.sh to run it in unique
namespace
selftests/net: convert fib_nexthops.sh to run it in unique namespace
selftests/net: convert fib-onlink-tests.sh to run it in unique
namespace
selftests/net: convert fib_rule_tests.sh to run it in unique namespace
selftests/net: convert fib_tests.sh to run it in unique namespace
selftests/net: convert fdb_flush.sh to run it in unique namespace
tools/testing/selftests/net/fcnal-test.sh | 30 ++-
tools/testing/selftests/net/fdb_flush.sh | 11 +-
.../testing/selftests/net/fib-onlink-tests.sh | 9 +-
.../selftests/net/fib_nexthop_multiprefix.sh | 98 +++++-----
.../selftests/net/fib_nexthop_nongw.sh | 34 ++--
tools/testing/selftests/net/fib_nexthops.sh | 142 +++++++-------
tools/testing/selftests/net/fib_rule_tests.sh | 36 ++--
tools/testing/selftests/net/fib_tests.sh | 184 +++++++++---------
tools/testing/selftests/net/lib.sh | 8 +
tools/testing/selftests/net/settings | 2 +-
.../selftests/net/srv6_end_dt46_l3vpn_test.sh | 51 +++--
.../selftests/net/srv6_end_dt4_l3vpn_test.sh | 48 ++---
.../selftests/net/srv6_end_dt6_l3vpn_test.sh | 46 ++---
13 files changed, 332 insertions(+), 367 deletions(-)
--
2.43.0
Changes from v1
(https://lore.kernel.org/damon/20231212191206.52917-1-sj@kernel.org/)
- Fix conflicts on latest mm-unstable tree
Changes from RFC
(https://lore.kernel.org/damon/20231202000806.46210-1-sj@kernel.org/)
- Make the working set size estimation test more reliable
- Wordsmith coverletter and commit messages
- Rename _damon.py to _damon_sysfs.py
DAMON exports most of its functionality via its sysfs interface. Hence
most DAMON functionality tests could be implemented using the interface.
However, because the interfaces require simple but multiple operations
for many controls, writing all such tests from the scratch could be
repetitive and time consuming.
Implement a minimum DAMON sysfs control module, and a couple of DAMON
functionality tests using the control module. The first test is for
ensuring minimum accuracy of data access monitoring, and the second test
is for finding if a previously found and fixed bug is introduced again.
Note that the DAMON sysfs control module is only for avoiding
duplicating code in tests. For convenient and general control of DAMON,
users should use DAMON user-space tools that developed for the purpose,
such as damo[1].
[1] https://github.com/damonitor/damo
Patches Sequence
----------------
This patchset is constructed with five patches. The first three patches
implement a Python-written test implementation-purpose DAMON sysfs
control module. The implementation is incrementally done in the
sequence of the basic data structure (first patch) first, kdamonds start
command (second patch) next, and finally DAMOS tried bytes update
command (third patch).
Then two patches for implementing selftests using the module follows.
The fourth patch implements a basic functionality test of DAMON for
working set estimation accuracy. Finally, the fifth patch implements a
corner case test for a previously found bug.
SeongJae Park (5):
selftests/damon: implement a python module for test-purpose DAMON
sysfs controls
selftests/damon/_damon_sysfs: implement kdamonds start function
selftests/damon/_damon_sysfs: implement updat_schemes_tried_bytes
command
selftests/damon: add a test for update_schemes_tried_regions sysfs
command
selftests/damon: add a test for update_schemes_tried_regions hang bug
tools/testing/selftests/damon/Makefile | 3 +
tools/testing/selftests/damon/_damon_sysfs.py | 322 ++++++++++++++++++
tools/testing/selftests/damon/access_memory.c | 41 +++
...sysfs_update_schemes_tried_regions_hang.py | 33 ++
...te_schemes_tried_regions_wss_estimation.py | 55 +++
5 files changed, 454 insertions(+)
create mode 100644 tools/testing/selftests/damon/_damon_sysfs.py
create mode 100644 tools/testing/selftests/damon/access_memory.c
create mode 100755 tools/testing/selftests/damon/sysfs_update_schemes_tried_regions_hang.py
create mode 100755 tools/testing/selftests/damon/sysfs_update_schemes_tried_regions_wss_estimation.py
base-commit: 091b8c820de390a6235595bdb281edab63b9befe
--
2.34.1
Changes from RFC
(https://lore.kernel.org/damon/20231202000806.46210-1-sj@kernel.org/)
- Make the working set size estimation test more reliable
- Wordsmith coverletter and commit messages
- Rename _damon.py to _damon_sysfs.py
DAMON exports most of its functionality via its sysfs interface. Hence
most DAMON functionality tests could be implemented using the interface.
However, because the interfaces require simple but multiple operations
for many controls, writing all such tests from the scratch could be
repetitive and time consuming.
Implement a minimum DAMON sysfs control module, and a couple of DAMON
functionality tests using the control module. The first test is for
ensuring minimum accuracy of data access monitoring, and the second test
is for finding if a previously found and fixed bug is introduced again.
Note that the DAMON sysfs control module is only for avoiding
duplicating code in tests. For convenient and general control of DAMON,
users should use DAMON user-space tools that developed for the purpose,
such as damo[1].
[1] https://github.com/damonitor/damo
Patches Sequence
----------------
This patchset is constructed with five patches. The first three patches
implement a Python-written test implementation-purpose DAMON sysfs
control module. The implementation is incrementally done in the
sequence of the basic data structure (first patch) first, kdamonds start
command (second patch) next, and finally DAMOS tried bytes update
command (third patch).
Then two patches for implementing selftests using the module follows.
The fourth patch implements a basic functionality test of DAMON for
working set estimation accuracy. Finally, the fifth patch implements a
corner case test for a previously found bug.
SeongJae Park (5):
selftests/damon: implement a python module for test-purpose DAMON
sysfs controls
selftests/damon/_damon_sysfs: implement kdamonds start function
selftests/damon/_damon_sysfs: implement updat_schemes_tried_bytes
command
selftests/damon: add a test for update_schemes_tried_regions sysfs
command
selftests/damon: add a test for update_schemes_tried_regions hang bug
tools/testing/selftests/damon/Makefile | 3 +
tools/testing/selftests/damon/_damon_sysfs.py | 322 ++++++++++++++++++
tools/testing/selftests/damon/access_memory.c | 41 +++
...sysfs_update_schemes_tried_regions_hang.py | 33 ++
...te_schemes_tried_regions_wss_estimation.py | 55 +++
5 files changed, 454 insertions(+)
create mode 100644 tools/testing/selftests/damon/_damon_sysfs.py
create mode 100644 tools/testing/selftests/damon/access_memory.c
create mode 100755 tools/testing/selftests/damon/sysfs_update_schemes_tried_regions_hang.py
create mode 100755 tools/testing/selftests/damon/sysfs_update_schemes_tried_regions_wss_estimation.py
base-commit: 5794dfaf6d1be564b0912d51d8a714baff329495
--
2.34.1
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
Now that the trace_marker can write up to the max size of the sub buffer.
Add a test to see if it actually can happen.
The README is updated to state that the trace_marker writes can be broken
up, and the test checks the README for that statement so that it does not
fail on older kernels that does not support this.
If the README does not have the specified update, the test will still test
if all the string is written (although it would be broken up), as that
should work with older kernels.
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: linux-kselftest(a)vger.kernel.org
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/trace.c | 1 +
.../ftrace/test.d/00basic/trace_marker.tc | 112 ++++++++++++++++++
2 files changed, 113 insertions(+)
create mode 100755 tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 2f8d59834c00..cbfcdd882590 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5595,6 +5595,7 @@ static const char readme_msg[] =
" delta: Delta difference against a buffer-wide timestamp\n"
" absolute: Absolute (standalone) timestamp\n"
"\n trace_marker\t\t- Writes into this file writes into the kernel buffer\n"
+ "\n May be broken into multiple events based on sub-buffer size.\n"
"\n trace_marker_raw\t\t- Writes into this file writes binary data into the kernel buffer\n"
" tracing_cpumask\t- Limit which CPUs to trace\n"
" instances\t\t- Make sub-buffers with: mkdir instances/foo\n"
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc b/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
new file mode 100755
index 000000000000..bcb2dc6b8a66
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/00basic/trace_marker.tc
@@ -0,0 +1,112 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: Change the ringbuffer sub-buffer size
+# requires: trace_marker
+# flags: instance
+
+get_buffer_data_size() {
+ sed -ne 's/^.*data.*size:\([0-9][0-9]*\).*/\1/p' events/header_page
+}
+
+get_buffer_data_offset() {
+ sed -ne 's/^.*data.*offset:\([0-9][0-9]*\).*/\1/p' events/header_page
+}
+
+get_event_header_size() {
+ type_len=`sed -ne 's/^.*type_len.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ time_len=`sed -ne 's/^.*time_delta.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ array_len=`sed -ne 's/^.*array.*:[^0-9]*\([0-9][0-9]*\).*/\1/p' events/header_event`
+ total_bits=$((type_len+time_len+array_len))
+ total_bits=$((total_bits+7))
+ echo $((total_bits/8))
+}
+
+get_print_event_buf_offset() {
+ sed -ne 's/^.*buf.*offset:\([0-9][0-9]*\).*/\1/p' events/ftrace/print/format
+}
+
+event_header_size=`get_event_header_size`
+print_header_size=`get_print_event_buf_offset`
+
+# Find the README
+README=""
+if [ -f README ]; then
+ README="README"
+# instance?
+elif [ -f ../../README ]; then
+ README="../../README"
+fi
+
+testone=0
+if [ ! -z "$README" ]; then
+ if grep -q "May be broken into multiple events based on sub-buffer size" $README; then
+ testone=1
+ fi
+fi
+
+data_offset=`get_buffer_data_offset`
+
+marker_meta=$((event_header_size+print_header_size))
+
+make_str() {
+ cnt=$1
+ # subtract two for \n\0 as marker adds these
+ cnt=$((cnt-2))
+ printf -- 'X%.0s' $(seq $cnt)
+}
+
+write_buffer() {
+ size=$1
+
+ str=`make_str $size`
+
+ # clear the buffer
+ echo > trace
+
+ # write the string into the marker
+ echo -n $str > trace_marker
+
+ echo $str
+}
+
+test_buffer() {
+
+ size=`get_buffer_data_size`
+ oneline_size=$((size-marker_meta))
+ echo size = $size
+ echo meta size = $marker_meta
+
+ if [ $testone -eq 1 ]; then
+ echo oneline size = $oneline_size
+
+ str=`write_buffer $oneline_size`
+
+ # Should be in one single event
+ new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: */,"");printf "%s", $0; exit}' trace`
+
+ if [ "$new_str" != "$str" ]; then
+ exit fail;
+ fi
+ fi
+
+ # Now add a little more the meta data overhead will overflow
+
+ str=`write_buffer $size`
+
+ # Make sure the line was broken
+ new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: /,"");printf "%s", $0; exit}' trace`
+
+ if [ "$new_str" = "$str" ]; then
+ exit fail;
+ fi
+
+ # Make sure the entire line can be found
+ new_str=`awk ' /tracing_mark_write:/ { sub(/^.*tracing_mark_write: */,"");printf "%s", $0; }' trace`
+
+ if [ "$new_str" != "$str" ]; then
+ exit fail;
+ fi
+}
+
+test_buffer
+
--
2.42.0
This reverts commit 9fc96c7c19df ("selftests: error out if kernel header
files are not yet built").
It turns out that requiring the kernel headers to be built as a
prerequisite to building selftests, does not work in many cases. For
example, Peter Zijlstra writes:
"My biggest beef with the whole thing is that I simply do not want to use
'make headers', it doesn't work for me.
I have a ton of output directories and I don't care to build tools into
the output dirs, in fact some of them flat out refuse to work that way
(bpf comes to mind)." [1]
Therefore, stop erroring out on the selftests build. Additional patches
will be required in order to change over to not requiring the kernel
headers.
[1] https://lore.kernel.org/20231208221007.GO28727@noisy.programming.kicks-ass.…
Cc: Anders Roxell <anders.roxell(a)linaro.org>
Cc: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Nathan Chancellor <nathan(a)kernel.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
---
tools/testing/selftests/Makefile | 21 +----------------
tools/testing/selftests/lib.mk | 40 +++-----------------------------
2 files changed, 4 insertions(+), 57 deletions(-)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 3b2061d1c1a5..8247a7c69c36 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -155,12 +155,10 @@ ifneq ($(KBUILD_OUTPUT),)
abs_objtree := $(realpath $(abs_objtree))
BUILD := $(abs_objtree)/kselftest
KHDR_INCLUDES := -isystem ${abs_objtree}/usr/include
- KHDR_DIR := ${abs_objtree}/usr/include
else
BUILD := $(CURDIR)
abs_srctree := $(shell cd $(top_srcdir) && pwd)
KHDR_INCLUDES := -isystem ${abs_srctree}/usr/include
- KHDR_DIR := ${abs_srctree}/usr/include
DEFAULT_INSTALL_HDR_PATH := 1
endif
@@ -174,7 +172,7 @@ export KHDR_INCLUDES
# all isn't the first target in the file.
.DEFAULT_GOAL := all
-all: kernel_header_files
+all:
@ret=1; \
for TARGET in $(TARGETS); do \
BUILD_TARGET=$$BUILD/$$TARGET; \
@@ -185,23 +183,6 @@ all: kernel_header_files
ret=$$((ret * $$?)); \
done; exit $$ret;
-kernel_header_files:
- @ls $(KHDR_DIR)/linux/*.h >/dev/null 2>/dev/null; \
- if [ $$? -ne 0 ]; then \
- RED='\033[1;31m'; \
- NOCOLOR='\033[0m'; \
- echo; \
- echo -e "$${RED}error$${NOCOLOR}: missing kernel header files."; \
- echo "Please run this and try again:"; \
- echo; \
- echo " cd $(top_srcdir)"; \
- echo " make headers"; \
- echo; \
- exit 1; \
- fi
-
-.PHONY: kernel_header_files
-
run_tests: all
@for TARGET in $(TARGETS); do \
BUILD_TARGET=$$BUILD/$$TARGET; \
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 118e0964bda9..aa646e0661f3 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -44,26 +44,10 @@ endif
selfdir = $(realpath $(dir $(filter %/lib.mk,$(MAKEFILE_LIST))))
top_srcdir = $(selfdir)/../../..
-ifeq ("$(origin O)", "command line")
- KBUILD_OUTPUT := $(O)
+ifeq ($(KHDR_INCLUDES),)
+KHDR_INCLUDES := -isystem $(top_srcdir)/usr/include
endif
-ifneq ($(KBUILD_OUTPUT),)
- # Make's built-in functions such as $(abspath ...), $(realpath ...) cannot
- # expand a shell special character '~'. We use a somewhat tedious way here.
- abs_objtree := $(shell cd $(top_srcdir) && mkdir -p $(KBUILD_OUTPUT) && cd $(KBUILD_OUTPUT) && pwd)
- $(if $(abs_objtree),, \
- $(error failed to create output directory "$(KBUILD_OUTPUT)"))
- # $(realpath ...) resolves symlinks
- abs_objtree := $(realpath $(abs_objtree))
- KHDR_DIR := ${abs_objtree}/usr/include
-else
- abs_srctree := $(shell cd $(top_srcdir) && pwd)
- KHDR_DIR := ${abs_srctree}/usr/include
-endif
-
-KHDR_INCLUDES := -isystem $(KHDR_DIR)
-
# The following are built by lib.mk common compile rules.
# TEST_CUSTOM_PROGS should be used by tests that require
# custom build rule and prevent common build rule use.
@@ -74,25 +58,7 @@ TEST_GEN_PROGS := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS))
TEST_GEN_PROGS_EXTENDED := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS_EXTENDED))
TEST_GEN_FILES := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_FILES))
-all: kernel_header_files $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) \
- $(TEST_GEN_FILES)
-
-kernel_header_files:
- @ls $(KHDR_DIR)/linux/*.h >/dev/null 2>/dev/null; \
- if [ $$? -ne 0 ]; then \
- RED='\033[1;31m'; \
- NOCOLOR='\033[0m'; \
- echo; \
- echo -e "$${RED}error$${NOCOLOR}: missing kernel header files."; \
- echo "Please run this and try again:"; \
- echo; \
- echo " cd $(top_srcdir)"; \
- echo " make headers"; \
- echo; \
- exit 1; \
- fi
-
-.PHONY: kernel_header_files
+all: $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES)
define RUN_TESTS
BASE_DIR="$(selfdir)"; \
--
2.43.0
Alter the linker section of KUNIT_TABLE to move it out of INIT_DATA and
into DATA_DATA.
Data for KUnit tests does not need to be in the init section.
In order to run tests again after boot the KUnit data cannot be labeled as
init data as the kernel could write over it.
Add a KUNIT_INIT_TABLE in the next patch for KUnit tests that test init
data/functions.
Signed-off-by: Rae Moar <rmoar(a)google.com>
---
include/asm-generic/vmlinux.lds.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index bae0fe4d499b..1107905d37fc 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -370,7 +370,8 @@
BRANCH_PROFILE() \
TRACE_PRINTKS() \
BPF_RAW_TP() \
- TRACEPOINT_STR()
+ TRACEPOINT_STR() \
+ KUNIT_TABLE()
/*
* Data section helpers
@@ -699,8 +700,7 @@
THERMAL_TABLE(governor) \
EARLYCON_TABLE() \
LSM_TABLE() \
- EARLY_LSM_TABLE() \
- KUNIT_TABLE()
+ EARLY_LSM_TABLE()
#define INIT_TEXT \
*(.init.text .init.text.*) \
base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86
--
2.43.0.rc2.451.g8631bc7472-goog
When TPIDR2 is not supported the tpidr2 ABI test prints the same message
for each skipped test:
ok 1 skipped, TPIDR2 not supported
which isn't ideal for test automation software since it tracks kselftest
results based on the string used to describe the test. This is also not
standard KTAP output, the expected format is:
ok 1 # SKIP default_value
Updated the program to generate this, using the same set of test names that
we would run if the test actually executed.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/arm64/abi/tpidr2.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/arm64/abi/tpidr2.c b/tools/testing/selftests/arm64/abi/tpidr2.c
index 351a098b503a..02ee3a91b780 100644
--- a/tools/testing/selftests/arm64/abi/tpidr2.c
+++ b/tools/testing/selftests/arm64/abi/tpidr2.c
@@ -254,6 +254,12 @@ static int write_clone_read(void)
putnum(++tests_run); \
putstr(" " #name "\n");
+#define skip_test(name) \
+ tests_skipped++; \
+ putstr("ok "); \
+ putnum(++tests_run); \
+ putstr(" # SKIP " #name "\n");
+
int main(int argc, char **argv)
{
int ret, i;
@@ -283,13 +289,11 @@ int main(int argc, char **argv)
} else {
putstr("# SME support not present\n");
- for (i = 0; i < EXPECTED_TESTS; i++) {
- putstr("ok ");
- putnum(i);
- putstr(" skipped, TPIDR2 not supported\n");
- }
-
- tests_skipped += EXPECTED_TESTS;
+ skip_test(default_value);
+ skip_test(write_read);
+ skip_test(write_sleep_read);
+ skip_test(write_fork_read);
+ skip_test(write_clone_read);
}
print_summary();
---
base-commit: 98b1cc82c4affc16f5598d4fa14b1858671b2263
change-id: 20231124-kselftest-arm64-tpidr2-skip-43764f4ff4f4
Best regards,
--
Mark Brown <broonie(a)kernel.org>
virtio-net have two usage of hashes: one is RSS and another is hash
reporting. Conventionally the hash calculation was done by the VMM.
However, computing the hash after the queue was chosen defeats the
purpose of RSS.
Another approach is to use eBPF steering program. This approach has
another downside: it cannot report the calculated hash due to the
restrictive nature of eBPF.
Extend the steering program feature by introducing a dedicated program
type: BPF_PROG_TYPE_VNET_HASH. This program type is capable to report
the hash value and the queue to use at the same time.
This is a rewrite of a RFC patch series submitted by Yuri Benditovich that
incorporates feedbacks for the series and V1 of this series:
https://lore.kernel.org/lkml/20210112194143.1494-1-yuri.benditovich@daynix.…
QEMU patched to use this new feature is available at:
https://github.com/daynix/qemu/tree/akihikodaki/bpf
The QEMU patches will soon be submitted to the upstream as RFC too.
V1 -> V2:
Changed to introduce a new BPF program type.
Akihiko Odaki (7):
bpf: Introduce BPF_PROG_TYPE_VNET_HASH
bpf: Add vnet_hash members to __sk_buff
skbuff: Introduce SKB_EXT_TUN_VNET_HASH
virtio_net: Add virtio_net_hdr_v1_hash_from_skb()
tun: Support BPF_PROG_TYPE_VNET_HASH
selftests/bpf: Test BPF_PROG_TYPE_VNET_HASH
vhost_net: Support VIRTIO_NET_F_HASH_REPORT
Documentation/bpf/bpf_prog_run.rst | 1 +
Documentation/bpf/libbpf/program_types.rst | 2 +
drivers/net/tun.c | 158 +++++--
drivers/vhost/net.c | 16 +-
include/linux/bpf_types.h | 2 +
include/linux/filter.h | 7 +
include/linux/skbuff.h | 10 +
include/linux/virtio_net.h | 22 +
include/uapi/linux/bpf.h | 5 +
kernel/bpf/verifier.c | 6 +
net/core/filter.c | 86 +++-
net/core/skbuff.c | 3 +
tools/include/uapi/linux/bpf.h | 5 +
tools/lib/bpf/libbpf.c | 2 +
tools/testing/selftests/bpf/config | 1 +
tools/testing/selftests/bpf/config.aarch64 | 1 -
.../selftests/bpf/prog_tests/vnet_hash.c | 385 ++++++++++++++++++
tools/testing/selftests/bpf/progs/vnet_hash.c | 16 +
18 files changed, 681 insertions(+), 47 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/vnet_hash.c
create mode 100644 tools/testing/selftests/bpf/progs/vnet_hash.c
--
2.42.0
By default, all the test output will be printed to stdout or output.log if
-s supplied. The kselftest/runner.sh also supports per test log if the
variable per_test_logging is set. So add new option -p to set this
veriable. Note the -p option is conflict with -s option.
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
---
tools/testing/selftests/run_kselftest.sh | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/run_kselftest.sh b/tools/testing/selftests/run_kselftest.sh
index 92743980e553..965220a314ce 100755
--- a/tools/testing/selftests/run_kselftest.sh
+++ b/tools/testing/selftests/run_kselftest.sh
@@ -20,7 +20,8 @@ usage()
{
cat <<EOF
Usage: $0 [OPTIONS]
- -s | --summary Print summary with detailed log in output.log
+ -s | --summary Print summary with detailed log in output.log (conflict with -p)
+ -p | --per_test_log Print test log in /tmp with each test name (conflict with -s)
-t | --test COLLECTION:TEST Run TEST from COLLECTION
-c | --collection COLLECTION Run all tests from COLLECTION
-l | --list List the available collection:test entries
@@ -41,6 +42,9 @@ while true; do
logfile="$BASE_DIR"/output.log
cat /dev/null > $logfile
shift ;;
+ -p | --per_test_log)
+ per_test_logging=1
+ shift ;;
-t | --test)
TESTS="$TESTS $2"
shift 2 ;;
--
2.41.0
Hi,
Changes since v2 [1]:
* Added a new patch (sent separately earlier) at the end, to error out
if "make headers" has not yet been run.
* Reworked and simplified the uffd movement patch. Now it only moves
some uffd*() routines, not all, and doesn't have to touch the Makefile
at all. This lighter touch also allowed me to drop the "move psize(),
pshift() into vm_utils.c" entirely. I expect Peter Xu will be a little
happier with this new approach.
* Fixed the commit description for the MADV_COLLAPSE patch.
* Added more Reviewed-by tags from David Hildenbrand and Peter Xu.
[1] https://lore.kernel.org/all/20230603021558.95299-1-jhubbard@nvidia.com/
John Hubbard (11):
selftests/mm: fix uffd-stress unused function warning
selftests/mm: fix unused variable warnings in hugetlb-madvise.c,
migration.c
selftests/mm: fix "warning: expression which evaluates to zero..." in
mlock2-tests.c
selftests/mm: fix invocation of tests that are run via shell scripts
selftests/mm: .gitignore: add mkdirty, va_high_addr_switch
selftests/mm: fix two -Wformat-security warnings in uffd builds
selftests/mm: fix a "possibly uninitialized" warning in pkey-x86.h
selftests/mm: fix build failures due to missing MADV_COLLAPSE
selftests/mm: move certain uffd*() routines from vm_util.c to
uffd-common.c
Documentation: kselftest: "make headers" is a prerequisite
selftests: error out if kernel header files are not yet built
Documentation/dev-tools/kselftest.rst | 1 +
tools/testing/selftests/lib.mk | 36 +++++++++++-
tools/testing/selftests/mm/.gitignore | 2 +
tools/testing/selftests/mm/cow.c | 7 ---
tools/testing/selftests/mm/hugetlb-madvise.c | 8 ++-
tools/testing/selftests/mm/khugepaged.c | 10 ----
tools/testing/selftests/mm/migration.c | 5 +-
tools/testing/selftests/mm/mlock2-tests.c | 1 -
tools/testing/selftests/mm/pkey-x86.h | 2 +-
tools/testing/selftests/mm/run_vmtests.sh | 6 +-
tools/testing/selftests/mm/uffd-common.c | 59 ++++++++++++++++++++
tools/testing/selftests/mm/uffd-common.h | 5 ++
tools/testing/selftests/mm/uffd-stress.c | 10 ----
tools/testing/selftests/mm/uffd-unit-tests.c | 16 ++----
tools/testing/selftests/mm/vm_util.c | 59 --------------------
tools/testing/selftests/mm/vm_util.h | 14 +++--
16 files changed, 130 insertions(+), 111 deletions(-)
base-commit: f8dba31b0a826e691949cd4fdfa5c30defaac8c5
--
2.40.1
The kernel has recently added support for shadow stacks, currently
x86 only using their CET feature but both arm64 and RISC-V have
equivalent features (GCS and Zicfiss respectively), I am actively
working on GCS[1]. With shadow stacks the hardware maintains an
additional stack containing only the return addresses for branch
instructions which is not generally writeable by userspace and ensures
that any returns are to the recorded addresses. This provides some
protection against ROP attacks and making it easier to collect call
stacks. These shadow stacks are allocated in the address space of the
userspace process.
Our API for shadow stacks does not currently offer userspace any
flexiblity for managing the allocation of shadow stacks for newly
created threads, instead the kernel allocates a new shadow stack with
the same size as the normal stack whenever a thread is created with the
feature enabled. The stacks allocated in this way are freed by the
kernel when the thread exits or shadow stacks are disabled for the
thread. This lack of flexibility and control isn't ideal, in the vast
majority of cases the shadow stack will be over allocated and the
implicit allocation and deallocation is not consistent with other
interfaces. As far as I can tell the interface is done in this manner
mainly because the shadow stack patches were in development since before
clone3() was implemented.
Since clone3() is readily extensible let's add support for specifying a
shadow stack when creating a new thread or process in a similar manner
to how the normal stack is specified, keeping the current implicit
allocation behaviour if one is not specified either with clone3() or
through the use of clone(). Unlike normal stacks only the shadow stack
size is specified, similar issues to those that lead to the creation of
map_shadow_stack() apply.
Please note that the x86 portions of this code are build tested only, I
don't appear to have a system that can run CET avaible to me, I have
done testing with an integration into my pending work for GCS. There is
some possibility that the arm64 implementation may require the use of
clone3() and explicit userspace allocation of shadow stacks, this is
still under discussion.
A new architecture feature Kconfig option for shadow stacks is added as
here, this was suggested as part of the review comments for the arm64
GCS series and since we need to detect if shadow stacks are supported it
seemed sensible to roll it in here.
[1] https://lore.kernel.org/r/20231009-arm64-gcs-v6-0-78e55deaa4dd@kernel.org/
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v4:
- Formatting changes.
- Use a define for minimum shadow stack size and move some basic
validation to fork.c.
- Link to v3: https://lore.kernel.org/r/20231120-clone3-shadow-stack-v3-0-a7b8ed3e2acc@ke…
Changes in v3:
- Rebase onto v6.7-rc2.
- Remove stale shadow_stack in internal kargs.
- If a shadow stack is specified unconditionally use it regardless of
CLONE_ parameters.
- Force enable shadow stacks in the selftest.
- Update changelogs for RISC-V feature rename.
- Link to v2: https://lore.kernel.org/r/20231114-clone3-shadow-stack-v2-0-b613f8681155@ke…
Changes in v2:
- Rebase onto v6.7-rc1.
- Remove ability to provide preallocated shadow stack, just specify the
desired size.
- Link to v1: https://lore.kernel.org/r/20231023-clone3-shadow-stack-v1-0-d867d0b5d4d0@ke…
---
Mark Brown (5):
mm: Introduce ARCH_HAS_USER_SHADOW_STACK
fork: Add shadow stack support to clone3()
selftests/clone3: Factor more of main loop into test_clone3()
selftests/clone3: Allow tests to flag if -E2BIG is a valid error code
kselftest/clone3: Test shadow stack support
arch/x86/Kconfig | 1 +
arch/x86/include/asm/shstk.h | 11 +-
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/shstk.c | 56 ++++--
fs/proc/task_mmu.c | 2 +-
include/linux/mm.h | 2 +-
include/linux/sched/task.h | 1 +
include/uapi/linux/sched.h | 4 +
kernel/fork.c | 53 ++++--
mm/Kconfig | 6 +
tools/testing/selftests/clone3/clone3.c | 200 +++++++++++++++++-----
tools/testing/selftests/clone3/clone3_selftests.h | 7 +
12 files changed, 268 insertions(+), 77 deletions(-)
---
base-commit: 98b1cc82c4affc16f5598d4fa14b1858671b2263
change-id: 20231019-clone3-shadow-stack-15d40d2bf536
Best regards,
--
Mark Brown <broonie(a)kernel.org>
This patch set enables the Intel flexible return and event delivery
(FRED) architecture with KVM VMX to allow guests to utilize FRED.
The FRED architecture defines simple new transitions that change
privilege level (ring transitions). The FRED architecture was
designed with the following goals:
1) Improve overall performance and response time by replacing event
delivery through the interrupt descriptor table (IDT event
delivery) and event return by the IRET instruction with lower
latency transitions.
2) Improve software robustness by ensuring that event delivery
establishes the full supervisor context and that event return
establishes the full user context.
The new transitions defined by the FRED architecture are FRED event
delivery and, for returning from events, two FRED return instructions.
FRED event delivery can effect a transition from ring 3 to ring 0, but
it is used also to deliver events incident to ring 0. One FRED
instruction (ERETU) effects a return from ring 0 to ring 3, while the
other (ERETS) returns while remaining in ring 0. Collectively, FRED
event delivery and the FRED return instructions are FRED transitions.
Intel VMX architecture is extended to run FRED guests, and the changes
are majorly:
1) New VMCS fields for FRED context management, which includes two new
event data VMCS fields, eight new guest FRED context VMCS fields and
eight new host FRED context VMCS fields.
2) VMX nested-Exception support for proper virtualization of stack
levels introduced with FRED architecture.
Search for the latest FRED spec in most search engines with this search pattern:
site:intel.com FRED (flexible return and event delivery) specification
We want to send out the FRED VMX patch set for review while the FRED
native patch set v12 is being reviewed @
https://lkml.kernel.org/kvm/20231003062458.23552-1-xin3.li@intel.com/.
For easier review, I have set up a base tree with the latest FRED native
patch set on top of tip tree in the 'fred_v12' branch of repo
https://github.com/xinli-intel/linux-fred-public.git.
Patch 1-2 are cleanups to VMX basic and misc MSRs, which were sent
out earlier as a preparation for FRED changes:
https://lore.kernel.org/kvm/20231030233940.438233-1-xin@zytor.com/.
Patch 3-14 add FRED support to VMX.
Patch 15-18 add FRED support to nested VMX.
Patch 19 exposes FRED to KVM guests to complete the enabling.
Patch 20-23 adds FRED selftests.
Shan Kang (1):
KVM: selftests: Add fred exception tests
Xin Li (22):
KVM: VMX: Cleanup VMX basic information defines and usages
KVM: VMX: Cleanup VMX misc information defines and usages
KVM: VMX: Add support for the secondary VM exit controls
KVM: x86: Mark CR4.FRED as not reserved
KVM: VMX: Initialize FRED VM entry/exit controls in vmcs_config
KVM: VMX: Defer enabling FRED MSRs save/load until after set CPUID
KVM: VMX: Disable intercepting FRED MSRs
KVM: VMX: Initialize VMCS FRED fields
KVM: VMX: Switch FRED RSP0 between host and guest
KVM: VMX: Add support for FRED context save/restore
KVM: x86: Add kvm_is_fred_enabled()
KVM: VMX: Handle FRED event data
KVM: VMX: Handle VMX nested exception for FRED
KVM: VMX: Dump FRED context in dump_vmcs()
KVM: nVMX: Add support for the secondary VM exit controls
KVM: nVMX: Add FRED VMCS fields
KVM: nVMX: Add support for VMX FRED controls
KVM: nVMX: Add VMCS FRED states checking
KVM: x86: Allow FRED/LKGS/WRMSRNS to be exposed to guests
KVM: selftests: Add FRED VMCS fields to evmcs
KVM: selftests: Run debug_regs test with FRED enabled
KVM: selftests: Add a new VM guest mode to run user level code
Documentation/virt/kvm/x86/nested-vmx.rst | 19 +
arch/x86/include/asm/hyperv-tlfs.h | 19 +
arch/x86/include/asm/kvm_host.h | 9 +-
arch/x86/include/asm/msr-index.h | 15 +-
arch/x86/include/asm/vmx.h | 57 ++-
arch/x86/kvm/cpuid.c | 4 +-
arch/x86/kvm/kvm_cache_regs.h | 10 +
arch/x86/kvm/svm/svm.c | 4 +-
arch/x86/kvm/vmx/capabilities.h | 20 +-
arch/x86/kvm/vmx/hyperv.c | 61 ++-
arch/x86/kvm/vmx/nested.c | 315 ++++++++++++--
arch/x86/kvm/vmx/nested.h | 2 +-
arch/x86/kvm/vmx/vmcs.h | 1 +
arch/x86/kvm/vmx/vmcs12.c | 19 +
arch/x86/kvm/vmx/vmcs12.h | 38 ++
arch/x86/kvm/vmx/vmcs_shadow_fields.h | 6 +-
arch/x86/kvm/vmx/vmx.c | 404 ++++++++++++++++--
arch/x86/kvm/vmx/vmx.h | 14 +-
arch/x86/kvm/x86.c | 55 ++-
arch/x86/kvm/x86.h | 5 +-
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/include/kvm_util_base.h | 1 +
.../selftests/kvm/include/x86_64/evmcs.h | 146 +++++++
.../selftests/kvm/include/x86_64/processor.h | 33 ++
.../selftests/kvm/include/x86_64/vmx.h | 20 +
tools/testing/selftests/kvm/lib/kvm_util.c | 5 +-
.../selftests/kvm/lib/x86_64/processor.c | 15 +-
tools/testing/selftests/kvm/lib/x86_64/vmx.c | 4 +-
.../testing/selftests/kvm/x86_64/debug_regs.c | 50 ++-
.../testing/selftests/kvm/x86_64/fred_test.c | 262 ++++++++++++
30 files changed, 1464 insertions(+), 150 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/fred_test.c
base-commit: d49b86c24e836941c85c4906e9519fca9426a6e0
--
2.42.0
Changes in RFC v3:
------------------
1. Pulled in the memory-provider dependency from Jakub's RFC[1] to make the
series reviewable and mergable.
2. Implemented multi-rx-queue binding which was a todo in v2.
3. Fix to cmsg handling.
The sticking point in RFC v2[2] was the device reset required to refill
the device rx-queues after the dmabuf bind/unbind. The solution
suggested as I understand is a subset of the per-queue management ops
Jakub suggested or similar:
https://lore.kernel.org/netdev/20230815171638.4c057dcd@kernel.org/
This is not addressed in this revision, because:
1. This point was discussed at netconf & netdev and there is openness to
using the current approach of requiring a device reset.
2. Implementing individual queue resetting seems to be difficult for my
test bed with GVE. My prototype to test this ran into issues with the
rx-queues not coming back up properly if reset individually. At the
moment I'm unsure if it's a mistake in the POC or a genuine issue in
the virtualization stack behind GVE, which currently doesn't test
individual rx-queue restart.
3. Our usecases are not bothered by requiring a device reset to refill
the buffer queues, and we'd like to support NICs that run into this
limitation with resetting individual queues.
My thought is that drivers that have trouble with per-queue configs can
use the support in this series, while drivers that support new netdev
ops to reset individual queues can automatically reset the queue as
part of the dma-buf bind/unbind.
The same approach with device resets is presented again for consideration
with other sticking points addressed.
This proposal includes the rx devmem path only proposed for merge. For a
snapshot of my entire tree which includes the GVE POC page pool support &
device memory support:
https://github.com/torvalds/linux/compare/master...mina:linux:tcpdevmem-v3
[1] https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168d79@redhat.…
[2] https://lore.kernel.org/netdev/CAHS8izOVJGJH5WF68OsRWFKJid1_huzzUK+hpKbLcL4…
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Jeroen de Borst <jeroendb(a)google.com>
Cc: Praveen Kaligineedi <pkaligineedi(a)google.com>
Changes in RFC v2:
------------------
The sticking point in RFC v1[1] was the dma-buf pages approach we used to
deliver the device memory to the TCP stack. RFC v2 is a proof-of-concept
that attempts to resolve this by implementing scatterlist support in the
networking stack, such that we can import the dma-buf scatterlist
directly. This is the approach proposed at a high level here[2].
Detailed changes:
1. Replaced dma-buf pages approach with importing scatterlist into the
page pool.
2. Replace the dma-buf pages centric API with a netlink API.
3. Removed the TX path implementation - there is no issue with
implementing the TX path with scatterlist approach, but leaving
out the TX path makes it easier to review.
4. Functionality is tested with this proposal, but I have not conducted
perf testing yet. I'm not sure there are regressions, but I removed
perf claims from the cover letter until they can be re-confirmed.
5. Added Signed-off-by: contributors to the implementation.
6. Fixed some bugs with the RX path since RFC v1.
Any feedback welcome, but specifically the biggest pending questions
needing feedback IMO are:
1. Feedback on the scatterlist-based approach in general.
2. Netlink API (Patch 1 & 2).
3. Approach to handle all the drivers that expect to receive pages from
the page pool (Patch 6).
[1] https://lore.kernel.org/netdev/dfe4bae7-13a0-3c5d-d671-f61b375cb0b4@gmail.c…
[2] https://lore.kernel.org/netdev/CAHS8izPm6XRS54LdCDZVd0C75tA1zHSu6jLVO8nzTLX…
----------------------
* TL;DR:
Device memory TCP (devmem TCP) is a proposal for transferring data to and/or
from device memory efficiently, without bouncing the data to a host memory
buffer.
* Problem:
A large amount of data transfers have device memory as the source and/or
destination. Accelerators drastically increased the volume of such transfers.
Some examples include:
- ML accelerators transferring large amounts of training data from storage into
GPU/TPU memory. In some cases ML training setup time can be as long as 50% of
TPU compute time, improving data transfer throughput & efficiency can help
improving GPU/TPU utilization.
- Distributed training, where ML accelerators, such as GPUs on different hosts,
exchange data among them.
- Distributed raw block storage applications transfer large amounts of data with
remote SSDs, much of this data does not require host processing.
Today, the majority of the Device-to-Device data transfers the network are
implemented as the following low level operations: Device-to-Host copy,
Host-to-Host network transfer, and Host-to-Device copy.
The implementation is suboptimal, especially for bulk data transfers, and can
put significant strains on system resources, such as host memory bandwidth,
PCIe bandwidth, etc. One important reason behind the current state is the
kernel’s lack of semantics to express device to network transfers.
* Proposal:
In this patch series we attempt to optimize this use case by implementing
socket APIs that enable the user to:
1. send device memory across the network directly, and
2. receive incoming network packets directly into device memory.
Packet _payloads_ go directly from the NIC to device memory for receive and from
device memory to NIC for transmit.
Packet _headers_ go to/from host memory and are processed by the TCP/IP stack
normally. The NIC _must_ support header split to achieve this.
Advantages:
- Alleviate host memory bandwidth pressure, compared to existing
network-transfer + device-copy semantics.
- Alleviate PCIe BW pressure, by limiting data transfer to the lowest level
of the PCIe tree, compared to traditional path which sends data through the
root complex.
* Patch overview:
** Part 1: netlink API
Gives user ability to bind dma-buf to an RX queue.
** Part 2: scatterlist support
Currently the standard for device memory sharing is DMABUF, which doesn't
generate struct pages. On the other hand, networking stack (skbs, drivers, and
page pool) operate on pages. We have 2 options:
1. Generate struct pages for dmabuf device memory, or,
2. Modify the networking stack to process scatterlist.
Approach #1 was attempted in RFC v1. RFC v2 implements approach #2.
** part 3: page pool support
We piggy back on page pool memory providers proposal:
https://github.com/kuba-moo/linux/tree/pp-providers
It allows the page pool to define a memory provider that provides the
page allocation and freeing. It helps abstract most of the device memory
TCP changes from the driver.
** part 4: support for unreadable skb frags
Page pool iovs are not accessible by the host; we implement changes
throughput the networking stack to correctly handle skbs with unreadable
frags.
** Part 5: recvmsg() APIs
We define user APIs for the user to send and receive device memory.
Not included with this RFC is the GVE devmem TCP support, just to
simplify the review. Code available here if desired:
https://github.com/mina/linux/tree/tcpdevmem
This RFC is built on top of net-next with Jakub's pp-providers changes
cherry-picked.
* NIC dependencies:
1. (strict) Devmem TCP require the NIC to support header split, i.e. the
capability to split incoming packets into a header + payload and to put
each into a separate buffer. Devmem TCP works by using device memory
for the packet payload, and host memory for the packet headers.
2. (optional) Devmem TCP works better with flow steering support & RSS support,
i.e. the NIC's ability to steer flows into certain rx queues. This allows the
sysadmin to enable devmem TCP on a subset of the rx queues, and steer
devmem TCP traffic onto these queues and non devmem TCP elsewhere.
The NIC I have access to with these properties is the GVE with DQO support
running in Google Cloud, but any NIC that supports these features would suffice.
I may be able to help reviewers bring up devmem TCP on their NICs.
* Testing:
The series includes a udmabuf kselftest that show a simple use case of
devmem TCP and validates the entire data path end to end without
a dependency on a specific dmabuf provider.
** Test Setup
Kernel: net-next with this RFC and memory provider API cherry-picked
locally.
Hardware: Google Cloud A3 VMs.
NIC: GVE with header split & RSS & flow steering support.
Jakub Kicinski (2):
net: page_pool: factor out releasing DMA from releasing the page
net: page_pool: create hooks for custom page providers
Mina Almasry (10):
net: netdev netlink api to bind dma-buf to a net device
netdev: support binding dma-buf to netdevice
netdev: netdevice devmem allocator
memory-provider: dmabuf devmem memory provider
page-pool: device memory support
net: support non paged skb frags
net: add support for skbs with unreadable frags
tcp: RX path for devmem TCP
net: add SO_DEVMEM_DONTNEED setsockopt to release RX pages
selftests: add ncdevmem, netcat for devmem TCP
Documentation/netlink/specs/netdev.yaml | 28 ++
include/linux/netdevice.h | 93 ++++
include/linux/skbuff.h | 56 ++-
include/linux/socket.h | 1 +
include/net/netdev_rx_queue.h | 1 +
include/net/page_pool/helpers.h | 151 ++++++-
include/net/page_pool/types.h | 55 +++
include/net/sock.h | 2 +
include/net/tcp.h | 5 +-
include/uapi/asm-generic/socket.h | 6 +
include/uapi/linux/netdev.h | 10 +
include/uapi/linux/uio.h | 10 +
net/core/datagram.c | 6 +
net/core/dev.c | 240 +++++++++++
net/core/gro.c | 7 +-
net/core/netdev-genl-gen.c | 14 +
net/core/netdev-genl-gen.h | 1 +
net/core/netdev-genl.c | 118 +++++
net/core/page_pool.c | 209 +++++++--
net/core/skbuff.c | 80 +++-
net/core/sock.c | 36 ++
net/ipv4/tcp.c | 205 ++++++++-
net/ipv4/tcp_input.c | 13 +-
net/ipv4/tcp_ipv4.c | 7 +
net/ipv4/tcp_output.c | 5 +-
net/packet/af_packet.c | 4 +-
tools/include/uapi/linux/netdev.h | 10 +
tools/net/ynl/generated/netdev-user.c | 42 ++
tools/net/ynl/generated/netdev-user.h | 47 ++
tools/testing/selftests/net/.gitignore | 1 +
tools/testing/selftests/net/Makefile | 5 +
tools/testing/selftests/net/ncdevmem.c | 546 ++++++++++++++++++++++++
32 files changed, 1950 insertions(+), 64 deletions(-)
create mode 100644 tools/testing/selftests/net/ncdevmem.c
--
2.42.0.869.gea05f2083d-goog
We missed one of the casts of kfree() to kunit_action_t in kunit-test,
which was only enabled when debugfs was in use. This could potentially
break CFI.
Use the existing wrapper function instead.
Signed-off-by: David Gow <davidgow(a)google.com>
---
lib/kunit/kunit-test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/kunit/kunit-test.c b/lib/kunit/kunit-test.c
index 3e9c5192d095..ee6927c60979 100644
--- a/lib/kunit/kunit-test.c
+++ b/lib/kunit/kunit-test.c
@@ -559,7 +559,7 @@ static void kunit_log_test(struct kunit *test)
KUNIT_EXPECT_TRUE(test, test->log->append_newlines);
full_log = string_stream_get_string(test->log);
- kunit_add_action(test, (kunit_action_t *)kfree, full_log);
+ kunit_add_action(test, kfree_wrapper, full_log);
KUNIT_EXPECT_NOT_ERR_OR_NULL(test,
strstr(full_log, "put this in log."));
KUNIT_EXPECT_NOT_ERR_OR_NULL(test,
--
2.43.0.rc2.451.g8631bc7472-goog
Add parsing of attributes as diagnostic data. Fixes issue with test plan
being parsed incorrectly as diagnostic data when located after
suite-level attributes.
Note that if there does not exist a test plan line, the diagnostic lines
between the suite header and the first result will be saved in the suite
log rather than the first test case log.
Signed-off-by: Rae Moar <rmoar(a)google.com>
---
tools/testing/kunit/kunit_parser.py | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py
index 79d8832c862a..ce34be15c929 100644
--- a/tools/testing/kunit/kunit_parser.py
+++ b/tools/testing/kunit/kunit_parser.py
@@ -450,7 +450,7 @@ def parse_diagnostic(lines: LineStream) -> List[str]:
Log of diagnostic lines
"""
log = [] # type: List[str]
- non_diagnostic_lines = [TEST_RESULT, TEST_HEADER, KTAP_START, TAP_START]
+ non_diagnostic_lines = [TEST_RESULT, TEST_HEADER, KTAP_START, TAP_START, TEST_PLAN]
while lines and not any(re.match(lines.peek())
for re in non_diagnostic_lines):
log.append(lines.pop())
@@ -726,6 +726,7 @@ def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest:
# test plan
test.name = "main"
ktap_line = parse_ktap_header(lines, test)
+ test.log.extend(parse_diagnostic(lines))
parse_test_plan(lines, test)
parent_test = True
else:
@@ -737,6 +738,7 @@ def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest:
if parent_test:
# If KTAP version line and/or subtest header is found, attempt
# to parse test plan and print test header
+ test.log.extend(parse_diagnostic(lines))
parse_test_plan(lines, test)
print_test_header(test)
expected_count = test.expected_count
base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86
--
2.43.0.472.g3155946c3a-goog