iommufd gives userspace the capability to manipulate iommu subsytem.
e.g. DMA map/unmap etc. In the near future, it will support iommu nested
translation. Different platform vendors have different implementation for
the nested translation. So before set up nested translation, userspace
needs to know the hardware iommu information. For example, Intel VT-d
supports using guest I/O page table as the stage-1 translation table. This
requires guest I/O page table be compatible with hardware IOMMU.
This series reports the iommu hardware information for a given iommufd_device
which has been bound to iommufd. It is preparation work for userspace to
allocate hwpt for given device. Like the nested translation support[1].
This series introduces an iommu op to report the iommu hardware info,
and an ioctl IOMMU_DEVICE_GET_HW_INFO is added to report such hardware
info to user. enum iommu_hw_info_type is defined to differentiate the
iommu hardware info reported to user hence user can decode them. This
series only adds the framework for iommu hw info reporting, the complete
reporting path needs vendor specific definition and driver support. The
full picture is available in [1] as well.
base-commit: 4c7e97cb6e65eab02991f60a5cc7a4fed5498c7a
[1] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
Change log:
v2:
- Drop patch 05 of v1 as it is already covered by other series
- Rename the capability info to be iommu hardware info
v1: https://lore.kernel.org/linux-iommu/20230209041642.9346-1-yi.l.liu@intel.co…
Regards,
Yi Liu
Lu Baolu (1):
iommu: Add new iommu op to get iommu hardware information
Nicolin Chen (1):
iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl
Yi Liu (2):
iommu: Move dev_iommu_ops() to private header
iommufd: Add IOMMU_DEVICE_GET_HW_INFO
drivers/iommu/iommu-priv.h | 11 +++
drivers/iommu/iommu.c | 2 +
drivers/iommu/iommufd/device.c | 75 +++++++++++++++++++
drivers/iommu/iommufd/iommufd_private.h | 1 +
drivers/iommu/iommufd/iommufd_test.h | 15 ++++
drivers/iommu/iommufd/main.c | 3 +
drivers/iommu/iommufd/selftest.c | 16 ++++
include/linux/iommu.h | 24 +++---
include/uapi/linux/iommufd.h | 47 ++++++++++++
tools/testing/selftests/iommu/iommufd.c | 17 ++++-
tools/testing/selftests/iommu/iommufd_utils.h | 26 +++++++
11 files changed, 225 insertions(+), 12 deletions(-)
--
2.34.1
There is a spelling mistake in an log message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com>
---
tools/testing/selftests/prctl/set-anon-vma-name-test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/prctl/set-anon-vma-name-test.c b/tools/testing/selftests/prctl/set-anon-vma-name-test.c
index 26d853c5a0c1..4275cb256dce 100644
--- a/tools/testing/selftests/prctl/set-anon-vma-name-test.c
+++ b/tools/testing/selftests/prctl/set-anon-vma-name-test.c
@@ -97,7 +97,7 @@ TEST_F(vma, renaming) {
TH_LOG("Try to pass invalid name (with non-printable character \\1) to rename the VMA");
EXPECT_EQ(rename_vma((unsigned long)self->ptr_anon, AREA_SIZE, BAD_NAME), -EINVAL);
- TH_LOG("Try to rename non-anonynous VMA");
+ TH_LOG("Try to rename non-anonymous VMA");
EXPECT_EQ(rename_vma((unsigned long) self->ptr_not_anon, AREA_SIZE, GOOD_NAME), -EINVAL);
}
--
2.30.2
Patch 1 removes an unneeded address copy in subflow_syn_recv_sock().
Patch 2 simplifies subflow_syn_recv_sock() to postpone some actions and
to avoid a bunch of conditionals.
Patch 3 stops reporting limits that are not taken into account when the
userspace PM is used.
Patch 4 adds a new test to validate that the 'subflows' field reported
by the kernel is correct. Such info can be retrieved via Netlink (e.g.
with ss) or getsockopt(SOL_MPTCP, MPTCP_INFO).
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
---
Changes in v2:
- Patch 3/4's commit message has been updated to use the correct SHA
- Rebased on latest net-next
- Link to v1: https://lore.kernel.org/r/20230324-upstream-net-next-20230324-misc-features…
---
Geliang Tang (1):
selftests: mptcp: add mptcp_info tests
Matthieu Baerts (1):
mptcp: do not fill info not used by the PM in used
Paolo Abeni (2):
mptcp: avoid unneeded address copy
mptcp: simplify subflow_syn_recv_sock()
net/mptcp/sockopt.c | 20 +++++++----
net/mptcp/subflow.c | 43 +++++++---------------
tools/testing/selftests/net/mptcp/mptcp_join.sh | 47 ++++++++++++++++++++++++-
3 files changed, 72 insertions(+), 38 deletions(-)
---
base-commit: e5b42483ccce50d5b957f474fd332afd4ef0c27b
change-id: 20230324-upstream-net-next-20230324-misc-features-178b2b618414
Best regards,
--
Matthieu Baerts <matthieu.baerts(a)tessares.net>
The s390 specific test_unwind kunit test has 39 parameterized tests. The
results in debugfs are truncated since the full log doesn't fit into 1500
bytes.
Therefore increase KUNIT_LOG_SIZE to 2048 bytes in a similar way like it
was done recently with commit "kunit: fix bug in debugfs logs of
parameterized tests". With that the whole test result is present.
Reported-by: Alexander Egorenkov <egorenar(a)linux.ibm.com>
Signed-off-by: Heiko Carstens <hca(a)linux.ibm.com>
---
include/kunit/test.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/kunit/test.h b/include/kunit/test.h
index 9721584027d8..57b309c6ca27 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -34,7 +34,7 @@ DECLARE_STATIC_KEY_FALSE(kunit_running);
struct kunit;
/* Size of log associated with test. */
-#define KUNIT_LOG_SIZE 1500
+#define KUNIT_LOG_SIZE 2048
/* Maximum size of parameter description string. */
#define KUNIT_PARAM_DESC_SIZE 128
--
2.37.2
There's been a bunch of off-list discussions about this, including at
Plumbers. The original plan was to do something involving providing an
ISA string to userspace, but ISA strings just aren't sufficient for a
stable ABI any more: in order to parse an ISA string users need the
version of the specifications that the string is written to, the version
of each extension (sometimes at a finer granularity than the RISC-V
releases/versions encode), and the expected use case for the ISA string
(ie, is it a U-mode or M-mode string). That's a lot of complexity to
try and keep ABI compatible and it's probably going to continue to grow,
as even if there's no more complexity in the specifications we'll have
to deal with the various ISA string parsing oddities that end up all
over userspace.
Instead this patch set takes a very different approach and provides a set
of key/value pairs that encode various bits about the system. The big
advantage here is that we can clearly define what these mean so we can
ensure ABI stability, but it also allows us to encode information that's
unlikely to ever appear in an ISA string (see the misaligned access
performance, for example). The resulting interface looks a lot like
what arm64 and x86 do, and will hopefully fit well into something like
ACPI in the future.
The actual user interface is a syscall, with a vDSO function in front of
it. The vDSO function can answer some queries without a syscall at all,
and falls back to the syscall for cases it doesn't have answers to.
Currently we prepopulate it with an array of answers for all keys and
a CPU set of "all CPUs". This can be adjusted as necessary to provide
fast answers to the most common queries.
An example series in glibc exposing this syscall and using it in an
ifunc selector for memcpy can be found at [1]. I'm about to send a v2
of that series out that incorporates the vDSO function.
I was asked about the performance delta between this and something like
sysfs. I created a small test program [2] and ran it on a Nezha D1
Allwinner board. Doing each operation 100000 times and dividing, these
operations take the following amount of time:
- open()+read()+close() of /sys/kernel/cpu_byteorder: 3.8us
- access("/sys/kernel/cpu_byteorder", R_OK): 1.3us
- riscv_hwprobe() vDSO and syscall: .0094us
- riscv_hwprobe() vDSO with no syscall: 0.0091us
These numbers get farther apart if we query multiple keys, as sysfs will
scale linearly with the number of keys, where the dedicated syscall
stays the same. To frame these numbers, I also did a tight
fork/exec/wait loop, which I measured as 4.8ms. So doing 4
open/read/close operations is a delta of about 0.3%, versus a single vDSO
call is a delta of essentially zero.
[1] https://public-inbox.org/libc-alpha/20230206194819.1679472-1-evan@rivosinc.…
[2] https://pastebin.com/x84NEKaS
Changes in v5:
- Added tags
- Fixed misuse of ISA_EXT_c as bitmap, changed to use
riscv_isa_extension_available() (Heiko, Conor)
- Document the alternatives approach in the commit message (Conor and
Heiko).
- Fix __init call warnings by making probe_vendor_features() and
thead_feature_probe_func() __init_or_module.
- Fixed compat vdso compilation failure (lkp).
Changes in v4:
- Used real types in syscall prototypes (Arnd)
- Fixed static line break in do_riscv_hwprobe() (Conor)
- Added newlines between documentation lists (Conor)
- Crispen up size types to size_t, and cpu indices to int (Joe)
- Fix copy_from_user() return logic bug (found via kselftests!)
- Add __user to SYSCALL_DEFINE() to fix warning
- More newlines in BASE_BEHAVIOR_IMA documentation (Conor)
- Add newlines to CPUPERF_0 documentation (Conor)
- Add UNSUPPORTED value (Conor)
- Switched from DT to alternatives-based probing (Rob)
- Crispen up cpu index type to always be int (Conor)
- Fixed selftests commit description, no more tiny libc (Mark Brown)
- Fixed selftest syscall prototype types to match v4.
- Added a prototype to fix -Wmissing-prototype warning (lkp(a)intel.com)
- Fixed rv32 build failure (lkp(a)intel.com)
- Make vdso prototype match syscall types update
Changes in v3:
- Updated copyright date in cpufeature.h
- Fixed typo in cpufeature.h comment (Conor)
- Refactored functions so that kernel mode can query too, in
preparation for the vDSO data population.
- Changed the vendor/arch/imp IDs to return a value of -1 on mismatch
rather than failing the whole call.
- Const cpumask pointer in hwprobe_mid()
- Embellished documentation WRT cpu_set and the returned values.
- Renamed hwprobe_mid() to hwprobe_arch_id() (Conor)
- Fixed machine ID doc warnings, changed elements to c:macro:.
- Completed dangling unistd.h comment (Conor)
- Fixed line breaks and minor logic optimization (Conor).
- Use riscv_cached_mxxxid() (Conor)
- Refactored base ISA behavior probe to allow kernel probing as well,
in prep for vDSO data initialization.
- Fixed doc warnings in IMA text list, use :c:macro:.
- Have hwprobe_misaligned return int instead of long.
- Constify cpumask pointer in hwprobe_misaligned()
- Fix warnings in _PERF_O list documentation, use :c:macro:.
- Move include cpufeature.h to misaligned patch.
- Fix documentation mismatch for RISCV_HWPROBE_KEY_CPUPERF_0 (Conor)
- Use for_each_possible_cpu() instead of NR_CPUS (Conor)
- Break early in misaligned access iteration (Conor)
- Increase MISALIGNED_MASK from 2 bits to 3 for possible UNSUPPORTED future
value (Conor)
- Introduced vDSO function
Changes in v2:
- Factored the move of struct riscv_cpuinfo to its own header
- Changed the interface to look more like poll(). Rather than supplying
key_offset and getting back an array of values with numerically
contiguous keys, have the user pre-fill the key members of the array,
and the kernel will fill in the corresponding values. For any key it
doesn't recognize, it will set the key of that element to -1. This
allows usermode to quickly ask for exactly the elements it cares
about, and not get bogged down in a back and forth about newer keys
that older kernels might not recognize. In other words, the kernel
can communicate that it doesn't recognize some of the keys while
still providing the data for the keys it does know.
- Added a shortcut to the cpuset parameters that if a size of 0 and
NULL is provided for the CPU set, the kernel will use a cpu mask of
all online CPUs. This is convenient because I suspect most callers
will only want to act on a feature if it's supported on all CPUs, and
it's a headache to dynamically allocate an array of all 1s, not to
mention a waste to have the kernel loop over all of the offline bits.
- Fixed logic error in if(of_property_read_string...) that caused crash
- Include cpufeature.h in cpufeature.h to avoid undeclared variable
warning.
- Added a _MASK define
- Fix random checkpatch complaints
- Updated the selftests to the new API and added some more.
- Fixed indentation, comments in .S, and general checkpatch complaints.
Evan Green (6):
RISC-V: Move struct riscv_cpuinfo to new header
RISC-V: Add a syscall for HW probing
RISC-V: hwprobe: Add support for RISCV_HWPROBE_BASE_BEHAVIOR_IMA
RISC-V: hwprobe: Support probing of misaligned access performance
selftests: Test the new RISC-V hwprobe interface
RISC-V: Add hwprobe vDSO function and data
Documentation/riscv/hwprobe.rst | 86 +++++++
Documentation/riscv/index.rst | 1 +
arch/riscv/Kconfig | 1 +
arch/riscv/errata/thead/errata.c | 10 +
arch/riscv/include/asm/alternative.h | 5 +
arch/riscv/include/asm/cpufeature.h | 23 ++
arch/riscv/include/asm/hwprobe.h | 13 +
arch/riscv/include/asm/syscall.h | 4 +
arch/riscv/include/asm/vdso/data.h | 17 ++
arch/riscv/include/asm/vdso/gettimeofday.h | 8 +
arch/riscv/include/uapi/asm/hwprobe.h | 37 +++
arch/riscv/include/uapi/asm/unistd.h | 9 +
arch/riscv/kernel/alternative.c | 19 ++
arch/riscv/kernel/compat_vdso/Makefile | 2 +-
arch/riscv/kernel/cpu.c | 8 +-
arch/riscv/kernel/cpufeature.c | 3 +
arch/riscv/kernel/smpboot.c | 1 +
arch/riscv/kernel/sys_riscv.c | 225 +++++++++++++++++-
arch/riscv/kernel/vdso.c | 6 -
arch/riscv/kernel/vdso/Makefile | 4 +
arch/riscv/kernel/vdso/hwprobe.c | 52 ++++
arch/riscv/kernel/vdso/sys_hwprobe.S | 15 ++
arch/riscv/kernel/vdso/vdso.lds.S | 3 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/riscv/Makefile | 58 +++++
.../testing/selftests/riscv/hwprobe/Makefile | 10 +
.../testing/selftests/riscv/hwprobe/hwprobe.c | 90 +++++++
.../selftests/riscv/hwprobe/sys_hwprobe.S | 12 +
28 files changed, 709 insertions(+), 14 deletions(-)
create mode 100644 Documentation/riscv/hwprobe.rst
create mode 100644 arch/riscv/include/asm/cpufeature.h
create mode 100644 arch/riscv/include/asm/hwprobe.h
create mode 100644 arch/riscv/include/asm/vdso/data.h
create mode 100644 arch/riscv/include/uapi/asm/hwprobe.h
create mode 100644 arch/riscv/kernel/vdso/hwprobe.c
create mode 100644 arch/riscv/kernel/vdso/sys_hwprobe.S
create mode 100644 tools/testing/selftests/riscv/Makefile
create mode 100644 tools/testing/selftests/riscv/hwprobe/Makefile
create mode 100644 tools/testing/selftests/riscv/hwprobe/hwprobe.c
create mode 100644 tools/testing/selftests/riscv/hwprobe/sys_hwprobe.S
--
2.25.1
Hi Linus,
Please pull the following Kselftest fixes update for Linux 6.3-rc5.
This Kselftest fixes update for Linux 6.3-rc5 consists of one single
fix for sigaltstack test -Wuninitialized warning found when building
with clang.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 624c60f326c6e5a80b008e8a5c7feffe8c27dc72:
selftests: fix LLVM build for i386 and x86_64 (2023-03-10 13:41:10 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux-kselftest-fixes-6.3-rc5
for you to fetch changes up to 05107edc910135d27fe557267dc45be9630bf3dd:
selftests: sigaltstack: fix -Wuninitialized (2023-03-20 17:28:31 -0600)
----------------------------------------------------------------
linux-kselftest-fixes-6.3-rc5
This Kselftest fixes update for Linux 6.3-rc5 consists of one single
fix for sigaltstack test -Wuninitialized warning found when building
with clang.
----------------------------------------------------------------
Nick Desaulniers (1):
selftests: sigaltstack: fix -Wuninitialized
.../selftests/sigaltstack/current_stack_pointer.h | 23 ++++++++++++++++++++++
tools/testing/selftests/sigaltstack/sas.c | 7 +------
2 files changed, 24 insertions(+), 6 deletions(-)
create mode 100644 tools/testing/selftests/sigaltstack/current_stack_pointer.
----------------------------------------------------------------
Hi all,
Is there an established process for new kunit infrastructure?
For example, we have this macro that makes KUNIT_ARRAY_PARAM easier by
letting you just declare an array of test cases:
/* Similar to KUNIT_ARRAY_PARAM, but avoiding an extra function */
#define KUNIT_ARRAY_PARAM_DESC(name, array, desc_member) \
static const void *name##_gen_params(const void *prev, char *desc) \
{ \
typeof((array)[0]) *__next = prev ? ((typeof(__next)) prev) + 1 : (array); \
if (__next - (array) < ARRAY_SIZE((array))) { \
strscpy(desc, __next->desc_member, KUNIT_PARAM_DESC_SIZE); \
return __next; \
} \
return NULL; \
}
Also, since we're working on wifi and thus networking, we want e.g. SKBs
to be resource-managed, and added some helper macros/functions for using
kunit_alloc_resource() with SKBs, that will be used at least in cfg80211
and mac80211 soon, so it would seem appropriate to have
include/kunit/skb.h (and a corresponding C file somewhere) with these
helpers.
Is all of this just a case of "nobody needed it so far", or is there no
expectation to add such infrastructure more generally?
johannes
Dzień dobry,
zapoznałem się z Państwa ofertą i z przyjemnością przyznaję, że przyciąga uwagę i zachęca do dalszych rozmów.
Pomyślałem, że może mógłbym mieć swój wkład w Państwa rozwój i pomóc dotrzeć z tą ofertą do większego grona odbiorców. Pozycjonuję strony www, dzięki czemu generują świetny ruch w sieci.
Możemy porozmawiać w najbliższym czasie?
Pozdrawiam
Adam Charachuta
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/ftrace/Makefile | 3 +-
tools/testing/selftests/ftrace/ftracetest | 63 ++++++++++++++++++++++++--
tools/testing/selftests/ftrace/ftracetest-ktap | 8 ++++
3 files changed, 70 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile
index d6e106fbce11..a1e955d2de4c 100644
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -1,7 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
all:
-TEST_PROGS := ftracetest
+TEST_PROGS_EXTENDED := ftracetest
+TEST_PROGS := ftracetest-ktap
TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index c3311c8c4089..539c8d6d5d71 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -13,6 +13,7 @@ echo "Usage: ftracetest [options] [testcase(s)] [testcase-directory(s)]"
echo " Options:"
echo " -h|--help Show help message"
echo " -k|--keep Keep passed test logs"
+echo " -K|--KTAP Output in KTAP format"
echo " -v|--verbose Increase verbosity of test messages"
echo " -vv Alias of -v -v (Show all results in stdout)"
echo " -vvv Alias of -v -v -v (Show all commands immediately)"
@@ -85,6 +86,10 @@ parse_opts() { # opts
KEEP_LOG=1
shift 1
;;
+ --ktap|-K)
+ KTAP=1
+ shift 1
+ ;;
--verbose|-v|-vv|-vvv)
if [ $VERBOSE -eq -1 ]; then
usage "--console can not use with --verbose"
@@ -178,6 +183,7 @@ TEST_DIR=$TOP_DIR/test.d
TEST_CASES=`find_testcases $TEST_DIR`
LOG_DIR=$TOP_DIR/logs/`date +%Y%m%d-%H%M%S`/
KEEP_LOG=0
+KTAP=0
DEBUG=0
VERBOSE=0
UNSUPPORTED_RESULT=0
@@ -229,7 +235,7 @@ prlog() { # messages
newline=
shift
fi
- printf "$*$newline"
+ [ "$KTAP" != "1" ] && printf "$*$newline"
[ "$LOG_FILE" ] && printf "$*$newline" | strip_esc >> $LOG_FILE
}
catlog() { #file
@@ -260,11 +266,11 @@ TOTAL_RESULT=0
INSTANCE=
CASENO=0
+CASENAME=
testcase() { # testfile
CASENO=$((CASENO+1))
- desc=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
- prlog -n "[$CASENO]$INSTANCE$desc"
+ CASENAME=`grep "^#[ \t]*description:" $1 | cut -f2- -d:`
}
checkreq() { # testfile
@@ -277,40 +283,68 @@ test_on_instance() { # testfile
grep -q "^#[ \t]*flags:.*instance" $1
}
+ktaptest() { # result comment
+ if [ "$KTAP" != "1" ]; then
+ return
+ fi
+
+ local result=
+ if [ "$1" = "1" ]; then
+ result="ok"
+ else
+ result="not ok"
+ fi
+ shift
+
+ local comment=$*
+ if [ "$comment" != "" ]; then
+ comment="# $comment"
+ fi
+
+ echo $CASENO $result $INSTANCE$CASENAME $comment
+}
+
eval_result() { # sigval
case $1 in
$PASS)
prlog " [${color_green}PASS${color_reset}]"
+ ktaptest 1
PASSED_CASES="$PASSED_CASES $CASENO"
return 0
;;
$FAIL)
prlog " [${color_red}FAIL${color_reset}]"
+ ktaptest 0
FAILED_CASES="$FAILED_CASES $CASENO"
return 1 # this is a bug.
;;
$UNRESOLVED)
prlog " [${color_blue}UNRESOLVED${color_reset}]"
+ ktaptest 0 UNRESOLVED
UNRESOLVED_CASES="$UNRESOLVED_CASES $CASENO"
return $UNRESOLVED_RESULT # depends on use case
;;
$UNTESTED)
prlog " [${color_blue}UNTESTED${color_reset}]"
+ ktaptest 1 SKIP
UNTESTED_CASES="$UNTESTED_CASES $CASENO"
return 0
;;
$UNSUPPORTED)
prlog " [${color_blue}UNSUPPORTED${color_reset}]"
+ ktaptest 1 SKIP
UNSUPPORTED_CASES="$UNSUPPORTED_CASES $CASENO"
return $UNSUPPORTED_RESULT # depends on use case
;;
$XFAIL)
prlog " [${color_green}XFAIL${color_reset}]"
+ ktaptest 1 XFAIL
XFAILED_CASES="$XFAILED_CASES $CASENO"
return 0
;;
*)
prlog " [${color_blue}UNDEFINED${color_reset}]"
+ ktaptest 0 error
UNDEFINED_CASES="$UNDEFINED_CASES $CASENO"
return 1 # this must be a test bug
;;
@@ -371,6 +405,7 @@ __run_test() { # testfile
run_test() { # testfile
local testname=`basename $1`
testcase $1
+ prlog -n "[$CASENO]$INSTANCE$CASENAME"
if [ ! -z "$LOG_FILE" ] ; then
local testlog=`mktemp $LOG_DIR/${CASENO}-${testname}-log.XXXXXX`
else
@@ -405,6 +440,17 @@ run_test() { # testfile
# load in the helper functions
. $TEST_DIR/functions
+if [ "$KTAP" = "1" ]; then
+ echo "TAP version 13"
+
+ casecount=`echo $TEST_CASES | wc -w`
+ for t in $TEST_CASES; do
+ test_on_instance $t || continue
+ casecount=$((casecount+1))
+ done
+ echo "1..${casecount}"
+fi
+
# Main loop
for t in $TEST_CASES; do
run_test $t
@@ -439,6 +485,17 @@ prlog "# of unsupported: " `echo $UNSUPPORTED_CASES | wc -w`
prlog "# of xfailed: " `echo $XFAILED_CASES | wc -w`
prlog "# of undefined(test bug): " `echo $UNDEFINED_CASES | wc -w`
+if [ "$KTAP" = "1" ]; then
+ echo -n "# Totals:"
+ echo -n " pass:"`echo $PASSED_CASES | wc -w`
+ echo -n " faii:"`echo $FAILED_CASES | wc -w`
+ echo -n " xfail:"`echo $XFAILED_CASES | wc -w`
+ echo -n " xpass:0"
+ echo -n " skip:"`echo $UNTESTED_CASES $UNSUPPORTED_CASES | wc -w`
+ echo -n " error:"`echo $UNRESOLVED_CASES $UNDEFINED_CASES | wc -w`
+ echo
+fi
+
cleanup
# if no error, return 0
diff --git a/tools/testing/selftests/ftrace/ftracetest-ktap b/tools/testing/selftests/ftrace/ftracetest-ktap
new file mode 100755
index 000000000000..b3284679ef3a
--- /dev/null
+++ b/tools/testing/selftests/ftrace/ftracetest-ktap
@@ -0,0 +1,8 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# ftracetest-ktap: Wrapper to integrate ftracetest with the kselftest runner
+#
+# Copyright (C) Arm Ltd., 2023
+
+./ftracetest -K
---
base-commit: fe15c26ee26efa11741a7b632e9f23b01aca4cc6
change-id: 20230302-ftrace-kselftest-ktap-9d7878691557
Best regards,
--
Mark Brown <broonie(a)kernel.org>
From: Roberto Sassu <roberto.sassu(a)huawei.com>
A User Mode Driver (UMD) is a specialization of a User Mode Helper (UMH),
which runs a user space process from a binary blob, and creates a
bidirectional pipe, so that the kernel can make a request to that process,
and the latter provides its response. It is currently used by bpfilter,
although it does not seem to do any useful work.
The problem is, if other users would like to implement a UMD similar to
bpfilter, they would have to duplicate the code. Instead, make an UMD
management library and API from the existing bpfilter and sockopt code,
and move it to common kernel code.
Also, define the software architecture and the main components of the
library: the UMD Manager, running in the kernel, acting as the frontend
interface to any user or kernel-originated request; the UMD Loader, also
running in the kernel, responsible to load the UMD Handler; the UMD
Handler, running in user space, responsible to handle requests from the UMD
Manager and to send to it the response.
I have two use cases, but for sake of brevity I will propose one.
I would like to add support for PGP keys and signatures in the kernel, so
that I can extend secure boot to applications, and allow/deny code
execution based on the signed file digests included in RPM headers.
While I proposed a patch set a while ago (based on a previous work of David
Howells), the main objection was that the PGP packet parser should not run
in the kernel.
That makes a perfect example for using a UMD. If the PGP parser is moved to
user space (UMD Handler), and the kernel (UMD Manager) just instantiates
the key and verifies the signature on already parsed data, this would
address the concern.
Patch 1 moves the function bpfilter_send_req() to usermode_driver.c and
makes the pipe between the kernel and the user space process suitable for
larger quantity of data (> 64K).
Patch 2 introduces the management library and API.
Patch 3 replaces the existing bpfilter and sockopt code with calls
to the management API. To use the new mechanism, sockopt itself (acts as
UMD Manager) now sends/receives messages to/from bpfilter_umh (acts as UMD
Handler), instead of bpfilter (acts as UMD Loader).
Patch 4 introduces a sample UMD, useful for other implementors, and uses it
for testing.
Patch 5 introduces the documentation of the new management library and API.
Roberto Sassu (5):
usermode_driver: Introduce umd_send_recv() from bpfilter
usermode_driver_mgmt: Introduce management of user mode drivers
bpfilter: Port to user mode driver management API
selftests/umd_mgmt: Add selftests for UMD management library
doc: Add documentation for the User Mode Driver management library
Documentation/driver-api/index.rst | 1 +
Documentation/driver-api/umd_mgmt.rst | 99 +++++++++++++
MAINTAINERS | 9 ++
include/linux/bpfilter.h | 12 +-
include/linux/usermode_driver.h | 2 +
include/linux/usermode_driver_mgmt.h | 35 +++++
kernel/Makefile | 2 +-
kernel/usermode_driver.c | 47 +++++-
kernel/usermode_driver_mgmt.c | 137 ++++++++++++++++++
net/bpfilter/bpfilter_kern.c | 120 +--------------
net/ipv4/bpfilter/sockopt.c | 67 +++++----
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/umd_mgmt/.gitignore | 1 +
tools/testing/selftests/umd_mgmt/Makefile | 14 ++
tools/testing/selftests/umd_mgmt/config | 1 +
.../selftests/umd_mgmt/sample_umd/Makefile | 22 +++
.../selftests/umd_mgmt/sample_umd/msgfmt.h | 13 ++
.../umd_mgmt/sample_umd/sample_binary_blob.S | 7 +
.../umd_mgmt/sample_umd/sample_handler.c | 81 +++++++++++
.../umd_mgmt/sample_umd/sample_loader.c | 28 ++++
.../umd_mgmt/sample_umd/sample_mgr.c | 124 ++++++++++++++++
tools/testing/selftests/umd_mgmt/umd_mgmt.sh | 40 +++++
22 files changed, 707 insertions(+), 156 deletions(-)
create mode 100644 Documentation/driver-api/umd_mgmt.rst
create mode 100644 include/linux/usermode_driver_mgmt.h
create mode 100644 kernel/usermode_driver_mgmt.c
create mode 100644 tools/testing/selftests/umd_mgmt/.gitignore
create mode 100644 tools/testing/selftests/umd_mgmt/Makefile
create mode 100644 tools/testing/selftests/umd_mgmt/config
create mode 100644 tools/testing/selftests/umd_mgmt/sample_umd/Makefile
create mode 100644 tools/testing/selftests/umd_mgmt/sample_umd/msgfmt.h
create mode 100644 tools/testing/selftests/umd_mgmt/sample_umd/sample_binary_blob.S
create mode 100644 tools/testing/selftests/umd_mgmt/sample_umd/sample_handler.c
create mode 100644 tools/testing/selftests/umd_mgmt/sample_umd/sample_loader.c
create mode 100644 tools/testing/selftests/umd_mgmt/sample_umd/sample_mgr.c
create mode 100755 tools/testing/selftests/umd_mgmt/umd_mgmt.sh
--
2.25.1
On Mon, Mar 27, 2023 at 07:56:03AM +0200, Markus Elfring wrote:
> >> 2. Can a cg_destroy() call ever work as expected if a cg_create() call failed?
> >
> > Perhaps next time you can answer your own question by spending 30
> > seconds actually reading the code you're "fixing":
> >
> > int cg_destroy(const char *cgroup)
> > {
> …
> > ret = rmdir(cgroup);
> …
> > if (ret && errno == ENOENT) <<< that case is explicitly handled here
> > ret = 0;
> >
> > return ret;
> > }
>
> Is it interesting somehow that a non-existing directory (which would occasionally
> not be found) is tolerated so far?
> https://elixir.bootlin.com/linux/v6.3-rc3/source/tools/testing/selftests/cg…
>
> Should such a function call be avoided because of a failed cg_create() call?
The point is that (a) you were wrong that this is fixing anything, and
(b) this patch is functionally useless. Sure, we could move some goto's
around and subjectively improve "something". Why? What's the point?
It's highly debatable that what you're doing is even an improvement, and
I'm not interested in wasting time pontificating about the merits of a
trivial "fix" for a test cleanup function that isn't even broken.
Several people have already either advised or directly asked you to stop
sending these patches. I'm not sure why you're choosing to ignore them,
but I'll throw my hat in the ring regardless and do the same. Please
stop sending these fake cleanup patches.
Add the test result "skip" to KTAP version 2 as an alternative way to
indicate a test was skipped.
The current spec uses the "#SKIP" directive to indicate that a test was
skipped. However, the "#SKIP" directive is not always evident when quickly
skimming through KTAP results.
The "skip" result would provide an alternative that could make it clearer
that a test has not successfully passed because it was skipped.
Before:
KTAP version 1
1..1
KTAP version 1
1..2
ok 1 case_1
ok 2 case_2 #SKIP
ok 1 suite
After:
KTAP version 2
1..1
KTAP version 2
1..2
ok 1 case_1
skip 2 case_2
ok 1 suite
Here is a link to a version of the KUnit parser that is able to parse
the skip test result for KTAP version 2. Note this parser is still able
to parse the "#SKIP" directive.
Link: https://kunit-review.googlesource.com/c/linux/+/5689
Signed-off-by: Rae Moar <rmoar(a)google.com>
---
Note: this patch is based on Frank's ktap_spec_version_2 branch.
Documentation/dev-tools/ktap.rst | 27 ++++++++++++++++++---------
1 file changed, 18 insertions(+), 9 deletions(-)
diff --git a/Documentation/dev-tools/ktap.rst b/Documentation/dev-tools/ktap.rst
index ff77f4aaa6ef..f48aa00db8f0 100644
--- a/Documentation/dev-tools/ktap.rst
+++ b/Documentation/dev-tools/ktap.rst
@@ -74,7 +74,8 @@ They are required and must have the format:
<result> <number> [<description>][ # [<directive>] [<diagnostic data>]]
The result can be either "ok", which indicates the test case passed,
-or "not ok", which indicates that the test case failed.
+"not ok", which indicates that the test case failed, or "skip", which indicates
+the test case did not run.
<number> represents the number of the test being performed. The first test must
have the number 1 and the number then must increase by 1 for each additional
@@ -91,12 +92,13 @@ A directive is a keyword that indicates a different outcome for a test other
than passed and failed. The directive is optional, and consists of a single
keyword preceding the diagnostic data. In the event that a parser encounters
a directive it doesn't support, it should fall back to the "ok" / "not ok"
-result.
+/ "skip" result.
Currently accepted directives are:
-- "SKIP", which indicates a test was skipped (note the result of the test case
- result line can be either "ok" or "not ok" if the SKIP directive is used)
+- "SKIP", which indicates a test was skipped (note this is an alternative to
+ the "skip" result type and if the SKIP directive is used, the
+ result can be any type - "ok", "not ok", or "skip")
- "TODO", which indicates that a test is not expected to pass at the moment,
e.g. because the feature it is testing is known to be broken. While this
directive is inherited from TAP, its use in the kernel is discouraged.
@@ -110,7 +112,7 @@ Currently accepted directives are:
The diagnostic data is a plain-text field which contains any additional details
about why this result was produced. This is typically an error message for ERROR
-or failed tests, or a description of missing dependencies for a SKIP result.
+or failed tests, or a description of missing dependencies for a skipped test.
The diagnostic data field is optional, and results which have neither a
directive nor any diagnostic data do not need to include the "#" field
@@ -130,11 +132,18 @@ The test "test_case_name" failed.
::
- ok 1 test # SKIP necessary dependency unavailable
+ skip 1 test # necessary dependency unavailable
-The test "test" was SKIPPED with the diagnostic message "necessary dependency
+The test "test" was skipped with the diagnostic message "necessary dependency
unavailable".
+::
+
+ ok 1 test_2 # SKIP this test should not run
+
+The test "test_2" was skipped with the diagnostic message "this test
+should not run".
+
::
not ok 1 test # TIMEOUT 30 seconds
@@ -225,7 +234,7 @@ An example format with multiple levels of nested testing:
not ok 1 test_1
ok 2 test_2
not ok 1 test_3
- ok 2 test_4 # SKIP
+ skip 2 test_4
not ok 1 example_test_1
ok 2 example_test_2
@@ -262,7 +271,7 @@ Example KTAP output
ok 1 example_test_1
KTAP version 2
1..2
- ok 1 test_1 # SKIP test_1 skipped
+ skip 1 test_1 # test_1 skipped
ok 2 test_2
ok 2 example_test_2
KTAP version 2
base-commit: 906f02e42adfbd5ae70d328ee71656ecb602aaf5
--
2.40.0.rc1.284.g88254d51c5-goog
On Sun, Mar 26, 2023 at 10:15:31AM +0200, Markus Elfring wrote:
[...]
> >>
> >> Fixes: a987785dcd6c8ae2915460582aebd6481c81eb67 ("Add tests for memory.oom.group")
> >
> > Fixes what in the what now?
>
> 1. Check repetition (which can be undesirable)
>
> 2. Can a cg_destroy() call ever work as expected if a cg_create() call failed?
Perhaps next time you can answer your own question by spending 30
seconds actually reading the code you're "fixing":
int cg_destroy(const char *cgroup)
{
int ret;
retry:
ret = rmdir(cgroup);
if (ret && errno == EBUSY) {
cg_killall(cgroup);
usleep(100);
goto retry;
}
if (ret && errno == ENOENT) <<< that case is explicitly handled here
ret = 0;
return ret;
}
This patch series includes miscellaneous update to the cpuset and its
testing code.
Patch 2 is actually a follow-up of commit 3fb906e7fabb ("cgroup/cpuset:
Don't filter offline CPUs in cpuset_cpus_allowed() for top cpuset tasks").
Patches 3-4 are for handling corner cases when dealing with
task_cpu_possible_mask().
Waiman Long (5):
cgroup/cpuset: Skip task update if hotplug doesn't affect current
cpuset
cgroup/cpuset: Include offline CPUs when tasks' cpumasks in top_cpuset
are updated
cgroup/cpuset: Find another usable CPU if none found in current cpuset
cgroup/cpuset: Add CONFIG_DEBUG_CPUSETS config for cpuset testing
cgroup/cpuset: Minor updates to test_cpuset_prs.sh
init/Kconfig | 5 +
kernel/cgroup/cpuset.c | 155 +++++++++++++++++-
.../selftests/cgroup/test_cpuset_prs.sh | 25 +--
3 files changed, 165 insertions(+), 20 deletions(-)
--
2.31.1
On Sat, Mar 25, 2023 at 07:30:21PM +0100, Markus Elfring wrote:
> Date: Sat, 25 Mar 2023 19:11:13 +0100
>
> The label “cleanup” was used to jump to another pointer check despite of
> the detail in the implementation of the function
> “test_memcg_oom_group_score_events” that it was determined already
> that a corresponding variable contained a null pointer.
This is poorly writte and confusing. Something like 'avoid unnecessary null
check/cg_destroy() invocation' would be far clearer.
>
> 1. Thus return directly after a call of the function “cg_name” failed.
>
This feels superfluious.
> 2. Use an additional label.
This also feels superfluious.
>
> 3. Delete a questionable check.
This seems superfluois and frankly, rude. It's not questionable, it's
readable, you should try to avoid language like 'questionable' when the
purpose of the check is obvious.
>
>
> This issue was detected by using the Coccinelle software.
>
> Fixes: a987785dcd6c8ae2915460582aebd6481c81eb67 ("Add tests for memory.oom.group")
Fixes what in the what now? This is not a bug fix, it's a 'questionable'
refactoring.
> Signed-off-by: Markus Elfring <elfring(a)users.sourceforge.net>
> ---
> tools/testing/selftests/cgroup/test_memcontrol.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index f4f7c0aef702..afcd1752413e 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -1242,12 +1242,11 @@ static int test_memcg_oom_group_score_events(const char *root)
> int safe_pid;
>
> memcg = cg_name(root, "memcg_test_0");
> -
> if (!memcg)
> - goto cleanup;
> + return ret;
>
> if (cg_create(memcg))
> - goto cleanup;
> + goto free_cg;
>
> if (cg_write(memcg, "memory.max", "50M"))
> goto cleanup;
> @@ -1275,8 +1274,8 @@ static int test_memcg_oom_group_score_events(const char *root)
> ret = KSFT_PASS;
>
> cleanup:
> - if (memcg)
> - cg_destroy(memcg);
> + cg_destroy(memcg);
> +free_cg:
> free(memcg);
>
> return ret;
> --
> 2.40.0
>
>
I dislike this patch, it adds complexity for no discernible purpose and
actively makes the code _less_ readable and in a self-test of all places (!)
Not all pedantic Coccinelle results are actionable. Remember that it's
humans who are reading the code.
Your email client/scripting is still somehow broken, I couldn't get b4 to
pull it correctly and it seems to have a duplicate message ID. You really
need to take a look at that (git send-email should do this fine for
example).
Please try to filter the output of Coccinelle and instead of spamming
thousands of pointless patches that add no value, try to choose those that
do add value.
My advice overall would be to just stop.
I've been trying to do some stuff with KUnit but I can't seem to
find a current tree where KUnit builds. Running on Debian stable
starting from a clean -next tree and running:
./tools/testing/kunit/kunit.py config
./tools/testing/kunit/kunit.py build
based on Documentation/dev-tools/kunit/start.rst. However I get:
[00:42:59] Configuring KUnit Kernel ...
[00:42:59] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make ARCH=um O=.kunit --jobs=8
ERROR:root:In file included from /usr/include/stdlib.h:1013,
from ../arch/x86/um/os-Linux/registers.c:8:
/usr/include/x86_64-linux-gnu/bits/stdlib-float.h: In function ‘atof’:
/usr/include/x86_64-linux-gnu/bits/stdlib-float.h:26:1: error: SSE register return with SSE disabled
26 | {
| ^
make[4]: *** [../scripts/Makefile.build:252: arch/x86/um/os-Linux/registers.o] Error 1
make[3]: *** [../scripts/Makefile.build:494: arch/x86/um/os-Linux] Error 2
make[3]: *** Waiting for unfinished jobs....
In file included from /usr/include/stdlib.h:1013,
from ../arch/um/drivers/fd.c:7:
/usr/include/x86_64-linux-gnu/bits/stdlib-float.h: In function ‘atof’:
/usr/include/x86_64-linux-gnu/bits/stdlib-float.h:26:1: error: SSE register return with SSE disabled
26 | {
| ^
make[3]: *** [../scripts/Makefile.build:252: arch/um/drivers/fd.o] Error 1
make[3]: *** Waiting for unfinished jobs....
In file included from /usr/include/stdlib.h:1013,
from ../arch/um/os-Linux/skas/process.c:7:
/usr/include/x86_64-linux-gnu/bits/stdlib-float.h: In function ‘atof’:
/usr/include/x86_64-linux-gnu/bits/stdlib-float.h:26:1: error: SSE register return with SSE disabled
26 | {
| ^
make[4]: *** [../scripts/Makefile.build:252: arch/um/os-Linux/skas/process.o] Error 1
make[3]: *** [../scripts/Makefile.build:494: arch/um/os-Linux/skas] Error 2
make[2]: *** [../scripts/Makefile.build:494: arch/um/os-Linux] Error 2
make[2]: *** Waiting for unfinished jobs....
make[2]: *** [../scripts/Makefile.build:494: arch/x86/um] Error 2
make[2]: *** [../scripts/Makefile.build:494: arch/um/drivers] Error 2
In file included from /usr/include/stdlib.h:1013,
from arch/um/kernel/config.c:7:
/usr/include/x86_64-linux-gnu/bits/stdlib-float.h: In function ‘atof’:
/usr/include/x86_64-linux-gnu/bits/stdlib-float.h:26:1: error: SSE register return with SSE disabled
26 | {
| ^
make[3]: *** [../scripts/Makefile.build:252: arch/um/kernel/config.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [../scripts/Makefile.build:494: arch/um/kernel] Error 2
make[1]: *** [/home/broonie/git/bisect/Makefile:2028: .] Error 2
make: *** [Makefile:226: __sub-make] Error 2
[00:43:20] Elapsed time: 20.233s
which isn't ideal. v6.2 is also broken, albeit differently:
ERROR:root:`.exit.text' referenced in section `.uml.exitcall.exit' of arch/um/drivers/virtio_uml.o: defined in discarded section `.exit.text' of arch/um/drivers/virtio_uml.o
collect2: error: ld returned 1 exit status
make[2]: *** [../scripts/Makefile.vmlinux:35: vmlinux] Error 1
make[1]: *** [/home/broonie/git/linux/Makefile:1264: vmlinux] Error 2
make: *** [Makefile:242: __sub-make] Error 2
which makes bisecting a bit of an issue. The kunit-fixes, kunit
and kunit-next trees in -next have the former error. Can anyone
point me at a tree/config/commands that's suitable for working on
KUnit at the minute?
There is a 'malloc' call in vcpu_save_state function, which can
be unsuccessful. This patch will add the malloc failure checking
to avoid possible null dereference and give more information
about test fail reasons.
Signed-off-by: Ivan Orlov <ivan.orlov0322(a)gmail.com>
---
tools/testing/selftests/kvm/lib/x86_64/processor.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index c39a4353ba19..827647ff3d41 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -954,6 +954,7 @@ struct kvm_x86_state *vcpu_save_state(struct kvm_vcpu *vcpu)
vcpu_run_complete_io(vcpu);
state = malloc(sizeof(*state) + msr_list->nmsrs * sizeof(state->msrs.entries[0]));
+ TEST_ASSERT(state, "-ENOMEM when allocating kvm state");
vcpu_events_get(vcpu, &state->events);
vcpu_mp_state_get(vcpu, &state->mp_state);
--
2.34.1
In this version, I have integrated Aaron's changes to the amx_test. In
addition, we also integrated one fix patch for a kernel warning due to
xsave address issue.
Patch 1:
Fix a host FPU kernel warning due to missing XTILEDATA in xinit.
Patch 2-8:
Overhaul amx_test. These patches were basically from v2.
Patch 9-13:
Overhaul amx_test from Aaron. I modified the changelog a little bit.
v2 -> v3:
- integrate Aaron's 5 commits with minor changes on commit message.
- Add one fix patch for a kernel warning.
v2:
https://lore.kernel.org/all/20230214184606.510551-1-mizhang@google.com/
Aaron Lewis (5):
KVM: selftests: x86: Assert that XTILE is XSAVE-enabled
KVM: selftests: x86: Assert that both XTILE{CFG,DATA} are
XSAVE-enabled
KVM: selftests: x86: Remove redundant check that XSAVE is supported
KVM: selftests: x86: Check that the palette table exists before using
it
KVM: selftests: x86: Check that XTILEDATA supports XFD
Mingwei Zhang (8):
x86/fpu/xstate: Avoid getting xstate address of init_fpstate if
fpstate contains the component
KVM: selftests: x86: Add a working xstate data structure
KVM: selftests: x86: Fix an error in comment of amx_test
KVM: selftests: x86: Add check of CR0.TS in the #NM handler in
amx_test
KVM: selftests: x86: Add the XFD check to IA32_XFD in #NM handler
KVM: selftests: x86: Fix the checks to XFD_ERR using and operation
KVM: selftests: x86: Enable checking on xcomp_bv in amx_test
KVM: selftests: x86: Repeat the checking of xheader when
IA32_XFD[XTILEDATA] is set in amx_test
arch/x86/kernel/fpu/xstate.c | 10 ++-
.../selftests/kvm/include/x86_64/processor.h | 14 ++++
tools/testing/selftests/kvm/x86_64/amx_test.c | 80 +++++++++----------
3 files changed, 59 insertions(+), 45 deletions(-)
--
2.39.2.637.g21b0678d19-goog
Align the guest stack to match calling sequence requirements in
section "The Stack Frame" of the System V ABI AMD64 Architecture
Processor Supplement, which requires the value (%rsp + 8), NOT %rsp,
to be a multiple of 16 when control is transferred to the function
entry point. I.e. in a normal function call, %rsp needs to be 16-byte
aligned _before_ CALL, not after.
This fixes unexpected #GPs in guest code when the compiler uses SSE
instructions, e.g. to initialize memory, as many SSE instructions
require memory operands (including those on the stack) to be
16-byte-aligned.
Signed-off-by: Ackerley Tng <ackerleytng(a)google.com>
---
This patch is a follow-up from discussions at
https://lore.kernel.org/lkml/20230121001542.2472357-9-ackerleytng@google.co…
v1 -> v2: Cleaned the patch up after getting comments from Sean in
v1: https://lore.kernel.org/lkml/Y%2FfHLdvKHlK6D%2F1v@google.com/
Please also see
https://lore.kernel.org/lkml/20230227174654.94641-1-ackerleytng@google.com/
regarding providing alignment macros for selftests.
---
.../selftests/kvm/lib/x86_64/processor.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index ae1e573d94ce..a0669d31bb85 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -5,6 +5,7 @@
* Copyright (C) 2018, Google LLC.
*/
+#include "linux/bitmap.h"
#include "test_util.h"
#include "kvm_util.h"
#include "processor.h"
@@ -573,6 +574,21 @@ struct kvm_vcpu *vm_arch_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id,
DEFAULT_GUEST_STACK_VADDR_MIN,
MEM_REGION_DATA);
+ stack_vaddr += DEFAULT_STACK_PGS * getpagesize();
+
+ /*
+ * Align stack to match calling sequence requirements in section "The
+ * Stack Frame" of the System V ABI AMD64 Architecture Processor
+ * Supplement, which requires the value (%rsp + 8) to be a multiple of
+ * 16 when control is transferred to the function entry point.
+ *
+ * If this code is ever used to launch a vCPU with 32-bit entry point it
+ * may need to subtract 4 bytes instead of 8 bytes.
+ */
+ TEST_ASSERT(IS_ALIGNED(stack_vaddr, PAGE_SIZE),
+ "__vm_vaddr_alloc() did not provide a page-aligned address");
+ stack_vaddr -= 8;
+
vcpu = __vm_vcpu_add(vm, vcpu_id);
vcpu_init_cpuid(vcpu, kvm_get_supported_cpuid());
vcpu_setup(vm, vcpu);
@@ -580,7 +596,7 @@ struct kvm_vcpu *vm_arch_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id,
/* Setup guest general purpose registers */
vcpu_regs_get(vcpu, ®s);
regs.rflags = regs.rflags | 0x2;
- regs.rsp = stack_vaddr + (DEFAULT_STACK_PGS * getpagesize());
+ regs.rsp = stack_vaddr;
regs.rip = (unsigned long) guest_code;
vcpu_regs_set(vcpu, ®s);
--
2.39.2.722.g9855ee24e9-goog
On Fri, Mar 24, 2023 at 7:13 AM Markus Elfring <Markus.Elfring(a)web.de> wrote:
>
> Date: Fri, 24 Mar 2023 14:54:18 +0100
>
> The label “err_out” was used to jump to another pointer check despite of
> the detail in the implementation of the function “rbtree_add_and_remove”
> that it was determined already that a corresponding variable contained
> a null pointer.
>
> 1. Thus return directly after the first call of the function
> “bpf_obj_new” failed.
>
> 2. Delete two questionable checks.
>
> 3. Omit an extra initialisation (for the variable “m”)
> which became unnecessary with this refactoring.
>
>
> This issue was detected by using the Coccinelle software.
>
> Fixes: 215249f6adc0359e3546829e7ee622b5e309b0ad ("selftests/bpf: Add rbtree selftests")
> Signed-off-by: Markus Elfring <elfring(a)users.sourceforge.net>
Nack.
Please stop sending such "cleanup" patches.
Patch 1 removes an unneeded address copy in subflow_syn_recv_sock().
Patch 2 simplifies subflow_syn_recv_sock() to postpone some actions and
to avoid a bunch of conditionals.
Patch 3 stops reporting limits that are not taken into account when the
userspace PM is used.
Patch 4 adds a new test to validate that the 'subflows' field reported
by the kernel is correct. Such info can be retrieved via Netlink (e.g.
with ss) or getsockopt(SOL_MPTCP, MPTCP_INFO).
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
---
Geliang Tang (1):
selftests: mptcp: add mptcp_info tests
Matthieu Baerts (1):
mptcp: do not fill info not used by the PM in used
Paolo Abeni (2):
mptcp: avoid unneeded address copy
mptcp: simplify subflow_syn_recv_sock()
net/mptcp/sockopt.c | 20 +++++++----
net/mptcp/subflow.c | 43 +++++++---------------
tools/testing/selftests/net/mptcp/mptcp_join.sh | 47 ++++++++++++++++++++++++-
3 files changed, 72 insertions(+), 38 deletions(-)
---
base-commit: 323fe43cf9aef79159ba8937218a3f076bf505af
change-id: 20230324-upstream-net-next-20230324-misc-features-178b2b618414
Best regards,
--
Matthieu Baerts <matthieu.baerts(a)tessares.net>
[ This series depends on VFIO device cdev series ]
Changelog
v5:
* Kept the cmd->id in the iommufd_test_create_access() so the access can
be created with an ioas by default. Then, renamed the previous ioctl
IOMMU_TEST_OP_ACCESS_SET_IOAS to IOMMU_TEST_OP_ACCESS_REPLACE_IOAS, so
it would be used to replace an access->ioas pointer.
* Added iommufd_access_replace() API after the introductions of the other
two APIs iommufd_access_attach() and iommufd_access_detach().
* Since vdev->iommufd_attached is also set in emulated pathway too, call
iommufd_access_update(), similar to the physical pathway.
v4:
https://lore.kernel.org/linux-iommu/cover.1678284812.git.nicolinc@nvidia.co…
* Rebased on top of Jason's series adding replace() and hwpt_alloc()
https://lore.kernel.org/linux-iommu/0-v2-51b9896e7862+8a8c-iommufd_alloc_jg…
* Rebased on top of cdev series v6
https://lore.kernel.org/kvm/20230308132903.465159-1-yi.l.liu@intel.com/
* Dropped the patch that's moved to cdev series.
* Added unmap function pointer sanity before calling it.
* Added "Reviewed-by" from Kevin and Yi.
* Added back the VFIO change updating the ATTACH uAPI.
v3:
https://lore.kernel.org/linux-iommu/cover.1677288789.git.nicolinc@nvidia.co…
* Rebased on top of Jason's iommufd_hwpt branch:
https://lore.kernel.org/linux-iommu/0-v2-406f7ac07936+6a-iommufd_hwpt_jgg@n…
* Dropped patches from this series accordingly. There were a couple of
VFIO patches that will be submitted after the VFIO cdev series. Also,
renamed the series to be "emulated".
* Moved dma_unmap sanity patch to the first in the series.
* Moved dma_unmap sanity to cover both VFIO and IOMMUFD pathways.
* Added Kevin's "Reviewed-by" to two of the patches.
* Fixed a NULL pointer bug in vfio_iommufd_emulated_bind().
* Moved unmap() call to the common place in iommufd_access_set_ioas().
v2:
https://lore.kernel.org/linux-iommu/cover.1675802050.git.nicolinc@nvidia.co…
* Rebased on top of vfio_device cdev v2 series.
* Update the kdoc and commit message of iommu_group_replace_domain().
* Dropped revert-to-core-domain part in iommu_group_replace_domain().
* Dropped !ops->dma_unmap check in vfio_iommufd_emulated_attach_ioas().
* Added missing rc value in vfio_iommufd_emulated_attach_ioas() from the
iommufd_access_set_ioas() call.
* Added a new patch in vfio_main to deny vfio_pin/unpin_pages() calls if
vdev->ops->dma_unmap is not implemented.
* Added a __iommmufd_device_detach helper and let the replace routine do
a partial detach().
* Added restriction on auto_domains to use the replace feature.
* Added the patch "iommufd/device: Make hwpt_list list_add/del symmetric"
from the has_group removal series.
v1:
https://lore.kernel.org/linux-iommu/cover.1675320212.git.nicolinc@nvidia.co…
Hi all,
The existing IOMMU APIs provide a pair of functions: iommu_attach_group()
for callers to attach a device from the default_domain (NULL if not being
supported) to a given iommu domain, and iommu_detach_group() for callers
to detach a device from a given domain to the default_domain. Internally,
the detach_dev op is deprecated for the newer drivers with default_domain.
This means that those drivers likely can switch an attaching domain to
another one, without stagging the device at a blocking or default domain,
for use cases such as:
1) vPASID mode, when a guest wants to replace a single pasid (PASID=0)
table with a larger table (PASID=N)
2) Nesting mode, when switching the attaching device from an S2 domain
to an S1 domain, or when switching between relevant S1 domains.
This series is rebased on top of Jason Gunthorpe's series that introduces
iommu_group_replace_domain API and IOMMUFD infrastructure for the IOMMUFD
"physical" devices. The IOMMUFD "emulated" deivces will need some extra
steps to replace the access->ioas object and its iopt pointer.
You can also find this series on Github:
https://github.com/nicolinc/iommufd/commits/iommu_group_replace_domain-v5
Thank you
Nicolin Chen
Nicolin Chen (4):
vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages()
iommufd: Add iommufd_access_replace() API
iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_REPLACE_IOAS coverage
vfio: Support IO page table replacement
drivers/iommu/iommufd/device.c | 60 +++++++++++++++----
drivers/iommu/iommufd/iommufd_test.h | 4 ++
drivers/iommu/iommufd/selftest.c | 19 ++++++
drivers/vfio/iommufd.c | 11 ++--
drivers/vfio/vfio_main.c | 4 ++
include/linux/iommufd.h | 1 +
include/uapi/linux/vfio.h | 6 ++
tools/testing/selftests/iommu/iommfd*.c | 0
tools/testing/selftests/iommu/iommufd.c | 29 ++++++++-
tools/testing/selftests/iommu/iommufd_utils.h | 19 ++++++
10 files changed, 135 insertions(+), 18 deletions(-)
create mode 100644 tools/testing/selftests/iommu/iommfd*.c
--
2.40.0
From: Zi Yan <ziy(a)nvidia.com>
Hi all,
File folio supports any order and people would like to support flexible orders
for anonymous folio[1] too. Currently, split_huge_page() only splits a huge
page to order-0 pages, but splitting to orders higher than 0 is also useful.
This patchset adds support for splitting a huge page to any lower order pages
and uses it during folio truncate operations.
The patchset is on top of mm-everything-2023-03-19-21-50.
* Patch 1 and 2 add new_order parameter split_page_memcg() and
split_page_owner() and prepare for upcoming changes.
* Patch 3 adds split_huge_page_to_list_to_order() to split a huge page
to any lower order. The original split_huge_page_to_list() calls
split_huge_page_to_list_to_order() with new_order = 0.
* Patch 4 uses split_huge_page_to_list_to_order() in large pagecache folio
truncation instead of split the large folio all the way down to order-0.
* Patch 5 adds a test API to debugfs and test cases in
split_huge_page_test selftests.
Comments and/or suggestions are welcome.
[1] https://lore.kernel.org/linux-mm/Y%2FblF0GIunm+pRIC@casper.infradead.org/
Zi Yan (5):
mm: memcg: make memcg huge page split support any order split.
mm: page_owner: add support for splitting to any order in split
page_owner.
mm: thp: split huge page to any lower order pages.
mm: truncate: split huge page cache page to a non-zero order if
possible.
mm: huge_memory: enable debugfs to split huge pages to any order.
include/linux/huge_mm.h | 10 +-
include/linux/memcontrol.h | 5 +-
include/linux/page_owner.h | 12 +-
mm/huge_memory.c | 138 ++++++++---
mm/memcontrol.c | 8 +-
mm/page_alloc.c | 8 +-
mm/page_owner.c | 11 +-
mm/truncate.c | 21 +-
.../selftests/mm/split_huge_page_test.c | 225 +++++++++++++++++-
9 files changed, 368 insertions(+), 70 deletions(-)
--
2.39.2
Add a new selftest for CMMA migration. Also fix a small issue found during
development of the test.
Nico Boehr (2):
KVM: s390: selftests: add selftest for CMMA migration
KVM: s390: fix KVM_S390_GET_CMMA_BITS for GFNs in memslot holes
arch/s390/kvm/kvm-s390.c | 4 +
tools/testing/selftests/kvm/Makefile | 1 +
tools/testing/selftests/kvm/s390x/cmma_test.c | 679 ++++++++++++++++++
3 files changed, 684 insertions(+)
create mode 100644 tools/testing/selftests/kvm/s390x/cmma_test.c
--
2.39.1
When reporting errors or skips we currently include the diagnostic message
indicating why we're failing or skipping. This isn't ideal since KTAP
defines the entire print as the test name, so if there's an error then test
systems won't detect the test as being the same one as a passing test. Move
the diagnostic to a separate ksft_print_msg() to avoid this issue, the test
name part will always be the same for passes, fails and skips and the
diagnostic information is still displayed.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/alsa/pcm-test.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/alsa/pcm-test.c b/tools/testing/selftests/alsa/pcm-test.c
index 58b525a4a32c..bab56ea67e89 100644
--- a/tools/testing/selftests/alsa/pcm-test.c
+++ b/tools/testing/selftests/alsa/pcm-test.c
@@ -489,17 +489,18 @@ static void test_pcm_time(struct pcm_data *data, enum test_class class,
}
if (!skip)
- ksft_test_result(pass, "%s.%s.%d.%d.%d.%s%s%s\n",
+ ksft_test_result(pass, "%s.%s.%d.%d.%d.%s\n",
test_class_name, test_name,
data->card, data->device, data->subdevice,
- snd_pcm_stream_name(data->stream),
- msg[0] ? " " : "", msg);
+ snd_pcm_stream_name(data->stream));
else
- ksft_test_result_skip("%s.%s.%d.%d.%d.%s%s%s\n",
+ ksft_test_result_skip("%s.%s.%d.%d.%d.%s\n",
test_class_name, test_name,
data->card, data->device, data->subdevice,
- snd_pcm_stream_name(data->stream),
- msg[0] ? " " : "", msg);
+ snd_pcm_stream_name(data->stream));
+
+ if (msg[0])
+ ksft_print_msg("%s\n", msg);
pthread_mutex_unlock(&results_lock);
---
base-commit: e8d018dd0257f744ca50a729e3d042cf2ec9da65
change-id: 20230323-alsa-pcm-test-names-bcd31b586ca9
Best regards,
--
Mark Brown <broonie(a)kernel.org>
This is useful when using nolibc for security-critical tools.
Using nolibc has the advantage that the code is easily auditable and
sandboxable with seccomp as no unexpected syscalls are used.
Using compiler-assistent stack protection provides another security
mechanism.
For this to work the compiler and libc have to collaborate.
This patch adds the following parts to nolibc that are required by the
compiler:
* __stack_chk_guard: random sentinel value
* __stack_chk_fail: handler for detected stack smashes
In addition an initialization function is added that randomizes the
sentinel value.
Only support for global guards is implemented.
Register guards are useful in multi-threaded context which nolibc does
not provide support for.
Link: https://lwn.net/Articles/584225/
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Changes in v2:
- Code and comments style fixes
- Only use raw syscalls in stackprotector functions
- Remove need for dedicated entrypoint and exec() during tests
- Add more rationale
- Shuffle some code around between commits
- Provide compatibility with the -fno-stack-protector patch
- Remove RFC status
- Link to v1: https://lore.kernel.org/r/20230223-nolibc-stackprotector-v1-0-3e74d81b3f21@…
This series is based on the current rcu/dev branch of Pauls rcu tree.
---
Thomas Weißschuh (8):
tools/nolibc: add definitions for standard fds
tools/nolibc: add helpers for wait() signal exits
tools/nolibc: tests: constify test_names
tools/nolibc: add support for stack protector
tools/nolibc: tests: fold in no-stack-protector cflags
tools/nolibc: tests: add test for -fstack-protector
tools/nolibc: i386: add stackprotector support
tools/nolibc: x86_64: add stackprotector support
tools/include/nolibc/Makefile | 4 +-
tools/include/nolibc/arch-i386.h | 7 ++-
tools/include/nolibc/arch-x86_64.h | 5 +++
tools/include/nolibc/nolibc.h | 1 +
tools/include/nolibc/stackprotector.h | 53 +++++++++++++++++++++++
tools/include/nolibc/types.h | 2 +
tools/include/nolibc/unistd.h | 5 +++
tools/testing/selftests/nolibc/Makefile | 11 ++++-
tools/testing/selftests/nolibc/nolibc-test.c | 64 ++++++++++++++++++++++++++--
9 files changed, 144 insertions(+), 8 deletions(-)
---
base-commit: a9b8406e51603238941dbc6fa1437f8915254ebb
change-id: 20230223-nolibc-stackprotector-d4d5f48ff771
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Hi,
This series adds initial KVM selftests support for powerpc
(64-bit, BookS). It spans 3 maintainers but it does not really
affect arch/powerpc, and it is well contained in selftests
code, just touches some makefiles and a tiny bit headers so
conflicts should be unlikely and trivial.
Hey Paolo and KVM group, if you didn't take the v1 series yet, could
you please take this instead. Otherwise I can send an incremental
fixup.
Since v1:
- r2 (TOC) was not being set for guest code
- MSR[VSX] was not being set for guest code
- Proper guest interrupt handling instead of quick hack that
just made a ucall out to host.
- Adjust subject to better match kvm selftests convention.
Thanks,
Nick
Nicholas Piggin (2):
KVM: PPC: selftests: implement support for powerpc
KVM: PPC: selftests: basic sanity tests
tools/testing/selftests/kvm/Makefile | 15 +
.../selftests/kvm/include/kvm_util_base.h | 13 +
.../selftests/kvm/include/powerpc/hcall.h | 22 +
.../selftests/kvm/include/powerpc/ppc_asm.h | 17 +
.../selftests/kvm/include/powerpc/processor.h | 32 ++
tools/testing/selftests/kvm/lib/kvm_util.c | 10 +
.../selftests/kvm/lib/powerpc/handlers.S | 96 ++++
.../testing/selftests/kvm/lib/powerpc/hcall.c | 45 ++
.../selftests/kvm/lib/powerpc/processor.c | 411 ++++++++++++++++++
.../testing/selftests/kvm/lib/powerpc/ucall.c | 30 ++
tools/testing/selftests/kvm/powerpc/helpers.h | 46 ++
.../testing/selftests/kvm/powerpc/null_test.c | 166 +++++++
.../selftests/kvm/powerpc/rtas_hcall.c | 146 +++++++
13 files changed, 1049 insertions(+)
create mode 100644 tools/testing/selftests/kvm/include/powerpc/hcall.h
create mode 100644 tools/testing/selftests/kvm/include/powerpc/ppc_asm.h
create mode 100644 tools/testing/selftests/kvm/include/powerpc/processor.h
create mode 100644 tools/testing/selftests/kvm/lib/powerpc/handlers.S
create mode 100644 tools/testing/selftests/kvm/lib/powerpc/hcall.c
create mode 100644 tools/testing/selftests/kvm/lib/powerpc/processor.c
create mode 100644 tools/testing/selftests/kvm/lib/powerpc/ucall.c
create mode 100644 tools/testing/selftests/kvm/powerpc/helpers.h
create mode 100644 tools/testing/selftests/kvm/powerpc/null_test.c
create mode 100644 tools/testing/selftests/kvm/powerpc/rtas_hcall.c
--
2.37.2
These patches are based on next-20230307 and UFFD_FEATURE_WP_UNPOPULATED
patches from Peter.
*Changes in v11*
- Rebase on top of next-20230307
- Base patches on UFFD_FEATURE_WP_UNPOPULATED (https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com)
- Do a lot of cosmetic changes and review updates
- Remove ENGAGE_WP + ! GET operation as it can be performed with UFFDIO_WRITEPROTECT
*Changes in v10*
- Add specific condition to return error if hugetlb is used with wp
async
- Move changes in tools/include/uapi/linux/fs.h to separate patch
- Add documentation
*Changes in v9:*
- Correct fault resolution for userfaultfd wp async
- Fix build warnings and errors which were happening on some configs
- Simplify pagemap ioctl's code
*Changes in v8:*
- Update uffd async wp implementation
- Improve PAGEMAP_IOCTL implementation
*Changes in v7:*
- Add uffd wp async
- Update the IOCTL to use uffd under the hood instead of soft-dirty
flags
Hello,
Note:
Soft-dirty pages and pages which have been written-to are synonyms. As
kernel already has soft-dirty feature inside which we have given up to
use, we are using written-to terminology while using UFFD async WP under
the hood.
This IOCTL, PAGEMAP_SCAN on pagemap file can be used to get and/or clear
the info about page table entries. The following operations are
supported in this ioctl:
- Get the information if the pages have been written-to (PAGE_IS_WRITTEN),
file mapped (PAGE_IS_FILE), present (PAGE_IS_PRESENT) or swapped
(PAGE_IS_SWAPPED).
- Write-protect the pages (PAGEMAP_WP_ENGAGE) to start finding which
pages have been written-to.
- Find pages which have been written-to and write protect the pages
(atomic PAGE_IS_WRITTEN + PAGEMAP_WP_ENGAGE)
It is possible to find and clear soft-dirty pages entirely in userspace.
But it isn't efficient:
- The mprotect and SIGSEGV handler for bookkeeping
- The userfaultfd wp (synchronous) with the handler for bookkeeping
Some benchmarks can be seen here[1]. This series adds features that weren't
present earlier:
- There is no atomic get soft-dirty/Written-to status and clear present in
the kernel.
- The pages which have been written-to can not be found in accurate way.
(Kernel's soft-dirty PTE bit + sof_dirty VMA bit shows more soft-dirty
pages than there actually are.)
Historically, soft-dirty PTE bit tracking has been used in the CRIU
project. The procfs interface is enough for finding the soft-dirty bit
status and clearing the soft-dirty bit of all the pages of a process.
We have the use case where we need to track the soft-dirty PTE bit for
only specific pages on-demand. We need this tracking and clear mechanism
of a region of memory while the process is running to emulate the
getWriteWatch() syscall of Windows.
*(Moved to using UFFD instead of soft-dirtyi feature to find pages which
have been written-to from v7 patch series)*:
Stop using the soft-dirty flags for finding which pages have been
written to. It is too delicate and wrong as it shows more soft-dirty
pages than the actual soft-dirty pages. There is no interest in
correcting it [2][3] as this is how the feature was written years ago.
It shouldn't be updated to changed behaviour. Peter Xu has suggested
using the async version of the UFFD WP [4] as it is based inherently
on the PTEs.
So in this patch series, I've added a new mode to the UFFD which is
asynchronous version of the write protect. When this variant of the
UFFD WP is used, the page faults are resolved automatically by the
kernel. The pages which have been written-to can be found by reading
pagemap file (!PM_UFFD_WP). This feature can be used successfully to
find which pages have been written to from the time the pages were
write protected. This works just like the soft-dirty flag without
showing any extra pages which aren't soft-dirty in reality.
The information related to pages if the page is file mapped, present and
swapped is required for the CRIU project [5][6]. The addition of the
required mask, any mask, excluded mask and return masks are also required
for the CRIU project [5].
The IOCTL returns the addresses of the pages which match the specific
masks. The page addresses are returned in struct page_region in a compact
form. The max_pages is needed to support a use case where user only wants
to get a specific number of pages. So there is no need to find all the
pages of interest in the range when max_pages is specified. The IOCTL
returns when the maximum number of the pages are found. The max_pages is
optional. If max_pages is specified, it must be equal or greater than the
vec_size. This restriction is needed to handle worse case when one
page_region only contains info of one page and it cannot be compacted.
This is needed to emulate the Windows getWriteWatch() syscall.
The patch series include the detailed selftest which can be used as an
example for the uffd async wp test and PAGEMAP_IOCTL. It shows the
interface usages as well.
[1] https://lore.kernel.org/lkml/54d4c322-cd6e-eefd-b161-2af2b56aae24@collabora…
[2] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[3] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[4] https://lore.kernel.org/all/Y6Hc2d+7eTKs7AiH@x1n
[5] https://lore.kernel.org/all/YyiDg79flhWoMDZB@gmail.com/
[6] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com/
Regards,
Muhammad Usama Anjum
Muhammad Usama Anjum (7):
userfaultfd: Add UFFD WP Async support
userfaultfd: Define dummy uffd_wp_range()
userfaultfd: update documentation to describe UFFD_FEATURE_WP_ASYNC
fs/proc/task_mmu: Implement IOCTL to get and optionally clear info
about PTEs
tools headers UAPI: Update linux/fs.h with the kernel sources
mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL
selftests: mm: add pagemap ioctl tests
Documentation/admin-guide/mm/pagemap.rst | 56 ++
Documentation/admin-guide/mm/userfaultfd.rst | 21 +
fs/proc/task_mmu.c | 366 ++++++++
fs/userfaultfd.c | 25 +-
include/linux/userfaultfd_k.h | 14 +
include/uapi/linux/fs.h | 53 ++
include/uapi/linux/userfaultfd.h | 11 +-
mm/memory.c | 27 +-
tools/include/uapi/linux/fs.h | 53 ++
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 4 +-
tools/testing/selftests/mm/config | 1 +
tools/testing/selftests/mm/pagemap_ioctl.c | 920 +++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 4 +
14 files changed, 1549 insertions(+), 7 deletions(-)
create mode 100644 tools/testing/selftests/mm/pagemap_ioctl.c
mode change 100644 => 100755 tools/testing/selftests/mm/run_vmtests.sh
--
2.39.2
Although there is a provision for 52 bit VA on arm64 platform, it remains
unutilised and higher addresses are not allocated. In order to accommodate
4PB [2^52] virtual address space where supported, NR_CHUNKS_HIGH is changed
accordingly.
Array holding addresses is changed from static allocation to dynamic
allocation to accommodate its voluminous nature which otherwise might
overflow the stack.
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: linux-mm(a)kvack.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash(a)arm.com>
---
tools/testing/selftests/mm/virtual_address_range.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c
index 50564512c5ee..bae0ceaf95b1 100644
--- a/tools/testing/selftests/mm/virtual_address_range.c
+++ b/tools/testing/selftests/mm/virtual_address_range.c
@@ -36,13 +36,15 @@
* till it reaches 512TB. One with size 128TB and the
* other being 384TB.
*
- * On Arm64 the address space is 256TB and no high mappings
- * are supported so far.
+ * On Arm64 the address space is 256TB and support for
+ * high mappings up to 4PB virtual address space has
+ * been added.
*/
#define NR_CHUNKS_128TB ((128 * SZ_1TB) / MAP_CHUNK_SIZE) /* Number of chunks for 128TB */
#define NR_CHUNKS_256TB (NR_CHUNKS_128TB * 2UL)
#define NR_CHUNKS_384TB (NR_CHUNKS_128TB * 3UL)
+#define NR_CHUNKS_3840TB (NR_CHUNKS_128TB * 30UL)
#define ADDR_MARK_128TB (1UL << 47) /* First address beyond 128TB */
#define ADDR_MARK_256TB (1UL << 48) /* First address beyond 256TB */
@@ -51,7 +53,7 @@
#define HIGH_ADDR_MARK ADDR_MARK_256TB
#define HIGH_ADDR_SHIFT 49
#define NR_CHUNKS_LOW NR_CHUNKS_256TB
-#define NR_CHUNKS_HIGH 0
+#define NR_CHUNKS_HIGH NR_CHUNKS_3840TB
#else
#define HIGH_ADDR_MARK ADDR_MARK_128TB
#define HIGH_ADDR_SHIFT 48
@@ -101,7 +103,7 @@ static int validate_lower_address_hint(void)
int main(int argc, char *argv[])
{
char *ptr[NR_CHUNKS_LOW];
- char *hptr[NR_CHUNKS_HIGH];
+ char **hptr;
char *hint;
unsigned long i, lchunks, hchunks;
@@ -119,6 +121,9 @@ int main(int argc, char *argv[])
return 1;
}
lchunks = i;
+ hptr = (char **) calloc(NR_CHUNKS_HIGH, sizeof(char *));
+ if (hptr == NULL)
+ return 1;
for (i = 0; i < NR_CHUNKS_HIGH; i++) {
hint = hind_addr();
@@ -139,5 +144,6 @@ int main(int argc, char *argv[])
for (i = 0; i < hchunks; i++)
munmap(hptr[i], MAP_CHUNK_SIZE);
+ free(hptr);
return 0;
}
--
2.30.2
mmap() fails to allocate 16GB virtual space chunk, skipping both low and
high VA range iterations. Hence, reduce MAP_CHUNK_SIZE to 1GB and update
relevant macros as required.
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: linux-mm(a)kvack.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash(a)arm.com>
---
tools/testing/selftests/mm/virtual_address_range.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c
index c0592646ed93..50564512c5ee 100644
--- a/tools/testing/selftests/mm/virtual_address_range.c
+++ b/tools/testing/selftests/mm/virtual_address_range.c
@@ -15,11 +15,15 @@
/*
* Maximum address range mapped with a single mmap()
- * call is little bit more than 16GB. Hence 16GB is
+ * call is little bit more than 1GB. Hence 1GB is
* chosen as the single chunk size for address space
* mapping.
*/
-#define MAP_CHUNK_SIZE 17179869184UL /* 16GB */
+
+#define SZ_1GB (1024 * 1024 * 1024UL)
+#define SZ_1TB (1024 * 1024 * 1024 * 1024UL)
+
+#define MAP_CHUNK_SIZE SZ_1GB
/*
* Address space till 128TB is mapped without any hint
@@ -36,7 +40,7 @@
* are supported so far.
*/
-#define NR_CHUNKS_128TB 8192UL /* Number of 16GB chunks for 128TB */
+#define NR_CHUNKS_128TB ((128 * SZ_1TB) / MAP_CHUNK_SIZE) /* Number of chunks for 128TB */
#define NR_CHUNKS_256TB (NR_CHUNKS_128TB * 2UL)
#define NR_CHUNKS_384TB (NR_CHUNKS_128TB * 3UL)
--
2.30.2
Hi,
I of course took the opportunity at my first time to make a mistake: two
patches were missing in v1.. please note that patch #9 and #10 are newly
added.
Previous version:
v1: https://lore.kernel.org/rcu/20230315235454.2993-1-boqun.feng@gmail.com/
Changes since v1:
* Add two missing patches.
* Fix checkpatch warnings.
You will also be able to find the series at:
https://github.com/fbq/linux rcu/rcutorture.2023.03.20a
top commit is:
6bc6e6b27524
List of changes:
Bhaskar Chowdhury (1):
tools: rcu: Add usage function and check for argument
Paul E. McKenney (7):
rcutorture: Add test_nmis module parameter
rcutorture: Set CONFIG_BOOTPARAM_HOTPLUG_CPU0 to offline CPU 0
rcutorture: Make scenario TREE04 enable lazy call_rcu()
torture: Permit kvm-again.sh --duration to default to previous run
torture: Enable clocksource watchdog with "tsc=watchdog"
rcuscale: Move shutdown from wait_event() to wait_event_idle()
refscale: Move shutdown from wait_event() to wait_event_idle()
Yue Hu (1):
rcutorture: Eliminate variable n_rcu_torture_boost_rterror
Zqiang (1):
rcutorture: Create nocb kthreads only when testing rcu in
CONFIG_RCU_NOCB_CPU=y kernels
kernel/rcu/rcuscale.c | 7 ++-
kernel/rcu/rcutorture.c | 49 +++++++++++++++----
kernel/rcu/refscale.c | 2 +-
tools/rcu/extract-stall.sh | 26 +++++++---
.../selftests/rcutorture/bin/kvm-again.sh | 2 +-
.../selftests/rcutorture/bin/torture.sh | 6 +--
.../selftests/rcutorture/configs/rcu/TREE01 | 1 +
.../selftests/rcutorture/configs/rcu/TREE04 | 1 +
8 files changed, 69 insertions(+), 25 deletions(-)
mode change 100644 => 100755 tools/rcu/extract-stall.sh
--
2.38.1
This adds support for receiving KeyUpdate messages (RFC 8446, 4.6.3
[1]). A sender transmits a KeyUpdate message and then changes its TX
key. The receiver should react by updating its RX key before
processing the next message.
This patchset implements key updates by:
1. pausing decryption when a KeyUpdate message is received, to avoid
attempting to use the old key to decrypt a record encrypted with
the new key
2. returning -EKEYEXPIRED to syscalls that cannot receive the
KeyUpdate message, until the rekey has been performed by userspace
3. passing the KeyUpdate message to userspace as a control message
4. allowing updates of the crypto_info via the TLS_TX/TLS_RX
setsockopts
This API has been tested with gnutls to make sure that it allows
userspace libraries to implement key updates [2]. Thanks to Frantisek
Krenzelok <fkrenzel(a)redhat.com> for providing the implementation in
gnutls and testing the kernel patches.
Note: in a future series, I'll clean up tls_set_sw_offload and
eliminate the per-cipher copy-paste using tls_cipher_size_desc.
[1] https://www.rfc-editor.org/rfc/rfc8446#section-4.6.3
[2] https://gitlab.com/gnutls/gnutls/-/merge_requests/1625
Changes in v2
use reverse xmas tree ordering in tls_set_sw_offload and
do_tls_setsockopt_conf
turn the alt_crypto_info into an else if
selftests: add rekey_fail test
Vadim suggested simplifying tls_set_sw_offload by copying the new
crypto_info in the context in do_tls_setsockopt_conf, and then
detecting the rekey in tls_set_sw_offload based on whether the iv was
already set, but I don't think we can have a common error path
(otherwise we'd free the aead etc on rekey failure). I decided instead
to reorganize tls_set_sw_offload so that the context is unmodified
until we know the rekey cannot fail. Some fields will be touched
during the rekey, but they will be set to the same value they had
before the rekey (prot->rec_seq_size, etc).
Apoorv suggested to name the struct tls_crypto_info_keys "tls13"
rather than "tls12". Since we're using the same crypto_info data for
TLS1.3 as for 1.2, even if the tests only run for TLS1.3, I'd rather
keep the "tls12" name, in case we end up adding a
"tls13_crypto_info_aes_gcm_128" type in the future.
Kuniyuki and Apoorv also suggested preventing rekeys on RX when we
haven't received a matching KeyUpdate message, but I'd rather let
userspace handle this and have a symmetric API between TX and RX on
the kernel side. It's a bit of a foot-gun, but we can't really stop a
broken userspace from rolling back the rec_seq on an existing
crypto_info either, and that seems like a worse possible breakage.
Sabrina Dubroca (5):
tls: remove tls_context argument from tls_set_sw_offload
tls: block decryption when a rekey is pending
tls: implement rekey for TLS1.3
selftests: tls: add key_generation argument to tls_crypto_info_init
selftests: tls: add rekey tests
include/net/tls.h | 4 +
net/tls/tls.h | 3 +-
net/tls/tls_device.c | 2 +-
net/tls/tls_main.c | 37 +++-
net/tls/tls_sw.c | 189 +++++++++++++----
tools/testing/selftests/net/tls.c | 336 +++++++++++++++++++++++++++++-
6 files changed, 511 insertions(+), 60 deletions(-)
--
2.38.1