This series extends KVM RISC-V to allow Guest/VM discover and use
conditional operations related ISA extensions (namely XVentanaCondOps
and Zicond).
To try these patches, use KVMTOOL from riscv_zbx_zicntr_smstateen_condops_v1
branch at: https://github.com/avpatel/kvmtool.git
These patches are based upon the latest riscv_kvm_queue and can also be
found in the riscv_kvm_condops_v3 branch at:
https://github.com/avpatel/linux.git
Changes since v2:
- Dropped patch1, patch2, and patch5 since these patches don't meet
the requirements of patch acceptance policy.
Changes since v1:
- Rebased the series on riscv_kvm_queue
- Split PATCH1 and PATCH2 of v1 series into two patches
- Added separate test configs for XVentanaCondOps and Zicond in PATCH7
of v1 series.
Anup Patel (6):
dt-bindings: riscv: Add Zicond extension entry
RISC-V: Detect Zicond from ISA string
RISC-V: KVM: Allow Zicond extension for Guest/VM
KVM: riscv: selftests: Add senvcfg register to get-reg-list test
KVM: riscv: selftests: Add smstateen registers to get-reg-list test
KVM: riscv: selftests: Add condops extensions to get-reg-list test
.../devicetree/bindings/riscv/extensions.yaml | 6 +++
arch/riscv/include/asm/hwcap.h | 1 +
arch/riscv/include/uapi/asm/kvm.h | 1 +
arch/riscv/kernel/cpufeature.c | 1 +
arch/riscv/kvm/vcpu_onereg.c | 2 +
.../selftests/kvm/riscv/get-reg-list.c | 54 +++++++++++++++++++
6 files changed, 65 insertions(+)
--
2.34.1
Hello, all:
This list is being migrated to new vger infrastructure. No action is required
on your part and there will be no change in how you interact with this list
after the migration is completed.
There will be a short 30-minute delay to the list archives on lore.kernel.org.
Once the backend work is done, I will follow up with another message.
-K
Hi Miklos,
I got a couple of bug reports[1][2] this morning from teams that are
tracking regresssions in linux-next. My patch 513dfacefd71 ("fuse:
share lookup state between submount and its parent") is causing panics
in the fuse unmount path. The reports came from users with SLUB_DEBUG
enabled, and the additional debug sanitization catches the fact that the
submount_lookup field isn't getting initialized which could lead to a
subsequently bogus attempt to access the submount_lookup structure and
adjust its refcount.
I've added SLUB_DEBUG to my testing kconfig, and have reproduced the
problem using the memfd self-test that was triggering the problem for
both reporters. With the fix that follows this e-mail, I see no more
erroneous accesses of poisoned slub memory.
I'm a bit unsure of the desired approach for fixing these kinds of
problems. I'm also away from the office on Nov 10th and Nov 13th, but
expect to be back on the console on the Nov 14th. Given the gap, I've
prepared a pair of patches, but we only need one.
The first is simply a followup fix that addresses the problem in a
subsequent one-line commit.
If you'd rather revert the entire bad patch and go again, the second
patch in the series is a v5 of the original with the submount_lookup
initialization added.
Either should do, but I wasn't sure which approach was preferable.
Thanks, and my apologies for the inconvenience.
-K
[1] https://lore.kernel.org/linux-fsdevel/CA+G9fYue-dV7t-NrOhWwGshvyboXjb2B6HpC…
[2] https://lore.kernel.org/intel-gfx/SJ1PR11MB6129508509896AD7D0E03114B9AFA@SJ…
Currently, the sud_test expects the emulated syscall to return the
emulated syscall number. This assumption only works on architectures
were the syscall calling convention use the same register for syscall
number/syscall return value. This is not the case for RISC-V and thus
the return value must be also emulated using the provided ucontext.
Signed-off-by: Clément Léger <cleger(a)rivosinc.com>
---
tools/testing/selftests/syscall_user_dispatch/sud_test.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/tools/testing/selftests/syscall_user_dispatch/sud_test.c b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
index b5d592d4099e..1b5553c19700 100644
--- a/tools/testing/selftests/syscall_user_dispatch/sud_test.c
+++ b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
@@ -158,6 +158,14 @@ static void handle_sigsys(int sig, siginfo_t *info, void *ucontext)
/* In preparation for sigreturn. */
SYSCALL_DISPATCH_OFF(glob_sel);
+
+ /*
+ * Modify interrupted context returned value according to syscall
+ * calling convention
+ */
+#if defined(__riscv)
+ ((ucontext_t*)ucontext)->uc_mcontext.__gregs[REG_A0] = MAGIC_SYSCALL_1;
+#endif
}
TEST(dispatch_and_return)
--
2.40.1
Hi all,
Here's a series to improve resctrl selftests. It contains following
improvements:
- Excludes shareable bits from CAT test allocation to avoid interference
- Alters read pattern to defeat HW prefetcher optimizations
- Rewrites CAT test to make the CAT test reliable and truly measure
if CAT is working or not
- Introduces generalized test framework making easier to add new tests
- Adds L2 CAT test
- Lots of other cleanups & refactoring
The patches up to CAT test rewrite have been earlier on the mailing list.
I've tried to address all the comments made against them back then.
This series have been tested across a large number of systems from
different generations.
Ilpo Järvinen (24):
selftests/resctrl: Split fill_buf to allow tests finer-grained control
selftests/resctrl: Refactor fill_buf functions
selftests/resctrl: Refactor get_cbm_mask()
selftests/resctrl: Mark get_cache_size() cache_type const
selftests/resctrl: Create cache_size() helper
selftests/resctrl: Exclude shareable bits from schemata in CAT test
selftests/resctrl: Split measure_cache_vals() function
selftests/resctrl: Split show_cache_info() to test specific and
generic parts
selftests/resctrl: Remove unnecessary __u64 -> unsigned long
conversion
selftests/resctrl: Remove nested calls in perf event handling
selftests/resctrl: Consolidate naming of perf event related things
selftests/resctrl: Improve perf init
selftests/resctrl: Convert perf related globals to locals
selftests/resctrl: Move cat_val() to cat_test.c and rename to
cat_test()
selftests/resctrl: Read in less obvious order to defeat prefetch
optimizations
selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test
selftests/resctrl: Create struct for input parameter
selftests/resctrl: Introduce generalized test framework
selftests/resctrl: Pass write_schemata() resource instead of test name
selftests/resctrl: Add helper to convert L2/3 to integer
selftests/resctrl: Get resource id from cache id
selftests/resctrl: Add test groups and name L3 CAT test L3_CAT
selftests/resctrl: Add L2 CAT test
selftests/resctrl: Ignore failures from L2 CAT test with <= 2 bits
tools/testing/selftests/resctrl/cache.c | 263 +++---------
tools/testing/selftests/resctrl/cat_test.c | 386 ++++++++++++------
tools/testing/selftests/resctrl/cmt_test.c | 72 +++-
tools/testing/selftests/resctrl/fill_buf.c | 114 +++---
tools/testing/selftests/resctrl/mba_test.c | 24 +-
tools/testing/selftests/resctrl/mbm_test.c | 26 +-
tools/testing/selftests/resctrl/resctrl.h | 102 ++++-
.../testing/selftests/resctrl/resctrl_tests.c | 202 ++++-----
tools/testing/selftests/resctrl/resctrl_val.c | 6 +-
tools/testing/selftests/resctrl/resctrlfs.c | 234 +++++++----
10 files changed, 807 insertions(+), 622 deletions(-)
--
2.30.2
From: Ricardo Cañuelo <ricardo.canuelo(a)collabora.com>
[ Upstream commit cf77bf698887c3b9ebed76dea492b07a3c2c7632 ]
The lkdtm selftest config fragment enables CONFIG_UBSAN_TRAP to make the
ARRAY_BOUNDS test kill the calling process when an out-of-bound access
is detected by UBSAN. However, after this [1] commit, UBSAN is triggered
under many new scenarios that weren't detected before, such as in struct
definitions with fixed-size trailing arrays used as flexible arrays. As
a result, CONFIG_UBSAN_TRAP=y has become a very aggressive option to
enable except for specific situations.
`make kselftest-merge` applies CONFIG_UBSAN_TRAP=y to the kernel config
for all selftests, which makes many of them fail because of system hangs
during boot.
This change removes the config option from the lkdtm kselftest and
configures the ARRAY_BOUNDS test to look for UBSAN reports rather than
relying on the calling process being killed.
[1] commit 2d47c6956ab3 ("ubsan: Tighten UBSAN_BOUNDS on GCC")'
Signed-off-by: Ricardo Cañuelo <ricardo.canuelo(a)collabora.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Link: https://lore.kernel.org/r/20230802063252.1917997-1-ricardo.canuelo@collabor…
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/lkdtm/config | 1 -
tools/testing/selftests/lkdtm/tests.txt | 2 +-
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/lkdtm/config b/tools/testing/selftests/lkdtm/config
index 5d52f64dfb430..7afe05e8c4d79 100644
--- a/tools/testing/selftests/lkdtm/config
+++ b/tools/testing/selftests/lkdtm/config
@@ -9,7 +9,6 @@ CONFIG_INIT_ON_FREE_DEFAULT_ON=y
CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y
CONFIG_UBSAN=y
CONFIG_UBSAN_BOUNDS=y
-CONFIG_UBSAN_TRAP=y
CONFIG_STACKPROTECTOR_STRONG=y
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB_DEBUG_ON=y
diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt
index 607b8d7e3ea34..2f3a1b96da6e3 100644
--- a/tools/testing/selftests/lkdtm/tests.txt
+++ b/tools/testing/selftests/lkdtm/tests.txt
@@ -7,7 +7,7 @@ EXCEPTION
#EXHAUST_STACK Corrupts memory on failure
#CORRUPT_STACK Crashes entire system on success
#CORRUPT_STACK_STRONG Crashes entire system on success
-ARRAY_BOUNDS
+ARRAY_BOUNDS call trace:|UBSAN: array-index-out-of-bounds
CORRUPT_LIST_ADD list_add corruption
CORRUPT_LIST_DEL list_del corruption
STACK_GUARD_PAGE_LEADING
--
2.42.0
From: Ricardo Cañuelo <ricardo.canuelo(a)collabora.com>
[ Upstream commit cf77bf698887c3b9ebed76dea492b07a3c2c7632 ]
The lkdtm selftest config fragment enables CONFIG_UBSAN_TRAP to make the
ARRAY_BOUNDS test kill the calling process when an out-of-bound access
is detected by UBSAN. However, after this [1] commit, UBSAN is triggered
under many new scenarios that weren't detected before, such as in struct
definitions with fixed-size trailing arrays used as flexible arrays. As
a result, CONFIG_UBSAN_TRAP=y has become a very aggressive option to
enable except for specific situations.
`make kselftest-merge` applies CONFIG_UBSAN_TRAP=y to the kernel config
for all selftests, which makes many of them fail because of system hangs
during boot.
This change removes the config option from the lkdtm kselftest and
configures the ARRAY_BOUNDS test to look for UBSAN reports rather than
relying on the calling process being killed.
[1] commit 2d47c6956ab3 ("ubsan: Tighten UBSAN_BOUNDS on GCC")'
Signed-off-by: Ricardo Cañuelo <ricardo.canuelo(a)collabora.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Link: https://lore.kernel.org/r/20230802063252.1917997-1-ricardo.canuelo@collabor…
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/lkdtm/config | 1 -
tools/testing/selftests/lkdtm/tests.txt | 2 +-
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/lkdtm/config b/tools/testing/selftests/lkdtm/config
index 5d52f64dfb430..7afe05e8c4d79 100644
--- a/tools/testing/selftests/lkdtm/config
+++ b/tools/testing/selftests/lkdtm/config
@@ -9,7 +9,6 @@ CONFIG_INIT_ON_FREE_DEFAULT_ON=y
CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y
CONFIG_UBSAN=y
CONFIG_UBSAN_BOUNDS=y
-CONFIG_UBSAN_TRAP=y
CONFIG_STACKPROTECTOR_STRONG=y
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB_DEBUG_ON=y
diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt
index 607b8d7e3ea34..2f3a1b96da6e3 100644
--- a/tools/testing/selftests/lkdtm/tests.txt
+++ b/tools/testing/selftests/lkdtm/tests.txt
@@ -7,7 +7,7 @@ EXCEPTION
#EXHAUST_STACK Corrupts memory on failure
#CORRUPT_STACK Crashes entire system on success
#CORRUPT_STACK_STRONG Crashes entire system on success
-ARRAY_BOUNDS
+ARRAY_BOUNDS call trace:|UBSAN: array-index-out-of-bounds
CORRUPT_LIST_ADD list_add corruption
CORRUPT_LIST_DEL list_del corruption
STACK_GUARD_PAGE_LEADING
--
2.42.0
From: Ricardo Cañuelo <ricardo.canuelo(a)collabora.com>
[ Upstream commit cf77bf698887c3b9ebed76dea492b07a3c2c7632 ]
The lkdtm selftest config fragment enables CONFIG_UBSAN_TRAP to make the
ARRAY_BOUNDS test kill the calling process when an out-of-bound access
is detected by UBSAN. However, after this [1] commit, UBSAN is triggered
under many new scenarios that weren't detected before, such as in struct
definitions with fixed-size trailing arrays used as flexible arrays. As
a result, CONFIG_UBSAN_TRAP=y has become a very aggressive option to
enable except for specific situations.
`make kselftest-merge` applies CONFIG_UBSAN_TRAP=y to the kernel config
for all selftests, which makes many of them fail because of system hangs
during boot.
This change removes the config option from the lkdtm kselftest and
configures the ARRAY_BOUNDS test to look for UBSAN reports rather than
relying on the calling process being killed.
[1] commit 2d47c6956ab3 ("ubsan: Tighten UBSAN_BOUNDS on GCC")'
Signed-off-by: Ricardo Cañuelo <ricardo.canuelo(a)collabora.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Link: https://lore.kernel.org/r/20230802063252.1917997-1-ricardo.canuelo@collabor…
Signed-off-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/lkdtm/config | 1 -
tools/testing/selftests/lkdtm/tests.txt | 2 +-
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/lkdtm/config b/tools/testing/selftests/lkdtm/config
index 5d52f64dfb430..7afe05e8c4d79 100644
--- a/tools/testing/selftests/lkdtm/config
+++ b/tools/testing/selftests/lkdtm/config
@@ -9,7 +9,6 @@ CONFIG_INIT_ON_FREE_DEFAULT_ON=y
CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y
CONFIG_UBSAN=y
CONFIG_UBSAN_BOUNDS=y
-CONFIG_UBSAN_TRAP=y
CONFIG_STACKPROTECTOR_STRONG=y
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB_DEBUG_ON=y
diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt
index 607b8d7e3ea34..2f3a1b96da6e3 100644
--- a/tools/testing/selftests/lkdtm/tests.txt
+++ b/tools/testing/selftests/lkdtm/tests.txt
@@ -7,7 +7,7 @@ EXCEPTION
#EXHAUST_STACK Corrupts memory on failure
#CORRUPT_STACK Crashes entire system on success
#CORRUPT_STACK_STRONG Crashes entire system on success
-ARRAY_BOUNDS
+ARRAY_BOUNDS call trace:|UBSAN: array-index-out-of-bounds
CORRUPT_LIST_ADD list_add corruption
CORRUPT_LIST_DEL list_del corruption
STACK_GUARD_PAGE_LEADING
--
2.42.0
Building an arm64 kernel and seftests/bpf with defconfig +
selftests/bpf/config and selftests/bpf/config.aarch64 the fragment
CONFIG_DEBUG_INFO_REDUCED is enabled in arm64's defconfig, it should be
disabled in file sefltests/bpf/config.aarch64 since if its not disabled
CONFIG_DEBUG_INFO_BTF wont be enabled.
Signed-off-by: Anders Roxell <anders.roxell(a)linaro.org>
---
tools/testing/selftests/bpf/config.aarch64 | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/bpf/config.aarch64 b/tools/testing/selftests/bpf/config.aarch64
index 253821494884..7b6e1d48c309 100644
--- a/tools/testing/selftests/bpf/config.aarch64
+++ b/tools/testing/selftests/bpf/config.aarch64
@@ -37,6 +37,7 @@ CONFIG_CRYPTO_USER_API_SKCIPHER=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_DEBUG_INFO_BTF=y
CONFIG_DEBUG_INFO_DWARF4=y
+CONFIG_DEBUG_INFO_REDUCED=n
CONFIG_DEBUG_LIST=y
CONFIG_DEBUG_LOCKDEP=y
CONFIG_DEBUG_NOTIFIERS=y
--
2.42.0
Good day!
I plan to migrate the linux-kselftest(a)vger.kernel.org list to the new
infrastructure this week. We're still doing it list-by-list to make sure that
we don't run into scaling issues with the new infra.
The migration will be performed live and should not require any downtime.
There will be no changes to how anyone interacts with the list after
migration is completed, so no action is required on anyone's part.
Please let me know if you have any concerns.
Best wishes,
-K
Goededag,
Ik ben mevrouw Joanna Liu en een medewerker van Citi Bank Hong Kong.
Kan ik € 100.000.000 aan u overmaken? Kan ik je vertrouwen
Ik wacht op jullie reacties
Met vriendelijke groeten
mevrouw Joanna Liu
Two bugfixes and some minor refactorings of the MIPS support.
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Changes in v2:
- Use olddefconfig instead of oldconfig
- mipso32{le,be} -> mips32{le,be}
- Link to v1: https://lore.kernel.org/r/20231105-nolibc-mips-be-v1-0-6c2ad3e50a1f@weisssc…
---
Thomas Weißschuh (6):
tools/nolibc: error out on unsupported architecture
tools/nolibc: move MIPS ABI validation into arch-mips.h
selftests/nolibc: use XARCH for MIPS
selftests/nolibc: explicitly specify ABI for MIPS
selftests/nolibc: extraconfig support
selftests/nolibc: add configuration for mipso32be
tools/include/nolibc/arch-mips.h | 4 ++++
tools/include/nolibc/arch.h | 4 +++-
tools/testing/selftests/nolibc/Makefile | 25 ++++++++++++++++++++-----
tools/testing/selftests/nolibc/run-tests.sh | 2 +-
4 files changed, 28 insertions(+), 7 deletions(-)
---
base-commit: 6de6466e41182875252fe09658f9b7d74c4fa43c
change-id: 20231105-nolibc-mips-be-892785dd3eaa
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Two bugfixes and some minor refactorings of the MIPS support.
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
Thomas Weißschuh (6):
tools/nolibc: error out on unsupported architecture
tools/nolibc: move MIPS ABI validation into arch-mips.h
selftests/nolibc: use XARCH for MIPS
selftests/nolibc: explicitly specify ABI for MIPS
selftests/nolibc: extraconfig support
selftests/nolibc: add configuration for mipso32be
tools/include/nolibc/arch-mips.h | 4 ++++
tools/include/nolibc/arch.h | 4 +++-
tools/testing/selftests/nolibc/Makefile | 25 ++++++++++++++++++++-----
tools/testing/selftests/nolibc/run-tests.sh | 2 +-
4 files changed, 28 insertions(+), 7 deletions(-)
---
base-commit: 6de6466e41182875252fe09658f9b7d74c4fa43c
change-id: 20231105-nolibc-mips-be-892785dd3eaa
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Dzień dobry,
zapoznałem się z Państwa ofertą i z przyjemnością przyznaję, że przyciąga uwagę i zachęca do dalszych rozmów.
Pomyślałem, że może mógłbym mieć swój wkład w Państwa rozwój i pomóc dotrzeć z tą ofertą do większego grona odbiorców. Pozycjonuję strony www, dzięki czemu generują świetny ruch w sieci.
Możemy porozmawiać w najbliższym czasie?
Pozdrawiam serdecznie
Adam Charachuta
The test mm `hugetlb_fault_after_madv` selftest needs one and only one
huge page to run, thus it sets `/proc/sys/vm/nr_hugepages` to 1.
The problem is that further tests require the previous number of
hugepages allocated in order to succeed.
Save the number of huge pages before changing it, and restore it once the
test finishes, so, further tests could run successfully.
Fixes: 116d57303a05 ("selftests/mm: add a new test for madv and hugetlb")
Reported-by: Ryan Roberts <ryan.roberts(a)arm.com>
Closes: https://lore.kernel.org/all/662df57e-47f1-4c15-9b84-f2f2d587fc5c@arm.com/
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
tools/testing/selftests/mm/run_vmtests.sh | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index cc16f6ca8533..00757445278e 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -223,9 +223,12 @@ CATEGORY="hugetlb" run_test ./hugepage-mremap
CATEGORY="hugetlb" run_test ./hugepage-vmemmap
CATEGORY="hugetlb" run_test ./hugetlb-madvise
+nr_hugepages_tmp=$(cat /proc/sys/vm/nr_hugepages)
# For this test, we need one and just one huge page
echo 1 > /proc/sys/vm/nr_hugepages
CATEGORY="hugetlb" run_test ./hugetlb_fault_after_madv
+# Restore the previous number of huge pages, since further tests rely on it
+echo "$nr_hugepages_tmp" > /proc/sys/vm/nr_hugepages
if test_selected "hugetlb"; then
echo "NOTE: These hugetlb tests provide minimal coverage. Use"
--
2.34.1
Create a selftest that exercises the race between page faults and
madvise(MADV_DONTNEED) in the same huge page. Do it by running two
threads that touches the huge page and madvise(MADV_DONTNEED) at the same
time.
In case of a SIGBUS coming at pagefault, the test should fail, since we
hit the bug.
The test doesn't have a signal handler, and if it fails, it fails like
the following
----------------------------------
running ./hugetlb_fault_after_madv
----------------------------------
./run_vmtests.sh: line 186: 595563 Bus error (core dumped) "$@"
[FAIL]
This selftest goes together with the fix of the bug[1] itself.
[1] https://lore.kernel.org/all/20231001005659.2185316-1-riel@surriel.com/#r
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
tools/testing/selftests/mm/Makefile | 1 +
.../selftests/mm/hugetlb_fault_after_madv.c | 73 +++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 4 +
3 files changed, 78 insertions(+)
create mode 100644 tools/testing/selftests/mm/hugetlb_fault_after_madv.c
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 6a9fc5693145..e71ec9910c62 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -68,6 +68,7 @@ TEST_GEN_FILES += split_huge_page_test
TEST_GEN_FILES += ksm_tests
TEST_GEN_FILES += ksm_functional_tests
TEST_GEN_FILES += mdwe_test
+TEST_GEN_FILES += hugetlb_fault_after_madv
ifneq ($(ARCH),arm64)
TEST_GEN_PROGS += soft-dirty
diff --git a/tools/testing/selftests/mm/hugetlb_fault_after_madv.c b/tools/testing/selftests/mm/hugetlb_fault_after_madv.c
new file mode 100644
index 000000000000..73b81c632366
--- /dev/null
+++ b/tools/testing/selftests/mm/hugetlb_fault_after_madv.c
@@ -0,0 +1,73 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <pthread.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include "vm_util.h"
+#include "../kselftest.h"
+
+#define MMAP_SIZE (1 << 21)
+#define INLOOP_ITER 100
+
+char *huge_ptr;
+
+/* Touch the memory while it is being madvised() */
+void *touch(void *unused)
+{
+ char *ptr = (char *)huge_ptr;
+
+ for (int i = 0; i < INLOOP_ITER; i++)
+ ptr[0] = '.';
+
+ return NULL;
+}
+
+void *madv(void *unused)
+{
+ usleep(rand() % 10);
+
+ for (int i = 0; i < INLOOP_ITER; i++)
+ madvise(huge_ptr, MMAP_SIZE, MADV_DONTNEED);
+
+ return NULL;
+}
+
+int main(void)
+{
+ unsigned long free_hugepages;
+ pthread_t thread1, thread2;
+ /*
+ * On kernel 6.4, we are able to reproduce the problem with ~1000
+ * interactions
+ */
+ int max = 10000;
+
+ srand(getpid());
+
+ free_hugepages = get_free_hugepages();
+ if (free_hugepages != 1) {
+ ksft_exit_skip("This test needs one and only one page to execute. Got %lu\n",
+ free_hugepages);
+ }
+
+ while (max--) {
+ huge_ptr = mmap(NULL, MMAP_SIZE, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB,
+ -1, 0);
+
+ if ((unsigned long)huge_ptr == -1)
+ ksft_exit_skip("Failed to allocated huge page\n");
+
+ pthread_create(&thread1, NULL, madv, NULL);
+ pthread_create(&thread2, NULL, touch, NULL);
+
+ pthread_join(thread1, NULL);
+ pthread_join(thread2, NULL);
+ munmap(huge_ptr, MMAP_SIZE);
+ }
+
+ return KSFT_PASS;
+}
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index 3e2bc818d566..9f53f7318a38 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -221,6 +221,10 @@ CATEGORY="hugetlb" run_test ./hugepage-mremap
CATEGORY="hugetlb" run_test ./hugepage-vmemmap
CATEGORY="hugetlb" run_test ./hugetlb-madvise
+# For this test, we need one and just one huge page
+echo 1 > /proc/sys/vm/nr_hugepages
+CATEGORY="hugetlb" run_test ./hugetlb_fault_after_madv
+
if test_selected "hugetlb"; then
echo "NOTE: These hugetlb tests provide minimal coverage. Use"
echo " https://github.com/libhugetlbfs/libhugetlbfs.git for"
--
2.34.1
In the PMTU test, when all previous tests are skipped and the new test
passes, the exit code is set to 0. However, the current check mistakenly
treats this as an assignment, causing the check to pass every time.
Consequently, regardless of how many tests have failed, if the latest test
passes, the PMTU test will report a pass.
Fixes: 2a9d3716b810 ("selftests: pmtu.sh: improve the test result processing")
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
---
v2: use "-eq" instead of "=" to make less error-prone
---
tools/testing/selftests/net/pmtu.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/pmtu.sh b/tools/testing/selftests/net/pmtu.sh
index f838dd370f6a..b3b2dc5a630c 100755
--- a/tools/testing/selftests/net/pmtu.sh
+++ b/tools/testing/selftests/net/pmtu.sh
@@ -2048,7 +2048,7 @@ run_test() {
case $ret in
0)
all_skipped=false
- [ $exitcode=$ksft_skip ] && exitcode=0
+ [ $exitcode -eq $ksft_skip ] && exitcode=0
;;
$ksft_skip)
[ $all_skipped = true ] && exitcode=$ksft_skip
--
2.41.0
This patch series introduces UFFDIO_MOVE feature to userfaultfd, which
has long been implemented and maintained by Andrea in his local tree [1],
but was not upstreamed due to lack of use cases where this approach would
be better than allocating a new page and copying the contents. Previous
upstraming attempts could be found at [6] and [7].
UFFDIO_COPY performs ~20% better than UFFDIO_MOVE when the application
needs pages to be allocated [2]. However, with UFFDIO_MOVE, if pages are
available (in userspace) for recycling, as is usually the case in heap
compaction algorithms, then we can avoid the page allocation and memcpy
(done by UFFDIO_COPY). Also, since the pages are recycled in the
userspace, we avoid the need to release (via madvise) the pages back to
the kernel [3].
We see over 40% reduction (on a Google pixel 6 device) in the compacting
thread’s completion time by using UFFDIO_MOVE vs. UFFDIO_COPY. This was
measured using a benchmark that emulates a heap compaction implementation
using userfaultfd (to allow concurrent accesses by application threads).
More details of the usecase are explained in [3].
Furthermore, UFFDIO_MOVE enables moving swapped-out pages without
touching them within the same vma. Today, it can only be done by mremap,
however it forces splitting the vma.
TODOs for follow-up improvements:
- cross-mm support. Known differences from single-mm and missing pieces:
- memcg recharging (might need to isolate pages in the process)
- mm counters
- cross-mm deposit table moves
- cross-mm test
- document the address space where src and dest reside in struct
uffdio_move
- TLB flush batching. Will require extensive changes to PTL locking in
move_pages_pte(). OTOH that might let us reuse parts of mremap code.
Changes since v3 [8]:
- changed retry path in folio_lock_anon_vma_read() to unlock and then
relock RCU, per Peter Xu
- removed cross-mm support from initial patchset, per David Hildenbrand
- replaced BUG_ONs with VM_WARN_ON or WARN_ON_ONCE, per David Hildenbrand
- added missing cache flushing, per Lokesh Gidra and Peter Xu
- updated manpage text in the patch description, per Peter Xu
- renamed internal functions from "remap" to "move", per Peter Xu
- added mmap_changing check after taking mmap_lock, per Peter Xu
- changed uffd context check to ensure dst_mm is registered onto uffd we
are operating on, Peter Xu and David Hildenbrand
- changed to non-maybe variants of maybe*_mkwrite(), per David Hildenbrand
- fixed warning for CONFIG_TRANSPARENT_HUGEPAGE=n, per kernel test robot
- comments cleanup, per David Hildenbrand and Peter Xu
- checks for VM_IO,VM_PFNMAP,VM_HUGETLB,..., per David Hildenbrand
- prevent moving pinned pages, per Peter Xu
- changed uffd tests to call move uffd_test_ctx_clear() at the end of the
test run instead of in the beginning of the next run
- added support for testcase-specific ops
- added test for moving PMD-aligned blocks
Changes since v2 [5]:
- renamed UFFDIO_REMAP to UFFDIO_MOVE, per David Hildenbrand
- rebase over mm-unstable to use folio_move_anon_rmap(),
per David Hildenbrand
- added text for manpage explaining DONTFORK and KSM requirements for this
feature, per David Hildenbrand
- check for anon_vma changes in the fast path of folio_lock_anon_vma_read,
per Peter Xu
- updated the title and description of the first patch,
per David Hildenbrand
- updating comments in folio_lock_anon_vma_read() explaining the need for
anon_vma checks, per David Hildenbrand
- changed all mapcount checks to PageAnonExclusive, per Jann Horn and
David Hildenbrand
- changed counters in remap_swap_pte() from MM_ANONPAGES to MM_SWAPENTS,
per Jann Horn
- added a check for PTE change after folio is locked in remap_pages_pte(),
per Jann Horn
- added handling of PMD migration entries and bailout when pmd_devmap(),
per Jann Horn
- added checks to ensure both src and dst VMAs are writable, per Peter Xu
- added UFFD_FEATURE_MOVE, per Peter Xu
- removed obsolete comments, per Peter Xu
- renamed remap_anon_pte to remap_present_pte, per Peter Xu
- added a comment for folio_get_anon_vma() explaining the need for
anon_vma checks, per Peter Xu
- changed error handling in remap_pages() to make it more clear,
per Peter Xu
- changed EFAULT to EAGAIN to retry when a hugepage appears or disappears
from under us, per Peter Xu
- added links to previous upstreaming attempts, per David Hildenbrand
Changes since v1 [4]:
- add mmget_not_zero in userfaultfd_remap, per Jann Horn
- removed extern from function definitions, per Matthew Wilcox
- converted to folios in remap_pages_huge_pmd, per Matthew Wilcox
- use PageAnonExclusive in remap_pages_huge_pmd, per David Hildenbrand
- handle pgtable transfers between MMs, per Jann Horn
- ignore concurrent A/D pte bit changes, per Jann Horn
- split functions into smaller units, per David Hildenbrand
- test for folio_test_large in remap_anon_pte, per Matthew Wilcox
- use pte_swp_exclusive for swapcount check, per David Hildenbrand
- eliminated use of mmu_notifier_invalidate_range_start_nonblock,
per Jann Horn
- simplified THP alignment checks, per Jann Horn
- refactored the loop inside remap_pages, per Jann Horn
- additional clarifying comments, per Jann Horn
Main changes since Andrea's last version [1]:
- Trivial translations from page to folio, mmap_sem to mmap_lock
- Replace pmd_trans_unstable() with pte_offset_map_nolock() and handle its
possible failure
- Move pte mapping into remap_pages_pte to allow for retries when source
page or anon_vma is contended. Since pte_offset_map_nolock() start RCU
read section, we can't block anymore after mapping a pte, so have to unmap
the ptesm do the locking and retry.
- Add and use anon_vma_trylock_write() to avoid blocking while in RCU
read section.
- Accommodate changes in mmu_notifier_range_init() API, switch to
mmu_notifier_invalidate_range_start_nonblock() to avoid blocking while in
RCU read section.
- Open-code now removed __swp_swapcount()
- Replace pmd_read_atomic() with pmdp_get_lockless()
- Add new selftest for UFFDIO_MOVE
[1] https://gitlab.com/aarcange/aa/-/commit/2aec7aea56b10438a3881a20a411aa4b1fc…
[2] https://lore.kernel.org/all/1425575884-2574-1-git-send-email-aarcange@redha…
[3] https://lore.kernel.org/linux-mm/CA+EESO4uO84SSnBhArH4HvLNhaUQ5nZKNKXqxRCyj…
[4] https://lore.kernel.org/all/20230914152620.2743033-1-surenb@google.com/
[5] https://lore.kernel.org/all/20230923013148.1390521-1-surenb@google.com/
[6] https://lore.kernel.org/all/1425575884-2574-21-git-send-email-aarcange@redh…
[7] https://lore.kernel.org/all/cover.1547251023.git.blake.caldwell@colorado.ed…
[8] https://lore.kernel.org/all/20231009064230.2952396-1-surenb@google.com/
The patchset applies over mm-unstable.
Andrea Arcangeli (2):
mm/rmap: support move to different root anon_vma in
folio_move_anon_rmap()
userfaultfd: UFFDIO_MOVE uABI
Suren Baghdasaryan (3):
selftests/mm: call uffd_test_ctx_clear at the end of the test
selftests/mm: add uffd_test_case_ops to allow test case-specific
operations
selftests/mm: add UFFDIO_MOVE ioctl test
Documentation/admin-guide/mm/userfaultfd.rst | 3 +
fs/userfaultfd.c | 72 +++
include/linux/rmap.h | 5 +
include/linux/userfaultfd_k.h | 11 +
include/uapi/linux/userfaultfd.h | 29 +-
mm/huge_memory.c | 122 ++++
mm/khugepaged.c | 3 +
mm/rmap.c | 30 +
mm/userfaultfd.c | 596 +++++++++++++++++++
tools/testing/selftests/mm/uffd-common.c | 51 +-
tools/testing/selftests/mm/uffd-common.h | 11 +
tools/testing/selftests/mm/uffd-stress.c | 5 +-
tools/testing/selftests/mm/uffd-unit-tests.c | 144 +++++
13 files changed, 1078 insertions(+), 4 deletions(-)
--
2.42.0.820.g83a721a137-goog
Üdvözlöm, van egy vállalkozásom, amelyre úgy hivatkoztam rád, mint
neked ugyanaz a vezetéknév, mint a néhai ügyfelem, de a részletek az
alábbiak lesznek értesítjük Önt, amikor megerősíti ennek az e-mailnek
a kézhezvételét. Üdvözlettel
*Changes in v33*:
- Add PAGE_IS_FILE support for THPs
*Changes in v31 and v32*:
- Minor updates
*Changes in v30*:
- Rebase on top of next-20230815
- Minor nitpicks
*Changes in v29:*
- Polish IOCTL and improve documentation
*Changes in v28:*
- Fix walk_end and add 17 test cases in selftests patch
*Changes in v27:*
- Handle review comments and minor improvements
- Add performance improvement patch on top with test for easy review
*Changes in v26:*
- Code re-structurring and API changes in PAGEMAP_IOCTL
*Changes in v25*:
- Do proper filtering on hole as well (hole got missed earlier)
*Changes in v24*:
- Rebase on top of next-20230710
- Place WP markers in case of hole as well
*Changes in v23*:
- Set vec_buf_index in loop only when vec_buf_index is set
- Return -EFAULT instead of -EINVAL if vec is NULL
- Correctly return the walk ending address to the page granularity
*Changes in v22*:
- Interface change:
- Replace [start start + len) with [start, end)
- Return the ending address of the address walk in start
*Changes in v21*:
- Abort walk instead of returning error if WP is to be performed on
partial hugetlb
*Changes in v20*
- Correct PAGE_IS_FILE and add PAGE_IS_PFNZERO
*Changes in v19*
- Minor changes and interface updates
*Changes in v18*
- Rebase on top of next-20230613
- Minor updates
*Changes in v17*
- Rebase on top of next-20230606
- Minor improvements in PAGEMAP_SCAN IOCTL patch
*Changes in v16*
- Fix a corner case
- Add exclusive PM_SCAN_OP_WP back
*Changes in v15*
- Build fix (Add missed build fix in RESEND)
*Changes in v14*
- Fix build error caused by #ifdef added at last minute in some configs
*Changes in v13*
- Rebase on top of next-20230414
- Give-up on using uffd_wp_range() and write new helpers, flush tlb only
once
*Changes in v12*
- Update and other memory types to UFFD_FEATURE_WP_ASYNC
- Rebaase on top of next-20230406
- Review updates
*Changes in v11*
- Rebase on top of next-20230307
- Base patches on UFFD_FEATURE_WP_UNPOPULATED
- Do a lot of cosmetic changes and review updates
- Remove ENGAGE_WP + !GET operation as it can be performed with
UFFDIO_WRITEPROTECT
*Changes in v10*
- Add specific condition to return error if hugetlb is used with wp
async
- Move changes in tools/include/uapi/linux/fs.h to separate patch
- Add documentation
*Changes in v9:*
- Correct fault resolution for userfaultfd wp async
- Fix build warnings and errors which were happening on some configs
- Simplify pagemap ioctl's code
*Changes in v8:*
- Update uffd async wp implementation
- Improve PAGEMAP_IOCTL implementation
*Changes in v7:*
- Add uffd wp async
- Update the IOCTL to use uffd under the hood instead of soft-dirty
flags
*Motivation*
The real motivation for adding PAGEMAP_SCAN IOCTL is to emulate Windows
GetWriteWatch() and ResetWriteWatch() syscalls [1]. The GetWriteWatch()
retrieves the addresses of the pages that are written to in a region of
virtual memory.
This syscall is used in Windows applications and games etc. This syscall is
being emulated in pretty slow manner in userspace. Our purpose is to
enhance the kernel such that we translate it efficiently in a better way.
Currently some out of tree hack patches are being used to efficiently
emulate it in some kernels. We intend to replace those with these patches.
So the whole gaming on Linux can effectively get benefit from this. It
means there would be tons of users of this code.
CRIU use case [2] was mentioned by Andrei and Danylo:
> Use cases for migrating sparse VMAs are binaries sanitized with ASAN,
> MSAN or TSAN [3]. All of these sanitizers produce sparse mappings of
> shadow memory [4]. Being able to migrate such binaries allows to highly
> reduce the amount of work needed to identify and fix post-migration
> crashes, which happen constantly.
Andrei's defines the following uses of this code:
* it is more granular and allows us to track changed pages more
effectively. The current interface can clear dirty bits for the entire
process only. In addition, reading info about pages is a separate
operation. It means we must freeze the process to read information
about all its pages, reset dirty bits, only then we can start dumping
pages. The information about pages becomes more and more outdated,
while we are processing pages. The new interface solves both these
downsides. First, it allows us to read pte bits and clear the
soft-dirty bit atomically. It means that CRIU will not need to freeze
processes to pre-dump their memory. Second, it clears soft-dirty bits
for a specified region of memory. It means CRIU will have actual info
about pages to the moment of dumping them.
* The new interface has to be much faster because basic page filtering
is happening in the kernel. With the old interface, we have to read
pagemap for each page.
*Implementation Evolution (Short Summary)*
From the definition of GetWriteWatch(), we feel like kernel's soft-dirty
feature can be used under the hood with some additions like:
* reset soft-dirty flag for only a specific region of memory instead of
clearing the flag for the entire process
* get and clear soft-dirty flag for a specific region atomically
So we decided to use ioctl on pagemap file to read or/and reset soft-dirty
flag. But using soft-dirty flag, sometimes we get extra pages which weren't
even written. They had become soft-dirty because of VMA merging and
VM_SOFTDIRTY flag. This breaks the definition of GetWriteWatch(). We were
able to by-pass this short coming by ignoring VM_SOFTDIRTY until David
reported that mprotect etc messes up the soft-dirty flag while ignoring
VM_SOFTDIRTY [5]. This wasn't happening until [6] got introduced. We
discussed if we can revert these patches. But we could not reach to any
conclusion. So at this point, I made couple of tries to solve this whole
VM_SOFTDIRTY issue by correcting the soft-dirty implementation:
* [7] Correct the bug fixed wrongly back in 2014. It had potential to cause
regression. We left it behind.
* [8] Keep a list of soft-dirty part of a VMA across splits and merges. I
got the reply don't increase the size of the VMA by 8 bytes.
At this point, we left soft-dirty considering it is too much delicate and
userfaultfd [9] seemed like the only way forward. From there onward, we
have been basing soft-dirty emulation on userfaultfd wp feature where
kernel resolves the faults itself when WP_ASYNC feature is used. It was
straight forward to add WP_ASYNC feature in userfautlfd. Now we get only
those pages dirty or written-to which are really written in reality. (PS
There is another WP_UNPOPULATED userfautfd feature is required which is
needed to avoid pre-faulting memory before write-protecting [9].)
All the different masks were added on the request of CRIU devs to create
interface more generic and better.
[1] https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-…
[2] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com
[3] https://github.com/google/sanitizers
[4] https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit
[5] https://lore.kernel.org/all/bfcae708-db21-04b4-0bbe-712badd03071@redhat.com
[6] https://lore.kernel.org/all/20220725142048.30450-1-peterx@redhat.com/
[7] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[8] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[9] https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com
[10] https://lore.kernel.org/all/20230125144529.1630917-1-mdanylo@google.com
* Original Cover letter from v8*
Hello,
Note:
Soft-dirty pages and pages which have been written-to are synonyms. As
kernel already has soft-dirty feature inside which we have given up to
use, we are using written-to terminology while using UFFD async WP under
the hood.
It is possible to find and clear soft-dirty pages entirely in userspace.
But it isn't efficient:
- The mprotect and SIGSEGV handler for bookkeeping
- The userfaultfd wp (synchronous) with the handler for bookkeeping
Some benchmarks can be seen here[1]. This series adds features that weren't
present earlier:
- There is no atomic get soft-dirty/Written-to status and clear present in
the kernel.
- The pages which have been written-to can not be found in accurate way.
(Kernel's soft-dirty PTE bit + sof_dirty VMA bit shows more soft-dirty
pages than there actually are.)
Historically, soft-dirty PTE bit tracking has been used in the CRIU
project. The procfs interface is enough for finding the soft-dirty bit
status and clearing the soft-dirty bit of all the pages of a process.
We have the use case where we need to track the soft-dirty PTE bit for
only specific pages on-demand. We need this tracking and clear mechanism
of a region of memory while the process is running to emulate the
getWriteWatch() syscall of Windows.
*(Moved to using UFFD instead of soft-dirty feature to find pages which
have been written-to from v7 patch series)*:
Stop using the soft-dirty flags for finding which pages have been
written to. It is too delicate and wrong as it shows more soft-dirty
pages than the actual soft-dirty pages. There is no interest in
correcting it [2][3] as this is how the feature was written years ago.
It shouldn't be updated to changed behaviour. Peter Xu has suggested
using the async version of the UFFD WP [4] as it is based inherently
on the PTEs.
So in this patch series, I've added a new mode to the UFFD which is
asynchronous version of the write protect. When this variant of the
UFFD WP is used, the page faults are resolved automatically by the
kernel. The pages which have been written-to can be found by reading
pagemap file (!PM_UFFD_WP). This feature can be used successfully to
find which pages have been written to from the time the pages were
write protected. This works just like the soft-dirty flag without
showing any extra pages which aren't soft-dirty in reality.
The information related to pages if the page is file mapped, present and
swapped is required for the CRIU project [5][6]. The addition of the
required mask, any mask, excluded mask and return masks are also required
for the CRIU project [5].
The IOCTL returns the addresses of the pages which match the specific
masks. The page addresses are returned in struct page_region in a compact
form. The max_pages is needed to support a use case where user only wants
to get a specific number of pages. So there is no need to find all the
pages of interest in the range when max_pages is specified. The IOCTL
returns when the maximum number of the pages are found. The max_pages is
optional. If max_pages is specified, it must be equal or greater than the
vec_size. This restriction is needed to handle worse case when one
page_region only contains info of one page and it cannot be compacted.
This is needed to emulate the Windows getWriteWatch() syscall.
The patch series include the detailed selftest which can be used as an
example for the uffd async wp test and PAGEMAP_IOCTL. It shows the
interface usages as well.
[1] https://lore.kernel.org/lkml/54d4c322-cd6e-eefd-b161-2af2b56aae24@collabora…
[2] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.…
[3] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.…
[4] https://lore.kernel.org/all/Y6Hc2d+7eTKs7AiH@x1n
[5] https://lore.kernel.org/all/YyiDg79flhWoMDZB@gmail.com/
[6] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com/
Regards,
Muhammad Usama Anjum
Muhammad Usama Anjum (5):
fs/proc/task_mmu: Implement IOCTL to get and optionally clear info
about PTEs
fs/proc/task_mmu: Add fast paths to get/clear PAGE_IS_WRITTEN flag
tools headers UAPI: Update linux/fs.h with the kernel sources
mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL
selftests: mm: add pagemap ioctl tests
Peter Xu (1):
userfaultfd: UFFD_FEATURE_WP_ASYNC
Documentation/admin-guide/mm/pagemap.rst | 89 +
Documentation/admin-guide/mm/userfaultfd.rst | 35 +
fs/proc/task_mmu.c | 722 ++++++++
fs/userfaultfd.c | 26 +-
include/linux/hugetlb.h | 1 +
include/linux/userfaultfd_k.h | 28 +-
include/uapi/linux/fs.h | 59 +
include/uapi/linux/userfaultfd.h | 9 +-
mm/hugetlb.c | 34 +-
mm/memory.c | 28 +-
tools/include/uapi/linux/fs.h | 59 +
tools/testing/selftests/mm/.gitignore | 2 +
tools/testing/selftests/mm/Makefile | 3 +-
tools/testing/selftests/mm/config | 1 +
tools/testing/selftests/mm/pagemap_ioctl.c | 1660 ++++++++++++++++++
tools/testing/selftests/mm/run_vmtests.sh | 4 +
16 files changed, 2736 insertions(+), 24 deletions(-)
create mode 100644 tools/testing/selftests/mm/pagemap_ioctl.c
--
2.40.1
Immediate is incorrectly cast to u32 before being spilled, losing sign
information. The range information is incorrect after load again. Fix
immediate spill by remove the cast. The second patch add a test case
for this.
Signed-off-by: Hao Sun <sunhao.th(a)gmail.com>
---
Changes in v3:
- Change the expected log to fix the test case
- Link to v2: https://lore.kernel.org/r/20231101-fix-check-stack-write-v2-0-cb7c17b869b0@…
Changes in v2:
- Add fix and cc tags.
- Link to v1: https://lore.kernel.org/r/20231026-fix-check-stack-write-v1-0-6b325ef3ce7e@…
---
Hao Sun (2):
bpf: Fix check_stack_write_fixed_off() to correctly spill imm
selftests/bpf: Add test for immediate spilled to stack
kernel/bpf/verifier.c | 2 +-
tools/testing/selftests/bpf/verifier/bpf_st_mem.c | 32 +++++++++++++++++++++++
2 files changed, 33 insertions(+), 1 deletion(-)
---
base-commit: f2fbb908112311423b09cd0d2b4978f174b99585
change-id: 20231026-fix-check-stack-write-c40996694dfa
Best regards,
--
Hao Sun <sunhao.th(a)gmail.com>
Hi Linus,
Please pull the following KUnit next update for Linux 6.7-rc1.
This kunit update for Linux 6.7-rc1 consists of:
-- string-stream testing enhancements
-- several fixes memory leaks
-- fix to reset status during parameter handling
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit ce9ecca0238b140b88f43859b211c9fdfd8e5b70:
Linux 6.6-rc2 (2023-09-17 14:40:24 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-kunit-6.7-rc1
for you to fetch changes up to 8040345fdae4cb256c5d981f91ae0f22bea8adcc:
kunit: test: Fix the possible memory leak in executor_test (2023-09-28 08:51:07 -0600)
----------------------------------------------------------------
linux_kselftest-kunit-6.7-rc1
This kunit update for Linux 6.7-rc1 consists of:
-- string-stream testing enhancements
-- several fixes memory leaks
-- fix to reset status during parameter handling
----------------------------------------------------------------
Jinjie Ruan (4):
kunit: Fix missed memory release in kunit_free_suite_set()
kunit: Fix the wrong kfree of copy for kunit_filter_suites()
kunit: Fix possible memory leak in kunit_filter_suites()
kunit: test: Fix the possible memory leak in executor_test
Michal Wajdeczko (1):
kunit: Reset test status on each param iteration
Richard Fitzgerald (10):
kunit: string-stream: Don't create a fragment for empty strings
kunit: string-stream: Improve testing of string_stream
kunit: string-stream: Add option to make all lines end with newline
kunit: string-stream-test: Add cases for string_stream newline appending
kunit: Don't use a managed alloc in is_literal()
kunit: string-stream: Add kunit_alloc_string_stream()
kunit: string-stream: Decouple string_stream from kunit
kunit: string-stream: Add tests for freeing resource-managed string_stream
kunit: Use string_stream for test log
kunit: string-stream: Test performance of string_stream
include/kunit/test.h | 14 +-
lib/kunit/assert.c | 14 +-
lib/kunit/debugfs.c | 36 ++-
lib/kunit/executor.c | 23 +-
lib/kunit/executor_test.c | 36 +--
lib/kunit/kunit-example-test.c | 5 +-
lib/kunit/kunit-test.c | 56 ++++-
lib/kunit/string-stream-test.c | 525 +++++++++++++++++++++++++++++++++++++++--
lib/kunit/string-stream.c | 100 ++++++--
lib/kunit/string-stream.h | 16 +-
lib/kunit/test.c | 56 +----
11 files changed, 734 insertions(+), 147 deletions(-)
----------------------------------------------------------------
This patchset adds two kfunc helpers, bpf_xdp_get_xfrm_state() and
bpf_xdp_xfrm_state_release() that wrap xfrm_state_lookup() and
xfrm_state_put(). The intent is to support software RSS (via XDP) for
the ongoing/upcoming ipsec pcpu work [0]. Recent experiments performed
on (hopefully) reproducible AWS testbeds indicate that single tunnel
pcpu ipsec can reach line rate on 100G ENA nics.
Note this patchset only tests/shows generic xfrm_state access. The
"secret sauce" (if you can really even call it that) involves accessing
a soon-to-be-upstreamed pcpu_num field in xfrm_state. Early example is
available here [1].
[0]: https://datatracker.ietf.org/doc/html/draft-ietf-ipsecme-multi-sa-performan…
[1]: https://github.com/danobi/xdp-tools/blob/e89a1c617aba3b50d990f779357d6ce286…
Changes from RFCv1:
* Add Antony's commit tags
* Add KF_ACQUIRE and KF_RELEASE semantics
Daniel Xu (7):
bpf: xfrm: Add bpf_xdp_get_xfrm_state() kfunc
bpf: xfrm: Add bpf_xdp_xfrm_state_release() kfunc
bpf: selftests: test_tunnel: Use ping -6 over ping6
bpf: selftests: test_tunnel: Mount bpffs if necessary
bpf: selftests: test_tunnel: Use vmlinux.h declarations
bpf: selftests: test_tunnel: Disable CO-RE relocations
bpf: xfrm: Add selftest for bpf_xdp_get_xfrm_state()
include/net/xfrm.h | 9 ++
net/xfrm/Makefile | 1 +
net/xfrm/xfrm_policy.c | 2 +
net/xfrm/xfrm_state_bpf.c | 121 ++++++++++++++++++
.../selftests/bpf/progs/bpf_tracing_net.h | 1 +
.../selftests/bpf/progs/test_tunnel_kern.c | 98 ++++++++------
tools/testing/selftests/bpf/test_tunnel.sh | 43 +++++--
7 files changed, 221 insertions(+), 54 deletions(-)
create mode 100644 net/xfrm/xfrm_state_bpf.c
--
2.42.0
Immediate is incorrectly cast to u32 before being spilled, losing sign
information. The range information is incorrect after load again. Fix
immediate spill by remove the cast. The second patch add a test case
for this.
Signed-off-by: Hao Sun <sunhao.th(a)gmail.com>
---
Changes in v2:
- Add fix and cc tags.
- Link to v1: https://lore.kernel.org/r/20231026-fix-check-stack-write-v1-0-6b325ef3ce7e@…
---
Hao Sun (2):
bpf: Fix check_stack_write_fixed_off() to correctly spill imm
selftests/bpf: Add test for immediate spilled to stack
kernel/bpf/verifier.c | 2 +-
tools/testing/selftests/bpf/verifier/bpf_st_mem.c | 32 +++++++++++++++++++++++
2 files changed, 33 insertions(+), 1 deletion(-)
---
base-commit: f1c73396133cb3d913e2075298005644ee8dfade
change-id: 20231026-fix-check-stack-write-c40996694dfa
Best regards,
--
Hao Sun <sunhao.th(a)gmail.com>