Changes since v6:
* Remove inlines from arm_pmuv3.c
* Use format attribute mechanism from SPE
* Re-arrange attributes so that threshold comes last and can
potentially be extended
* Emit an error if the max threshold is exceeded rather than clamping
* Convert all register fields to GENMASK
Changes since v5:
* Restructure the docs and add some more explanations
* PMMIR.WIDTH -> PMMIR.THWIDTH in one comment
* Don't write EVTYPER.TC if TH is 0. Doesn't have any functional
effect but it might be a bit easier to understand the code.
* Expand the format field #define names
Changes since v4:
* Rebase onto v6.7-rc1, it no longer depends on kvmarm/next
* Remove change that moved ARMV8_PMU_EVTYPE_MASK to the asm files.
This actually depended on those files being included in a certain
order with arm_pmuv3.h to avoid circular includes. Now the
definition is done programmatically in arm_pmuv3.c instead.
Changes since v3:
* Drop #include changes to KVM source files because since
commit bc512d6a9b92 ("KVM: arm64: Make PMEVTYPER<n>_EL0.NSH RES0 if
EL2 isn't advertised"), KVM doesn't use ARMV8_PMU_EVTYPE_MASK
anymore
Changes since v2:
* Split threshold_control attribute into two, threshold_compare and
threshold_count so that it's easier to use
* Add some notes to the first commit message and the cover letter
about the behavior in KVM
* Update the docs commit with regards to the split attribute
Changes since v1:
* Fix build on aarch32 by disabling FEAT_PMUv3_TH and splitting event
type mask between the platforms
* Change armv8pmu_write_evtype() to take unsigned long instead of u64
so it isn't unnecessarily wide on aarch32
* Add UL suffix to aarch64 event type mask definition
----
FEAT_PMUv3_TH (Armv8.8) is a new feature that allows conditional
counting of PMU events depending on how much the event increments on
a single cycle. Two new config fields for perf_event_open have been
added, and a PMU cap file for reading the max_threshold. See the second
commit message and the docs in the last commit for more details.
The feature is not currently supported on KVM guests, and PMMIR is set
to read as zero, so it's not advertised as available. But it can be
added at a later time. Writes to PMEVTYPER.TC and TH from guests are
already RES0.
The change has been validated on the Arm FVP model:
# Zero values, works as expected (as before).
$ perf stat -e dtlb_walk/threshold=0,threshold_compare=0/ -- true
5962 dtlb_walk/threshold=0,threshold_compare=0/
# Threshold >= 255 causes count to be 0 because dtlb_walk doesn't
# increase by more than 1 per cycle.
$ perf stat -e dtlb_walk/threshold=255,threshold_compare=2/ -- true
0 dtlb_walk/threshold=255,threshold_compare=2/
# Keeping comparison as >= but lowering the threshold to 1 makes the
# count return.
$ perf stat -e dtlb_walk/threshold=1,threshold_compare=2/ -- true
6329 dtlb_walk/threshold=1,threshold_compare=2/
James Clark (11):
arm: perf: Remove inlines from arm_pmuv3.c
arm: perf/kvm: Use GENMASK for ARMV8_PMU_PMCR_N
arm: perf: Use GENMASK for PMMIR fields
arm: perf: Convert remaining fields to use GENMASK
arm64: perf: Include threshold control fields in PMEVTYPER mask
arm: pmu: Share user ABI format mechanism with SPE
perf/arm_dmc620: Remove duplicate format attribute #defines
KVM: selftests: aarch64: Update tools copy of arm_pmuv3.h
arm: pmu: Move error message and -EOPNOTSUPP to individual PMUs
arm64: perf: Add support for event counting threshold
Documentation: arm64: Document the PMU event counting threshold
feature
Documentation/arch/arm64/perf.rst | 72 +++++++
arch/arm/kernel/perf_event_v7.c | 6 +-
arch/arm64/kvm/pmu-emul.c | 8 +-
arch/arm64/kvm/sys_regs.c | 4 +-
drivers/perf/apple_m1_cpu_pmu.c | 6 +-
drivers/perf/arm_dmc620_pmu.c | 22 +--
drivers/perf/arm_pmu.c | 11 +-
drivers/perf/arm_pmuv3.c | 175 ++++++++++++++----
drivers/perf/arm_spe_pmu.c | 22 ---
include/linux/perf/arm_pmu.h | 22 +++
include/linux/perf/arm_pmuv3.h | 34 ++--
tools/include/perf/arm_pmuv3.h | 43 +++--
.../kvm/aarch64/vpmu_counter_access.c | 5 +-
13 files changed, 296 insertions(+), 134 deletions(-)
--
2.34.1
Here is the 3rd part of converting net selftests to run in unique namespace.
This part converts all srv6 and fib tests.
Note that patch 06 is a fix for testing fib_nexthop_multiprefix.
Here is the part 1 link:
https://lore.kernel.org/netdev/20231202020110.362433-1-liuhangbin@gmail.com
And part 2 link:
https://lore.kernel.org/netdev/20231206070801.1691247-1-liuhangbin@gmail.com
v1 -> v2:
- indent the test result in case --- cut off the patch (Jakub Kicinski)
Hangbin Liu (13):
selftests/net: add variable NS_LIST for lib.sh
selftests/net: convert srv6_end_dt46_l3vpn_test.sh to run it in unique
namespace
selftests/net: convert srv6_end_dt4_l3vpn_test.sh to run it in unique
namespace
selftests/net: convert srv6_end_dt6_l3vpn_test.sh to run it in unique
namespace
selftests/net: convert fcnal-test.sh to run it in unique namespace
selftests/net: fix grep checking for fib_nexthop_multiprefix
selftests/net: convert fib_nexthop_multiprefix to run it in unique
namespace
selftests/net: convert fib_nexthop_nongw.sh to run it in unique
namespace
selftests/net: convert fib_nexthops.sh to run it in unique namespace
selftests/net: convert fib-onlink-tests.sh to run it in unique
namespace
selftests/net: convert fib_rule_tests.sh to run it in unique namespace
selftests/net: convert fib_tests.sh to run it in unique namespace
selftests/net: convert fdb_flush.sh to run it in unique namespace
tools/testing/selftests/net/fcnal-test.sh | 30 ++-
tools/testing/selftests/net/fdb_flush.sh | 11 +-
.../testing/selftests/net/fib-onlink-tests.sh | 9 +-
.../selftests/net/fib_nexthop_multiprefix.sh | 98 +++++-----
.../selftests/net/fib_nexthop_nongw.sh | 34 ++--
tools/testing/selftests/net/fib_nexthops.sh | 142 +++++++-------
tools/testing/selftests/net/fib_rule_tests.sh | 36 ++--
tools/testing/selftests/net/fib_tests.sh | 184 +++++++++---------
tools/testing/selftests/net/lib.sh | 8 +
tools/testing/selftests/net/settings | 2 +-
.../selftests/net/srv6_end_dt46_l3vpn_test.sh | 51 +++--
.../selftests/net/srv6_end_dt4_l3vpn_test.sh | 48 ++---
.../selftests/net/srv6_end_dt6_l3vpn_test.sh | 46 ++---
13 files changed, 332 insertions(+), 367 deletions(-)
--
2.43.0
This patchset adds two kfunc helpers, bpf_xdp_get_xfrm_state() and
bpf_xdp_xfrm_state_release() that wrap xfrm_state_lookup() and
xfrm_state_put(). The intent is to support software RSS (via XDP) for
the ongoing/upcoming ipsec pcpu work [0]. Recent experiments performed
on (hopefully) reproducible AWS testbeds indicate that single tunnel
pcpu ipsec can reach line rate on 100G ENA nics.
Note this patchset only tests/shows generic xfrm_state access. The
"secret sauce" (if you can really even call it that) involves accessing
a soon-to-be-upstreamed pcpu_num field in xfrm_state. Early example is
available here [1].
[0]: https://datatracker.ietf.org/doc/draft-ietf-ipsecme-multi-sa-performance/03/
[1]: https://github.com/danobi/xdp-tools/blob/e89a1c617aba3b50d990f779357d6ce286…
Changes from v5:
* Improve kfunc doc comments
* Remove extraneous replay-window setting on selftest reverse path
* Squash two kfunc commits into one
* Rebase to bpf-next to pick up bitfield write patches
* Remove testing of opts.error in selftest prog
Changes from v4:
* Fixup commit message for selftest
* Set opts->error -ENOENT for !x
* Revert single file xfrm + bpf
Changes from v3:
* Place all xfrm bpf integrations in xfrm_bpf.c
* Avoid using nval as a temporary
* Rebase to bpf-next
* Remove extraneous __failure_unpriv annotation for verifier tests
Changes from v2:
* Fix/simplify BPF_CORE_WRITE_BITFIELD() algorithm
* Added verifier tests for bitfield writes
* Fix state leakage across test_tunnel subtests
Changes from v1:
* Move xfrm tunnel tests to test_progs
* Fix writing to opts->error when opts is invalid
* Use __bpf_kfunc_start_defs()
* Remove unused vxlanhdr definition
* Add and use BPF_CORE_WRITE_BITFIELD() macro
* Make series bisect clean
Changes from RFCv2:
* Rebased to ipsec-next
* Fix netns leak
Changes from RFCv1:
* Add Antony's commit tags
* Add KF_ACQUIRE and KF_RELEASE semantics
Daniel Xu (5):
bpf: xfrm: Add bpf_xdp_get_xfrm_state() kfunc
bpf: selftests: test_tunnel: Setup fresh topology for each subtest
bpf: selftests: test_tunnel: Use vmlinux.h declarations
bpf: selftests: Move xfrm tunnel test to test_progs
bpf: xfrm: Add selftest for bpf_xdp_get_xfrm_state()
include/net/xfrm.h | 9 +
net/xfrm/Makefile | 1 +
net/xfrm/xfrm_policy.c | 2 +
net/xfrm/xfrm_state_bpf.c | 134 +++++++++++++++
.../selftests/bpf/prog_tests/test_tunnel.c | 162 +++++++++++++++++-
.../selftests/bpf/progs/bpf_tracing_net.h | 1 +
.../selftests/bpf/progs/test_tunnel_kern.c | 138 ++++++++-------
tools/testing/selftests/bpf/test_tunnel.sh | 92 ----------
8 files changed, 384 insertions(+), 155 deletions(-)
create mode 100644 net/xfrm/xfrm_state_bpf.c
--
2.42.1
This patchset adds two kfunc helpers, bpf_xdp_get_xfrm_state() and
bpf_xdp_xfrm_state_release() that wrap xfrm_state_lookup() and
xfrm_state_put(). The intent is to support software RSS (via XDP) for
the ongoing/upcoming ipsec pcpu work [0]. Recent experiments performed
on (hopefully) reproducible AWS testbeds indicate that single tunnel
pcpu ipsec can reach line rate on 100G ENA nics.
Note this patchset only tests/shows generic xfrm_state access. The
"secret sauce" (if you can really even call it that) involves accessing
a soon-to-be-upstreamed pcpu_num field in xfrm_state. Early example is
available here [1].
[0]: https://datatracker.ietf.org/doc/draft-ietf-ipsecme-multi-sa-performance/03/
[1]: https://github.com/danobi/xdp-tools/blob/e89a1c617aba3b50d990f779357d6ce286…
Changes from v4:
* Fixup commit message for selftest
* Set opts->error -ENOENT for !x
* Revert single file xfrm + bpf
Changes from v3:
* Place all xfrm bpf integrations in xfrm_bpf.c
* Avoid using nval as a temporary
* Rebase to bpf-next
* Remove extraneous __failure_unpriv annotation for verifier tests
Changes from v2:
* Fix/simplify BPF_CORE_WRITE_BITFIELD() algorithm
* Added verifier tests for bitfield writes
* Fix state leakage across test_tunnel subtests
Changes from v1:
* Move xfrm tunnel tests to test_progs
* Fix writing to opts->error when opts is invalid
* Use __bpf_kfunc_start_defs()
* Remove unused vxlanhdr definition
* Add and use BPF_CORE_WRITE_BITFIELD() macro
* Make series bisect clean
Changes from RFCv2:
* Rebased to ipsec-next
* Fix netns leak
Changes from RFCv1:
* Add Antony's commit tags
* Add KF_ACQUIRE and KF_RELEASE semantics
Daniel Xu (9):
bpf: xfrm: Add bpf_xdp_get_xfrm_state() kfunc
bpf: xfrm: Add bpf_xdp_xfrm_state_release() kfunc
libbpf: Add BPF_CORE_WRITE_BITFIELD() macro
bpf: selftests: test_loader: Support __btf_path() annotation
bpf: selftests: Add verifier tests for CO-RE bitfield writes
bpf: selftests: test_tunnel: Setup fresh topology for each subtest
bpf: selftests: test_tunnel: Use vmlinux.h declarations
bpf: selftests: Move xfrm tunnel test to test_progs
bpf: xfrm: Add selftest for bpf_xdp_get_xfrm_state()
include/net/xfrm.h | 9 +
net/xfrm/Makefile | 1 +
net/xfrm/xfrm_policy.c | 2 +
net/xfrm/xfrm_state_bpf.c | 130 ++++++++++++++
tools/lib/bpf/bpf_core_read.h | 32 ++++
.../selftests/bpf/prog_tests/test_tunnel.c | 162 +++++++++++++++++-
.../selftests/bpf/prog_tests/verifier.c | 2 +
tools/testing/selftests/bpf/progs/bpf_misc.h | 1 +
.../selftests/bpf/progs/bpf_tracing_net.h | 1 +
.../selftests/bpf/progs/test_tunnel_kern.c | 138 ++++++++-------
.../bpf/progs/verifier_bitfield_write.c | 100 +++++++++++
tools/testing/selftests/bpf/test_loader.c | 7 +
tools/testing/selftests/bpf/test_tunnel.sh | 92 ----------
13 files changed, 522 insertions(+), 155 deletions(-)
create mode 100644 net/xfrm/xfrm_state_bpf.c
create mode 100644 tools/testing/selftests/bpf/progs/verifier_bitfield_write.c
--
2.42.1
The test is inspired by the pmu_event_filter_test which implemented by x86. On
the arm64 platform, there is the same ability to set the pmu_event_filter
through the KVM_ARM_VCPU_PMU_V3_FILTER attribute. So add the test for arm64.
The series first move some pmu common code from vpmu_counter_access to
lib/aarch64/vpmu.c and include/aarch64/vpmu.h, which can be used by
pmu_event_filter_test. Then fix a bug related to the [enable|disable]_counter,
and at last, implement the test itself.
Changelog:
----------
v1->v2:
- Improve the commit message. [Eric]
- Fix the bug in [enable|disable]_counter. [Raghavendra & Marc]
- Add the check if kvm has attr KVM_ARM_VCPU_PMU_V3_FILTER.
- Add if host pmu support the test event throught pmceid0.
- Split the test_invalid_filter() to another patch. [Eric]
v1: https://lore.kernel.org/all/20231123063750.2176250-1-shahuang@redhat.com/
Shaoqin Huang (5):
KVM: selftests: aarch64: Make the [create|destroy]_vpmu_vm() public
KVM: selftests: aarch64: Move pmu helper functions into vpmu.h
KVM: selftests: aarch64: Fix the buggy [enable|disable]_counter
KVM: selftests: aarch64: Introduce pmu_event_filter_test
KVM: selftests: aarch64: Add invalid filter test in
pmu_event_filter_test
tools/testing/selftests/kvm/Makefile | 2 +
.../kvm/aarch64/pmu_event_filter_test.c | 267 ++++++++++++++++++
.../kvm/aarch64/vpmu_counter_access.c | 218 ++------------
.../selftests/kvm/include/aarch64/vpmu.h | 135 +++++++++
.../testing/selftests/kvm/lib/aarch64/vpmu.c | 74 +++++
5 files changed, 502 insertions(+), 194 deletions(-)
create mode 100644 tools/testing/selftests/kvm/aarch64/pmu_event_filter_test.c
create mode 100644 tools/testing/selftests/kvm/include/aarch64/vpmu.h
create mode 100644 tools/testing/selftests/kvm/lib/aarch64/vpmu.c
--
2.40.1
Hi all,
Here's v3 series to improve resctrl selftests with generalized test
framework and rewritten CAT test. As agreed, v3 does not include the
group naming patch which will become part of Maciej's non-contiguous
serie. The error handling cleanups (return errno, perror() & return
value comment cleanups) and CPU affinity restore for CAT test add to
the patch count.
The series contains following improvements:
- Excludes shareable bits from CAT test allocation to avoid interference
- Replaces file "sink" with a volatile variable
- Alters read pattern to defeat HW prefetcher optimizations
- Rewrites CAT test to make the CAT test reliable and truly measure
if CAT is working or not
- Introduces generalized test framework making easier to add new tests
- Lots of other cleanups & refactoring
This serie have been tested across a large number of systems from
different generations.
v3:
- New patches to handle return errno, perror() and return value comments
- Tweak changelogs
- Moved error printout removal to other patch
- Zero bit CBM returns error
- Tweak comments
- Make get_shareable_mask() static
- Return directly without storing result into ret variable first
- llc -> LLC
- Altered changelog and removed "the whole time" wording because
llc occu results are still unsigned long
- Altered changelog's wording to not say "a volatile pointer"
- Make min_diff_percent and MIN_DIFF_PERCENT_PER_BIT unsigned long
- Add patch to restore CPU affinity after CAT test
- Move uparams clear into init function
- Add CPU vendor ID bitmask comment
- Use test_resource_feature_check(test) in CMT
- "feature" -> "resource" in function comment
v2:
- Postpone adding L2 CAT test as more investigations are necessary
- Add patch to remove ctrlc_handler() from wrong place
- Improvements to changelogs
- Function comments improvements & comment cleanups
- Move some parts of the changes into more logical patch
- If checks: buf == NULL -> !buf
- Variable naming:
- p -> buf
- cbm_mask_path -> cbm_path
- Function naming:
- get_cbm_mask() -> get_full_cbm()
- cache_size() -> cache_portion_size()
- Use PATH_MAX
- Improved cache_portion_size() parameter names
- int count -> unsigned int
- Pass filename to measurement taking functions instead of
resctrl_val_param
- !lines ? : reversal
- Removed bogus static from function local variable
- Open perf fd only once, reset & enable in the innermost test loop
- Add perf fd ioctl() error handling
- Add patch to change compiler optimization prevention "sink" from file
to volatile variable
- Remove cpu_no and resource (the latter was added in v1) members from
resctrl_val_param (pass uparams and test where those are needed)
- Removed ARRAY_SIZE() macro
- Add patch to rename "resource_id" to "domain_id"
Ilpo Järvinen (29):
selftests/resctrl: Convert perror() to ksft_perror() or
ksft_print_msg()
selftests/resctrl: Return -1 instead of errno on error
selftests/resctrl: Don't use ctrlc_handler() outside signal handling
selftests/resctrl: Change function comments to say < 0 on error
selftests/resctrl: Split fill_buf to allow tests finer-grained control
selftests/resctrl: Refactor fill_buf functions
selftests/resctrl: Refactor get_cbm_mask() and rename to
get_full_cbm()
selftests/resctrl: Mark get_cache_size() cache_type const
selftests/resctrl: Create cache_portion_size() helper
selftests/resctrl: Exclude shareable bits from schemata in CAT test
selftests/resctrl: Split measure_cache_vals()
selftests/resctrl: Split show_cache_info() to test specific and
generic parts
selftests/resctrl: Remove unnecessary __u64 -> unsigned long
conversion
selftests/resctrl: Remove nested calls in perf event handling
selftests/resctrl: Consolidate naming of perf event related things
selftests/resctrl: Improve perf init
selftests/resctrl: Convert perf related globals to locals
selftests/resctrl: Move cat_val() to cat_test.c and rename to
cat_test()
selftests/resctrl: Open perf fd before start & add error handling
selftests/resctrl: Replace file write with volatile variable
selftests/resctrl: Read in less obvious order to defeat prefetch
optimizations
selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test
selftests/resctrl: Restore the CPU affinity after CAT test
selftests/resctrl: Create struct for input parameters
selftests/resctrl: Introduce generalized test framework
selftests/resctrl: Pass write_schemata() resource instead of test name
selftests/resctrl: Add helper to convert L2/3 to integer
selftests/resctrl: Rename resource ID to domain ID
selftests/resctrl: Get domain id from cache id
tools/testing/selftests/resctrl/cache.c | 287 +++++----------
tools/testing/selftests/resctrl/cat_test.c | 337 +++++++++++-------
tools/testing/selftests/resctrl/cmt_test.c | 80 +++--
tools/testing/selftests/resctrl/fill_buf.c | 132 ++++---
tools/testing/selftests/resctrl/mba_test.c | 30 +-
tools/testing/selftests/resctrl/mbm_test.c | 32 +-
tools/testing/selftests/resctrl/resctrl.h | 126 +++++--
.../testing/selftests/resctrl/resctrl_tests.c | 197 ++++------
tools/testing/selftests/resctrl/resctrl_val.c | 138 +++----
tools/testing/selftests/resctrl/resctrlfs.c | 321 +++++++++++------
10 files changed, 936 insertions(+), 744 deletions(-)
--
2.30.2
KUnit tests often need to provide a struct device, and thus far have
mostly been using root_device_register() or platform devices to create
a 'fake device' for use with, e.g., code which uses device-managed
resources. This has several disadvantages, including not being designed
for test use, scattering files in sysfs, and requiring manual teardown
on test exit, which may not always be possible in case of failure.
Instead, introduce a set of helper functions which allow devices
(internally a struct kunit_device) to be created and managed by KUnit --
i.e., they will be automatically unregistered on test exit. These
helpers can either use a user-provided struct device_driver, or have one
automatically created and managed by KUnit. In both cases, the device
lives on a new kunit_bus.
This is a follow-up to a previous proposal here:
https://lore.kernel.org/linux-kselftest/20230325043104.3761770-1-davidgow@g…
(The kunit_defer() function in the first patch there has since been
merged as the 'deferred actions' feature.)
My intention is to take this whole series in via the kselftest/kunit
branch, but I'm equally okay with splitting up the later patches which
use this to go via the various subsystem trees in case there are merge
conflicts.
Cheers,
-- David
Signed-off-by: David Gow <davidgow(a)google.com>
---
Changes in v3:
- Port the DRM tests to these new helpers (Thanks, Maxime!)
- Include the lib/kunit/device-impl.h file, which was missing from the
previous revision.
- Fix a use-after-free bug in kunit_device_driver_test, which resulted
in memory corruption on some clang-built UML builds.
- The 'test_state' is now allocated with kunit_kzalloc(), not on the
stack, as the stack will be gone when cleanup occurs.
- Link to v2: https://lore.kernel.org/r/20231208-kunit_bus-v2-0-e95905d9b325@google.com
Changes in v2:
- Simplify device/driver/bus matching, removing the no-longer-required
kunit_bus_match function. (Thanks, Greg)
- The return values are both more consistent (kunit_device_register now
returns an explicit error pointer, rather than failing the test), and
better documented.
- Add some basic documentation to the implementations as well as the
headers. The documentation in the headers is still more complete, and
is now properly compiled into the HTML documentation (under
dev-tools/kunit/api/resources.html). (Thanks, Matti)
- Moved the internal-only kunit_bus_init() function to a private header,
lib/kunit/device-impl.h to avoid polluting the public headers, and
match other internal-only headers. (Thanks, Greg)
- Alphabetise KUnit includes in other test modules. (Thanks, Amadeusz.)
- Several code cleanups, particularly around error handling and
allocation. (Thanks Greg, Maxime)
- Several const-correctness and casting improvements. (Thanks, Greg)
- Added a new test to verify KUnit cleanup triggers device cleanup.
(Thanks, Maxime).
- Improved the user-specified device test to verify that probe/remove
hooks are called correctly. (Thanks, Maxime).
- The overflow test no-longer needlessly calls
kunit_device_unregister().
- Several other minor cleanups and documentation improvements, which
hopefully make this a bit clearer and more robust.
- Link to v1: https://lore.kernel.org/r/20231205-kunit_bus-v1-0-635036d3bc13@google.com
---
David Gow (4):
kunit: Add APIs for managing devices
fortify: test: Use kunit_device
overflow: Replace fake root_device with kunit_device
ASoC: topology: Replace fake root_device with kunit_device in tests
Maxime Ripard (1):
drm/tests: Switch to kunit devices
Documentation/dev-tools/kunit/api/resource.rst | 9 ++
Documentation/dev-tools/kunit/usage.rst | 50 +++++++
drivers/gpu/drm/tests/drm_kunit_helpers.c | 66 +--------
include/kunit/device.h | 80 +++++++++++
lib/fortify_kunit.c | 5 +-
lib/kunit/Makefile | 3 +-
lib/kunit/device-impl.h | 17 +++
lib/kunit/device.c | 181 +++++++++++++++++++++++++
lib/kunit/kunit-test.c | 134 +++++++++++++++++-
lib/kunit/test.c | 3 +
lib/overflow_kunit.c | 5 +-
sound/soc/soc-topology-test.c | 10 +-
12 files changed, 485 insertions(+), 78 deletions(-)
---
base-commit: b285ba6f8cc1b2bfece0b4350fdb92c8780bc698
change-id: 20230718-kunit_bus-ab19c4ef48dc
Best regards,
--
David Gow <davidgow(a)google.com>
From: Paul Durrant <pdurrant(a)amazon.com>
There are four new patches in the series over what was in version 9 [1]:
* KVM: xen: separate initialization of shared_info cache and content
* KVM: xen: (re-)initialize shared_info if guest (32/64-bit) mode is set
These deal with a missing re-initialization of shared_info if either the
guest or VMM changes the 'long_mode' flag. This was discovred in testing
when the guest wallclock reverted to the Unix epoch because the pvclock
information in the shared_info page was not in the correct place, and so
the guest read zeroes instead.
* KVM: xen: don't block on pfncache locks in kvm_xen_set_evtchn_fast()
* KVM: pfncache: check the need for invalidation under read lock first
The first of these fixes a bug discovered when compiling the kernel with
CONFIG_PROVE_RAW_LOCK_NESTING: kvm_xen_set_evtchn_fast() can be called from
the callback of a HRTIMER_MODE_ABS_HARD timer and hence be executed in
IRQ context. It should therefore not block on any lock. Thus two
occurrences of a read_lock() are converted to a read_trylock() which
kick the code down a slow-path if they fail.
The second patch removes a 'false' contention on the pfncache lock that
could result in taking that slow-path: the MMU notifier callback need only
take a pfncache read lock; it only need take a write lock if a match is
found.
Apart from these new patches...
* KVM: xen: split up kvm_xen_set_evtchn_fast()
... has been re-worked to (hopefully) improve readability and also validate
the 'correct' vcpu_info structure depending on whether the guest is in long
mode or not.
[1] https://lore.kernel.org/kvm/20231122121822.1042-1-paul@xen.org/
Paul Durrant (19):
KVM: pfncache: Add a map helper function
KVM: pfncache: remove unnecessary exports
KVM: xen: mark guest pages dirty with the pfncache lock held
KVM: pfncache: add a mark-dirty helper
KVM: pfncache: remove KVM_GUEST_USES_PFN usage
KVM: pfncache: stop open-coding offset_in_page()
KVM: pfncache: include page offset in uhva and use it consistently
KVM: pfncache: allow a cache to be activated with a fixed (userspace)
HVA
KVM: xen: separate initialization of shared_info cache and content
KVM: xen: (re-)initialize shared_info if guest (32/64-bit) mode is set
KVM: xen: allow shared_info to be mapped by fixed HVA
KVM: xen: allow vcpu_info to be mapped by fixed HVA
KVM: selftests / xen: map shared_info using HVA rather than GFN
KVM: selftests / xen: re-map vcpu_info using HVA rather than GPA
KVM: xen: advertize the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA capability
KVM: xen: split up kvm_xen_set_evtchn_fast()
KVM: xen: don't block on pfncache locks in kvm_xen_set_evtchn_fast()
KVM: pfncache: check the need for invalidation under read lock first
KVM: xen: allow vcpu_info content to be 'safely' copied
Documentation/virt/kvm/api.rst | 53 ++-
arch/x86/kvm/x86.c | 7 +-
arch/x86/kvm/xen.c | 358 +++++++++++-------
include/linux/kvm_host.h | 40 +-
include/linux/kvm_types.h | 8 -
include/uapi/linux/kvm.h | 9 +-
.../selftests/kvm/x86_64/xen_shinfo_test.c | 59 ++-
virt/kvm/pfncache.c | 185 ++++-----
8 files changed, 461 insertions(+), 258 deletions(-)
base-commit: 1ab097653e4dd8d23272d028a61352c23486fd4a
--
2.39.2
This patch series introduces UFFDIO_MOVE feature to userfaultfd, which
has long been implemented and maintained by Andrea in his local tree [1],
but was not upstreamed due to lack of use cases where this approach would
be better than allocating a new page and copying the contents. Previous
upstraming attempts could be found at [6] and [7].
UFFDIO_COPY performs ~20% better than UFFDIO_MOVE when the application
needs pages to be allocated [2]. However, with UFFDIO_MOVE, if pages are
available (in userspace) for recycling, as is usually the case in heap
compaction algorithms, then we can avoid the page allocation and memcpy
(done by UFFDIO_COPY). Also, since the pages are recycled in the
userspace, we avoid the need to release (via madvise) the pages back to
the kernel [3].
We see over 40% reduction (on a Google pixel 6 device) in the compacting
thread’s completion time by using UFFDIO_MOVE vs. UFFDIO_COPY. This was
measured using a benchmark that emulates a heap compaction implementation
using userfaultfd (to allow concurrent accesses by application threads).
More details of the usecase are explained in [3].
Furthermore, UFFDIO_MOVE enables moving swapped-out pages without
touching them within the same vma. Today, it can only be done by mremap,
however it forces splitting the vma.
TODOs for follow-up improvements:
- cross-mm support. Known differences from single-mm and missing pieces:
- memcg recharging (might need to isolate pages in the process)
- mm counters
- cross-mm deposit table moves
- cross-mm test
- document the address space where src and dest reside in struct
uffdio_move
- TLB flush batching. Will require extensive changes to PTL locking in
move_pages_pte(). OTOH that might let us reuse parts of mremap code.
Changes since v5 [10]:
- added logic to split large folios in move_pages_pte(),
per David Hildenbrand
- added check for PAE before split_huge_pmd() to avoid the split if the
move operation can't be done
- replaced calls to default_huge_page_size() with read_pmd_pagesize() in
uffd_move_pmd test, per David Hildenbrand
- fixed the condition in uffd_move_test_common() checking if area
alignment is needed
Changes since v4 [9]:
- added Acked-by in patch 1, per Peter Xu
- added description for ctx, mm and mode parameters of move_pages(),
per kernel test robot
- added Reviewed-by's, per Peter Xu and Axel Rasmussen
- removed unused operations in uffd_test_case_ops
- refactored uffd-unit-test changes to avoid using global variables and
handle pmd moves without page size overrides, per Peter Xu
Changes since v3 [8]:
- changed retry path in folio_lock_anon_vma_read() to unlock and then
relock RCU, per Peter Xu
- removed cross-mm support from initial patchset, per David Hildenbrand
- replaced BUG_ONs with VM_WARN_ON or WARN_ON_ONCE, per David Hildenbrand
- added missing cache flushing, per Lokesh Gidra and Peter Xu
- updated manpage text in the patch description, per Peter Xu
- renamed internal functions from "remap" to "move", per Peter Xu
- added mmap_changing check after taking mmap_lock, per Peter Xu
- changed uffd context check to ensure dst_mm is registered onto uffd we
are operating on, Peter Xu and David Hildenbrand
- changed to non-maybe variants of maybe*_mkwrite(), per David Hildenbrand
- fixed warning for CONFIG_TRANSPARENT_HUGEPAGE=n, per kernel test robot
- comments cleanup, per David Hildenbrand and Peter Xu
- checks for VM_IO,VM_PFNMAP,VM_HUGETLB,..., per David Hildenbrand
- prevent moving pinned pages, per Peter Xu
- changed uffd tests to call move uffd_test_ctx_clear() at the end of the
test run instead of in the beginning of the next run
- added support for testcase-specific ops
- added test for moving PMD-aligned blocks
Changes since v2 [5]:
- renamed UFFDIO_REMAP to UFFDIO_MOVE, per David Hildenbrand
- rebase over mm-unstable to use folio_move_anon_rmap(),
per David Hildenbrand
- added text for manpage explaining DONTFORK and KSM requirements for this
feature, per David Hildenbrand
- check for anon_vma changes in the fast path of folio_lock_anon_vma_read,
per Peter Xu
- updated the title and description of the first patch,
per David Hildenbrand
- updating comments in folio_lock_anon_vma_read() explaining the need for
anon_vma checks, per David Hildenbrand
- changed all mapcount checks to PageAnonExclusive, per Jann Horn and
David Hildenbrand
- changed counters in remap_swap_pte() from MM_ANONPAGES to MM_SWAPENTS,
per Jann Horn
- added a check for PTE change after folio is locked in remap_pages_pte(),
per Jann Horn
- added handling of PMD migration entries and bailout when pmd_devmap(),
per Jann Horn
- added checks to ensure both src and dst VMAs are writable, per Peter Xu
- added UFFD_FEATURE_MOVE, per Peter Xu
- removed obsolete comments, per Peter Xu
- renamed remap_anon_pte to remap_present_pte, per Peter Xu
- added a comment for folio_get_anon_vma() explaining the need for
anon_vma checks, per Peter Xu
- changed error handling in remap_pages() to make it more clear,
per Peter Xu
- changed EFAULT to EAGAIN to retry when a hugepage appears or disappears
from under us, per Peter Xu
- added links to previous upstreaming attempts, per David Hildenbrand
Changes since v1 [4]:
- add mmget_not_zero in userfaultfd_remap, per Jann Horn
- removed extern from function definitions, per Matthew Wilcox
- converted to folios in remap_pages_huge_pmd, per Matthew Wilcox
- use PageAnonExclusive in remap_pages_huge_pmd, per David Hildenbrand
- handle pgtable transfers between MMs, per Jann Horn
- ignore concurrent A/D pte bit changes, per Jann Horn
- split functions into smaller units, per David Hildenbrand
- test for folio_test_large in remap_anon_pte, per Matthew Wilcox
- use pte_swp_exclusive for swapcount check, per David Hildenbrand
- eliminated use of mmu_notifier_invalidate_range_start_nonblock,
per Jann Horn
- simplified THP alignment checks, per Jann Horn
- refactored the loop inside remap_pages, per Jann Horn
- additional clarifying comments, per Jann Horn
Main changes since Andrea's last version [1]:
- Trivial translations from page to folio, mmap_sem to mmap_lock
- Replace pmd_trans_unstable() with pte_offset_map_nolock() and handle its
possible failure
- Move pte mapping into remap_pages_pte to allow for retries when source
page or anon_vma is contended. Since pte_offset_map_nolock() start RCU
read section, we can't block anymore after mapping a pte, so have to unmap
the ptesm do the locking and retry.
- Add and use anon_vma_trylock_write() to avoid blocking while in RCU
read section.
- Accommodate changes in mmu_notifier_range_init() API, switch to
mmu_notifier_invalidate_range_start_nonblock() to avoid blocking while in
RCU read section.
- Open-code now removed __swp_swapcount()
- Replace pmd_read_atomic() with pmdp_get_lockless()
- Add new selftest for UFFDIO_MOVE
[1] https://gitlab.com/aarcange/aa/-/commit/2aec7aea56b10438a3881a20a411aa4b1fc…
[2] https://lore.kernel.org/all/1425575884-2574-1-git-send-email-aarcange@redha…
[3] https://lore.kernel.org/linux-mm/CA+EESO4uO84SSnBhArH4HvLNhaUQ5nZKNKXqxRCyj…
[4] https://lore.kernel.org/all/20230914152620.2743033-1-surenb@google.com/
[5] https://lore.kernel.org/all/20230923013148.1390521-1-surenb@google.com/
[6] https://lore.kernel.org/all/1425575884-2574-21-git-send-email-aarcange@redh…
[7] https://lore.kernel.org/all/cover.1547251023.git.blake.caldwell@colorado.ed…
[8] https://lore.kernel.org/all/20231009064230.2952396-1-surenb@google.com/
[9] https://lore.kernel.org/all/20231028003819.652322-1-surenb@google.com/
[10] https://lore.kernel.org/all/20231121171643.3719880-1-surenb@google.com/
Andrea Arcangeli (2):
mm/rmap: support move to different root anon_vma in
folio_move_anon_rmap()
userfaultfd: UFFDIO_MOVE uABI
Suren Baghdasaryan (3):
selftests/mm: call uffd_test_ctx_clear at the end of the test
selftests/mm: add uffd_test_case_ops to allow test case-specific
operations
selftests/mm: add UFFDIO_MOVE ioctl test
Documentation/admin-guide/mm/userfaultfd.rst | 3 +
fs/userfaultfd.c | 72 +++
include/linux/rmap.h | 5 +
include/linux/userfaultfd_k.h | 11 +
include/uapi/linux/userfaultfd.h | 29 +-
mm/huge_memory.c | 122 ++++
mm/khugepaged.c | 3 +
mm/rmap.c | 30 +
mm/userfaultfd.c | 614 +++++++++++++++++++
tools/testing/selftests/mm/uffd-common.c | 39 +-
tools/testing/selftests/mm/uffd-common.h | 9 +
tools/testing/selftests/mm/uffd-stress.c | 5 +-
tools/testing/selftests/mm/uffd-unit-tests.c | 192 ++++++
13 files changed, 1130 insertions(+), 4 deletions(-)
--
2.43.0.rc2.451.g8631bc7472-goog