Changelog: v2 -> v3
* Minimal code refactoring
* Rebased on v6.6-rc1
RFC v1:
https://lore.kernel.org/all/20210611124154.56427-1-psampat@linux.ibm.com/
RFC v2:
https://lore.kernel.org/all/20230828061530.126588-2-aboorvad@linux.vnet.ibm…
Other related RFC:
https://lore.kernel.org/all/20210430082804.38018-1-psampat@linux.ibm.com/
Userspace selftest:
https://lkml.org/lkml/2020/9/2/356
----
A kernel module + userspace driver to estimate the wakeup latency
caused by going into stop states. The motivation behind this program is
to find significant deviations behind advertised latency and residency
values.
The patchset measures latencies for two kinds of events. IPIs and Timers
As this is a software-only mechanism, there will be additional latencies
of the kernel-firmware-hardware interactions. To account for that, the
program also measures a baseline latency on a 100 percent loaded CPU
and the latencies achieved must be in view relative to that.
To achieve this, we introduce a kernel module and expose its control
knobs through the debugfs interface that the selftests can engage with.
The kernel module provides the following interfaces within
/sys/kernel/debug/powerpc/latency_test/ for,
IPI test:
ipi_cpu_dest = Destination CPU for the IPI
ipi_cpu_src = Origin of the IPI
ipi_latency_ns = Measured latency time in ns
Timeout test:
timeout_cpu_src = CPU on which the timer to be queued
timeout_expected_ns = Timer duration
timeout_diff_ns = Difference of actual duration vs expected timer
Sample output is as follows:
# --IPI Latency Test---
# Baseline Avg IPI latency(ns): 2720
# Observed Avg IPI latency(ns) - State snooze: 2565
# Observed Avg IPI latency(ns) - State stop0_lite: 3856
# Observed Avg IPI latency(ns) - State stop0: 3670
# Observed Avg IPI latency(ns) - State stop1: 3872
# Observed Avg IPI latency(ns) - State stop2: 17421
# Observed Avg IPI latency(ns) - State stop4: 1003922
# Observed Avg IPI latency(ns) - State stop5: 1058870
#
# --Timeout Latency Test--
# Baseline Avg timeout diff(ns): 1435
# Observed Avg timeout diff(ns) - State snooze: 1709
# Observed Avg timeout diff(ns) - State stop0_lite: 2028
# Observed Avg timeout diff(ns) - State stop0: 1954
# Observed Avg timeout diff(ns) - State stop1: 1895
# Observed Avg timeout diff(ns) - State stop2: 14556
# Observed Avg timeout diff(ns) - State stop4: 873988
# Observed Avg timeout diff(ns) - State stop5: 959137
Aboorva Devarajan (2):
powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events
powerpc/selftest: Add support for cpuidle latency measurement
arch/powerpc/Kconfig.debug | 10 +
arch/powerpc/kernel/Makefile | 1 +
arch/powerpc/kernel/test_cpuidle_latency.c | 154 ++++++
tools/testing/selftests/powerpc/Makefile | 1 +
.../powerpc/cpuidle_latency/.gitignore | 2 +
.../powerpc/cpuidle_latency/Makefile | 6 +
.../cpuidle_latency/cpuidle_latency.sh | 443 ++++++++++++++++++
.../powerpc/cpuidle_latency/settings | 1 +
8 files changed, 618 insertions(+)
create mode 100644 arch/powerpc/kernel/test_cpuidle_latency.c
create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/.gitignore
create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/Makefile
create mode 100755 tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh
create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/settings
--
2.25.1
When execute the following command to test clone3 under !CONFIG_TIME_NS:
# make headers && cd tools/testing/selftests/clone3 && make && ./clone3
we can see the following error info:
# [7538] Trying clone3() with flags 0x80 (size 0)
# Invalid argument - Failed to create new process
# [7538] clone3() with flags says: -22 expected 0
not ok 18 [7538] Result (-22) is different than expected (0)
...
# Totals: pass:18 fail:1 xfail:0 xpass:0 skip:0 error:0
This is because if CONFIG_TIME_NS is not set, but the flag
CLONE_NEWTIME (0x80) is used to clone a time namespace, it
will return -EINVAL in copy_time_ns().
If kernel does not support CONFIG_TIME_NS, /proc/self/ns/time
will be not exist, and then we should skip clone3() test with
CLONE_NEWTIME.
With this patch under !CONFIG_TIME_NS:
# make headers && cd tools/testing/selftests/clone3 && make && ./clone3
...
# Time namespaces are not supported
ok 18 # SKIP Skipping clone3() with CLONE_NEWTIME
...
# Totals: pass:18 fail:0 xfail:0 xpass:0 skip:1 error:0
Fixes: 515bddf0ec41 ("selftests/clone3: test clone3 with CLONE_NEWTIME")
Suggested-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Tiezhu Yang <yangtiezhu(a)loongson.cn>
---
v6: Rebase on 6.5-rc1 and update the commit message
tools/testing/selftests/clone3/clone3.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/clone3/clone3.c b/tools/testing/selftests/clone3/clone3.c
index e60cf4d..1c61e3c 100644
--- a/tools/testing/selftests/clone3/clone3.c
+++ b/tools/testing/selftests/clone3/clone3.c
@@ -196,7 +196,12 @@ int main(int argc, char *argv[])
CLONE3_ARGS_NO_TEST);
/* Do a clone3() in a new time namespace */
- test_clone3(CLONE_NEWTIME, 0, 0, CLONE3_ARGS_NO_TEST);
+ if (access("/proc/self/ns/time", F_OK) == 0) {
+ test_clone3(CLONE_NEWTIME, 0, 0, CLONE3_ARGS_NO_TEST);
+ } else {
+ ksft_print_msg("Time namespaces are not supported\n");
+ ksft_test_result_skip("Skipping clone3() with CLONE_NEWTIME\n");
+ }
/* Do a clone3() with exit signal (SIGCHLD) in flags */
test_clone3(SIGCHLD, 0, -EINVAL, CLONE3_ARGS_NO_TEST);
--
2.1.0
v2: Rebase on 6.5-rc1 and update the commit message
Tiezhu Yang (2):
selftests/vDSO: Add support for LoongArch
selftests/vDSO: Get version and name for all archs
tools/testing/selftests/vDSO/vdso_config.h | 6 ++++-
tools/testing/selftests/vDSO/vdso_test_getcpu.c | 16 +++++--------
.../selftests/vDSO/vdso_test_gettimeofday.c | 26 ++++++----------------
3 files changed, 18 insertions(+), 30 deletions(-)
--
2.1.0
And this is the last(?) revision of this series which should now compile
with or without CONFIG_HID_BPF set.
I had to do changes because [1] was failing
Nick, I kept your Tested-by, even if I made small changes in 1/3. Feel
free to shout if you don't want me to keep it.
Eduard, You helped us a lot in the review of v1 but never sent your
Reviewed-by or Acked-by. Do you want me to add one?
Cheers,
Benjamin
[1] https://gitlab.freedesktop.org/bentiss/hid/-/jobs/49754306
For reference, the v2 cover letter:
| Hi, I am sending this series on behalf of myself and Benjamin Tissoires. There
| existed an initial n=3 patch series which was later expanded to n=4 and
| is now back to n=3 with some fixes added in and rebased against
| mainline.
|
| This patch series aims to ensure that the hid/bpf selftests can be built
| without errors.
|
| Here's Benjamin's initial cover letter for context:
| | These fixes have been triggered by [0]:
| | basically, if you do not recompile the kernel first, and are
| | running on an old kernel, vmlinux.h doesn't have the required
| | symbols and the compilation fails.
| |
| | The tests will fail if you run them on that very same machine,
| | of course, but the binary should compile.
| |
| | And while I was sorting out why it was failing, I realized I
| | could do a couple of improvements on the Makefile.
| |
| | [0] https://lore.kernel.org/linux-input/56ba8125-2c6f-a9c9-d498-0ca1c153dcb2@re…
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Changes in v3:
- Also overwrite all of the enum symbols in patch 1/3
- Link to v2: https://lore.kernel.org/r/20230908-kselftest-09-08-v2-0-0def978a4c1b@google…
Changes in v2:
- roll Justin's fix into patch 1/3
- add __attribute__((preserve_access_index)) (thanks Eduard)
- rebased onto mainline (2dde18cd1d8fac735875f2e4987f11817cc0bc2c)
- Link to v1: https://lore.kernel.org/r/20230825-wip-selftests-v1-0-c862769020a8@kernel.o…
Link: https://github.com/ClangBuiltLinux/linux/issues/1698
Link: https://github.com/ClangBuiltLinux/continuous-integration2/issues/61
---
Benjamin Tissoires (3):
selftests/hid: ensure we can compile the tests on kernels pre-6.3
selftests/hid: do not manually call headers_install
selftests/hid: force using our compiled libbpf headers
tools/testing/selftests/hid/Makefile | 10 ++-
tools/testing/selftests/hid/progs/hid.c | 3 -
.../testing/selftests/hid/progs/hid_bpf_helpers.h | 77 ++++++++++++++++++++++
3 files changed, 81 insertions(+), 9 deletions(-)
---
base-commit: 29aa98d0fe013e2ab62aae4266231b7fb05d47a2
change-id: 20230825-wip-selftests-9a7502b56542
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
A number of corner cases were caught when trying to run the selftests on
older systems. Missed skip conditions, some error cases, and outdated
python setups would all report failures but the issue would actually be
related to some other condition rather than the selftest suite.
Address these individual cases.
Aaron Conole (4):
selftests: openvswitch: Add version check for pyroute2
selftests: openvswitch: Catch cases where the tests are killed
selftests: openvswitch: Skip drop testing on older kernels
selftests: openvswitch: Fix the ct_tuple for v4
.../selftests/net/openvswitch/openvswitch.sh | 21 ++++++++-
.../selftests/net/openvswitch/ovs-dpctl.py | 46 ++++++++++++++++++-
2 files changed, 65 insertions(+), 2 deletions(-)
--
2.40.1
On Sun, Oct 8, 2023 at 7:22 AM Akihiko Odaki <akihiko.odaki(a)daynix.com> wrote:
>
> virtio-net have two usage of hashes: one is RSS and another is hash
> reporting. Conventionally the hash calculation was done by the VMM.
> However, computing the hash after the queue was chosen defeats the
> purpose of RSS.
>
> Another approach is to use eBPF steering program. This approach has
> another downside: it cannot report the calculated hash due to the
> restrictive nature of eBPF.
>
> Introduce the code to compute hashes to the kernel in order to overcome
> thse challenges. An alternative solution is to extend the eBPF steering
> program so that it will be able to report to the userspace, but it makes
> little sense to allow to implement different hashing algorithms with
> eBPF since the hash value reported by virtio-net is strictly defined by
> the specification.
>
> The hash value already stored in sk_buff is not used and computed
> independently since it may have been computed in a way not conformant
> with the specification.
>
> Signed-off-by: Akihiko Odaki <akihiko.odaki(a)daynix.com>
> ---
> +static const struct tun_vnet_hash_cap tun_vnet_hash_cap = {
> + .max_indirection_table_length =
> + TUN_VNET_HASH_MAX_INDIRECTION_TABLE_LENGTH,
> +
> + .types = VIRTIO_NET_SUPPORTED_HASH_TYPES
> +};
No need to have explicit capabilities exchange like this? Tun either
supports all or none.
> case TUNSETSTEERINGEBPF:
> - ret = tun_set_ebpf(tun, &tun->steering_prog, argp);
> + bpf_ret = tun_set_ebpf(tun, &tun->steering_prog, argp);
> + if (IS_ERR(bpf_ret))
> + ret = PTR_ERR(bpf_ret);
> + else if (bpf_ret)
> + tun->vnet_hash.flags &= ~TUN_VNET_HASH_RSS;
Don't make one feature disable another.
TUNSETSTEERINGEBPF and TUNSETVNETHASH are mutually exclusive
functions. If one is enabled the other call should fail, with EBUSY
for instance.
> + case TUNSETVNETHASH:
> + len = sizeof(vnet_hash);
> + if (copy_from_user(&vnet_hash, argp, len)) {
> + ret = -EFAULT;
> + break;
> + }
> +
> + if (((vnet_hash.flags & TUN_VNET_HASH_REPORT) &&
> + (tun->vnet_hdr_sz < sizeof(struct virtio_net_hdr_v1_hash) ||
> + !tun_is_little_endian(tun))) ||
> + vnet_hash.indirection_table_mask >=
> + TUN_VNET_HASH_MAX_INDIRECTION_TABLE_LENGTH) {
> + ret = -EINVAL;
> + break;
> + }
> +
> + argp = (u8 __user *)argp + len;
> + len = (vnet_hash.indirection_table_mask + 1) * 2;
> + if (copy_from_user(vnet_hash_indirection_table, argp, len)) {
> + ret = -EFAULT;
> + break;
> + }
> +
> + argp = (u8 __user *)argp + len;
> + len = virtio_net_hash_key_length(vnet_hash.types);
> +
> + if (copy_from_user(vnet_hash_key, argp, len)) {
> + ret = -EFAULT;
> + break;
> + }
Probably easier and less error-prone to define a fixed size control
struct with the max indirection table size.
Btw: please trim the CC: list considerably on future patches.
qemu-system-ppc64 can handle both big and little endian kernels.
While some setups, like Debian, provide a symlink to execute
qemu-system-ppc64 as qemu-system-ppc64le, others, like ArchLinux, do not.
So always use qemu-system-ppc64 directly.
Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net>
---
tools/testing/selftests/nolibc/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/nolibc/Makefile b/tools/testing/selftests/nolibc/Makefile
index 891aa396163d..af60e07d3c12 100644
--- a/tools/testing/selftests/nolibc/Makefile
+++ b/tools/testing/selftests/nolibc/Makefile
@@ -82,7 +82,7 @@ QEMU_ARCH_arm = arm
QEMU_ARCH_mips = mipsel # works with malta_defconfig
QEMU_ARCH_ppc = ppc
QEMU_ARCH_ppc64 = ppc64
-QEMU_ARCH_ppc64le = ppc64le
+QEMU_ARCH_ppc64le = ppc64
QEMU_ARCH_riscv = riscv64
QEMU_ARCH_s390 = s390x
QEMU_ARCH_loongarch = loongarch64
---
base-commit: 361fbc295e965a3c7f606d281e6107e098d33730
change-id: 20231008-nolibc-qemu-ppc64-07b4f74043a6
Best regards,
--
Thomas Weißschuh <linux(a)weissschuh.net>
Hi all:
The core frequency is subjected to the process variation in semiconductors.
Not all cores are able to reach the maximum frequency respecting the
infrastructure limits. Consequently, AMD has redefined the concept of
maximum frequency of a part. This means that a fraction of cores can reach
maximum frequency. To find the best process scheduling policy for a given
scenario, OS needs to know the core ordering informed by the platform through
highest performance capability register of the CPPC interface.
Earlier implementations of amd-pstate preferred core only support a static
core ranking and targeted performance. Now it has the ability to dynamically
change the preferred core based on the workload and platform conditions and
accounting for thermals and aging.
Amd-pstate driver utilizes the functions and data structures provided by
the ITMT architecture to enable the scheduler to favor scheduling on cores
which can be get a higher frequency with lower voltage.
We call it amd-pstate preferred core.
Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
Amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.
Amd-pstate driver will provide an initial core ordering at boot time.
It relies on the CPPC interface to communicate the core ranking to the
operating system and scheduler to make sure that OS is choosing the cores
with highest performance firstly for scheduling the process. When amd-pstate
driver receives a message with the highest performance change, it will
update the core ranking.
Changes form V7->V8:
- all:
- - pick up Review-By flag added by Mario and Ray.
- cpufreq: amd-pstate:
- - use hw_prefcore embeds into cpudata structure.
- - delete preferred core init from cpu online/off.
Changes form V6->V7:
- x86:
- - Modify kconfig about X86_AMD_PSTATE.
- cpufreq: amd-pstate:
- - modify incorrect comments about scheduler_work().
- - convert highest_perf data type.
- - modify preferred core init when cpu init and online.
- acpi: cppc:
- - modify link of CPPC highest performance.
- cpufreq:
- - modify link of CPPC highest performance changed.
Changes form V5->V6:
- cpufreq: amd-pstate:
- - modify the wrong tag order.
- - modify warning about hw_prefcore sysfs attribute.
- - delete duplicate comments.
- - modify the variable name cppc_highest_perf to prefcore_ranking.
- - modify judgment conditions for setting highest_perf.
- - modify sysfs attribute for CPPC highest perf to pr_debug message.
- Documentation: amd-pstate:
- - modify warning: title underline too short.
Changes form V4->V5:
- cpufreq: amd-pstate:
- - modify sysfs attribute for CPPC highest perf.
- - modify warning about comments
- - rebase linux-next
- cpufreq:
- - Moidfy warning about function declarations.
- Documentation: amd-pstate:
- - align with ``amd-pstat``
Changes form V3->V4:
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
Changes form V2->V3:
- x86:
- - Modify kconfig and description.
- cpufreq: amd-pstate:
- - Add Co-developed-by tag in commit message.
- cpufreq:
- - Modify commit message.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
Changes form V1->V2:
- acpi: cppc:
- - Add reference link.
- cpufreq:
- - Moidfy link error.
- cpufreq: amd-pstate:
- - Init the priorities of all online CPUs
- - Use a single variable to represent the status of preferred core.
- Documentation:
- - Default enabled preferred core.
- Documentation: amd-pstate:
- - Modify inappropriate descriptions.
- - Default enabled preferred core.
- - Use a single variable to represent the status of preferred core.
Meng Li (7):
x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
acpi: cppc: Add get the highest performance cppc control
cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
cpufreq: Add a notification message that the highest perf has changed
cpufreq: amd-pstate: Update amd-pstate preferred core ranking
dynamically
Documentation: amd-pstate: introduce amd-pstate preferred core
Documentation: introduce amd-pstate preferrd core mode kernel command
line options
.../admin-guide/kernel-parameters.txt | 5 +
Documentation/admin-guide/pm/amd-pstate.rst | 59 +++++-
arch/x86/Kconfig | 5 +-
drivers/acpi/cppc_acpi.c | 13 ++
drivers/acpi/processor_driver.c | 6 +
drivers/cpufreq/amd-pstate.c | 186 ++++++++++++++++--
drivers/cpufreq/cpufreq.c | 13 ++
include/acpi/cppc_acpi.h | 5 +
include/linux/amd-pstate.h | 10 +
include/linux/cpufreq.h | 5 +
10 files changed, 285 insertions(+), 22 deletions(-)
--
2.34.1
Hi all,
This series implements the Permission Overlay Extension introduced in 2022
VMSA enhancements [1]. It is based on v6.6-rc3.
The Permission Overlay Extension allows to constrain permissions on memory
regions. This can be used from userspace (EL0) without a system call or TLB
invalidation.
POE is used to implement the Memory Protection Keys [2] Linux syscall.
The first few patches add the basic framework, then the PKEYS interface is
implemented, and then the selftests are made to work on arm64.
There was discussion about what the 'default' protection key value should be,
I used disallow-all (apart from pkey 0), which matches what x86 does.
Patch 15 contains a call to cpus_have_const_cap(), which I couldn't avoid
until Mark's patch to re-order when the alternatives were applied [3] is
committed.
The KVM part isn't tested yet.
I have tested the modified protection_keys test on x86_64 [4], but not PPC.
Hopefully I have CC'd everyone correctly.
Thanks,
Joey
Joey Gouly (20):
arm64/sysreg: add system register POR_EL{0,1}
arm64/sysreg: update CPACR_EL1 register
arm64: cpufeature: add Permission Overlay Extension cpucap
arm64: disable trapping of POR_EL0 to EL2
arm64: context switch POR_EL0 register
KVM: arm64: Save/restore POE registers
arm64: enable the Permission Overlay Extension for EL0
arm64: add POIndex defines
arm64: define VM_PKEY_BIT* for arm64
arm64: mask out POIndex when modifying a PTE
arm64: enable ARCH_HAS_PKEYS on arm64
arm64: handle PKEY/POE faults
arm64: stop using generic mm_hooks.h
arm64: implement PKEYS support
arm64: add POE signal support
arm64: enable PKEY support for CPUs with S1POE
arm64: enable POE and PIE to coexist
kselftest/arm64: move get_header()
selftests: mm: move fpregs printing
selftests: mm: make protection_keys test work on arm64
Documentation/arch/arm64/elf_hwcaps.rst | 3 +
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/el2_setup.h | 10 +-
arch/arm64/include/asm/hwcap.h | 1 +
arch/arm64/include/asm/kvm_host.h | 4 +
arch/arm64/include/asm/mman.h | 6 +-
arch/arm64/include/asm/mmu.h | 2 +
arch/arm64/include/asm/mmu_context.h | 51 ++++++-
arch/arm64/include/asm/pgtable-hwdef.h | 10 ++
arch/arm64/include/asm/pgtable-prot.h | 8 +-
arch/arm64/include/asm/pgtable.h | 28 +++-
arch/arm64/include/asm/pkeys.h | 110 ++++++++++++++
arch/arm64/include/asm/por.h | 33 +++++
arch/arm64/include/asm/processor.h | 1 +
arch/arm64/include/asm/sysreg.h | 16 ++
arch/arm64/include/asm/traps.h | 1 +
arch/arm64/include/uapi/asm/hwcap.h | 1 +
arch/arm64/include/uapi/asm/sigcontext.h | 7 +
arch/arm64/kernel/cpufeature.c | 15 ++
arch/arm64/kernel/cpuinfo.c | 1 +
arch/arm64/kernel/process.c | 16 ++
arch/arm64/kernel/signal.c | 51 +++++++
arch/arm64/kernel/traps.c | 12 +-
arch/arm64/kvm/sys_regs.c | 2 +
arch/arm64/mm/fault.c | 44 +++++-
arch/arm64/mm/mmap.c | 7 +
arch/arm64/mm/mmu.c | 38 +++++
arch/arm64/tools/cpucaps | 1 +
arch/arm64/tools/sysreg | 11 +-
fs/proc/task_mmu.c | 2 +
include/linux/mm.h | 11 +-
.../arm64/signal/testcases/testcases.c | 23 ---
.../arm64/signal/testcases/testcases.h | 26 +++-
tools/testing/selftests/mm/Makefile | 2 +-
tools/testing/selftests/mm/pkey-arm64.h | 138 ++++++++++++++++++
tools/testing/selftests/mm/pkey-helpers.h | 8 +
tools/testing/selftests/mm/pkey-powerpc.h | 3 +
tools/testing/selftests/mm/pkey-x86.h | 4 +
tools/testing/selftests/mm/protection_keys.c | 29 ++--
39 files changed, 685 insertions(+), 52 deletions(-)
create mode 100644 arch/arm64/include/asm/pkeys.h
create mode 100644 arch/arm64/include/asm/por.h
create mode 100644 tools/testing/selftests/mm/pkey-arm64.h
--
2.25.1