On Wed, Nov 13, 2024 at 2:31 AM Paolo Bonzini <pbonzini(a)redhat.com> wrote:
>
>
>
> Il mar 12 nov 2024, 21:44 Doug Covelli <doug.covelli(a)broadcom.com> ha scritto:
>>
>> > Split irqchip should be the best tradeoff. Without it, moves from cr8
>> > stay in the kernel, but moves to cr8 always go to userspace with a
>> > KVM_EXIT_SET_TPR exit. You also won't be able to use Intel
>> > flexpriority (in-processor accelerated TPR) because KVM does not know
>> > which bits are set in IRR. So it will be *really* every move to cr8
>> > that goes to userspace.
>>
>> Sorry to hijack this thread but is there a technical reason not to allow CR8
>> based accesses to the TPR (not MMIO accesses) when the in-kernel local APIC is
>> not in use?
>
>
> No worries, you're not hijacking :) The only reason is that it would be more code for a seldom used feature and anyway with worse performance. (To be clear, CR8 based accesses are allowed, but stores cause an exit in order to check the new TPR against IRR. That's because KVM's API does not have an equivalent of the TPR threshold as you point out below).
I have not really looked at the code but it seems like it could also
simplify things as CR8 would be handled more uniformly regardless of
who is virtualizing the local APIC.
>> Also I could not find these documented anywhere but with MSFT's APIC our monitor
>> relies on extensions for trapping certain events such as INIT/SIPI plus LINT0
>> and SVR writes:
>>
>> UINT64 X64ApicInitSipiExitTrap : 1; // WHvRunVpExitReasonX64ApicInitSipiTrap
>> UINT64 X64ApicWriteLint0ExitTrap : 1; // WHvRunVpExitReasonX64ApicWriteTrap
>> UINT64 X64ApicWriteLint1ExitTrap : 1; // WHvRunVpExitReasonX64ApicWriteTrap
>> UINT64 X64ApicWriteSvrExitTrap : 1; // WHvRunVpExitReasonX64ApicWriteTrap
>
>
> There's no need for this in KVM's in-kernel APIC model. INIT and SIPI are handled in the hypervisor and you can get the current state of APs via KVM_GET_MPSTATE. LINT0 and LINT1 are injected with KVM_INTERRUPT and KVM_NMI respectively, and they obey IF/PPR and NMI blocking respectively, plus the interrupt shadow; so there's no need for userspace to know when LINT0/LINT1 themselves change. The spurious interrupt vector register is also handled completely in kernel.
I realize that KVM can handle LINT0/SVR updates themselves but our
interrupt subsystem relies on knowing the current values of these
registers even when not virtualizing the local APIC. I suppose we
could use KVM_GET_LAPIC to sync things up on demand but that seems
like it might nor be great from a performance point of view.
>> I did not see any similar functionality for KVM. Does anything like that exist?
>> In any case we would be happy to add support for handling CR8 accesses w/o
>> exiting w/o the in-kernel APIC along with some sort of a way to configure the
>> TPR threshold if folks are not opposed to that.
>
>
> As far I know everybody who's using KVM (whether proprietary or open source) has had no need for that, so I don't think it's a good idea to make the API more complex. Performance of Windows guests is going to be bad anyway with userspace APIC.
From what I have seen the exit cost with KVM is significantly lower
than with WHP/Hyper-V. I don't think performance of Windows guests
with userspace APIC emulation would be bad if CR8 exits could be
avoided (Linux guests perf isn't bad from what I have observed and the
main difference is the astronomical number of CR8 exits). It seems
like it would be pretty decent although I agree if you want the
absolute best performance then you would want to use the in kernel
APIC to speed up handling of ICR/EOI writes but those are relatively
infrequent compared to CR8 accesses .
Anyway I just saw Sean's response while writing this and it seems he
is not in favor of avoiding CR8 exits w/o the in kernel APIC either so
I suppose we will have to look into making use of the in kernel APIC.
Doug
> Paolo
>
>> Doug
>>
>> > > For now I think it makes sense to handle BDOOR_CMD_GET_VCPU_INFO at userlevel
>> > > like we do on Windows and macOS.
>> > >
>> > > BDOOR_CMD_GETTIME/BDOOR_CMD_GETTIMEFULL are similar with the former being
>> > > deprecated in favor of the latter. Both do essentially the same thing which is
>> > > to return the host OS's time - on Linux this is obtained via gettimeofday. I
>> > > believe this is mainly used by tools to fix up the VM's time when resuming from
>> > > suspend. I think it is fine to continue handling these at userlevel.
>> >
>> > As long as the TSC is not involved it should be okay.
>> >
>> > Paolo
>> >
>> > > > >> Anyway, one question apart from this: is the API the same for the I/O
>> > > > >> port and hypercall backdoors?
>> > > > >
>> > > > > Yeah the calls and arguments are the same. The hypercall based
>> > > > > interface is an attempt to modernize the backdoor since as you pointed
>> > > > > out the I/O based interface is kind of hacky as it bypasses the normal
>> > > > > checks for an I/O port access at CPL3. It would be nice to get rid of
>> > > > > it but unfortunately I don't think that will happen in the foreseeable
>> > > > > future as there are a lot of existing VMs out there with older SW that
>> > > > > still uses this interface.
>> > > >
>> > > > Yeah, but I think it still justifies that the KVM_ENABLE_CAP API can
>> > > > enable the hypercall but not the I/O port.
>> > > >
>> > > > Paolo
>> >
>>
>> --
>> This electronic communication and the information and any files transmitted
>> with it, or attached to it, are confidential and are intended solely for
>> the use of the individual or entity to whom it is addressed and may contain
>> information that is confidential, legally privileged, protected by privacy
>> laws, or otherwise restricted from disclosure to anyone else. If you are
>> not the intended recipient or the person responsible for delivering the
>> e-mail to the intended recipient, you are hereby notified that any use,
>> copying, distributing, dissemination, forwarding, printing, or copying of
>> this e-mail is strictly prohibited. If you received this e-mail in error,
>> please return the e-mail to the sender, delete it from your computer, and
>> destroy any printed copy of it.
>>
--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.
Hi All,
This patch-set aims to improve precision of BPF_MUL and add testcases
to illustrate precision gains using signed and unsigned bounds.
Thanks for taking the time to review and for all the feedback!
Best,
Matan
Changes from v1:
- Fixed typo made in patch.
Changes from v2:
- Added signed multiplication to BPF_MUL.
- Added test cases to exercise BPF_MUL.
- Reordered patches in the series.
Changes from v3:
- Coding style fixes.
Matan Shachnai (2):
bpf, verifier: Improve precision of BPF_MUL
selftests/bpf: Add testcases for BPF_MUL
kernel/bpf/verifier.c | 80 +++++------
.../selftests/bpf/progs/verifier_bounds.c | 134 ++++++++++++++++++
2 files changed, 170 insertions(+), 44 deletions(-)
--
2.25.1
In commit 03c7527e97f7 ("KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits
to be overridden") we made that bitfield in the ID registers unwritable
however the change neglected to make the corresponding update to set_id_regs
resulting in it failing:
# ok 56 ID_AA64MMFR0_EL1_BIGEND
# ==== Test Assertion Failure ====
# aarch64/set_id_regs.c:434: masks[idx] & ftr_bits[j].mask == ftr_bits[j].mask
# pid=5566 tid=5566 errno=22 - Invalid argument
# 1 0x00000000004034a7: test_vm_ftr_id_regs at set_id_regs.c:434
# 2 0x0000000000401b53: main at set_id_regs.c:684
# 3 0x0000ffff8e6b7543: ?? ??:0
# 4 0x0000ffff8e6b7617: ?? ??:0
# 5 0x0000000000401e6f: _start at ??:?
# 0 != 0xf0 (masks[idx] & ftr_bits[j].mask != ftr_bits[j].mask)
not ok 8 selftests: kvm: set_id_regs # exit=254
Remove ID_AA64MMFR1_EL1.ASIDBITS from the set of bitfields we test for
writeability.
Fixes: 03c7527e97f7 ("KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits to be overridden")
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/kvm/aarch64/set_id_regs.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/aarch64/set_id_regs.c b/tools/testing/selftests/kvm/aarch64/set_id_regs.c
index a79b7f18452d2ec336ae623b8aa5c9cf329b6b4e..3a97c160b5fec990aaf8dfaf100a907b913f057c 100644
--- a/tools/testing/selftests/kvm/aarch64/set_id_regs.c
+++ b/tools/testing/selftests/kvm/aarch64/set_id_regs.c
@@ -152,7 +152,6 @@ static const struct reg_ftr_bits ftr_id_aa64mmfr0_el1[] = {
REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, BIGENDEL0, 0),
REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, SNSMEM, 0),
REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, BIGEND, 0),
- REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, ASIDBITS, 0),
REG_FTR_BITS(FTR_LOWER_SAFE, ID_AA64MMFR0_EL1, PARANGE, 0),
REG_FTR_END,
};
---
base-commit: 78d4f34e2115b517bcbfe7ec0d018bbbb6f9b0b8
change-id: 20241216-kvm-arm64-fix-set-id-asidbits-9bede25b7ad3
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Hi All,
This patch-set aims to improve precision of BPF_MUL and add testcases
to illustrate precision gains using signed and unsigned bounds.
Thanks for taking the time to review and specifically for Eduard's feedback!
Best,
Matan
Changes from v1:
- Fixed typo made in patch
Changes from v2:
- Added signed multiplication to BPF_MUL
- Added test cases to exercise BPF_MUL
- Reordered patches in the series.
Matan Shachnai (2):
bpf, verifier: Improve precision of BPF_MUL
selftests/bpf: Add testcases for BPF_MUL
kernel/bpf/verifier.c | 72 +++++-----
.../selftests/bpf/progs/verifier_bounds.c | 134 ++++++++++++++++++
2 files changed, 166 insertions(+), 40 deletions(-)
--
2.25.1
If the selftest is not running as root, it should skip not
fail and give an appropriate warning to the user. This patch adds
ksft_exit_skip() if the test is not running as root.
Logs:
Before change:
TAP version 13
1..1
ok 1 # SKIP This test needs root to run!
After change:
TAP version 13
1..1
ok 2 # SKIP This test needs root to run!
Totals: pass:0 fail:0 xfail:0 xpass:0 skip:1 error:0
Signed-off-by: Shivam Chaudhary <cvam0000(a)gmail.com>
---
v1->v2 : Replace ksft_exit_fail_msg -> ksft_exit_skip
v1 : https://lore.kernel.org/all/20241115191721.621381-1-cvam0000@gmail.com/
tools/testing/selftests/acct/acct_syscall.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/acct/acct_syscall.c b/tools/testing/selftests/acct/acct_syscall.c
index e44e8fe1f4a3..87c044fb9293 100644
--- a/tools/testing/selftests/acct/acct_syscall.c
+++ b/tools/testing/selftests/acct/acct_syscall.c
@@ -24,7 +24,7 @@ int main(void)
// Check if test is run a root
if (geteuid()) {
- ksft_test_result_skip("This test needs root to run!\n");
+ ksft_exit_skip("This test needs root to run!\n");
return 1;
}
--
2.34.1
When adapting the test to the kselftest framework, a few printf() calls
indicating test progress were not updated.
Fix this by replacing these printf() calls by ksft_print_msg() calls.
Fixes: ce7d101750ff8450 ("selftests: timers: clocksource-switch: adapt to kselftest framework")
Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
---
v2:
- Add Reviewed-by.
When just running the test, the output looks like:
# Validating clocksource arch_sys_counter
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
# Validating clocksource ffca0000.timer
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
When redirecting the test output to a file, the progress prints are not
interspersed with the test output, but collated at the end:
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:6 error:0
# Validating clocksource arch_sys_counter
# Validating clocksource ffca0000.timer
...
This makes it hard to match the test results with the timer under test.
Is there a way to fix this? The test does use fork().
Thanks!
---
tools/testing/selftests/timers/clocksource-switch.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/timers/clocksource-switch.c b/tools/testing/selftests/timers/clocksource-switch.c
index c5264594064c8516..83faa4e354e389c2 100644
--- a/tools/testing/selftests/timers/clocksource-switch.c
+++ b/tools/testing/selftests/timers/clocksource-switch.c
@@ -156,8 +156,8 @@ int main(int argc, char **argv)
/* Check everything is sane before we start switching asynchronously */
if (do_sanity_check) {
for (i = 0; i < count; i++) {
- printf("Validating clocksource %s\n",
- clocksource_list[i]);
+ ksft_print_msg("Validating clocksource %s\n",
+ clocksource_list[i]);
if (change_clocksource(clocksource_list[i])) {
status = -1;
goto out;
@@ -169,7 +169,7 @@ int main(int argc, char **argv)
}
}
- printf("Running Asynchronous Switching Tests...\n");
+ ksft_print_msg("Running Asynchronous Switching Tests...\n");
pid = fork();
if (!pid)
return run_tests(runtime);
--
2.34.1
Hi all,
This patch series continues the work to migrate the script tests into
prog_tests.
test_xdp_meta.sh uses the BPF programs defined in progs/test_xdp_meta.c
to do a simple XDP/TC functional test that checks the metadata
allocation performed by the bpf_xdp_adjust_meta() helper.
This is already partly covered by two tests under prog_tests/:
- xdp_context_test_run.c uses bpf_prog_test_run_opts() to verify the
validity of the xdp_md context after a call to bpf_xdp_adjust_meta()
- xdp_metadata.c ensures that these meta-data can be exchanged through
an AF_XDP socket.
However test_xdp_meta.sh also verifies that the meta-data initialized
in the struct xdp_md is forwarded to the struct __sk_buff used by BPF
programs at 'TC level'. To cover this, I add a test case in
xdp_context_test_run.c that uses the same BPF programs from
progs/test_xdp_meta.c.
---
Changes in v2:
- Add missing close_netns()
- Use a unique 'close' label
- Link to v1: https://lore.kernel.org/r/20241206-xdp_meta-v1-0-5c150618f6e9@bootlin.com
---
Bastien Curutchet (2):
selftests/bpf: test_xdp_meta: Rename BPF sections
selftests/bpf: Migrate test_xdp_meta.sh into xdp_context_test_run.c
tools/testing/selftests/bpf/Makefile | 1 -
.../bpf/prog_tests/xdp_context_test_run.c | 87 ++++++++++++++++++++++
tools/testing/selftests/bpf/progs/test_xdp_meta.c | 4 +-
tools/testing/selftests/bpf/test_xdp_meta.sh | 58 ---------------
4 files changed, 89 insertions(+), 61 deletions(-)
---
base-commit: 0c30734c4f35c4784d3d3ca1bb89d9779045878c
change-id: 20241203-xdp_meta-868307cd0e03
Best regards,
--
Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>