This patch series introduces a new char misc driver, /dev/ntsync, which is used
to implement Windows NT synchronization primitives.
This was previously submitted as an RFC [1]. Since there were no major changes
requested to the last RFC revision, I've stripped the RFC prefix.
[1] https://lore.kernel.org/lkml/20240131021356.10322-1-zfigura@codeweavers.com/
== Background ==
The Wine project emulates the Windows API in user space. One particular part of
that API, namely the NT synchronization primitives, have historically been
implemented via RPC to a dedicated "kernel" process. However, more recent
applications use these APIs more strenuously, and the overhead of RPC has become
a bottleneck.
The NT synchronization APIs are too complex to implement on top of existing
primitives without sacrificing correctness. Certain operations, such as
NtPulseEvent() or the "wait-for-all" mode of NtWaitForMultipleObjects(), require
direct control over the underlying wait queue, and implementing a wait queue
sufficiently robust for Wine in user space is not possible. This proposed
driver, therefore, implements the problematic interfaces directly in the Linux
kernel.
This driver was presented at Linux Plumbers Conference 2023. For those further
interested in the history of synchronization in Wine and past attempts to solve
this problem in user space, a recording of the presentation can be viewed here:
https://www.youtube.com/watch?v=NjU4nyWyhU8
== Performance ==
The gain in performance varies wildly depending on the application in question
and the user's hardware. For some games NT synchronization is not a bottleneck
and no change can be observed, but for others frame rate improvements of 50 to
150 percent are not atypical. The following table lists frame rate measurements
from a variety of games on a variety of hardware, taken by users Dmitry
Skvortsov, FuzzyQuils, OnMars, and myself:
Game Upstream ntsync improvement
===========================================================================
Anger Foot 69 99 43%
Call of Juarez 99.8 224.1 125%
Dirt 3 110.6 860.7 678%
Forza Horizon 5 108 160 48%
Lara Croft: Temple of Osiris 141 326 131%
Metro 2033 164.4 199.2 21%
Resident Evil 2 26 77 196%
The Crew 26 51 96%
Tiny Tina's Wonderlands 130 360 177%
Total War Saga: Troy 109 146 34%
===========================================================================
== Patches ==
The intended semantics of the patches are broadly intended to match those of the
corresponding Windows functions. For those not already familiar with the Windows
functions (or their undocumented behaviour), patch 31/31 provides a detailed
specification, and individual patches also include a brief description of the
API they are implementing.
The patches making use of this driver in Wine can be retrieved or browsed here:
https://repo.or.cz/wine/zf.git/shortlog/refs/heads/ntsync5
== Implementation ==
Some aspects of the implementation may deserve particular comment:
* In the interest of performance, each object is governed only by a single
spinlock. However, NTSYNC_IOC_WAIT_ALL requires that the state of multiple
objects be changed as a single atomic operation. In order to achieve this, we
first take a device-wide lock ("wait_all_lock") any time we are going to lock
more than one object at a time.
The maximum number of objects that can be used in a vectored wait, and
therefore the maximum that can be locked simultaneously, is 64. This number is
NT's own limit.
The acquisition of multiple spinlocks will degrade performance. This is a
conscious choice, however. Wait-for-all is known to be a very rare operation
in practice, especially with counts that approach the maximum, and it is the
intent of the ntsync driver to optimize wait-for-any at the expense of
wait-for-all as much as possible.
* NT mutexes are tied to their threads on an OS level, and the kernel includes
builtin support for "robust" mutexes. In order to keep the ntsync driver
self-contained and avoid touching more code than necessary, it does not hook
into task exit nor use pids.
Instead, the user space emulator is expected to manage thread IDs and pass
them as an argument to any relevant functions; this is the "owner" field of
ntsync_wait_args and ntsync_mutex_args.
When the emulator detects that a thread dies, it should therefore call
NTSYNC_IOC_MUTEX_KILL on any open mutexes.
* ntsync is module-capable mostly because there was nothing preventing it, and
because it aided development. It is not a hard requirement, though.
== Previous versions ==
Changes from the last (v2) RFC:
* Add a new wait flag NTSYNC_WAIT_REALTIME. I had originally missed a corner
case in NtWaitForMultipleObjects() related to its interaction with system time
adjustments. Essentially the function is sometimes supposed to respect system
time adjustments and sometimes supposed to ignore them, so in order to achieve
this I've added a function that controls which flag is being synchronized to.
Thanks Piotr Caban for catching this.
* Add tests for overflowing semaphore and mutex counters, and a test for
exceeding NTSYNC_MAX_WAIT_COUNT, per Andi Kleen.
* Add a more intense and realistic test involving multiple threads using the
same mutex to access data, per Andi Kleen.
* Use check_add_overflow() instead of writing out overflow checking manually
[and thereby avoid relying on -fwrapv].
* Add some missing headers that were being implicitly included: atomic.h,
hrtimer.h, ktime.h, sched.h, sched/signal.h, spinlock.h.
* Link to RFC v2: https://lore.kernel.org/lkml/20240131021356.10322-1-zfigura@codeweavers.com/
* Link to RFC v1: https://lore.kernel.org/lkml/20240124004028.16826-1-zfigura@codeweavers.com/
Elizabeth Figura (31):
ntsync: Introduce the ntsync driver and character device.
ntsync: Introduce NTSYNC_IOC_CREATE_SEM.
ntsync: Introduce NTSYNC_IOC_SEM_POST.
ntsync: Introduce NTSYNC_IOC_WAIT_ANY.
ntsync: Introduce NTSYNC_IOC_WAIT_ALL.
ntsync: Introduce NTSYNC_IOC_CREATE_MUTEX.
ntsync: Introduce NTSYNC_IOC_MUTEX_UNLOCK.
ntsync: Introduce NTSYNC_IOC_MUTEX_KILL.
ntsync: Introduce NTSYNC_IOC_CREATE_EVENT.
ntsync: Introduce NTSYNC_IOC_EVENT_SET.
ntsync: Introduce NTSYNC_IOC_EVENT_RESET.
ntsync: Introduce NTSYNC_IOC_EVENT_PULSE.
ntsync: Introduce NTSYNC_IOC_SEM_READ.
ntsync: Introduce NTSYNC_IOC_MUTEX_READ.
ntsync: Introduce NTSYNC_IOC_EVENT_READ.
ntsync: Introduce alertable waits.
ntsync: Allow waits to use the REALTIME clock.
selftests: ntsync: Add some tests for semaphore state.
selftests: ntsync: Add some tests for mutex state.
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ANY.
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ALL.
selftests: ntsync: Add some tests for wakeup signaling with
WINESYNC_IOC_WAIT_ANY.
selftests: ntsync: Add some tests for wakeup signaling with
WINESYNC_IOC_WAIT_ALL.
selftests: ntsync: Add some tests for manual-reset event state.
selftests: ntsync: Add some tests for auto-reset event state.
selftests: ntsync: Add some tests for wakeup signaling with events.
selftests: ntsync: Add tests for alertable waits.
selftests: ntsync: Add some tests for wakeup signaling via alerts.
selftests: ntsync: Add a stress test for contended waits.
maintainers: Add an entry for ntsync.
docs: ntsync: Add documentation for the ntsync uAPI.
Documentation/userspace-api/index.rst | 1 +
.../userspace-api/ioctl/ioctl-number.rst | 2 +
Documentation/userspace-api/ntsync.rst | 399 +++++
MAINTAINERS | 9 +
drivers/misc/Kconfig | 9 +
drivers/misc/Makefile | 1 +
drivers/misc/ntsync.c | 1146 ++++++++++++++
include/uapi/linux/ntsync.h | 62 +
tools/testing/selftests/Makefile | 1 +
.../testing/selftests/drivers/ntsync/Makefile | 8 +
tools/testing/selftests/drivers/ntsync/config | 1 +
.../testing/selftests/drivers/ntsync/ntsync.c | 1407 +++++++++++++++++
12 files changed, 3046 insertions(+)
create mode 100644 Documentation/userspace-api/ntsync.rst
create mode 100644 drivers/misc/ntsync.c
create mode 100644 include/uapi/linux/ntsync.h
create mode 100644 tools/testing/selftests/drivers/ntsync/Makefile
create mode 100644 tools/testing/selftests/drivers/ntsync/config
create mode 100644 tools/testing/selftests/drivers/ntsync/ntsync.c
base-commit: e21817acb23ece75d41a4fa7b40c85550f147389
--
2.43.0
When adapting the test to the kselftest framework, a few printf() calls
indicating test progress were not updated.
Fix this by replacing these printf() calls by ksft_print_msg() calls.
Fixes: ce7d101750ff8450 ("selftests: timers: clocksource-switch: adapt to kselftest framework")
Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
---
When just running the test, the output looks like:
# Validating clocksource arch_sys_counter
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
# Validating clocksource ffca0000.timer
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
When redirecting the test output to a file, the progress prints are not
interspersed with the test output, but collated at the end:
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
TAP version 13
1..12
ok 1 CLOCK_REALTIME
...
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:6 error:0
# Validating clocksource arch_sys_counter
# Validating clocksource ffca0000.timer
...
This makes it hard to match the test results with the timer under test.
Is there a way to fix this? The test does use fork().
Thanks!
---
tools/testing/selftests/timers/clocksource-switch.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/timers/clocksource-switch.c b/tools/testing/selftests/timers/clocksource-switch.c
index c5264594064c8516..83faa4e354e389c2 100644
--- a/tools/testing/selftests/timers/clocksource-switch.c
+++ b/tools/testing/selftests/timers/clocksource-switch.c
@@ -156,8 +156,8 @@ int main(int argc, char **argv)
/* Check everything is sane before we start switching asynchronously */
if (do_sanity_check) {
for (i = 0; i < count; i++) {
- printf("Validating clocksource %s\n",
- clocksource_list[i]);
+ ksft_print_msg("Validating clocksource %s\n",
+ clocksource_list[i]);
if (change_clocksource(clocksource_list[i])) {
status = -1;
goto out;
@@ -169,7 +169,7 @@ int main(int argc, char **argv)
}
}
- printf("Running Asynchronous Switching Tests...\n");
+ ksft_print_msg("Running Asynchronous Switching Tests...\n");
pid = fork();
if (!pid)
return run_tests(runtime);
--
2.34.1
Make sure the IOAM data insertion is not applied on cloned skb's. As a
consequence, ioam selftests needed a refactoring.
Justin Iurman (2):
ioam6: fix write to cloned skb in ipv6_hop_ioam()
selftests: ioam6: refactoring to align with the fix
net/ipv6/exthdrs.c | 8 ++
tools/testing/selftests/net/ioam6.sh | 38 ++++----
tools/testing/selftests/net/ioam6_parser.c | 101 +++++++++++----------
3 files changed, 81 insertions(+), 66 deletions(-)
base-commit: 166c2c8a6a4dc2e4ceba9e10cfe81c3e469e3210
--
2.34.1
Hi!
When running selftests for our subsystem in our CI we'd like all
tests to pass. Currently some tests use SKIP for cases they
expect to fail, because the kselftest_harness limits the return
codes to pass/fail/skip.
Clean up and support the use of the full range of ksft exit codes
under kselftest_harness.
To avoid conflicts and get the functionality into the networking
tree ASAP I'd like to put the patches on shared branch so that
both linux-kselftest and net-next can pull it in. Shuah, please
LMK if that'd work for you, and if so which -rc should I base
the branch on. Or is merging directly into net-next okay?
Jakub Kicinski (4):
selftests: kselftest_harness: pass step via shared memory
selftests: kselftest_harness: use KSFT_* exit codes
selftests: kselftest_harness: support using xfail
selftests: ip_local_port_range: use XFAIL instead of SKIP
tools/testing/selftests/kselftest_harness.h | 67 ++++++++++++++-----
.../selftests/net/ip_local_port_range.c | 2 +-
2 files changed, 52 insertions(+), 17 deletions(-)
--
2.43.0
Should set the SSADE (Second Stage Access/Dirty bit Enable) bit of the
pasid entry when attaching a device to a nested domain if its parent
has already enabled dirty tracking.
Fixes: 111bf85c68f6 ("iommu/vt-d: Add helper to setup pasid nested translation")
Signed-off-by: Yi Liu <yi.l.liu(a)intel.com>
---
base commit: 547ab8fc4cb04a1a6b34377dd8fad34cd2c8a8e3
---
drivers/iommu/intel/pasid.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 3239cefa4c33..9be24bb762cf 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -658,6 +658,8 @@ int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device *dev,
pasid_set_domain_id(pte, did);
pasid_set_address_width(pte, s2_domain->agaw);
pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
+ if (s2_domain->dirty_tracking)
+ pasid_set_ssade(pte);
pasid_set_translation_type(pte, PASID_ENTRY_PGTT_NESTED);
pasid_set_present(pte);
spin_unlock(&iommu->lock);
--
2.34.1
Add a common result printing helper and always include test name
in the result line. Previously when SKIP or XPASS would happen
we printed:
ok 1 # SKIP unknown
without the test name. Now we'll print:
ok 1 global.no_pad # SKIP unknown
This appears to be more inline with:
https://docs.kernel.org/dev-tools/ktap.html
and makes parsing results easier.
First 3 patches rearrange kselftest_harness to use exit code
as an enum rather than separate passed/skip/xfail members.
Rest of the series builds a ksft_test_result_code() helper.
This series is on top of:
https://lore.kernel.org/all/20240216002619.1999225-1-kuba@kernel.org/
Jakub Kicinski (7):
selftests: kselftest_harness: generate test name once
selftests: kselftest_harness: save full exit code in metadata
selftests: kselftest_harness: use exit code to store skip and xfail
selftests: kselftest: add ksft_test_result_code(), handling all exit
codes
selftests: kselftest_harness: print test name for SKIP and XFAIL
selftests: kselftest_harness: let ksft_test_result_code() handle line
termination
selftests: kselftest_harness: let PASS / FAIL provide diagnostic
tools/testing/selftests/kselftest.h | 45 ++++++++++
tools/testing/selftests/kselftest_harness.h | 96 ++++++++++-----------
tools/testing/selftests/net/tls.c | 2 +-
3 files changed, 91 insertions(+), 52 deletions(-)
--
2.43.0