Hi,
I noticed that only the RSEQ membarrier command allows specifying a specific cpu. I have a (extremely toy) lib I was playing around with [1] and noticed this and that being able to specify the cpu would be useful to me.
I'm by no means an expert in this code though - and so could be totally missing something.
Additionally this seems really difficult to actually test. I added a self test that just proces "some" interrupts are being sent, but nothing more than that. I don't know what else can be done there.
Patch 1 is the main change Patch 2 is the self-test. Which maybe you don't want at all.
[1]: https://github.com/DylanZA/rseqlock/commit/be7bc7214fd5aacec47e26126118f8bbd...
Thanks, Dylan
Dylan Yudaken (2): membarrier: allow cpu_id to be set on more commands membarrier: self test for cpu specific calls
kernel/sched/membarrier.c | 6 +- tools/testing/selftests/membarrier/.gitignore | 1 + tools/testing/selftests/membarrier/Makefile | 3 +- .../membarrier/membarrier_test_expedited.c | 135 ++++++++++++++++++ .../membarrier/membarrier_test_impl.h | 5 + 5 files changed, 148 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/membarrier/membarrier_test_expedited.c
base-commit: ee88bddf7f2f5d1f1da87dd7bedc734048b70e88
No reason to not allow MEMBARRIER_CMD_FLAG_CPU on MEMBARRIER_CMD_PRIVATE_EXPEDITED or MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE.
If it is known specifically what cpu you want to interrupt then there is a decent efficiency saving in not interrupting all the other ones.
Also - the code already works as is for them.
Signed-off-by: Dylan Yudaken dyudaken@gmail.com --- kernel/sched/membarrier.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index 809194cd779f..def6d4094ad6 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -595,7 +595,9 @@ static int membarrier_get_registrations(void) * contains the CPU on which to interrupt (= restart) * the RSEQ critical section. * @cpu_id: if @flags == MEMBARRIER_CMD_FLAG_CPU, indicates the cpu on which - * RSEQ CS should be interrupted (@cmd must be + * RSEQ CS should be interrupted (@cmd must be one of + * MEMBARRIER_CMD_PRIVATE_EXPEDITED, + * MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE, * MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ). * * If this system call is not implemented, -ENOSYS is returned. If the @@ -625,6 +627,8 @@ static int membarrier_get_registrations(void) SYSCALL_DEFINE3(membarrier, int, cmd, unsigned int, flags, int, cpu_id) { switch (cmd) { + case MEMBARRIER_CMD_PRIVATE_EXPEDITED: + case MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE: case MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ: if (unlikely(flags && flags != MEMBARRIER_CMD_FLAG_CPU)) return -EINVAL;
On 2025-06-26 11:52, Dylan Yudaken wrote:
No reason to not allow MEMBARRIER_CMD_FLAG_CPU on MEMBARRIER_CMD_PRIVATE_EXPEDITED or MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE.
If it is known specifically what cpu you want to interrupt then there is a decent efficiency saving in not interrupting all the other ones.
Also - the code already works as is for them.
Can you elaborate on a concrete use-case justifying adding this ?
Thanks,
Mathieu
Signed-off-by: Dylan Yudaken dyudaken@gmail.com
kernel/sched/membarrier.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index 809194cd779f..def6d4094ad6 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -595,7 +595,9 @@ static int membarrier_get_registrations(void)
contains the CPU on which to interrupt (= restart)
the RSEQ critical section.
- @cpu_id: if @flags == MEMBARRIER_CMD_FLAG_CPU, indicates the cpu on which
RSEQ CS should be interrupted (@cmd must be
RSEQ CS should be interrupted (@cmd must be one of
MEMBARRIER_CMD_PRIVATE_EXPEDITED,
MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE,
MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ).
- If this system call is not implemented, -ENOSYS is returned. If the
@@ -625,6 +627,8 @@ static int membarrier_get_registrations(void) SYSCALL_DEFINE3(membarrier, int, cmd, unsigned int, flags, int, cpu_id) { switch (cmd) {
- case MEMBARRIER_CMD_PRIVATE_EXPEDITED:
- case MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE: case MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ: if (unlikely(flags && flags != MEMBARRIER_CMD_FLAG_CPU)) return -EINVAL;
On Thu, Jun 26, 2025 at 5:07 PM Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
On 2025-06-26 11:52, Dylan Yudaken wrote:
No reason to not allow MEMBARRIER_CMD_FLAG_CPU on MEMBARRIER_CMD_PRIVATE_EXPEDITED or MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE.
If it is known specifically what cpu you want to interrupt then there is a decent efficiency saving in not interrupting all the other ones.
Also - the code already works as is for them.
Can you elaborate on a concrete use-case justifying adding this ?
Thanks,
Mathieu
So my use case is for core-local data such as performance counters.
I have a library that allows a fast thread to "lock" a core -> do some work (probably incrementing some performance counters) -> unlock. The "lock" uses restartable sequences (ie no serializing instructions), and the unlock just writes a 0 to memory (again, no serializing instructions).
A slow thread will occasionally (say every few minutes) try and read data computed in the work section. It does this by disabling locking and firing off a membarrier(RSEQ) on that core to be sure that the core is either "locked" or "unlocked". It then spins waiting for it to be unlocked. At this point my understanding is a bit fuzzy - but I believe you need that core to have a memory barrier since there is no serializing instruction and the processor would happily reorder some "work" after the "unlock" instruction.
That serializing instruction is what I want from this. But since I know the cpu_id that I am working with I don't need to do a barrier on _all_ the cores.
To be clear: (1) I don't have a current real world use case, and (2) my library/design/understanding might be buggy. (3) I don't have a use case for the SYNC_CORE part, but again it seemed easy enough to add and I presume others might have a use case.
Add a self test for the cpu specific calls to membarrier.
This works by figuring out the number of interrupts on a given core before/after calling membarrier(2) to assert that at least some interrupts have happened. This feels like it might be a bit flaky if for example the worker thread was switched out. To mitigate this there are some checks such as making sure it stays on one core, and also it asserts only 1 interrupt for every 2 calls to membarrier(2)
Signed-off-by: Dylan Yudaken dyudaken@gmail.com --- tools/testing/selftests/membarrier/.gitignore | 1 + tools/testing/selftests/membarrier/Makefile | 3 +- .../membarrier/membarrier_test_expedited.c | 135 ++++++++++++++++++ .../membarrier/membarrier_test_impl.h | 5 + 4 files changed, 143 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/membarrier/membarrier_test_expedited.c
diff --git a/tools/testing/selftests/membarrier/.gitignore b/tools/testing/selftests/membarrier/.gitignore index f2fbba178601..39cdadb11c01 100644 --- a/tools/testing/selftests/membarrier/.gitignore +++ b/tools/testing/selftests/membarrier/.gitignore @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only membarrier_test_multi_thread membarrier_test_single_thread +membarrier_test_expedited diff --git a/tools/testing/selftests/membarrier/Makefile b/tools/testing/selftests/membarrier/Makefile index fc840e06ff56..f3e7920a900c 100644 --- a/tools/testing/selftests/membarrier/Makefile +++ b/tools/testing/selftests/membarrier/Makefile @@ -3,6 +3,7 @@ CFLAGS += -g $(KHDR_INCLUDES) LDLIBS += -lpthread
TEST_GEN_PROGS := membarrier_test_single_thread \ - membarrier_test_multi_thread + membarrier_test_multi_thread \ + membarrier_test_expedited
include ../lib.mk diff --git a/tools/testing/selftests/membarrier/membarrier_test_expedited.c b/tools/testing/selftests/membarrier/membarrier_test_expedited.c new file mode 100644 index 000000000000..aaea36381282 --- /dev/null +++ b/tools/testing/selftests/membarrier/membarrier_test_expedited.c @@ -0,0 +1,135 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE +#include <linux/membarrier.h> +#include <syscall.h> +#include <stdio.h> +#include <string.h> +#include <pthread.h> +#include <stdatomic.h> + +#include "membarrier_test_impl.h" + +struct thread_state { + atomic_int thread_cpu; + atomic_bool end_thread; + pthread_mutex_t mutex; +}; + +void *test_membarrier_thread(void *arg) +{ + struct thread_state *ts = (struct thread_state *)arg; + + ts->thread_cpu = sched_getcpu(); + pthread_mutex_unlock(&ts->mutex); + if (ts->thread_cpu < 0) + return 0; + while (!ts->end_thread) + ts->thread_cpu = sched_getcpu(); + return NULL; +} + +static long read_interrupts(int cpu) +{ + char line[4096]; + FILE *fp = fopen("/proc/interrupts", "r"); + long res = 0; + + if (!fp) + ksft_exit_fail_msg("unable to open /proc/interrupts\n"); + + fgets(line, sizeof(line), fp); /* skip first line */ + while (fgets(line, sizeof(line), fp) != NULL) { + char *save; + int next_cpu = 0; + + for (char *token = strtok_r(line, " ", &save); token; + token = strtok_r(NULL, " ", &save)) { + if (*token < '0' || *token > '9') + continue; + if (next_cpu++ == cpu) + res += atol(token); + } + } + fclose(fp); + return res; +} + +static int test_membarrier(const char *name, int cmd, int register_cmd) +{ + int runs = 0; + long irq = 0; + pthread_t test_thread; + int ret = 0; + + struct thread_state ts = { .thread_cpu = -1, + .end_thread = 0, + .mutex = PTHREAD_MUTEX_INITIALIZER }; + if (sys_membarrier_cpu(cmd, 0) == 0) + ksft_exit_fail_msg("%s: expected failure before register\n", + name); + if (sys_membarrier(register_cmd, 0) != 0) + ksft_exit_fail_msg("%s: unable to register\n", name); + + /* nothing interesting in single processor machines */ + if (sysconf(_SC_NPROCESSORS_ONLN) == 1) + goto success; + + pthread_mutex_lock(&ts.mutex); + pthread_create(&test_thread, NULL, test_membarrier_thread, &ts); + + /* wait for thread to start */ + pthread_mutex_lock(&ts.mutex); + pthread_mutex_unlock(&ts.mutex); + + for (int i = 0; i < 1000; i++) { + int cpu_start, cpu_end, cpu_this; + long irq_start, irq_end; + + cpu_start = ts.thread_cpu; + if (cpu_start < 0) + ksft_exit_fail_msg("sched_getcpu() failed\n"); + + irq_start = read_interrupts(cpu_start); + if (sys_membarrier_cpu(cmd, cpu_start)) + ksft_exit_fail_msg("%s: sys_membarrier failed\n", name); + cpu_end = ts.thread_cpu; + cpu_this = sched_getcpu(); + + /* maybe it was moved to a different cpu, so we cannot trust the irq count */ + /* If we are on the same cpu we wouldnt expect an interrupt */ + if (cpu_end != cpu_start || cpu_this == cpu_end) + continue; + irq_end = read_interrupts(cpu_end); + irq += (irq_end - irq_start); + runs++; + } + ts.end_thread = 1; + pthread_join(test_thread, NULL); + + if (!runs) + ksft_exit_fail_msg("%s: no successful runs\n", name); + + /* Every run should probably have had an interrupt, but use at least half + * to be safe. + */ + if (irq < runs / 2) + ksft_exit_fail_msg("%s: only had %d / %d irqs\n", name, irq, + runs); +success: + ksft_test_result_pass("expedited %s\n", name); + return 0; +} + +int main(int argc, char **argv) +{ + ksft_print_header(); + ksft_set_plan(3); + + test_membarrier("EXPEDITED", MEMBARRIER_CMD_PRIVATE_EXPEDITED, + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED); + test_membarrier("RSEQ", MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ, + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_RSEQ); + test_membarrier("SYNC_CORE", MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE, + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE); + ksft_exit_pass(); +} diff --git a/tools/testing/selftests/membarrier/membarrier_test_impl.h b/tools/testing/selftests/membarrier/membarrier_test_impl.h index af89855adb7b..c10a8af4612e 100644 --- a/tools/testing/selftests/membarrier/membarrier_test_impl.h +++ b/tools/testing/selftests/membarrier/membarrier_test_impl.h @@ -16,6 +16,11 @@ static int sys_membarrier(int cmd, int flags) return syscall(__NR_membarrier, cmd, flags); }
+static int sys_membarrier_cpu(int cmd, int cpu) +{ + return syscall(__NR_membarrier, cmd, MEMBARRIER_CMD_FLAG_CPU, cpu); +} + static int test_membarrier_get_registrations(int cmd) { int ret, flags = 0;
linux-kselftest-mirror@lists.linaro.org