Real-time setups try hard to ensure proper isolation between time
critical applications and e.g. network processing performed by the
network stack in softirq and RPS is used to move the softirq
activity away from the isolated core.
If the network configuration is dynamic, with netns and devices
routinely created at run-time, enforcing the correct RPS setting
on each newly created device allowing to transient bad configuration
became complex.
These series try to address the above, introducing a new
sysctl knob: rps_default_mask. The new sysctl entry allows
configuring a systemwide RPS mask, to be enforced since receive
queue creation time without any fourther per device configuration
required.
Additionally, a simple self-test is introduced to check the
rps_default_mask behavior.
v1 -> v2:
- fix sparse warning in patch 2/3
Paolo Abeni (3):
net/sysctl: factor-out netdev_rx_queue_set_rps_mask() helper
net/core: introduce default_rps_mask netns attribute
self-tests: introduce self-tests for RPS default mask
Documentation/admin-guide/sysctl/net.rst | 6 ++
include/linux/netdevice.h | 1 +
net/core/net-sysfs.c | 73 +++++++++++--------
net/core/sysctl_net_core.c | 58 +++++++++++++++
tools/testing/selftests/net/Makefile | 1 +
tools/testing/selftests/net/config | 3 +
.../testing/selftests/net/rps_default_mask.sh | 57 +++++++++++++++
7 files changed, 169 insertions(+), 30 deletions(-)
create mode 100755 tools/testing/selftests/net/rps_default_mask.sh
--
2.26.2
From: Vincent Cheng <vincent.cheng.xh(a)renesas.com>
This series adds adjust phase to the PTP Hardware Clock device interface.
Some PTP hardware clocks have a write phase mode that has
a built-in hardware filtering capability. The write phase mode
utilizes a phase offset control word instead of a frequency offset
control word. Add adjust phase function to take advantage of this
capability.
Changes since v1:
- As suggested by Richard Cochran:
1. ops->adjphase is new so need to check for non-null function pointer.
2. Kernel coding style uses lower_case_underscores.
3. Use existing PTP clock API for delayed worker.
Vincent Cheng (3):
ptp: Add adjphase function to support phase offset control.
ptp: Add adjust_phase to ptp_clock_caps capability.
ptp: ptp_clockmatrix: Add adjphase() to support PHC write phase mode.
drivers/ptp/ptp_chardev.c | 1 +
drivers/ptp/ptp_clock.c | 3 ++
drivers/ptp/ptp_clockmatrix.c | 92 +++++++++++++++++++++++++++++++++++
drivers/ptp/ptp_clockmatrix.h | 8 ++-
include/linux/ptp_clock_kernel.h | 6 ++-
include/uapi/linux/ptp_clock.h | 4 +-
tools/testing/selftests/ptp/testptp.c | 6 ++-
7 files changed, 114 insertions(+), 6 deletions(-)
--
2.7.4
Add RSEQ, restartable sequence, support and related selftest to RISCV.
The Kconfig option HAVE_REGS_AND_STACK_ACCESS_API is also required by
RSEQ because RSEQ will modify the content of pt_regs.sepc through
instruction_pointer_set() during the fixup procedure. In order to select
the config HAVE_REGS_AND_STACK_ACCESS_API, the missing APIs for accessing
pt_regs are also added in this patch set.
The relevant RSEQ tests in kselftest require the Binutils patch "RISC-V:
Fix linker problems with TLS copy relocs" to avoid placing
PREINIT_ARRAY and TLS variable of librseq.so at the same address.
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=3e7bd7f…
A segmental fault will happen if binutils misses this patch.
Patrick Stählin (1):
riscv: add required functions to enable HAVE_REGS_AND_STACK_ACCESS_API
Vincent Chen (2):
riscv: Add support for restartable sequence
rseq/selftests: Add support for riscv
Changes since v1:
1. Use the correct register name to access pt_regs
arch/riscv/Kconfig | 2 +
arch/riscv/include/asm/ptrace.h | 29 +-
arch/riscv/kernel/entry.S | 4 +
arch/riscv/kernel/ptrace.c | 99 +++++
arch/riscv/kernel/signal.c | 2 +
tools/testing/selftests/rseq/param_test.c | 23 ++
tools/testing/selftests/rseq/rseq-riscv.h | 622 ++++++++++++++++++++++++++++++
tools/testing/selftests/rseq/rseq.h | 2 +
8 files changed, 782 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/rseq/rseq-riscv.h
--
2.7.4
Hi,
This is v8 of syscall user dispatch. Last version got some acks but
there was one small documentation fix I wanted to do, as requested by
Florian. This also addresses the commit message fixup Peter requested.
The only actual code change from v7 is solving a trivial merge conflict
I myself created with the entry code fixup I made week and with
something else in the TIP tree.
I also shared this with glibc and there wasn't any complaints other than
the matter about user-notif vs. siginfo, which was discussed in v7 and
the understanding is that it is not necessary now and can be added
later, if needed, on the same infrastructure without a new api.
I'm not sure about TIP the rules, but is it too late to be queued for
the next merge window? I'd love to have this in 5.11 if possible, since
it has been flying for quite a while.
This is based on tip/master.
As usual, a working tree with this patchset is available at:
https://gitlab.collabora.com/krisman/linux -b syscall-user-dispatch-v8
Previous submissions are archived at:
RFC/v1: https://lkml.org/lkml/2020/7/8/96
v2: https://lkml.org/lkml/2020/7/9/17
v3: https://lkml.org/lkml/2020/7/12/4
v4: https://www.spinics.net/lists/linux-kselftest/msg16377.html
v5: https://lkml.org/lkml/2020/8/10/1320
v6: https://lkml.org/lkml/2020/9/4/1122
v7: https://lwn.net/Articles/837598/
Gabriel Krisman Bertazi (7):
x86: vdso: Expose sigreturn address on vdso to the kernel
signal: Expose SYS_USER_DISPATCH si_code type
kernel: Implement selective syscall userspace redirection
entry: Support Syscall User Dispatch on common syscall entry
selftests: Add kselftest for syscall user dispatch
selftests: Add benchmark for syscall user dispatch
docs: Document Syscall User Dispatch
.../admin-guide/syscall-user-dispatch.rst | 87 +++++
arch/x86/entry/vdso/vdso2c.c | 2 +
arch/x86/entry/vdso/vdso32/sigreturn.S | 2 +
arch/x86/entry/vdso/vma.c | 15 +
arch/x86/include/asm/elf.h | 2 +
arch/x86/include/asm/vdso.h | 2 +
arch/x86/kernel/signal_compat.c | 2 +-
fs/exec.c | 3 +
include/linux/entry-common.h | 2 +
include/linux/sched.h | 2 +
include/linux/syscall_user_dispatch.h | 40 +++
include/linux/thread_info.h | 2 +
include/uapi/asm-generic/siginfo.h | 3 +-
include/uapi/linux/prctl.h | 5 +
kernel/entry/Makefile | 2 +-
kernel/entry/common.c | 17 +
kernel/entry/common.h | 16 +
kernel/entry/syscall_user_dispatch.c | 102 ++++++
kernel/fork.c | 1 +
kernel/sys.c | 5 +
tools/testing/selftests/Makefile | 1 +
.../syscall_user_dispatch/.gitignore | 3 +
.../selftests/syscall_user_dispatch/Makefile | 9 +
.../selftests/syscall_user_dispatch/config | 1 +
.../syscall_user_dispatch/sud_benchmark.c | 200 +++++++++++
.../syscall_user_dispatch/sud_test.c | 310 ++++++++++++++++++
26 files changed, 833 insertions(+), 3 deletions(-)
create mode 100644 Documentation/admin-guide/syscall-user-dispatch.rst
create mode 100644 include/linux/syscall_user_dispatch.h
create mode 100644 kernel/entry/common.h
create mode 100644 kernel/entry/syscall_user_dispatch.c
create mode 100644 tools/testing/selftests/syscall_user_dispatch/.gitignore
create mode 100644 tools/testing/selftests/syscall_user_dispatch/Makefile
create mode 100644 tools/testing/selftests/syscall_user_dispatch/config
create mode 100644 tools/testing/selftests/syscall_user_dispatch/sud_benchmark.c
create mode 100644 tools/testing/selftests/syscall_user_dispatch/sud_test.c
--
2.29.2
Base
====
This series is based on (and therefore should apply cleanly to) the tag
"v5.12-rc7-mmots-2021-04-11-20-49", additionally with Peter's selftest cleanup
series applied first:
https://lore.kernel.org/patchwork/cover/1412450/
Changelog
=========
v4->v5:
- Picked up {Reviewed,Acked}-by's.
- Fix cleanup in error path in shmem_mcopy_atomic_pte(). [Hugh, Peter]
- Mention switching to lru_cache_add() in the commit message of 9/10. [Hugh]
- Split + reorder commits, so now we 1) implement the faulting path, 2)
implement the CONTINUE ioctl, and 3) advertise the feature. Squash the
documentation update into step (3). [Hugh, Peter]
- Reorder install_pte() cleanup to come before selftest changes. [Hugh]
v3->v4:
- Fix handling of the shmem private mcopy case. Previously, I had (incorrectly)
assumed that !vma_is_anonymous() was equivalent to "the page will be in the
page cache". But, in this case we have an optimization where we allocate a new
*anonymous* page. So, use a new "bool page_in_cache" instead, which checks if
page->mapping is set. Correct several places with this new check. [Hugh]
- Fix calling mm_counter() before page_add_..._rmap(). [Hugh]
- When modifying shmem_mcopy_atomic_pte() to use the new install_pte() helper,
just use lru_cache_add_inactive_or_unevictable(), no need to branch and maybe
use lru_cache_add(). [Hugh]
- De-pluralize mcopy_atomic_install_pte(s). [Hugh]
- Make "writable" a bool, and initialize consistently. [Hugh]
v2->v3:
- Picked up {Reviewed,Acked}-by's.
- Reorder commits: introduce CONTINUE before MINOR registration. [Hugh, Peter]
- Don't try to {unlock,put}_page an xarray value in shmem_getpage_gfp. [Hugh]
- Move enum mcopy_atomic_mode forward declare out of CONFIG_HUGETLB_PAGE. [Hugh]
- Keep mistakenly removed UFFD_USER_MODE_ONLY in selftest. [Peter]
- Cleanup context management in self test (make clear implicit, remove unneeded
return values now that we have err()). [Peter]
- Correct dst_pte argument to dst_pmd in shmem_mcopy_atomic_pte macro. [Hugh]
- Mention the new shmem support feature in documentation. [Hugh]
v1->v2:
- Pick up Reviewed-by's.
- Don't swapin page when a minor fault occurs. Notice that it needs to be
swapped in, and just immediately fire the minor fault. Let a future CONTINUE
deal with swapping in the page. [Peter]
- Clarify comment about i_size checks in mm/userfaultfd.c. [Peter]
- Only forward declare once (out of #ifdef) in hugetlb.h. [Peter]
Changes since [2]:
- Squash the fixes ([2]) in with the original series ([1]). This makes reviewing
easier, as we no longer have to sift through deltas undoing what we had done
before. [Hugh, Peter]
- Modify shmem_mcopy_atomic_pte() to use the new mcopy_atomic_install_ptes()
helper, reducing code duplication. [Hugh]
- Properly trigger handle_userfault() in the shmem_swapin_page() case. [Hugh]
- Use shmem_getpage() instead of find_lock_page() to lookup the existing page in
for continue. This properly deals with swapped-out pages. [Hugh]
- Unconditionally pte_mkdirty() for anon memory (as before). [Peter]
- Don't include userfaultfd_k.h in either hugetlb.h or shmem_fs.h. [Hugh]
- Add comment for UFFD_FEATURE_MINOR_SHMEM (to match _HUGETLBFS). [Hugh]
- Fix some small cleanup issues (parens, reworded conditionals, reduced plumbing
of some parameters, simplify labels/gotos, ...). [Hugh, Peter]
Overview
========
See the series which added minor faults for hugetlbfs [3] for a detailed
overview of minor fault handling in general. This series adds the same support
for shmem-backed areas.
This series is structured as follows:
- Commits 1 and 2 are cleanups.
- Commits 3 and 4 implement the new feature (minor fault handling for shmem).
- Commit 5 advertises that the feature is now available since at this point it's
fully implemented.
- Commit 6 is a final cleanup, modifying an existing code path to re-use a new
helper we've introduced.
- Commits 7, 8, 9, 10 update the userfaultfd selftest to exercise the feature.
Use Case
========
In some cases it is useful to have VM memory backed by tmpfs instead of
hugetlbfs. So, this feature will be used to support the same VM live migration
use case described in my original series.
Additionally, Android folks (Lokesh Gidra <lokeshgidra(a)google.com>) hope to
optimize the Android Runtime garbage collector using this feature:
"The plan is to use userfaultfd for concurrently compacting the heap. With
this feature, the heap can be shared-mapped at another location where the
GC-thread(s) could continue the compaction operation without the need to
invoke userfault ioctl(UFFDIO_COPY) each time. OTOH, if and when Java threads
get faults on the heap, UFFDIO_CONTINUE can be used to resume execution.
Furthermore, this feature enables updating references in the 'non-moving'
portion of the heap efficiently. Without this feature, uneccessary page
copying (ioctl(UFFDIO_COPY)) would be required."
[1] https://lore.kernel.org/patchwork/cover/1388144/
[2] https://lore.kernel.org/patchwork/patch/1408161/
[3] https://lore.kernel.org/linux-fsdevel/20210301222728.176417-1-axelrasmussen…
Axel Rasmussen (10):
userfaultfd/hugetlbfs: avoid including userfaultfd_k.h in hugetlb.h
userfaultfd/shmem: combine shmem_{mcopy_atomic,mfill_zeropage}_pte
userfaultfd/shmem: support minor fault registration for shmem
userfaultfd/shmem: support UFFDIO_CONTINUE for shmem
userfaultfd/shmem: advertise shmem minor fault support
userfaultfd/shmem: modify shmem_mcopy_atomic_pte to use install_pte()
userfaultfd/selftests: use memfd_create for shmem test type
userfaultfd/selftests: create alias mappings in the shmem test
userfaultfd/selftests: reinitialize test context in each test
userfaultfd/selftests: exercise minor fault handling shmem support
Documentation/admin-guide/mm/userfaultfd.rst | 3 +-
fs/userfaultfd.c | 6 +-
include/linux/hugetlb.h | 4 +-
include/linux/shmem_fs.h | 17 +-
include/linux/userfaultfd_k.h | 5 +
include/uapi/linux/userfaultfd.h | 7 +-
mm/hugetlb.c | 1 +
mm/memory.c | 8 +-
mm/shmem.c | 110 +++-----
mm/userfaultfd.c | 175 ++++++++----
tools/testing/selftests/vm/userfaultfd.c | 274 ++++++++++++-------
11 files changed, 360 insertions(+), 250 deletions(-)
--
2.31.1.498.g6c1eba8ee3d-goog