This series provides initial support for the ARMv9 Scalable Matrix
Extension (SME). SME takes the approach used for vectors in SVE and
extends this to provide architectural support for matrix operations. A
more detailed overview can be found in [1].
For the kernel SME can be thought of as a series of features which are
intended to be used together by applications but operate mostly
orthogonally:
- The ZA matrix register.
- Streaming mode, in which ZA can be accessed and a subset of SVE
features are available.
- A second vector length, used for streaming mode SVE and ZA and
controlled using a similar interface to that for SVE.
- TPIDR2, a new userspace controllable system register intended for use
by the C library for storing context related to the ZA ABI.
A substantial part of the series is dedicated to refactoring the
existing SVE support so that we don't need to duplicate code for
handling vector lengths and the SVE registers, this involves creating an
array of vector types and making the users take the vector type as a
parameter. I'm not 100% happy with this but wasn't able to come up with
anything better, duplicating code definitely felt like a bad idea so
this felt like the least bad thing. If this approach makes sense to
people it might make sense to split this off into a separate series
and/or merge it while the rest is pending review to try to make things a
little more digestable, the series is very large so it'd probably make
things easier to digest if some of the preparatory refactoring could be
merged before the rest is ready.
One feature of the architecture of particular note is that switching
to and from streaming mode may change the size of and invalidate the
contents of the SVE registers, and when in streaming mode the FFR is not
accessible. This complicates aspects of the ABI like signal handling
and ptrace.
This initial implementation is mainly intended to get the ABI in place,
there are several areas which will be worked on going forwards - some of
these will be blockers, others could be handled in followup serieses:
- SME is currently not supported for KVM guests, this will be done as a
followup series. A host system can use SME and run KVM guests but
SME is not available in the guests.
- The KVM host support is done in a very simplistic way, were anyone to
attempt to use it in production there would be performance impacts on
hosts with SME support. As part of this we also add enumeration of
fine grained traps.
- There is not currently ptrace or signal support TPIDR2, this will be
done as a followup series.
- No support is currently provided for scheduler control of SME or SME
applications, given the size of the SME register state the context
switch overhead may be noticable so this may be needed especially for
real time applications. Similar concerns already exist for larger
SVE vector lengths but are amplified for SME, particularly as the
vector length increases.
- There has been no work on optimising the performance of anything the
kernel does.
It is not expected that any systems will be encountered that support SME
but not SVE, SME is an ARMv9 feature and SVE is mandatory for ARMv9.
The code attempts to handle any such systems that are encountered but
this hasn't been tested extensively.
v11:
- Rebase onto v5.17-rc3.
- Provide a sme-inst.h to collect manual encodings in kselftest.
v10:
- Actually do the rebase of fixups from the previous version into
relevant patches.
v9:
- Remove defensive programming around IS_ENABLED() and FGT in KVM code.
- Fix naming of TPIDR2 FGT register bit.
- Add patches making handling of floating point register bits more
consistent (also sent as separate series).
- Drop now unused enumeration of fine grained traps.
v8:
- Rebase onto v5.17-rc1.
- Support interoperation with KVM, SME is disabled for KVM guests with
minimal handling for cleaning up SME state when entering and leaving
the guest.
- Document and implement that signal handlers are invoked with ZA and
streaming mode disabled.
- Use the RDSVL instruction introduced in EAC2 of the architecture to
obtain the streaming mode vector length during enumeration, ZA state
loading/saving and in test programs.
- Store a pointer to SVCR in fpsimd_last_state and use it in fpsimd_save()
for interoperation with KVM.
- Add a test case sme_trap_no_sm checking that we generate a SIGILL
when using an instruction that requires streaming mode without
enabling it.
- Add basic ZA context form validation to testcases helper library.
- Move signal tests over to validating streaming VL from ZA information.
- Pulled in patch removing ARRAY_SIZE() so that kselftest builds
cleanly and to avoid trivial conflicts.
v7:
- Rebase onto v5.16-rc3.
- Reduce indentation when supporting custom triggers for signal tests
as suggested by Catalin.
- Change to specifying a width for all CPU features rather than adding
single bit specific infrastructure.
- Don't require zeroing of non-shared SVE state during syscalls.
v6:
- Rebase onto v5.16-rc1.
- Return to disabling TIF_SVE on kernel entry even if we have SME
state, this avoids the need for KVM to handle the case where TIF_SVE
is set on guest entry.
- Add syscall-abi.h to SME updates to syscall-abi, mistakenly omitted
from commit.
v5:
- Rebase onto currently merged SVE and kselftest patches.
- Add support for the FA64 option, introduced in the recently published
EAC1 update to the specification.
- Pull in test program for the syscall ABI previously sent separately
with some revisions and add coverage for the SME ABI.
- Fix checking for options with 1 bit fields in ID_AA64SMFR0_EL1.
- Minor fixes and clarifications to the ABI documentation.
v4:
- Rebase onto merged patches.
- Remove an uneeded NULL check in vec_proc_do_default_vl().
- Include patch to factor out utility routines in kselftests written in
assembler.
- Specify -ffreestanding when building TPIDR2 test.
v3:
- Skip FFR rather than predicate registers in sve_flush_live().
- Don't assume a bool is all zeros in sve_flush_live() as per AAPCS.
- Don't redundantly specify a zero index when clearing FFR.
v2:
- Fix several issues with !SME and !SVE configurations.
- Preserve TPIDR2 when creating a new thread/process unless
CLONE_SETTLS is set.
- Report traps due to using features in an invalid mode as SIGILL.
- Spell out streaming mode behaviour in SVE ABI documentation more
directly.
- Document TPIDR2 in the ABI document.
- Use SMSTART and SMSTOP rather than read/modify/write sequences.
- Rework logic for exiting streaming mode on syscall.
- Don't needlessly initialise SVCR on access trap.
- Always restore SME VL for userspace if SME traps are disabled.
- Only yield to encourage preemption every 128 iterations in za-test,
otherwise do a getpid(), and validate SVCR after syscall.
- Leave streaming mode disabled except when reading the vector length
in za-test, and disable ZA after detecting a mismatch.
- Add SME support to vlset.
- Clarifications and typo fixes in comments.
- Move sme_alloc() forward declaration back a patch.
[1] https://community.arm.com/developer/ip-products/processors/b/processors-ip-…
Mark Brown (40):
arm64: Define CPACR_EL1_FPEN similarly to other floating point
controls
arm64: Always use individual bits in CPACR floating point enables
arm64: cpufeature: Always specify and use a field width for
capabilities
kselftest/arm64: Remove local ARRAY_SIZE() definitions
kselftest/arm64: signal: Allow tests to be incompatible with features
arm64/sme: Provide ABI documentation for SME
arm64/sme: System register and exception syndrome definitions
arm64/sme: Manually encode SME instructions
arm64/sme: Early CPU setup for SME
arm64/sme: Basic enumeration support
arm64/sme: Identify supported SME vector lengths at boot
arm64/sme: Implement sysctl to set the default vector length
arm64/sme: Implement vector length configuration prctl()s
arm64/sme: Implement support for TPIDR2
arm64/sme: Implement SVCR context switching
arm64/sme: Implement streaming SVE context switching
arm64/sme: Implement ZA context switching
arm64/sme: Implement traps and syscall handling for SME
arm64/sme: Disable ZA and streaming mode when handling signals
arm64/sme: Implement streaming SVE signal handling
arm64/sme: Implement ZA signal handling
arm64/sme: Implement ptrace support for streaming mode SVE registers
arm64/sme: Add ptrace support for ZA
arm64/sme: Disable streaming mode and ZA when flushing CPU state
arm64/sme: Save and restore streaming mode over EFI runtime calls
KVM: arm64: Hide SME system registers from guests
KVM: arm64: Trap SME usage in guest
KVM: arm64: Handle SME host state when running guests
arm64/sme: Provide Kconfig for SME
kselftest/arm64: Add manual encodings for SME instructions
kselftest/arm64: sme: Add SME support to vlset
kselftest/arm64: Add tests for TPIDR2
kselftest/arm64: Extend vector configuration API tests to cover SME
kselftest/arm64: sme: Provide streaming mode SVE stress test
kselftest/arm64: signal: Handle ZA signal context in core code
kselftest/arm64: Add stress test for SME ZA context switching
kselftest/arm64: signal: Add SME signal handling tests
kselftest/arm64: Add streaming SVE to SVE ptrace tests
kselftest/arm64: Add coverage for the ZA ptrace interface
kselftest/arm64: Add SME support to syscall ABI test
Documentation/arm64/elf_hwcaps.rst | 33 +
Documentation/arm64/index.rst | 1 +
Documentation/arm64/sme.rst | 432 +++++++++++++
Documentation/arm64/sve.rst | 70 ++-
arch/arm64/Kconfig | 11 +
arch/arm64/include/asm/cpu.h | 4 +
arch/arm64/include/asm/cpufeature.h | 25 +
arch/arm64/include/asm/el2_setup.h | 64 +-
arch/arm64/include/asm/esr.h | 13 +-
arch/arm64/include/asm/exception.h | 1 +
arch/arm64/include/asm/fpsimd.h | 110 +++-
arch/arm64/include/asm/fpsimdmacros.h | 86 +++
arch/arm64/include/asm/hwcap.h | 8 +
arch/arm64/include/asm/kvm_arm.h | 5 +-
arch/arm64/include/asm/kvm_host.h | 4 +
arch/arm64/include/asm/processor.h | 18 +-
arch/arm64/include/asm/sysreg.h | 67 +-
arch/arm64/include/asm/thread_info.h | 2 +
arch/arm64/include/uapi/asm/hwcap.h | 8 +
arch/arm64/include/uapi/asm/ptrace.h | 69 ++-
arch/arm64/include/uapi/asm/sigcontext.h | 55 +-
arch/arm64/kernel/cpufeature.c | 273 ++++++--
arch/arm64/kernel/cpuinfo.c | 13 +
arch/arm64/kernel/entry-common.c | 11 +
arch/arm64/kernel/entry-fpsimd.S | 36 ++
arch/arm64/kernel/fpsimd.c | 585 ++++++++++++++++--
arch/arm64/kernel/process.c | 28 +-
arch/arm64/kernel/ptrace.c | 356 +++++++++--
arch/arm64/kernel/signal.c | 194 +++++-
arch/arm64/kernel/syscall.c | 34 +-
arch/arm64/kernel/traps.c | 1 +
arch/arm64/kvm/fpsimd.c | 43 +-
arch/arm64/kvm/hyp/include/hyp/switch.h | 4 +-
arch/arm64/kvm/hyp/nvhe/switch.c | 30 +
arch/arm64/kvm/hyp/vhe/switch.c | 15 +-
arch/arm64/kvm/sys_regs.c | 9 +-
arch/arm64/tools/cpucaps | 2 +
include/uapi/linux/elf.h | 2 +
include/uapi/linux/prctl.h | 9 +
kernel/sys.c | 12 +
tools/testing/selftests/arm64/abi/.gitignore | 1 +
tools/testing/selftests/arm64/abi/Makefile | 9 +-
.../selftests/arm64/abi/syscall-abi-asm.S | 69 ++-
.../testing/selftests/arm64/abi/syscall-abi.c | 205 +++++-
.../testing/selftests/arm64/abi/syscall-abi.h | 15 +
tools/testing/selftests/arm64/abi/tpidr2.c | 298 +++++++++
tools/testing/selftests/arm64/fp/.gitignore | 4 +
tools/testing/selftests/arm64/fp/Makefile | 12 +-
tools/testing/selftests/arm64/fp/rdvl-sme.c | 14 +
tools/testing/selftests/arm64/fp/rdvl.S | 10 +
tools/testing/selftests/arm64/fp/rdvl.h | 1 +
tools/testing/selftests/arm64/fp/sme-inst.h | 51 ++
tools/testing/selftests/arm64/fp/ssve-stress | 59 ++
tools/testing/selftests/arm64/fp/sve-ptrace.c | 13 +-
tools/testing/selftests/arm64/fp/sve-test.S | 20 +
tools/testing/selftests/arm64/fp/vec-syscfg.c | 10 +
tools/testing/selftests/arm64/fp/vlset.c | 10 +-
tools/testing/selftests/arm64/fp/za-ptrace.c | 354 +++++++++++
tools/testing/selftests/arm64/fp/za-stress | 59 ++
tools/testing/selftests/arm64/fp/za-test.S | 388 ++++++++++++
.../testing/selftests/arm64/signal/.gitignore | 2 +
.../selftests/arm64/signal/test_signals.h | 5 +
.../arm64/signal/test_signals_utils.c | 40 +-
.../arm64/signal/test_signals_utils.h | 2 +
.../testcases/fake_sigreturn_sme_change_vl.c | 92 +++
.../arm64/signal/testcases/sme_trap_no_sm.c | 38 ++
.../signal/testcases/sme_trap_non_streaming.c | 45 ++
.../arm64/signal/testcases/sme_trap_za.c | 36 ++
.../selftests/arm64/signal/testcases/sme_vl.c | 68 ++
.../arm64/signal/testcases/ssve_regs.c | 129 ++++
.../arm64/signal/testcases/testcases.c | 36 ++
.../arm64/signal/testcases/testcases.h | 3 +-
72 files changed, 4590 insertions(+), 251 deletions(-)
create mode 100644 Documentation/arm64/sme.rst
create mode 100644 tools/testing/selftests/arm64/abi/syscall-abi.h
create mode 100644 tools/testing/selftests/arm64/abi/tpidr2.c
create mode 100644 tools/testing/selftests/arm64/fp/rdvl-sme.c
create mode 100644 tools/testing/selftests/arm64/fp/sme-inst.h
create mode 100644 tools/testing/selftests/arm64/fp/ssve-stress
create mode 100644 tools/testing/selftests/arm64/fp/za-ptrace.c
create mode 100644 tools/testing/selftests/arm64/fp/za-stress
create mode 100644 tools/testing/selftests/arm64/fp/za-test.S
create mode 100644 tools/testing/selftests/arm64/signal/testcases/fake_sigreturn_sme_change_vl.c
create mode 100644 tools/testing/selftests/arm64/signal/testcases/sme_trap_no_sm.c
create mode 100644 tools/testing/selftests/arm64/signal/testcases/sme_trap_non_streaming.c
create mode 100644 tools/testing/selftests/arm64/signal/testcases/sme_trap_za.c
create mode 100644 tools/testing/selftests/arm64/signal/testcases/sme_vl.c
create mode 100644 tools/testing/selftests/arm64/signal/testcases/ssve_regs.c
base-commit: dfd42facf1e4ada021b939b4e19c935dcdd55566
--
2.30.2
From: Mike Kravetz <mike.kravetz(a)oracle.com>
[ Upstream commit fda153c89af344d21df281009a9d046cf587ea0f ]
Running the memfd script ./run_hugetlbfs_test.sh will often end in error
as follows:
memfd-hugetlb: CREATE
memfd-hugetlb: BASIC
memfd-hugetlb: SEAL-WRITE
memfd-hugetlb: SEAL-FUTURE-WRITE
memfd-hugetlb: SEAL-SHRINK
fallocate(ALLOC) failed: No space left on device
./run_hugetlbfs_test.sh: line 60: 166855 Aborted (core dumped) ./memfd_test hugetlbfs
opening: ./mnt/memfd
fuse: DONE
If no hugetlb pages have been preallocated, run_hugetlbfs_test.sh will
allocate 'just enough' pages to run the test. In the SEAL-FUTURE-WRITE
test the mfd_fail_write routine maps the file, but does not unmap. As a
result, two hugetlb pages remain reserved for the mapping. When the
fallocate call in the SEAL-SHRINK test attempts allocate all hugetlb
pages, it is short by the two reserved pages.
Fix by making sure to unmap in mfd_fail_write.
Link: https://lkml.kernel.org/r/20220219004340.56478-1-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Joel Fernandes <joel(a)joelfernandes.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/memfd/memfd_test.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index 26546892cd545..faab09215c88b 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -373,6 +373,7 @@ static void mfd_fail_write(int fd)
printf("mmap()+mprotect() didn't fail as expected\n");
abort();
}
+ munmap(p, mfd_def_size);
}
/* verify PUNCH_HOLE fails */
--
2.34.1
From: Mike Kravetz <mike.kravetz(a)oracle.com>
[ Upstream commit fda153c89af344d21df281009a9d046cf587ea0f ]
Running the memfd script ./run_hugetlbfs_test.sh will often end in error
as follows:
memfd-hugetlb: CREATE
memfd-hugetlb: BASIC
memfd-hugetlb: SEAL-WRITE
memfd-hugetlb: SEAL-FUTURE-WRITE
memfd-hugetlb: SEAL-SHRINK
fallocate(ALLOC) failed: No space left on device
./run_hugetlbfs_test.sh: line 60: 166855 Aborted (core dumped) ./memfd_test hugetlbfs
opening: ./mnt/memfd
fuse: DONE
If no hugetlb pages have been preallocated, run_hugetlbfs_test.sh will
allocate 'just enough' pages to run the test. In the SEAL-FUTURE-WRITE
test the mfd_fail_write routine maps the file, but does not unmap. As a
result, two hugetlb pages remain reserved for the mapping. When the
fallocate call in the SEAL-SHRINK test attempts allocate all hugetlb
pages, it is short by the two reserved pages.
Fix by making sure to unmap in mfd_fail_write.
Link: https://lkml.kernel.org/r/20220219004340.56478-1-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Joel Fernandes <joel(a)joelfernandes.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/memfd/memfd_test.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index 845e5f67b6f02..cf4c5276eb06a 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -416,6 +416,7 @@ static void mfd_fail_write(int fd)
printf("mmap()+mprotect() didn't fail as expected\n");
abort();
}
+ munmap(p, mfd_def_size);
}
/* verify PUNCH_HOLE fails */
--
2.34.1
From: Mike Kravetz <mike.kravetz(a)oracle.com>
[ Upstream commit fda153c89af344d21df281009a9d046cf587ea0f ]
Running the memfd script ./run_hugetlbfs_test.sh will often end in error
as follows:
memfd-hugetlb: CREATE
memfd-hugetlb: BASIC
memfd-hugetlb: SEAL-WRITE
memfd-hugetlb: SEAL-FUTURE-WRITE
memfd-hugetlb: SEAL-SHRINK
fallocate(ALLOC) failed: No space left on device
./run_hugetlbfs_test.sh: line 60: 166855 Aborted (core dumped) ./memfd_test hugetlbfs
opening: ./mnt/memfd
fuse: DONE
If no hugetlb pages have been preallocated, run_hugetlbfs_test.sh will
allocate 'just enough' pages to run the test. In the SEAL-FUTURE-WRITE
test the mfd_fail_write routine maps the file, but does not unmap. As a
result, two hugetlb pages remain reserved for the mapping. When the
fallocate call in the SEAL-SHRINK test attempts allocate all hugetlb
pages, it is short by the two reserved pages.
Fix by making sure to unmap in mfd_fail_write.
Link: https://lkml.kernel.org/r/20220219004340.56478-1-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Joel Fernandes <joel(a)joelfernandes.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/memfd/memfd_test.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index 10baa1652fc2a..a4e520b94e431 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -386,6 +386,7 @@ static void mfd_fail_write(int fd)
printf("mmap()+mprotect() didn't fail as expected\n");
abort();
}
+ munmap(p, mfd_def_size);
}
/* verify PUNCH_HOLE fails */
--
2.34.1
Hello,
The aim of this series is to make resctrl_tests run by using
kselftest framework.
- I modify resctrl_test Makefile and kselftest Makefile,
to enable build/run resctrl_tests by using kselftest framework.
Of course, users can also build/run resctrl_tests without
using framework as before.
- I change the default limited time for resctrl_tests to 120 seconds, to
ensure the resctrl_tests finish in limited time on different environments.
- When resctrl file system is not supported by environment or
resctrl_tests is not run as root, return skip code of kselftest framework.
- If resctrl_tests does not finish in limited time, terminate it as
same as executing ctrl+c that kills parent process and child process.
Difference from v2:
- I reworte changelog of this patch series.
- I added how to use framework to run resctrl to README. [PATCH v3 2/5]
- License has no dependencies on this patch series, I separated from it this patch series to another patch.
https://lore.kernel.org/lkml/20211213100154.180599-1-tan.shaopeng@jp.fujits…
With regard to the limited time, I think 120s is not a problem since some tests have a longer
timeout (e.g. net test is 300s). Please let me know if this is wrong.
Thanks,
Shaopeng Tan (5):
selftests/resctrl: Kill child process before parent process terminates
if SIGTERM is received
selftests/resctrl: Make resctrl_tests run using kselftest framework
selftests/resctrl: Update README about using kselftest framework to
build/run resctrl_tests
selftests/resctrl: Change the default limited time to 120 seconds
selftests/resctrl: Fix resctrl_tests' return code to work with
selftest framework
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/resctrl/Makefile | 20 ++++-------
tools/testing/selftests/resctrl/README | 34 +++++++++++++++++++
.../testing/selftests/resctrl/resctrl_tests.c | 4 +--
tools/testing/selftests/resctrl/resctrl_val.c | 1 +
tools/testing/selftests/resctrl/settings | 1 +
6 files changed, 45 insertions(+), 16 deletions(-)
create mode 100644 tools/testing/selftests/resctrl/settings
--
2.27.0
Hi there,
This series introduces support of eBPF for HID devices.
I have several use cases where eBPF could be interesting for those
input devices:
- simple fixup of report descriptor:
In the HID tree, we have half of the drivers that are "simple" and
that just fix one key or one byte in the report descriptor.
Currently, for users of such devices, the process of fixing them
is long and painful.
With eBPF, we could externalize those fixups in one external repo,
ship various CoRe bpf programs and have those programs loaded at boot
time without having to install a new kernel (and wait 6 months for the
fix to land in the distro kernel)
- Universal Stylus Interface (or any other new fancy feature that
requires a new kernel API)
See [0].
Basically, USI pens are requiring a new kernel API because there are
some channels of communication our HID and input stack are not capable
of. Instead of using hidraw or creating new sysfs or ioctls, we can rely
on eBPF to have the kernel API controlled by the consumer and to not
impact the performances by waking up userspace every time there is an
event.
- Surface Dial
This device is a "puck" from Microsoft, basically a rotary dial with a
push button. The kernel already exports it as such but doesn't handle
the haptic feedback we can get out of it.
Furthermore, that device is not recognized by userspace and so it's a
nice paperwight in the end.
With eBPF, we can morph that device into a mouse, and convert the dial
events into wheel events. Also, we can set/unset the haptic feedback
from userspace. The convenient part of BPF makes it that the kernel
doesn't make any choice that would need to be reverted because that
specific userspace doesn't handle it properly or because that other
one expects it to be different.
- firewall
What if we want to prevent other users to access a specific feature of a
device? (think a possibly bonker firmware update entry popint)
With eBPF, we can intercept any HID command emitted to the device and
validate it or not.
This also allows to sync the state between the userspace and the
kernel/bpf program because we can intercept any incoming command.
- tracing
The last usage I have in mind is tracing events and all the fun we can
do we BPF to summarize and analyze events.
Right now, tracing relies on hidraw. It works well except for a couple
of issues:
1. if the driver doesn't export a hidraw node, we can't trace anything
(eBPF will be a "god-mode" there, so it might raise some eyebrows)
2. hidraw doesn't catch the other process requests to the device, which
means that we have cases where we need to add printks to the kernel
to understand what is happening.
With that long introduction, here is the v1 of the support of eBPF in
HID.
I have targeted bpf-next here because the parts that will have the most
conflicts are in bpf. There might be a trivial minor conflict in
include/linux/hid.h with an other series I have pending[1].
I am relatively new to bpf, so having some feedback would be most very
welcome.
A couple of notes though:
- The series is missing a SEC("hid/driver_event") which would allow to
intercept incoming requests to the device from anybody. I left it
outside because it's not critical to have it from day one (we are more
interested right now by the USI case above)
- I am still wondering how to integrate the tracing part:
right now, if a bpf program is loaded before we start the tracer, we
will see *modified* events in the tracer. However, it might be
interesting to decide to see either unmodified (raw events from the
device) or modified events.
I think a flag might be able to solve that. The flag will control
whether we add the new program at the beginning of the list or at the
tail, but I am not sure if this is common practice in eBPF or if
there is a better way.
Cheers,
Benjamin
[0] https://lore.kernel.org/linux-input/20211215134220.1735144-1-tero.kristo@li…
[1] https://lore.kernel.org/linux-input/20220203143226.4023622-1-benjamin.tisso…
Benjamin Tissoires (6):
HID: initial BPF implementation
HID: bpf: allow to change the report descriptor from an eBPF program
HID: bpf: add hid_{get|set}_data helpers
HID: bpf: add new BPF type to trigger commands from userspace
HID: bpf: tests: rely on uhid event to know if a test device is ready
HID: bpf: add bpf_hid_raw_request helper function
drivers/hid/Makefile | 1 +
drivers/hid/hid-bpf.c | 327 +++++++++
drivers/hid/hid-core.c | 31 +-
include/linux/bpf-hid.h | 98 +++
include/linux/bpf_types.h | 4 +
include/linux/hid.h | 25 +
include/uapi/linux/bpf.h | 33 +
include/uapi/linux/bpf_hid.h | 56 ++
kernel/bpf/Makefile | 3 +
kernel/bpf/hid.c | 653 ++++++++++++++++++
kernel/bpf/syscall.c | 12 +
samples/bpf/.gitignore | 1 +
samples/bpf/Makefile | 4 +
samples/bpf/hid_mouse_kern.c | 91 +++
samples/bpf/hid_mouse_user.c | 129 ++++
tools/include/uapi/linux/bpf.h | 33 +
tools/lib/bpf/libbpf.c | 9 +
tools/lib/bpf/libbpf.h | 2 +
tools/lib/bpf/libbpf.map | 1 +
tools/testing/selftests/bpf/prog_tests/hid.c | 685 +++++++++++++++++++
tools/testing/selftests/bpf/progs/hid.c | 149 ++++
21 files changed, 2339 insertions(+), 8 deletions(-)
create mode 100644 drivers/hid/hid-bpf.c
create mode 100644 include/linux/bpf-hid.h
create mode 100644 include/uapi/linux/bpf_hid.h
create mode 100644 kernel/bpf/hid.c
create mode 100644 samples/bpf/hid_mouse_kern.c
create mode 100644 samples/bpf/hid_mouse_user.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/hid.c
create mode 100644 tools/testing/selftests/bpf/progs/hid.c
--
2.35.1
Changes from Previous Version (v2)
==================================
Compared to the v2 of this patchset
(https://lore.kernel.org/linux-mm/20220225130712.12682-1-sj@kernel.org/), this
version contains below changes.
- Put real details in the ABI document (Greg KH)
- Update 'Date:' in ABI document from Feb 2022 to Mar 2022 (Greg KH)
Introduction
============
DAMON's debugfs-based user interface (DAMON_DBGFS) served very well, so far.
However, it unnecessarily depends on debugfs, while DAMON is not aimed to be
used for only debugging. Also, the interface receives multiple values via one
file. For example, schemes file receives 18 values. As a result, it is
inefficient, hard to be used, and difficult to be extended. Especially,
keeping backward compatibility of user space tools is getting only challenging.
It would be better to implement another reliable and flexible interface and
deprecate DAMON_DBGFS in long term.
For the reason, this patchset introduces a sysfs-based new user interface of
DAMON. The idea of the new interface is, using directory hierarchies and
having one dedicated file for each value. For a short example, users can do
the virtual address monitoring via the interface as below:
# cd /sys/kernel/mm/damon/admin/
# echo 1 > kdamonds/nr_kdamonds
# echo 1 > kdamonds/0/contexts/nr_contexts
# echo vaddr > kdamonds/0/contexts/0/operations
# echo 1 > kdamonds/0/contexts/0/targets/nr_targets
# echo $(pidof <workload>) > kdamonds/0/contexts/0/targets/0/pid_target
# echo on > kdamonds/0/state
A brief representation of the files hierarchy of DAMON sysfs interface is as
below. Childs are represented with indentation, directories are having '/'
suffix, and files in each directory are separated by comma.
/sys/kernel/mm/damon/admin
│ kdamonds/nr_kdamonds
│ │ 0/state,pid
│ │ │ contexts/nr_contexts
│ │ │ │ 0/operations
│ │ │ │ │ monitoring_attrs/
│ │ │ │ │ │ intervals/sample_us,aggr_us,update_us
│ │ │ │ │ │ nr_regions/min,max
│ │ │ │ │ targets/nr_targets
│ │ │ │ │ │ 0/pid_target
│ │ │ │ │ │ │ regions/nr_regions
│ │ │ │ │ │ │ │ 0/start,end
│ │ │ │ │ │ │ │ ...
│ │ │ │ │ │ ...
│ │ │ │ │ schemes/nr_schemes
│ │ │ │ │ │ 0/action
│ │ │ │ │ │ │ access_pattern/
│ │ │ │ │ │ │ │ sz/min,max
│ │ │ │ │ │ │ │ nr_accesses/min,max
│ │ │ │ │ │ │ │ age/min,max
│ │ │ │ │ │ │ quotas/ms,bytes,reset_interval_ms
│ │ │ │ │ │ │ │ weights/sz_permil,nr_accesses_permil,age_permil
│ │ │ │ │ │ │ watermarks/metric,interval_us,high,mid,low
│ │ │ │ │ │ │ stats/nr_tried,sz_tried,nr_applied,sz_applied,qt_exceeds
│ │ │ │ │ │ ...
│ │ │ │ ...
│ │ ...
Detailed usage of the files will be described in the final Documentation patch
of this patchset.
Main Difference Between DAMON_DBGFS and DAMON_SYSFS
---------------------------------------------------
At the moment, DAMON_DBGFS and DAMON_SYSFS provides same features. One
important difference between them is their exclusiveness. DAMON_DBGFS works in
an exclusive manner, so that no DAMON worker thread (kdamond) in the system can
run concurrently and interfere somehow. For the reason, DAMON_DBGFS asks users
to construct all monitoring contexts and start them at once. It's not a big
problem but makes the operation a little bit complex and unflexible.
For more flexible usage, DAMON_SYSFS moves the responsibility of preventing any
possible interference to the admins and work in a non-exclusive manner. That
is, users can configure and start contexts one by one. Note that DAMON
respects both exclusive groups and non-exclusive groups of contexts, in a
manner similar to that of reader-writer locks. That is, if any exclusive
monitoring contexts (e.g., contexts that started via DAMON_DBGFS) are running,
DAMON_SYSFS does not start new contexts, and vice versa.
Future Plan of DAMON_DBGFS Deprecation
======================================
Once this patchset is merged, DAMON_DBGFS development will be frozen. That is,
we will maintain it to work as is now so that no users will be break. But, it
will not be extended to provide any new feature of DAMON. The support will be
continued only until next LTS release. After that, we will drop DAMON_DBGFS.
User-space Tooling Compatibility
--------------------------------
As DAMON_SYSFS provides all features of DAMON_DBGFS, all user space tooling can
move to DAMON_SYSFS. As we will continue supporting DAMON_DBGFS until next LTS
kernel release, user space tools would have enough time to move to DAMON_SYSFS.
The official user space tool, damo[1], is already supporting both DAMON_SYSFS
and DAMON_DBGFS. Both correctness tests[2] and performance tests[3] of DAMON
using DAMON_SYSFS also passed.
[1] https://github.com/awslabs/damo
[2] https://github.com/awslabs/damon-tests/tree/master/corr
[3] https://github.com/awslabs/damon-tests/tree/master/perf
Complete Git Tree
=================
You can get the complete git tree from
https://git.kernel.org/sj/h/damon/sysfs/patches/v2.
Sequence of Patches
===================
First two patches (patches 1-2) make core changes for DAMON_SYSFS. The first
one (patch 1) allows non-exclusive DAMON contexts so that DAMON_SYSFS can work
in non-exclusive mode, while the second one (patch 2) adds size of DAMON enum
types so that DAMON API users can safely iterate the enums.
Third patch (patch 3) implements basic sysfs stub for virtual address spaces
monitoring. Note that this implements only sysfs files and DAMON is not
linked. Fourth patch (patch 4) links the DAMON_SYSFS to DAMON so that users
can control DAMON using the sysfs files.
Following six patches (patches 5-10) implements other DAMON features that
DAMON_DBGFS supports one by one (physical address space monitoring, DAMON-based
operation schemes, schemes quotas, schemes prioritization weights, schemes
watermarks, and schemes stats).
Following patch (patch 11) adds a simple selftest for DAMON_SYSFS, and the
final one (patch 12) documents DAMON_SYSFS.
Patch History
=============
Changes from v2
(https://lore.kernel.org/linux-mm/20220225130712.12682-1-sj@kernel.org/)
- Put real details in the ABI document (Greg KH)
- Update 'Date:' in ABI document from Feb 2022 to Mar 2022 (Greg KH)
Changes from v1
(https://lore.kernel.org/linux-mm/20220223152051.22936-1-sj@kernel.org/)
- Use __ATTR_R{O,W}_MODE() instead of __ATTR() (Greg KH)
- Change some file names for using __ATTR_R{O,W}_MODE() (Greg KH)
- Add ABI document (Greg KH)
Chages from RFC
(https://lore.kernel.org/linux-mm/20220217161938.8874-1-sj@kernel.org/)
- Implement all DAMON debugfs interface providing features
- Writeup documents
- Add more selftests
SeongJae Park (13):
mm/damon/core: Allow non-exclusive DAMON start/stop
mm/damon/core: Add number of each enum type values
mm/damon: Implement a minimal stub for sysfs-based DAMON interface
mm/damon/sysfs: Link DAMON for virtual address spaces monitoring
mm/damon/sysfs: Support the physical address space monitoring
mm/damon/sysfs: Support DAMON-based Operation Schemes
mm/damon/sysfs: Support DAMOS quotas
mm/damon/sysfs: Support schemes prioritization
mm/damon/sysfs: Support DAMOS watermarks
mm/damon/sysfs: Support DAMOS stats
selftests/damon: Add a test for DAMON sysfs interface
Docs/admin-guide/mm/damon/usage: Document DAMON sysfs interface
Docs/ABI/testing: Add DAMON sysfs interface ABI document
.../ABI/testing/sysfs-kernel-mm-damon | 274 ++
Documentation/admin-guide/mm/damon/usage.rst | 350 ++-
MAINTAINERS | 1 +
include/linux/damon.h | 6 +-
mm/damon/Kconfig | 7 +
mm/damon/Makefile | 1 +
mm/damon/core.c | 23 +-
mm/damon/dbgfs.c | 2 +-
mm/damon/reclaim.c | 2 +-
mm/damon/sysfs.c | 2594 +++++++++++++++++
tools/testing/selftests/damon/Makefile | 1 +
tools/testing/selftests/damon/sysfs.sh | 306 ++
12 files changed, 3550 insertions(+), 17 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-damon
create mode 100644 mm/damon/sysfs.c
create mode 100755 tools/testing/selftests/damon/sysfs.sh
--
2.17.1
Extend the interoperability with IMA, to give wider flexibility for the
implementation of integrity-focused LSMs based on eBPF.
Patch 1 fixes some style issues.
Patches 2-4 gives the ability to eBPF-based LSMs to take advantage of the
measurement capability of IMA without needing to setup a policy in IMA
(those LSMs might implement the policy capability themselves).
Patches 5-6 allows eBPF-based LSMs to evaluate files read by the kernel.
Changelog
v1:
- Modify ima_file_hash() only and allow the usage of the function with the
modified behavior by eBPF-based LSMs through the new function
bpf_ima_file_hash() (suggested by Mimi)
- Make bpf_lsm_kernel_read_file() sleepable so that bpf_ima_inode_hash()
and bpf_ima_file_hash() can be called inside the implementation of
eBPF-based LSMs for this hook
Roberto Sassu (6):
ima: Fix documentation-related warnings in ima_main.c
ima: Always return a file measurement in ima_file_hash()
bpf-lsm: Introduce new helper bpf_ima_file_hash()
selftests/bpf: Add test for bpf_ima_file_hash()
bpf-lsm: Make bpf_lsm_kernel_read_file() as sleepable
selftests/bpf: Add test for bpf_lsm_kernel_read_file()
include/uapi/linux/bpf.h | 11 +++++
kernel/bpf/bpf_lsm.c | 21 +++++++++
security/integrity/ima/ima_main.c | 47 ++++++++++++-------
tools/include/uapi/linux/bpf.h | 11 +++++
tools/testing/selftests/bpf/ima_setup.sh | 2 +
.../selftests/bpf/prog_tests/test_ima.c | 30 ++++++++++--
tools/testing/selftests/bpf/progs/ima.c | 34 ++++++++++++--
7 files changed, 132 insertions(+), 24 deletions(-)
--
2.32.0
Before, our help output contained lines like
--kconfig_add KCONFIG_ADD
--qemu_config qemu_config
--jobs jobs
They're not very helpful.
The former kind come from the automatic 'metavar' we get from argparse,
the uppsercase version of the flag name.
The latter are where we manually specified metavar as the flag name.
After:
--build_dir DIR
--make_options X=Y
--kunitconfig KUNITCONFIG
--kconfig_add CONFIG_X=Y
--arch ARCH
--cross_compile PREFIX
--qemu_config FILE
--jobs N
--timeout SECONDS
--raw_output [{all,kunit}]
--json [FILE]
This patch tries to make the code more clear by specifying the _type_ of
input we expect, e.g. --build_dir is a DIR, --qemu_config is a FILE.
I also switched it to uppercase since it looked more clearly like
placeholder text that way.
This patch also changes --raw_output to specify `choices` to make it
more clear what the options are, and this way argparse can validate it
for us, as shown by the added test case.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
tools/testing/kunit/kunit.py | 26 ++++++++++++--------------
tools/testing/kunit/kunit_tool_test.py | 5 +++++
2 files changed, 17 insertions(+), 14 deletions(-)
diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
index 9274c6355809..566404f5e42a 100755
--- a/tools/testing/kunit/kunit.py
+++ b/tools/testing/kunit/kunit.py
@@ -206,8 +206,6 @@ def parse_tests(request: KunitParseRequest, input_data: Iterable[str]) -> Tuple[
pass
elif request.raw_output == 'kunit':
output = kunit_parser.extract_tap_lines(output)
- else:
- print(f'Unknown --raw_output option "{request.raw_output}"', file=sys.stderr)
for line in output:
print(line.rstrip())
@@ -281,10 +279,10 @@ def add_common_opts(parser) -> None:
parser.add_argument('--build_dir',
help='As in the make command, it specifies the build '
'directory.',
- type=str, default='.kunit', metavar='build_dir')
+ type=str, default='.kunit', metavar='DIR')
parser.add_argument('--make_options',
help='X=Y make option, can be repeated.',
- action='append')
+ action='append', metavar='X=Y')
parser.add_argument('--alltests',
help='Run all KUnit tests through allyesconfig',
action='store_true')
@@ -292,11 +290,11 @@ def add_common_opts(parser) -> None:
help='Path to Kconfig fragment that enables KUnit tests.'
' If given a directory, (e.g. lib/kunit), "/.kunitconfig" '
'will get automatically appended.',
- metavar='kunitconfig')
+ metavar='KUNITCONFIG')
parser.add_argument('--kconfig_add',
help='Additional Kconfig options to append to the '
'.kunitconfig, e.g. CONFIG_KASAN=y. Can be repeated.',
- action='append')
+ action='append', metavar='CONFIG_X=Y')
parser.add_argument('--arch',
help=('Specifies the architecture to run tests under. '
@@ -304,7 +302,7 @@ def add_common_opts(parser) -> None:
'string passed to the ARCH make param, '
'e.g. i386, x86_64, arm, um, etc. Non-UML '
'architectures run on QEMU.'),
- type=str, default='um', metavar='arch')
+ type=str, default='um', metavar='ARCH')
parser.add_argument('--cross_compile',
help=('Sets make\'s CROSS_COMPILE variable; it should '
@@ -316,18 +314,18 @@ def add_common_opts(parser) -> None:
'if you have downloaded the microblaze toolchain '
'from the 0-day website to a directory in your '
'home directory called `toolchains`).'),
- metavar='cross_compile')
+ metavar='PREFIX')
parser.add_argument('--qemu_config',
help=('Takes a path to a path to a file containing '
'a QemuArchParams object.'),
- type=str, metavar='qemu_config')
+ type=str, metavar='FILE')
def add_build_opts(parser) -> None:
parser.add_argument('--jobs',
help='As in the make command, "Specifies the number of '
'jobs (commands) to run simultaneously."',
- type=int, default=get_default_jobs(), metavar='jobs')
+ type=int, default=get_default_jobs(), metavar='N')
def add_exec_opts(parser) -> None:
parser.add_argument('--timeout',
@@ -336,7 +334,7 @@ def add_exec_opts(parser) -> None:
'tests.',
type=int,
default=300,
- metavar='timeout')
+ metavar='SECONDS')
parser.add_argument('filter_glob',
help='Filter which KUnit test suites/tests run at '
'boot-time, e.g. list* or list*.*del_test',
@@ -346,7 +344,7 @@ def add_exec_opts(parser) -> None:
metavar='filter_glob')
parser.add_argument('--kernel_args',
help='Kernel command-line parameters. Maybe be repeated',
- action='append')
+ action='append', metavar='')
parser.add_argument('--run_isolated', help='If set, boot the kernel for each '
'individual suite/test. This is can be useful for debugging '
'a non-hermetic test, one that might pass/fail based on '
@@ -357,13 +355,13 @@ def add_exec_opts(parser) -> None:
def add_parse_opts(parser) -> None:
parser.add_argument('--raw_output', help='If set don\'t format output from kernel. '
'If set to --raw_output=kunit, filters to just KUnit output.',
- type=str, nargs='?', const='all', default=None)
+ type=str, nargs='?', const='all', default=None, choices=['all', 'kunit'])
parser.add_argument('--json',
nargs='?',
help='Stores test results in a JSON, and either '
'prints to stdout or saves to file if a '
'filename is specified',
- type=str, const='stdout', default=None)
+ type=str, const='stdout', default=None, metavar='FILE')
def main(argv, linux=None):
parser = argparse.ArgumentParser(
diff --git a/tools/testing/kunit/kunit_tool_test.py b/tools/testing/kunit/kunit_tool_test.py
index 352369dffbd9..eb2011d12c78 100755
--- a/tools/testing/kunit/kunit_tool_test.py
+++ b/tools/testing/kunit/kunit_tool_test.py
@@ -595,6 +595,11 @@ class KUnitMainTest(unittest.TestCase):
self.assertNotEqual(call, mock.call(StrContains('Testing complete.')))
self.assertNotEqual(call, mock.call(StrContains(' 0 tests run')))
+ def test_run_raw_output_invalid(self):
+ self.linux_source_mock.run_kernel = mock.Mock(return_value=[])
+ with self.assertRaises(SystemExit) as e:
+ kunit.main(['run', '--raw_output=invalid'], self.linux_source_mock)
+
def test_run_raw_output_does_not_take_positional_args(self):
# --raw_output is a string flag, but we don't want it to consume
# any positional arguments, only ones after an '='
base-commit: 5debe5bfa02c4c8922bd2d0f82c9c3a70bec8944
--
2.35.1.574.g5d30c73bfb-goog
Some problems with reading the RTC time may happen rarely, for example
while the RTC is updating. So read the RTC many times to catch these
problems. For example, a previous attempt for my
commit ea6fa4961aab ("rtc: mc146818-lib: fix RTC presence check")
was incorrect and would have triggered this selftest.
To avoid the risk of damaging the hardware, wait 11ms before consecutive
reads.
In rtc_time_to_timestamp I copied values manually instead of casting -
just to be on the safe side. The 11ms wait period was chosen so that it is
not a divisor of 1000ms.
Signed-off-by: Mateusz Jończyk <mat.jonczyk(a)o2.pl>
Cc: Alessandro Zummo <a.zummo(a)towertech.it>
Cc: Alexandre Belloni <alexandre.belloni(a)bootlin.com>
Cc: Shuah Khan <shuah(a)kernel.org>
---
Also, before
commit cdedc45c579f ("rtc: cmos: avoid UIP when reading alarm time")
reading the RTC alarm time during RTC update produced incorrect results
on many Intel platforms. Preparing a similar selftest for this case
would be more difficult, though, because the RTC alarm time is cached by
the kernel. Direct access would have to be exposed somehow, for example
in debugfs. I may prepare a patch for it in the future.
---
tools/testing/selftests/rtc/rtctest.c | 66 +++++++++++++++++++++++++++
tools/testing/selftests/rtc/settings | 2 +-
2 files changed, 67 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/rtc/rtctest.c b/tools/testing/selftests/rtc/rtctest.c
index 66af608fb4c6..2b9d929a24ed 100644
--- a/tools/testing/selftests/rtc/rtctest.c
+++ b/tools/testing/selftests/rtc/rtctest.c
@@ -20,6 +20,8 @@
#define NUM_UIE 3
#define ALARM_DELTA 3
+#define READ_LOOP_DURATION_SEC 30
+#define READ_LOOP_SLEEP_MS 11
static char *rtc_file = "/dev/rtc0";
@@ -49,6 +51,70 @@ TEST_F(rtc, date_read) {
rtc_tm.tm_hour, rtc_tm.tm_min, rtc_tm.tm_sec);
}
+static time_t rtc_time_to_timestamp(struct rtc_time *rtc_time)
+{
+ struct tm tm_time = {
+ .tm_sec = rtc_time->tm_sec,
+ .tm_min = rtc_time->tm_min,
+ .tm_hour = rtc_time->tm_hour,
+ .tm_mday = rtc_time->tm_mday,
+ .tm_mon = rtc_time->tm_mon,
+ .tm_year = rtc_time->tm_year,
+ };
+
+ return mktime(&tm_time);
+}
+
+static void nanosleep_with_retries(long ns)
+{
+ struct timespec req = {
+ .tv_sec = 0,
+ .tv_nsec = ns,
+ };
+ struct timespec rem;
+
+ while (nanosleep(&req, &rem) != 0) {
+ req.tv_sec = rem.tv_sec;
+ req.tv_nsec = rem.tv_nsec;
+ }
+}
+
+TEST_F_TIMEOUT(rtc, date_read_loop, READ_LOOP_DURATION_SEC + 2) {
+ int rc;
+ long iter_count = 0;
+ struct rtc_time rtc_tm;
+ time_t start_rtc_read, prev_rtc_read;
+
+ TH_LOG("Continuously reading RTC time for %ds (with %dms breaks after every read).",
+ READ_LOOP_DURATION_SEC, READ_LOOP_SLEEP_MS);
+
+ rc = ioctl(self->fd, RTC_RD_TIME, &rtc_tm);
+ ASSERT_NE(-1, rc);
+ start_rtc_read = rtc_time_to_timestamp(&rtc_tm);
+ prev_rtc_read = start_rtc_read;
+
+ do {
+ time_t rtc_read;
+
+ rc = ioctl(self->fd, RTC_RD_TIME, &rtc_tm);
+ ASSERT_NE(-1, rc);
+
+ rtc_read = rtc_time_to_timestamp(&rtc_tm);
+ /* Time should not go backwards */
+ ASSERT_LE(prev_rtc_read, rtc_read);
+ /* Time should not increase more then 1s at a time */
+ ASSERT_GE(prev_rtc_read + 1, rtc_read);
+
+ /* Sleep 11ms to avoid killing / overheating the RTC */
+ nanosleep_with_retries(READ_LOOP_SLEEP_MS * 1000000);
+
+ prev_rtc_read = rtc_read;
+ iter_count++;
+ } while (prev_rtc_read <= start_rtc_read + READ_LOOP_DURATION_SEC);
+
+ TH_LOG("Performed %ld RTC time reads.", iter_count);
+}
+
TEST_F_TIMEOUT(rtc, uie_read, NUM_UIE + 2) {
int i, rc, irq = 0;
unsigned long data;
diff --git a/tools/testing/selftests/rtc/settings b/tools/testing/selftests/rtc/settings
index a953c96aa16e..0c1a2075d5f3 100644
--- a/tools/testing/selftests/rtc/settings
+++ b/tools/testing/selftests/rtc/settings
@@ -1 +1 @@
-timeout=180
+timeout=210
--
2.25.1
Changes from Previous Version (v1)
==================================
Compared to the v1 of this patchset
(https://lore.kernel.org/linux-mm/20220223152051.22936-1-sj@kernel.org/), this
version contains below changes.
- Use __ATTR_R{O,W}_MODE() instead of __ATTR() (Greg KH)
- Change some file names for using __ATTR_R{O,W}_MODE() (Greg KH)
- Add ABI document (Greg KH)
Introduction
============
DAMON's debugfs-based user interface (DAMON_DBGFS) served very well, so far.
However, it unnecessarily depends on debugfs, while DAMON is not aimed to be
used for only debugging. Also, the interface receives multiple values via one
file. For example, schemes file receives 18 values. As a result, it is
inefficient, hard to be used, and difficult to be extended. Especially,
keeping backward compatibility of user space tools is getting only challenging.
It would be better to implement another reliable and flexible interface and
deprecate DAMON_DBGFS in long term.
For the reason, this patchset introduces a sysfs-based new user interface of
DAMON. The idea of the new interface is, using directory hierarchies and
having one dedicated file for each value. For a short example, users can do
the virtual address monitoring via the interface as below:
# cd /sys/kernel/mm/damon/admin/
# echo 1 > kdamonds/nr_kdamonds
# echo 1 > kdamonds/0/contexts/nr_contexts
# echo vaddr > kdamonds/0/contexts/0/operations
# echo 1 > kdamonds/0/contexts/0/targets/nr_targets
# echo $(pidof <workload>) > kdamonds/0/contexts/0/targets/0/pid_target
# echo on > kdamonds/0/state
A brief representation of the files hierarchy of DAMON sysfs interface is as
below. Childs are represented with indentation, directories are having '/'
suffix, and files in each directory are separated by comma.
/sys/kernel/mm/damon/admin
│ kdamonds/nr_kdamonds
│ │ 0/state,pid
│ │ │ contexts/nr_contexts
│ │ │ │ 0/operations
│ │ │ │ │ monitoring_attrs/
│ │ │ │ │ │ intervals/sample_us,aggr_us,update_us
│ │ │ │ │ │ nr_regions/min,max
│ │ │ │ │ targets/nr_targets
│ │ │ │ │ │ 0/pid_target
│ │ │ │ │ │ │ regions/nr_regions
│ │ │ │ │ │ │ │ 0/start,end
│ │ │ │ │ │ │ │ ...
│ │ │ │ │ │ ...
│ │ │ │ │ schemes/nr_schemes
│ │ │ │ │ │ 0/action
│ │ │ │ │ │ │ access_pattern/
│ │ │ │ │ │ │ │ sz/min,max
│ │ │ │ │ │ │ │ nr_accesses/min,max
│ │ │ │ │ │ │ │ age/min,max
│ │ │ │ │ │ │ quotas/ms,bytes,reset_interval_ms
│ │ │ │ │ │ │ │ weights/sz_permil,nr_accesses_permil,age_permil
│ │ │ │ │ │ │ watermarks/metric,interval_us,high,mid,low
│ │ │ │ │ │ │ stats/nr_tried,sz_tried,nr_applied,sz_applied,qt_exceeds
│ │ │ │ │ │ ...
│ │ │ │ ...
│ │ ...
Detailed usage of the files will be described in the final Documentation patch
of this patchset.
Main Difference Between DAMON_DBGFS and DAMON_SYSFS
---------------------------------------------------
At the moment, DAMON_DBGFS and DAMON_SYSFS provides same features. One
important difference between them is their exclusiveness. DAMON_DBGFS works in
an exclusive manner, so that no DAMON worker thread (kdamond) in the system can
run concurrently and interfere somehow. For the reason, DAMON_DBGFS asks users
to construct all monitoring contexts and start them at once. It's not a big
problem but makes the operation a little bit complex and unflexible.
For more flexible usage, DAMON_SYSFS moves the responsibility of preventing any
possible interference to the admins and work in a non-exclusive manner. That
is, users can configure and start contexts one by one. Note that DAMON
respects both exclusive groups and non-exclusive groups of contexts, in a
manner similar to that of reader-writer locks. That is, if any exclusive
monitoring contexts (e.g., contexts that started via DAMON_DBGFS) are running,
DAMON_SYSFS does not start new contexts, and vice versa.
Future Plan of DAMON_DBGFS Deprecation
======================================
Once this patchset is merged, DAMON_DBGFS development will be frozen. That is,
we will maintain it to work as is now so that no users will be break. But, it
will not be extended to provide any new feature of DAMON. The support will be
continued only until next LTS release. After that, we will drop DAMON_DBGFS.
User-space Tooling Compatibility
--------------------------------
As DAMON_SYSFS provides all features of DAMON_DBGFS, all user space tooling can
move to DAMON_SYSFS. As we will continue supporting DAMON_DBGFS until next LTS
kernel release, user space tools would have enough time to move to DAMON_SYSFS.
The official user space tool, damo[1], is already supporting both DAMON_SYSFS
and DAMON_DBGFS. Both correctness tests[2] and performance tests[3] of DAMON
using DAMON_SYSFS also passed.
[1] https://github.com/awslabs/damo
[2] https://github.com/awslabs/damon-tests/tree/master/corr
[3] https://github.com/awslabs/damon-tests/tree/master/perf
Complete Git Tree
=================
You can get the complete git tree from
https://git.kernel.org/sj/h/damon/sysfs/patches/v2.
Sequence of Patches
===================
First two patches (patches 1-2) make core changes for DAMON_SYSFS. The first
one (patch 1) allows non-exclusive DAMON contexts so that DAMON_SYSFS can work
in non-exclusive mode, while the second one (patch 2) adds size of DAMON enum
types so that DAMON API users can safely iterate the enums.
Third patch (patch 3) implements basic sysfs stub for virtual address spaces
monitoring. Note that this implements only sysfs files and DAMON is not
linked. Fourth patch (patch 4) links the DAMON_SYSFS to DAMON so that users
can control DAMON using the sysfs files.
Following six patches (patches 5-10) implements other DAMON features that
DAMON_DBGFS supports one by one (physical address space monitoring, DAMON-based
operation schemes, schemes quotas, schemes prioritization weights, schemes
watermarks, and schemes stats).
Following patch (patch 11) adds a simple selftest for DAMON_SYSFS, and the
final one (patch 12) documents DAMON_SYSFS.
Patch History
=============
Changes from Previous Version (v1)
==================================
Changes from v1
(https://lore.kernel.org/linux-mm/20220223152051.22936-1-sj@kernel.org/)
- Use __ATTR_R{O,W}_MODE() instead of __ATTR() (Greg KH)
- Change some file names for using __ATTR_R{O,W}_MODE() (Greg KH)
- Add ABI document (Greg KH)
Chages from RFC
(https://lore.kernel.org/linux-mm/20220217161938.8874-1-sj@kernel.org/)
- Implement all DAMON debugfs interface providing features
- Writeup documents
- Add more selftests
SeongJae Park (13):
mm/damon/core: Allow non-exclusive DAMON start/stop
mm/damon/core: Add number of each enum type values
mm/damon: Implement a minimal stub for sysfs-based DAMON interface
mm/damon/sysfs: Link DAMON for virtual address spaces monitoring
mm/damon/sysfs: Support the physical address space monitoring
mm/damon/sysfs: Support DAMON-based Operation Schemes
mm/damon/sysfs: Support DAMOS quotas
mm/damon/sysfs: Support schemes prioritization
mm/damon/sysfs: Support DAMOS watermarks
mm/damon/sysfs: Support DAMOS stats
selftests/damon: Add a test for DAMON sysfs interface
Docs/admin-guide/mm/damon/usage: Document DAMON sysfs interface
Docs/ABI/testing: Add DAMON sysfs interface ABI document
.../ABI/testing/sysfs-kernel-mm-damon | 276 ++
Documentation/admin-guide/mm/damon/usage.rst | 350 ++-
MAINTAINERS | 1 +
include/linux/damon.h | 6 +-
mm/damon/Kconfig | 7 +
mm/damon/Makefile | 1 +
mm/damon/core.c | 23 +-
mm/damon/dbgfs.c | 2 +-
mm/damon/reclaim.c | 2 +-
mm/damon/sysfs.c | 2594 +++++++++++++++++
tools/testing/selftests/damon/Makefile | 1 +
tools/testing/selftests/damon/sysfs.sh | 306 ++
12 files changed, 3552 insertions(+), 17 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-damon
create mode 100644 mm/damon/sysfs.c
create mode 100755 tools/testing/selftests/damon/sysfs.sh
--
2.17.1
Chages from Previous Version (RFC)
==================================
Compared to the RFC version of this patchset
(https://lore.kernel.org/linux-mm/20220217161938.8874-1-sj@kernel.org/), this
version contains below changes.
- Implement all DAMON debugfs interface providing features
- Writeup documents
- Add more selftests
Introduction
============
DAMON's debugfs-based user interface (DAMON_DBGFS) served very well, so far.
However, it unnecessarily depends on debugfs, while DAMON is not aimed to be
used for only debugging. Also, the interface receives multiple values via one
file. For example, schemes file receives 18 values. As a result, it is
inefficient, hard to be used, and difficult to be extended. Especially,
keeping backward compatibility of user space tools is getting only challenging.
It would be better to implement another reliable and flexible interface and
deprecate DAMON_DBGFS in long term.
For the reason, this patchset introduces a sysfs-based new user interface of
DAMON. The idea of the new interface is, using directory hierarchies and
having one dedicated file for each value. For a short example, users can do
the virtual address monitoring via the interface as below:
# cd /sys/kernel/mm/damon/admin/
# echo 1 > kdamonds/nr
# echo 1 > kdamonds/0/contexts/nr
# echo vaddr > kdamonds/0/contexts/0/operations
# echo 1 > kdamonds/0/contexts/0/targets/nr
# echo $(pidof <workload>) > kdamonds/0/contexts/0/targets/0/pid
# echo on > kdamonds/0/state
A brief representation of the files hierarchy of DAMON sysfs interface is as
below. Childs are represented with indentation, directories are having '/'
suffix, and files in each directory are separated by comma.
/sys/kernel/mm/damon/admin
│ kdamonds/nr
│ │ 0/state,pid
│ │ │ contexts/nr
│ │ │ │ 0/operations
│ │ │ │ │ monitoring_attrs/
│ │ │ │ │ │ intervals/sample_us,aggr_us,update_us
│ │ │ │ │ │ nr_regions/min,max
│ │ │ │ │ targets/nr
│ │ │ │ │ │ 0/pid
│ │ │ │ │ │ │ regions/nr
│ │ │ │ │ │ │ │ 0/start,end
│ │ │ │ │ │ │ │ ...
│ │ │ │ │ │ ...
│ │ │ │ │ schemes/nr
│ │ │ │ │ 0/action
│ │ │ │ │ │ access_pattern/
│ │ │ │ │ │ │ sz/min,max
│ │ │ │ │ │ │ nr_accesses/min,max
│ │ │ │ │ │ │ age/min,max
│ │ │ │ │ │ quotas/ms,sz,reset_interval_ms
│ │ │ │ │ │ │ weights/sz,nr_accesses,age
│ │ │ │ │ │ watermarks/metric,interval_us,high,mid,low
│ │ │ │ │ │ stats/nr_tried,sz_tried,nr_applied,sz_applied,qt_exceeds
│ │ │ │ │ ...
│ │ ...
Detailed usage of the files will be described in the final Documentation patch
of this patchset.
Main Difference Between DAMON_DBGFS and DAMON_SYSFS
---------------------------------------------------
At the moment, DAMON_DBGFS and DAMON_SYSFS provides same features. One
important difference between them is their exclusiveness. DAMON_DBGFS works in
an exclusive manner, so that no DAMON worker thread (kdamond) in the system can
run concurrently and interfere somehow. For the reason, DAMON_DBGFS asks users
to construct all monitoring contexts and start them at once. It's not a big
problem but makes the operation a little bit complex and unflexible.
For more flexible usage, DAMON_SYSFS moves the responsibility of preventing any
possible interference to the admins and work in a non-exclusive manner. That
is, users can configure and start contexts one by one. Note that DAMON
respects both exclusive groups and non-exclusive groups of contexts, in a
manner similar to that of reader-writer locks. That is, if any exclusive
monitoring contexts (e.g., contexts that started via DAMON_DBGFS) are running,
DAMON_SYSFS does not start new contexts, and vice versa.
Future Plan of DAMON_DBGFS Deprecation
======================================
Once this patchset is merged, DAMON_DBGFS development will be frozen. That is,
we will maintain it to work as is now so that no users will be break. But, it
will not be extended to provide any new feature of DAMON. The support will be
continued only until next LTS release. After that, we will drop DAMON_DBGFS.
User-space Tooling Compatibility
--------------------------------
As DAMON_SYSFS provides all features of DAMON_DBGFS, all user space tooling can
move to DAMON_SYSFS. As we will continue supporting DAMON_DBGFS until next LTS
kernel release, user space tools would have enough time to move to DAMON_SYSFS.
The official user space tool, damo[1], is already supporting both DAMON_SYSFS
and DAMON_DBGFS. Both correctness tests[2] and performance tests[3] of DAMON
using DAMON_SYSFS also passed.
[1] https://github.com/awslabs/damo
[2] https://github.com/awslabs/damon-tests/tree/master/corr
[3] https://github.com/awslabs/damon-tests/tree/master/perf
Complete Git Tree
=================
You can get the complete git tree from
https://git.kernel.org/sj/h/damon/sysfs/patches/v1.
Sequence of Patches
===================
First two patches (patches 1-2) make core changes for DAMON_SYSFS. The first
one (patch 1) allows non-exclusive DAMON contexts so that DAMON_SYSFS can work
in non-exclusive mode, while the second one (patch 2) adds size of DAMON enum
types so that DAMON API users can safely iterate the enums.
Third patch (patch 3) implements basic sysfs stub for virtual address spaces
monitoring. Note that this implements only sysfs files and DAMON is not
linked. Fourth patch (patch 4) links the DAMON_SYSFS to DAMON so that users
can control DAMON using the sysfs files.
Following six patches (patches 5-10) implements other DAMON features that
DAMON_DBGFS supports one by one (physical address space monitoring, DAMON-based
operation schemes, schemes quotas, schemes prioritization weights, schemes
watermarks, and schemes stats).
Following patch (patch 11) adds a simple selftest for DAMON_SYSFS, and the
final one (patch 12) documents DAMON_SYSFS.
SeongJae Park (12):
mm/damon/core: Allow non-exclusive DAMON start/stop
mm/damon/core: Add number of each enum type values
mm/damon: Implement a minimal stub for sysfs-based DAMON interface
mm/damon/sysfs: Link DAMON for virtual address spaces monitoring
mm/damon/sysfs: Support physical address space monitoring
mm/damon/sysfs: Support DAMON-based Operation Schemes
mm/damon/sysfs: Support DAMOS quotas
mm/damon/sysfs: Support schemes prioritization weights
mm/damon/sysfs: Support DAMOS watermarks
mm/damon/sysfs: Support DAMOS stats
selftests/damon: Add a test for DAMON sysfs interface
Docs/admin-guide/mm/damon/usage: Document DAMON sysfs interface
Documentation/admin-guide/mm/damon/usage.rst | 349 ++-
include/linux/damon.h | 6 +-
mm/damon/Kconfig | 7 +
mm/damon/Makefile | 1 +
mm/damon/core.c | 23 +-
mm/damon/dbgfs.c | 2 +-
mm/damon/reclaim.c | 2 +-
mm/damon/sysfs.c | 2684 ++++++++++++++++++
tools/testing/selftests/damon/Makefile | 1 +
tools/testing/selftests/damon/sysfs.sh | 306 ++
10 files changed, 3364 insertions(+), 17 deletions(-)
create mode 100644 mm/damon/sysfs.c
create mode 100755 tools/testing/selftests/damon/sysfs.sh
--
2.17.1
Dzień dobry,
dostrzegam możliwość współpracy z Państwa firmą.
Świadczymy kompleksową obsługę inwestycji w fotowoltaikę, która obniża koszty energii elektrycznej nawet o 90%.
Czy są Państwo zainteresowani weryfikacją wstępnych propozycji?
Pozdrawiam,
Jakub Daroch
The list_del_init_careful() function was added[1] after the list KUnit
test. Add a very basic test to cover it.
Note that this test only covers the single-threaded behaviour (which
matches list_del_init()), as is already the case with the test for
list_empty_careful().
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
Signed-off-by: David Gow <davidgow(a)google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
---
Changes since v3:
https://lore.kernel.org/lkml/20220209052813.854014-1-davidgow@google.com/
- Fix a comment style issue.
- Add Reviewed-by tags.
Changes since v2:
https://lore.kernel.org/linux-kselftest/20220208040122.695258-1-davidgow@go…
- Fix the test calling list_del_init() instead of
list_del_init_careful()
- Improve the comment noting we only test single-threaded behaviour.
Changes since v1:
https://lore.kernel.org/linux-kselftest/20220205061539.273330-1-davidgow@go…
- Patch 1/3 unchanged
---
lib/list-test.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/lib/list-test.c b/lib/list-test.c
index ee09505df16f..302b7382bff4 100644
--- a/lib/list-test.c
+++ b/lib/list-test.c
@@ -161,6 +161,26 @@ static void list_test_list_del_init(struct kunit *test)
KUNIT_EXPECT_TRUE(test, list_empty_careful(&a));
}
+static void list_test_list_del_init_careful(struct kunit *test)
+{
+ /* NOTE: This test only checks the behaviour of this function in
+ * isolation. It does not verify memory model guarantees.
+ */
+ struct list_head a, b;
+ LIST_HEAD(list);
+
+ list_add_tail(&a, &list);
+ list_add_tail(&b, &list);
+
+ /* before: [list] -> a -> b */
+ list_del_init_careful(&a);
+ /* after: [list] -> b, a initialised */
+
+ KUNIT_EXPECT_PTR_EQ(test, list.next, &b);
+ KUNIT_EXPECT_PTR_EQ(test, b.prev, &list);
+ KUNIT_EXPECT_TRUE(test, list_empty_careful(&a));
+}
+
static void list_test_list_move(struct kunit *test)
{
struct list_head a, b;
@@ -707,6 +727,7 @@ static struct kunit_case list_test_cases[] = {
KUNIT_CASE(list_test_list_replace_init),
KUNIT_CASE(list_test_list_swap),
KUNIT_CASE(list_test_list_del_init),
+ KUNIT_CASE(list_test_list_del_init_careful),
KUNIT_CASE(list_test_list_move),
KUNIT_CASE(list_test_list_move_tail),
KUNIT_CASE(list_test_list_bulk_move_tail),
--
2.35.1.574.g5d30c73bfb-goog
This series is a result of looking deeper into breakage of
tools/testing/selftests/rlimits/rlimits-per-userns.c after
https://lore.kernel.org/r/20220204181144.24462-1-mkoutny@suse.com/
is applied.
The description of the original problem that lead to RLIMIT_NPROC et al.
ucounts rewrite could be ambiguously interpretted as supporting either
the case of:
- never-fork service or
- fork (RLIMIT_NPROC-1) times service.
The scenario is weird anyway given existence of pids controller.
The realization of that scenario relies not only on tracking number of
processes per user_ns but also newly allows the root to override limit through
set*uid. The commit message didn't mention that, so it's unclear if it
was the intention too.
I also noticed that the RLIMIT_NPROC enforcing in fork seems subject to TOCTOU
race (check(nr_tasks),...,nr_tasks++) so the limit is rather advisory (but
that's not a new thing related to ucounts rewrite).
This series is RFC to discuss relevance of the subtle changes RLIMIT_NPROC to
ucounts rewrite introduced.
Michal Koutný (6):
set_user: Perform RLIMIT_NPROC capability check against new user
credentials
set*uid: Check RLIMIT_PROC against new credentials
cred: Count tasks by their real uid into RLIMIT_NPROC
ucounts: Allow root to override RLIMIT_NPROC
selftests: Challenge RLIMIT_NPROC in user namespaces
selftests: Test RLIMIT_NPROC in clone-created user namespaces
fs/exec.c | 2 +-
include/linux/cred.h | 2 +-
kernel/cred.c | 29 ++-
kernel/fork.c | 2 +-
kernel/sys.c | 20 +-
kernel/ucount.c | 3 +
kernel/user_namespace.c | 2 +-
.../selftests/rlimits/rlimits-per-userns.c | 233 +++++++++++++++---
8 files changed, 229 insertions(+), 64 deletions(-)
--
2.34.1
Use a more idiomatic check that a list is non-empty (`if mylist:`) and
sinmplify the function body by dedenting and using a dict to map between
the kunit TestStatus enum => KernelCI json status string.
The dict hopefully makes it less likely to have bugs like commit
9a6bb30a8830 ("kunit: tool: fix --json output for skipped tests").
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
---
Note: this series is based on my earlier set of kunit tool cleanups for
5.18, https://lore.kernel.org/linux-kselftest/20220118190922.1557074-1-dlatypov@g…
There's no interesting semantic dependency, just some boring merge
conflicts, specifically with patch #4 there, https://lore.kernel.org/linux-kselftest/20220118190922.1557074-5-dlatypov@g…
---
tools/testing/kunit/kunit_json.py | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/tools/testing/kunit/kunit_json.py b/tools/testing/kunit/kunit_json.py
index 24d103049bca..14a480d3308a 100644
--- a/tools/testing/kunit/kunit_json.py
+++ b/tools/testing/kunit/kunit_json.py
@@ -16,24 +16,24 @@ from typing import Any, Dict
JsonObj = Dict[str, Any]
+_status_map: Dict[TestStatus, str] = {
+ TestStatus.SUCCESS: "PASS",
+ TestStatus.SKIPPED: "SKIP",
+ TestStatus.TEST_CRASHED: "ERROR",
+}
+
def _get_group_json(test: Test, def_config: str, build_dir: str) -> JsonObj:
sub_groups = [] # List[JsonObj]
test_cases = [] # List[JsonObj]
for subtest in test.subtests:
- if len(subtest.subtests):
+ if subtest.subtests:
sub_group = _get_group_json(subtest, def_config,
build_dir)
sub_groups.append(sub_group)
- else:
- test_case = {"name": subtest.name, "status": "FAIL"}
- if subtest.status == TestStatus.SUCCESS:
- test_case["status"] = "PASS"
- elif subtest.status == TestStatus.SKIPPED:
- test_case["status"] = "SKIP"
- elif subtest.status == TestStatus.TEST_CRASHED:
- test_case["status"] = "ERROR"
- test_cases.append(test_case)
+ continue
+ status = _status_map.get(subtest.status, "FAIL")
+ test_cases.append({"name": subtest.name, "status": status})
test_group = {
"name": test.name,
--
2.35.1.473.g83b2b277ed-goog
The first patch of this series is an improvement to the existing
syncookie BPF helper.
The two other patches add new functionality that allows XDP to
accelerate iptables synproxy.
v1 of this series [1] used to include a patch that exposed conntrack
lookup to BPF using stable helpers. It was superseded by series [2] by
Kumar Kartikeya Dwivedi, which implements this functionality using
unstable helpers.
The second patch adds new helpers to issue and check SYN cookies without
binding to a socket, which is useful in the synproxy scenario.
The third patch adds a selftest, which consists of a script, an XDP
program and a userspace control application. The XDP program uses
socketless SYN cookie helpers and queries conntrack status instead of
socket status. The userspace control application allows to tune
parameters of the XDP program. This program also serves as a minimal
example of usage of the new functionality.
The draft of the new functionality was presented on Netdev 0x15 [3].
v2 changes:
Split into two series, submitted bugfixes to bpf, dropped the conntrack
patches, implemented the timestamp cookie in BPF using bpf_loop, dropped
the timestamp cookie patch.
[1]: https://lore.kernel.org/bpf/20211020095815.GJ28644@breakpoint.cc/t/
[2]: https://lore.kernel.org/bpf/20220114163953.1455836-1-memxor@gmail.com/
[3]: https://netdevconf.info/0x15/session.html?Accelerating-synproxy-with-XDP
Maxim Mikityanskiy (3):
bpf: Make errors of bpf_tcp_check_syncookie distinguishable
bpf: Add helpers to issue and check SYN cookies in XDP
bpf: Add selftests for raw syncookie helpers
include/net/tcp.h | 1 +
include/uapi/linux/bpf.h | 75 +-
net/core/filter.c | 128 ++-
net/ipv4/tcp_input.c | 3 +-
tools/include/uapi/linux/bpf.h | 75 +-
tools/testing/selftests/bpf/.gitignore | 1 +
tools/testing/selftests/bpf/Makefile | 5 +-
.../selftests/bpf/progs/xdp_synproxy_kern.c | 743 ++++++++++++++++++
.../selftests/bpf/test_xdp_synproxy.sh | 71 ++
tools/testing/selftests/bpf/xdp_synproxy.c | 418 ++++++++++
10 files changed, 1510 insertions(+), 10 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
create mode 100755 tools/testing/selftests/bpf/test_xdp_synproxy.sh
create mode 100644 tools/testing/selftests/bpf/xdp_synproxy.c
--
2.30.2
Dzień dobry,
jakiś czas temu zgłosiła się do nas firma, której strona internetowa nie pozycjonowała się wysoko w wyszukiwarce Google.
Na podstawie wykonanego przez nas audytu SEO zoptymalizowaliśmy treści na stronie pod kątem wcześniej opracowanych słów kluczowych. Nasz wewnętrzny system codziennie analizuje prawidłowe działanie witryny. Dzięki indywidualnej strategii, firma zdobywa coraz więcej Klientów.
Czy chcieliby Państwo zwiększyć liczbę osób odwiedzających stronę internetową firmy? Mógłbym przedstawić ofertę?
Pozdrawiam serdecznie,
Wiktor Zielonko
This series starts by adding support for SA filtering to the bridge,
which is then allowed to be offloaded to switchdev devices. Furthermore
an offloading implementation is supplied for the mv88e6xxx driver.
Public Local Area Networks are often deployed such that there is a
risk of unauthorized or unattended clients getting access to the LAN.
To prevent such access we introduce SA filtering, such that ports
designated as secure ports are set in locked mode, so that only
authorized source MAC addresses are given access by adding them to
the bridges forwarding database. Incoming packets with source MAC
addresses that are not in the forwarding database of the bridge are
discarded. It is then the task of user space daemons to populate the
bridge's forwarding database with static entries of authorized entities.
The most common approach is to use the IEEE 802.1X protocol to take
care of the authorization of allowed users to gain access by opening
for the source address of the authorized host.
With the current use of the bridge parameter in hostapd, there is
a limitation in using this for IEEE 802.1X port authentication. It
depends on hostapd attaching the port on which it has a successful
authentication to the bridge, but that only allows for a single
authentication per port. This patch set allows for the use of
IEEE 802.1X port authentication in a more general network context with
multiple 802.1X aware hosts behind a single port as depicted, which is
a commonly used commercial use-case, as it is only the number of
available entries in the forwarding database that limits the number of
authenticated clients.
+--------------------------------+
| |
| Bridge/Authenticator |
| |
+-------------+------------------+
802.1X port |
|
|
+------+-------+
| |
| Hub/Switch |
| |
+-+----------+-+
| |
+--+--+ +--+--+
| | | |
Hosts | a | | b | . . .
| | | |
+-----+ +-----+
The 802.1X standard involves three different components, a Supplicant
(Host), an Authenticator (Network Access Point) and an Authentication
Server which is typically a Radius server. This patch set thus enables
the bridge module together with an authenticator application to serve
as an Authenticator on designated ports.
For the bridge to become an IEEE 802.1X Authenticator, a solution using
hostapd with the bridge driver can be found at
https://github.com/westermo/hostapd/tree/bridge_driver .
The relevant components work transparently in relation to if it is the
bridge module or the offloaded switchcore case that is in use.
Hans Schultz (5):
net: bridge: Add support for bridge port in locked mode
net: bridge: Add support for offloading of locked port flag
net: dsa: Include BR_PORT_LOCKED in the list of synced brport flags
net: dsa: mv88e6xxx: Add support for bridge port locked mode
selftests: forwarding: tests of locked port feature
drivers/net/dsa/mv88e6xxx/chip.c | 9 +-
drivers/net/dsa/mv88e6xxx/port.c | 29 +++
drivers/net/dsa/mv88e6xxx/port.h | 9 +-
include/linux/if_bridge.h | 1 +
include/uapi/linux/if_link.h | 1 +
net/bridge/br_input.c | 11 +-
net/bridge/br_netlink.c | 6 +-
net/bridge/br_switchdev.c | 2 +-
net/dsa/port.c | 4 +-
.../testing/selftests/net/forwarding/Makefile | 1 +
.../net/forwarding/bridge_locked_port.sh | 180 ++++++++++++++++++
tools/testing/selftests/net/forwarding/lib.sh | 8 +
12 files changed, 254 insertions(+), 7 deletions(-)
create mode 100755 tools/testing/selftests/net/forwarding/bridge_locked_port.sh
--
2.30.2
Changes since V1:
- V1: https://lore.kernel.org/linux-sgx/cover.1643393473.git.reinette.chatre@inte…
- All changes impact the commit messages only, no changes to code.
- Rewrite commit message of 1/4 (Dave).
- Detail in 2/4 commit log what callers will see with this change (Dave).
- Add Acked-by from Dave to 2/4 and 4/4.
Hi Everybody,
Please find included a few fixes that address problems encountered after
venturing into the enclave loading error handling code of the SGX
selftests.
Reinette
Reinette Chatre (4):
selftests/sgx: Fix NULL-pointer-dereference upon early test failure
selftests/sgx: Do not attempt enclave build without valid enclave
selftests/sgx: Ensure enclave data available during debug print
selftests/sgx: Remove extra newlines in test output
tools/testing/selftests/sgx/load.c | 9 +++++----
tools/testing/selftests/sgx/main.c | 9 +++++----
2 files changed, 10 insertions(+), 8 deletions(-)
base-commit: 2056e2989bf47ad7274ecc5e9dda2add53c112f9
--
2.25.1
The arch_timer and vgic_irq kselftests assume that they can create a
vgic-v3, using the library function vgic_v3_setup() which aborts with a
test failure if it is not possible to do so. Since vgic-v3 can only be
instantiated on systems where the host has GICv3 this leads to false
positives on older systems where that is not the case.
Fix this by changing vgic_v3_setup() to return an error if the vgic can't
be instantiated and have the callers skip if this happens. We could also
exit flagging a skip in vgic_v3_setup() but this would prevent future test
cases conditionally deciding which GIC to use or generally doing more
complex output.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Reviewed-by: Andrew Jones <drjones(a)redhat.com>
Tested-by: Ricardo Koller <ricarkol(a)google.com>
---
tools/testing/selftests/kvm/aarch64/arch_timer.c | 7 ++++++-
tools/testing/selftests/kvm/aarch64/vgic_irq.c | 4 ++++
tools/testing/selftests/kvm/lib/aarch64/vgic.c | 4 +++-
3 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/aarch64/arch_timer.c b/tools/testing/selftests/kvm/aarch64/arch_timer.c
index 9ad38bd360a4..b08d30bf71c5 100644
--- a/tools/testing/selftests/kvm/aarch64/arch_timer.c
+++ b/tools/testing/selftests/kvm/aarch64/arch_timer.c
@@ -366,6 +366,7 @@ static struct kvm_vm *test_vm_create(void)
{
struct kvm_vm *vm;
unsigned int i;
+ int ret;
int nr_vcpus = test_args.nr_vcpus;
vm = vm_create_default_with_vcpus(nr_vcpus, 0, 0, guest_code, NULL);
@@ -382,7 +383,11 @@ static struct kvm_vm *test_vm_create(void)
ucall_init(vm, NULL);
test_init_timer_irq(vm);
- vgic_v3_setup(vm, nr_vcpus, 64, GICD_BASE_GPA, GICR_BASE_GPA);
+ ret = vgic_v3_setup(vm, nr_vcpus, 64, GICD_BASE_GPA, GICR_BASE_GPA);
+ if (ret < 0) {
+ print_skip("Failed to create vgic-v3");
+ exit(KSFT_SKIP);
+ }
/* Make all the test's cmdline args visible to the guest */
sync_global_to_guest(vm, test_args);
diff --git a/tools/testing/selftests/kvm/aarch64/vgic_irq.c b/tools/testing/selftests/kvm/aarch64/vgic_irq.c
index f0230711fbe9..554ca649d470 100644
--- a/tools/testing/selftests/kvm/aarch64/vgic_irq.c
+++ b/tools/testing/selftests/kvm/aarch64/vgic_irq.c
@@ -767,6 +767,10 @@ static void test_vgic(uint32_t nr_irqs, bool level_sensitive, bool eoi_split)
gic_fd = vgic_v3_setup(vm, 1, nr_irqs,
GICD_BASE_GPA, GICR_BASE_GPA);
+ if (gic_fd < 0) {
+ print_skip("Failed to create vgic-v3, skipping");
+ exit(KSFT_SKIP);
+ }
vm_install_exception_handler(vm, VECTOR_IRQ_CURRENT,
guest_irq_handlers[args.eoi_split][args.level_sensitive]);
diff --git a/tools/testing/selftests/kvm/lib/aarch64/vgic.c b/tools/testing/selftests/kvm/lib/aarch64/vgic.c
index f365c32a7296..5d45046c1b80 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/vgic.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/vgic.c
@@ -52,7 +52,9 @@ int vgic_v3_setup(struct kvm_vm *vm, unsigned int nr_vcpus, uint32_t nr_irqs,
nr_vcpus, nr_vcpus_created);
/* Distributor setup */
- gic_fd = kvm_create_device(vm, KVM_DEV_TYPE_ARM_VGIC_V3, false);
+ if (_kvm_create_device(vm, KVM_DEV_TYPE_ARM_VGIC_V3,
+ false, &gic_fd) != 0)
+ return -1;
kvm_device_access(gic_fd, KVM_DEV_ARM_VGIC_GRP_NR_IRQS,
0, &nr_irqs, true);
--
2.30.2
This series starts by adding support for SA filtering to the bridge,
which is then allowed to be offloaded to switchdev devices. Furthermore
an offloading implementation is supplied for the mv88e6xxx driver.
Public Local Area Networks are often deployed such that there is a
risk of unauthorized or unattended clients getting access to the LAN.
To prevent such access we introduce SA filtering, such that ports
designated as secure ports are set in locked mode, so that only
authorized source MAC addresses are given access by adding them to
the bridges forwarding database. Incoming packets with source MAC
addresses that are not in the forwarding database of the bridge are
discarded. It is then the task of user space daemons to populate the
bridge's forwarding database with static entries of authorized entities.
The most common approach is to use the IEEE 802.1X protocol to take
care of the authorization of allowed users to gain access by opening
for the source address of the authorized host.
With the current use of the bridge parameter in hostapd, there is
a limitation in using this for IEEE 802.1X port authentication. It
depends on hostapd attaching the port on which it has a successful
authentication to the bridge, but that only allows for a single
authentication per port. This patch set allows for the use of
IEEE 802.1X port authentication in a more general network context with
multiple 802.1X aware hosts behind a single port as depicted, which is
a commonly used commercial use-case, as it is only the number of
available entries in the forwarding database that limits the number of
authenticated clients.
+--------------------------------+
| |
| Bridge/Authenticator |
| |
+-------------+------------------+
802.1X port |
|
|
+------+-------+
| |
| Hub/Switch |
| |
+-+----------+-+
| |
+--+--+ +--+--+
| | | |
Hosts | a | | b | . . .
| | | |
+-----+ +-----+
The 802.1X standard involves three different components, a Supplicant
(Host), an Authenticator (Network Access Point) and an Authentication
Server which is typically a Radius server. This patch set thus enables
the bridge module together with an authenticator application to serve
as an Authenticator on designated ports.
For the bridge to become an IEEE 802.1X Authenticator, a solution using
hostapd with the bridge driver can be found at
https://github.com/westermo/hostapd/tree/bridge_driver .
The relevant components work transparently in relation to if it is the
bridge module or the offloaded switchcore case that is in use.
Hans Schultz (5):
net: bridge: Add support for bridge port in locked mode
net: bridge: Add support for offloading of locked port flag
net: dsa: Include BR_PORT_LOCKED in the list of synced brport flags
net: dsa: mv88e6xxx: Add support for bridge port locked mode
selftests: forwarding: tests of locked port feature
drivers/net/dsa/mv88e6xxx/chip.c | 9 +-
drivers/net/dsa/mv88e6xxx/port.c | 29 +++
drivers/net/dsa/mv88e6xxx/port.h | 9 +-
include/linux/if_bridge.h | 1 +
include/uapi/linux/if_link.h | 1 +
net/bridge/br_input.c | 11 +-
net/bridge/br_netlink.c | 6 +-
net/bridge/br_switchdev.c | 2 +-
net/dsa/port.c | 4 +-
.../testing/selftests/net/forwarding/Makefile | 1 +
.../net/forwarding/bridge_locked_port.sh | 180 ++++++++++++++++++
tools/testing/selftests/net/forwarding/lib.sh | 8 +
12 files changed, 254 insertions(+), 7 deletions(-)
create mode 100755 tools/testing/selftests/net/forwarding/bridge_locked_port.sh
--
2.30.2
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)canonical.com>
[ Upstream commit 6fec1ab67f8d60704cc7de64abcfd389ab131542 ]
The PREEMPT_RT patchset does not use do_softirq() function thus trying
to filter for do_softirq fails for such kernel:
echo do_softirq
ftracetest: 81: echo: echo: I/O error
Choose some other visible function for the test. The function does not
have to be actually executed during the test, because it is only testing
filter API interface.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)canonical.com>
Reviewed-by: Shuah Khan <skhan(a)linuxfoundation.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Reviewed-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc b/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc
index 51f6e6146bd93..951b4311930c5 100644
--- a/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc
+++ b/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc
@@ -22,7 +22,7 @@ fail() { # mesg
FILTER=set_ftrace_filter
FUNC1="schedule"
-FUNC2="do_softirq"
+FUNC2="scheduler_tick"
ALL_FUNCS="#### all functions enabled ####"
--
2.34.1
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)canonical.com>
[ Upstream commit 6fec1ab67f8d60704cc7de64abcfd389ab131542 ]
The PREEMPT_RT patchset does not use do_softirq() function thus trying
to filter for do_softirq fails for such kernel:
echo do_softirq
ftracetest: 81: echo: echo: I/O error
Choose some other visible function for the test. The function does not
have to be actually executed during the test, because it is only testing
filter API interface.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)canonical.com>
Reviewed-by: Shuah Khan <skhan(a)linuxfoundation.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Reviewed-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc b/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc
index e96e279e0533a..25432b8cd5bd2 100644
--- a/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc
+++ b/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc
@@ -19,7 +19,7 @@ fail() { # mesg
FILTER=set_ftrace_filter
FUNC1="schedule"
-FUNC2="do_softirq"
+FUNC2="scheduler_tick"
ALL_FUNCS="#### all functions enabled ####"
--
2.34.1
From: Sherry Yang <sherry.yang(a)oracle.com>
[ Upstream commit 21bffcb76ee2fbafc7d5946cef10abc9df5cfff7 ]
seccomp_bpf failed on tests 47 global.user_notification_filter_empty
and 48 global.user_notification_filter_empty_threaded when it's
tested on updated kernel but with old kernel headers. Because old
kernel headers don't have definition of macro __NR_clone3 which is
required for these two tests. Since under selftests/, we can install
headers once for all tests (the default INSTALL_HDR_PATH is
usr/include), fix it by adding usr/include to the list of directories
to be searched. Use "-isystem" to indicate it's a system directory as
the real kernel headers directories are.
Signed-off-by: Sherry Yang <sherry.yang(a)oracle.com>
Tested-by: Sherry Yang <sherry.yang(a)oracle.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/seccomp/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/seccomp/Makefile b/tools/testing/selftests/seccomp/Makefile
index 0ebfe8b0e147f..585f7a0c10cbe 100644
--- a/tools/testing/selftests/seccomp/Makefile
+++ b/tools/testing/selftests/seccomp/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-CFLAGS += -Wl,-no-as-needed -Wall
+CFLAGS += -Wl,-no-as-needed -Wall -isystem ../../../../usr/include/
LDFLAGS += -lpthread
TEST_GEN_PROGS := seccomp_bpf seccomp_benchmark
--
2.34.1
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)canonical.com>
[ Upstream commit 6fec1ab67f8d60704cc7de64abcfd389ab131542 ]
The PREEMPT_RT patchset does not use do_softirq() function thus trying
to filter for do_softirq fails for such kernel:
echo do_softirq
ftracetest: 81: echo: echo: I/O error
Choose some other visible function for the test. The function does not
have to be actually executed during the test, because it is only testing
filter API interface.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)canonical.com>
Reviewed-by: Shuah Khan <skhan(a)linuxfoundation.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Reviewed-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
.../selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc b/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc
index e96e279e0533a..25432b8cd5bd2 100644
--- a/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc
+++ b/tools/testing/selftests/ftrace/test.d/ftrace/func_set_ftrace_file.tc
@@ -19,7 +19,7 @@ fail() { # mesg
FILTER=set_ftrace_filter
FUNC1="schedule"
-FUNC2="do_softirq"
+FUNC2="scheduler_tick"
ALL_FUNCS="#### all functions enabled ####"
--
2.34.1
From: Sherry Yang <sherry.yang(a)oracle.com>
[ Upstream commit 21bffcb76ee2fbafc7d5946cef10abc9df5cfff7 ]
seccomp_bpf failed on tests 47 global.user_notification_filter_empty
and 48 global.user_notification_filter_empty_threaded when it's
tested on updated kernel but with old kernel headers. Because old
kernel headers don't have definition of macro __NR_clone3 which is
required for these two tests. Since under selftests/, we can install
headers once for all tests (the default INSTALL_HDR_PATH is
usr/include), fix it by adding usr/include to the list of directories
to be searched. Use "-isystem" to indicate it's a system directory as
the real kernel headers directories are.
Signed-off-by: Sherry Yang <sherry.yang(a)oracle.com>
Tested-by: Sherry Yang <sherry.yang(a)oracle.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/seccomp/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/seccomp/Makefile b/tools/testing/selftests/seccomp/Makefile
index 0ebfe8b0e147f..585f7a0c10cbe 100644
--- a/tools/testing/selftests/seccomp/Makefile
+++ b/tools/testing/selftests/seccomp/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-CFLAGS += -Wl,-no-as-needed -Wall
+CFLAGS += -Wl,-no-as-needed -Wall -isystem ../../../../usr/include/
LDFLAGS += -lpthread
TEST_GEN_PROGS := seccomp_bpf seccomp_benchmark
--
2.34.1