This patch series extends the previously add __ksym externs with btf
info.
Right now the __ksym externs are treated as pure 64-bit scalar value.
Libbpf replaces ld_imm64 insn of __ksym by its kernel address at load
time. This patch series extend those extern with their btf info. Note
that btf support for __ksym must come with the btf that has VARs encoded
to work properly. Therefore, these patches are tested against a btf
generated by a patched pahole, whose change will available in released
pahole soon.
There are a couple of design choices that I would like feedbacks from
bpf/btf experts.
1. Because the newly added pseudo_btf_id needs to carry both a kernel
address (64 bits) and a btf id (32 bits), I used the 'off' fields
of ld_imm insn to carry btf id. I wonder if this breaks anything or
if there is a better idea.
2. Since only a subset of vars are going to be encoded into the new
btf, if a ksym that doesn't find its btf id, it doesn't get
converted into pseudo_btf_id. It is still treated as pure scalar
value. But we require kernel btf to be loaded in libbpf if there is
any ksym in the bpf prog.
This is RFC as it requires pahole changes that encode kernel vars into
btf.
Hao Luo (2):
bpf: BTF support for __ksym externs
selftests/bpf: Test __ksym externs with BTF
include/uapi/linux/bpf.h | 37 ++++++++++----
kernel/bpf/verifier.c | 26 ++++++++--
tools/include/uapi/linux/bpf.h | 37 ++++++++++----
tools/lib/bpf/libbpf.c | 50 ++++++++++++++++++-
.../testing/selftests/bpf/prog_tests/ksyms.c | 2 +
.../testing/selftests/bpf/progs/test_ksyms.c | 14 ++++++
6 files changed, 143 insertions(+), 23 deletions(-)
--
2.27.0.389.gc38d7665816-goog
v1: https://lkml.org/lkml/2020/7/7/1036
Changelog v1 --> v2
1. Based on Shuah Khan's comment, changed exit code to ksft_skip to
indicate the test is being skipped
2. Change the busy workload for baseline measurement from
"yes > /dev/null" to "cat /dev/random to /dev/null", based on
observed CPU utilization for "yes" consuming ~60% CPU while the
latter consumes 100% of CPUs, giving more accurate baseline numbers
---
The patch series introduces a mechanism to measure wakeup latency for
IPI and timer based interrupts
The motivation behind this series is to find significant deviations
behind advertised latency and resisdency values
To achieve this, we introduce a kernel module and expose its control
knobs through the debugfs interface that the selftests can engage with.
The kernel module provides the following interfaces within
/sys/kernel/debug/latency_test/ for,
1. IPI test:
ipi_cpu_dest # Destination CPU for the IPI
ipi_cpu_src # Origin of the IPI
ipi_latency_ns # Measured latency time in ns
2. Timeout test:
timeout_cpu_src # CPU on which the timer to be queued
timeout_expected_ns # Timer duration
timeout_diff_ns # Difference of actual duration vs expected timer
To include the module, check option and include as module
kernel hacking -> Cpuidle latency selftests
The selftest inserts the module, disables all the idle states and
enables them one by one testing the following:
1. Keeping source CPU constant, iterates through all the CPUS measuring
IPI latency for baseline (CPU is busy with
"cat /dev/random > /dev/null" workload) and the when the CPU is
allowed to be at rest
2. Iterating through all the CPUs, sending expected timer durations to
be equivalent to the residency of the the deepest idle state
enabled and extracting the difference in time between the time of
wakeup and the expected timer duration
Usage
-----
Can be used in conjuction to the rest of the selftests.
Default Output location in: tools/testing/cpuidle/cpuidle.log
To run this test specifically:
$ make -C tools/testing/selftests TARGETS="cpuidle" run_tests
There are a few optinal arguments too that the script can take
[-h <help>]
[-m <location of the module>]
[-o <location of the output>]
Sample output snippet
---------------------
--IPI Latency Test---
--Baseline IPI Latency measurement: CPU Busy--
SRC_CPU DEST_CPU IPI_Latency(ns)
...
0 8 1996
0 9 2125
0 10 1264
0 11 1788
0 12 2045
Baseline Average IPI latency(ns): 1843
---Enabling state: 5---
SRC_CPU DEST_CPU IPI_Latency(ns)
0 8 621719
0 9 624752
0 10 622218
0 11 623968
0 12 621303
Expected IPI latency(ns): 100000
Observed Average IPI latency(ns): 622792
--Timeout Latency Test--
--Baseline Timeout Latency measurement: CPU Busy--
Wakeup_src Baseline_delay(ns)
...
8 2249
9 2226
10 2211
11 2183
12 2263
Baseline Average timeout diff(ns): 2226
---Enabling state: 5---
8 10749
9 10911
10 10912
11 12100
12 73276
Expected timeout(ns): 10000200
Observed Average timeout diff(ns): 23589
Pratik Rajesh Sampat (2):
cpuidle: Trace IPI based and timer based wakeup latency from idle
states
selftest/cpuidle: Add support for cpuidle latency measurement
drivers/cpuidle/Makefile | 1 +
drivers/cpuidle/test-cpuidle_latency.c | 150 ++++++++++++
lib/Kconfig.debug | 10 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/cpuidle/Makefile | 6 +
tools/testing/selftests/cpuidle/cpuidle.sh | 257 +++++++++++++++++++++
tools/testing/selftests/cpuidle/settings | 1 +
7 files changed, 426 insertions(+)
create mode 100644 drivers/cpuidle/test-cpuidle_latency.c
create mode 100644 tools/testing/selftests/cpuidle/Makefile
create mode 100755 tools/testing/selftests/cpuidle/cpuidle.sh
create mode 100644 tools/testing/selftests/cpuidle/settings
--
2.25.4
The goal for this series is to avoid device private memory TLB
invalidations when migrating a range of addresses from system
memory to device private memory and some of those pages have already
been migrated. The approach taken is to introduce a new mmu notifier
invalidation event type and use that in the device driver to skip
invalidation callbacks from migrate_vma_setup(). The device driver is
also then expected to handle device MMU invalidations as part of the
migrate_vma_setup(), migrate_vma_pages(), migrate_vma_finalize() process.
Note that this is opt-in. A device driver can simply invalidate its MMU
in the mmu notifier callback and not handle MMU invalidations in the
migration sequence.
This series is based on Jason Gunthorpe's HMM tree (linux-5.8.0-rc4).
Also, this replaces the need for the following two patches I sent:
("mm: fix migrate_vma_setup() src_owner and normal pages")
https://lore.kernel.org/linux-mm/20200622222008.9971-1-rcampbell@nvidia.com
("nouveau: fix mixed normal and device private page migration")
https://lore.kernel.org/lkml/20200622233854.10889-3-rcampbell@nvidia.com
Changes in v2:
Rebase to Jason Gunthorpe's HMM tree.
Added reviewed-by from Bharata B Rao.
Rename the mmu_notifier_range::data field to migrate_pgmap_owner as
suggested by Jason Gunthorpe.
Ralph Campbell (5):
nouveau: fix storing invalid ptes
mm/migrate: add a direction parameter to migrate_vma
mm/notifier: add migration invalidation type
nouveau/svm: use the new migration invalidation
mm/hmm/test: use the new migration invalidation
arch/powerpc/kvm/book3s_hv_uvmem.c | 2 ++
drivers/gpu/drm/nouveau/nouveau_dmem.c | 13 ++++++--
drivers/gpu/drm/nouveau/nouveau_svm.c | 10 +++++-
drivers/gpu/drm/nouveau/nouveau_svm.h | 1 +
.../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 13 +++++---
include/linux/migrate.h | 12 +++++--
include/linux/mmu_notifier.h | 7 ++++
lib/test_hmm.c | 33 +++++++++++--------
mm/migrate.c | 13 ++++++--
9 files changed, 77 insertions(+), 27 deletions(-)
--
2.20.1
Currently, KUnit does not allow the use of tests as a module.
This prevents the implementation of tests that require userspace.
This patchset makes this possible by introducing the use of
the root filesystem in KUnit. And it allows the use of tests
that can be compiled as a module
Vitor Massaru Iha (3):
kunit: tool: Add support root filesystem in kunit-tool
lib: Allows to borrow mm in userspace on KUnit
lib: Convert test_user_copy to KUnit test
include/kunit/test.h | 1 +
lib/Kconfig.debug | 17 ++
lib/Makefile | 2 +-
lib/kunit/try-catch.c | 15 +-
lib/{test_user_copy.c => user_copy_kunit.c} | 196 +++++++++-----------
tools/testing/kunit/kunit.py | 37 +++-
tools/testing/kunit/kunit_kernel.py | 105 +++++++++--
7 files changed, 238 insertions(+), 135 deletions(-)
rename lib/{test_user_copy.c => user_copy_kunit.c} (55%)
base-commit: 725aca9585956676687c4cb803e88f770b0df2b2
prerequisite-patch-id: 582b6d9d28ce4b71628890ec832df6522ca68de0
--
2.26.2
This fixes the way the Authority Mask Register (AMR) is updated
by the existing pkey tests and adds a new test to verify the
functionality of execute-disabled pkeys.
Previous versions can be found at:
v2: https://lore.kernel.org/linuxppc-dev/20200527030342.13712-1-sandipan@linux.…
v1: https://lore.kernel.org/linuxppc-dev/20200508162332.65316-1-sandipan@linux.…
Changes in v3:
- Fixed AMR writes for existing pkey tests (new patch).
- Moved Hash MMU check under utilities (new patch) and removed duplicate
code.
- Fixed comments on why the pkey permission bits were redefined.
- Switched to existing mfspr() macro for reading AMR.
- Switched to sig_atomic_t as data type for variables updated in the
signal handlers.
- Switched to exit()-ing if the signal handlers come across an unexpected
condition instead of trying to reset page and pkey permissions.
- Switched to write() from printf() for printing error messages from
the signal handlers.
- Switched to getpagesize().
- Renamed fault counter to denote remaining faults.
- Dropped unnecessary randomization for choosing an address to fault at.
- Added additional information on change in permissions due to AMR and
IAMR bits in comments.
- Switched the first instruction word of the executable region to a trap
to test if it is actually overwritten by a no-op later.
- Added an new test scenario where the pkey imposes no restrictions and
an attempt is made to jump to the executable region again.
Changes in v2:
- Added .gitignore entry for test binary.
- Fixed builds for older distros where siginfo_t might not have si_pkey as
a formal member based on discussion with Michael.
Sandipan Das (3):
selftests: powerpc: Fix pkey access right updates
selftests: powerpc: Move Hash MMU check to utilities
selftests: powerpc: Add test for execute-disabled pkeys
tools/testing/selftests/powerpc/include/reg.h | 6 +
.../testing/selftests/powerpc/include/utils.h | 1 +
tools/testing/selftests/powerpc/mm/.gitignore | 1 +
tools/testing/selftests/powerpc/mm/Makefile | 5 +-
.../selftests/powerpc/mm/bad_accesses.c | 28 --
.../selftests/powerpc/mm/pkey_exec_prot.c | 388 ++++++++++++++++++
.../selftests/powerpc/ptrace/core-pkey.c | 2 +-
.../selftests/powerpc/ptrace/ptrace-pkey.c | 2 +-
tools/testing/selftests/powerpc/utils.c | 28 ++
9 files changed, 429 insertions(+), 32 deletions(-)
create mode 100644 tools/testing/selftests/powerpc/mm/pkey_exec_prot.c
--
2.25.1
On Thu, Jul 09, 2020 at 09:27:43AM -0700, Andy Lutomirski wrote:
> On Thu, Jul 9, 2020 at 9:22 AM Dave Hansen <dave.hansen(a)intel.com> wrote:
> >
> > On 7/9/20 9:07 AM, Andy Lutomirski wrote:
> > > On Thu, Jul 9, 2020 at 8:56 AM Dave Hansen <dave.hansen(a)intel.com> wrote:
> > >> On 7/9/20 8:44 AM, Andersen, John wrote:
> > >>> Bits which are allowed to be pinned default to WP for CR0 and SMEP,
> > >>> SMAP, and UMIP for CR4.
> > >> I think it also makes sense to have FSGSBASE in this set.
> > >>
> > >> I know it hasn't been tested, but I think we should do the legwork to
> > >> test it. If not in this set, can we agree that it's a logical next step?
> > > I have no objection to pinning FSGSBASE, but is there a clear
> > > description of the threat model that this whole series is meant to
> > > address? The idea is to provide a degree of protection against an
> > > attacker who is able to convince a guest kernel to write something
> > > inappropriate to CR4, right? How realistic is this?
> >
> > If a quick search can find this:
> >
> > > https://googleprojectzero.blogspot.com/2017/05/exploiting-linux-kernel-via-…
> >
> > I'd pretty confident that the guys doing actual bad things have it in
> > their toolbox too.
> >
>
> True, but we have the existing software CR4 pinning. I suppose the
> virtualization version is stronger.
>
Yes, as Kees said this will be stronger because it stops ROP and other gadget
based techniques which avoid the use of native_write_cr0/4().
With regards to what should be done in this patchset and what in other
patchsets. I have a fix for kexec thanks to Arvind's note about
TRAMPOLINE_32BIT_CODE_SIZE. The physical host boots fine now and the virtual
one can kexec fine.
What remains to be done on that front is to add some identifying information to
the kernel image to declare that it supports paravirtualized control register
pinning or not.
Liran suggested adding a section to the built image acting as a flag to signify
support for being kexec'd by a kernel with pinning enabled. If anyone has any
opinions on how they'd like to see this implemented please let me know.
Otherwise I'll just take a stab at it and you'll all see it hopefully in the
next version.
With regards to FSGSBASE, are we open to validating and adding that to the
DEFAULT set as a part of a separate patchset? This patchset is focused on
replicating the functionality we already have natively.
(If anyone got this email twice, sorry I messed up the From: field the first
time around)