Dzień dobry,
chciałbym poinformować Państwa o możliwości pozyskania nowych zleceń ze strony www.
Widzimy zainteresowanie potencjalnych Klientów Państwa firmą, dlatego chętnie pomożemy Państwu dotrzeć z ofertą do większego grona odbiorców poprzez efektywne metody pozycjonowania strony w Google.
Czy mógłbym liczyć na kontakt zwrotny?
Pozdrawiam,
Mikołaj Rudzik
This patch series is preparation to fix the problems when execute
the multiple_kprobes.tc selftest on mips, some more work needs to
be done.
Tiezhu Yang (2):
selftests/ftrace: Save kprobe_events to test log
MIPS: Use NOKPROBE_SYMBOL() instead of __kprobes annotation
arch/mips/kernel/kprobes.c | 45 +++++++++++++++-------
arch/mips/mm/fault.c | 6 ++-
.../ftrace/test.d/kprobe/multiple_kprobes.tc | 2 +
3 files changed, 38 insertions(+), 15 deletions(-)
--
2.1.0
H e l l o,
I lead family investment vehicles who want to invest a proportion of their funds with a trust party .
Please are you interested in discussing investment in your sector?
Please email, or simply write to me here: allen.large(a)cheapnet.it I value promptness and will make every attempt to respond within a short time.
Thank you.
Allen S.
From: Frank Rowand <frank.rowand(a)sony.com>
The process to create version 2 of the KTAP Specification is documented
in email discussions. I am attempting to capture this information at
https://elinux.org/Test_Results_Format_Notes#KTAP_version_2
I am already not following the suggested process, which says:
"...please try to follow this principal of one major topic per email
thread." I think that is ok in this case because the two patches
are related and (hopefully) not controversial.
Changes since patch version 1:
- drop patch 1/2. Jonathan Corbet has already applied this patch
into version 1 of the Specification
- rename patch 2/2 to patch 1/2, with updated patch comment
- add new patch 2/2
Frank Rowand (2):
ktap_v2: change version to 2-rc in KTAP specification
ktap_v2: change "version 1" to "version 2" in examples
Documentation/dev-tools/ktap.rst | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)
--
Frank Rowand <frank.rowand(a)sony.com>
When writing tests, it'd often be very useful to be able to intercept
calls to a function in the code being tested and replace it with a
test-specific stub. This has always been an obviously missing piece of
KUnit, and the solutions always involve some tradeoffs with cleanliness,
performance, or impact on non-test code. See the folowing document for
some of the challenges:
https://kunit.dev/mocking.html
This series consists of two prototype patches which add support for this
sort of redirection to KUnit tests:
1: static_stub: Any function which might want to be intercepted adds a
call to a macro which checks if a test has redirected calls to it, and
calls the corresponding replacement.
2: ftrace_stub: Functions are intercepted using ftrace and livepatch.
This doesn't require adding a new prologue to each function being
replaced, but does have more dependencies (which restricts it to a small
number of architectures, not including UML), and doesn't work well with
inline functions.
The API for both implementations is very similar, so it should be easy
to migrate from one to the other if necessary. Both of these
implementations restrict the redirection to the test context: it is
automatically undone after the KUnit test completes, and does not affect
calls in other threads. If CONFIG_KUNIT is not enabled, there should be
no overhead in either implementation.
Does either (or both) of these features sound useful, and is this
sort-of API the right model? (Personally, I think there's a reasonable
scope for both.) Is anything obviously missing or wrong? Do the names,
descriptions etc. make any sense?
Note that these patches are definitely still at the "prototype" level,
and things like error-handling, documentation, and testing are still
pretty sparse. There is also quite a bit of room for optimisation.
These'll all be improved for v1 if the concept seems good.
Cheers,
-- David
Daniel Latypov (1):
kunit: expose ftrace-based API for stubbing out functions during tests
David Gow (1):
kunit: Expose 'static stub' API to redirect functions
include/kunit/ftrace_stub.h | 84 +++++++++++++++++
include/kunit/static_stub.h | 106 +++++++++++++++++++++
lib/kunit/Kconfig | 11 +++
lib/kunit/Makefile | 5 +
lib/kunit/ftrace_stub.c | 138 ++++++++++++++++++++++++++++
lib/kunit/kunit-example-test.c | 64 +++++++++++++
lib/kunit/static_stub.c | 125 +++++++++++++++++++++++++
lib/kunit/stubs_example.kunitconfig | 11 +++
8 files changed, 544 insertions(+)
create mode 100644 include/kunit/ftrace_stub.h
create mode 100644 include/kunit/static_stub.h
create mode 100644 lib/kunit/ftrace_stub.c
create mode 100644 lib/kunit/static_stub.c
create mode 100644 lib/kunit/stubs_example.kunitconfig
--
2.35.1.894.gb6a874cedc-goog
Currently the arm64 kselftests attempt to locate the ABI headers using
custom logic which doesn't work correctly in the case of out of tree builds
if KBUILD_OUTPUT is not specified. Since lib.mk defines KHDR_INCLUDES with
the appropriate flags we can simply remove the custom logic and use that
instead.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
This fix is required to get us able to run the arm64 kselftests
in KernelCI, it does out of tree kselftest builds triggering the
issue especially in conjunction with the addition of the new
definitions for SME.
tools/testing/selftests/arm64/Makefile | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
diff --git a/tools/testing/selftests/arm64/Makefile b/tools/testing/selftests/arm64/Makefile
index 1e8d9a8f59df..9460cbe81bcc 100644
--- a/tools/testing/selftests/arm64/Makefile
+++ b/tools/testing/selftests/arm64/Makefile
@@ -17,16 +17,7 @@ top_srcdir = $(realpath ../../../../)
# Additional include paths needed by kselftest.h and local headers
CFLAGS += -I$(top_srcdir)/tools/testing/selftests/
-# Guessing where the Kernel headers could have been installed
-# depending on ENV config
-ifeq ($(KBUILD_OUTPUT),)
-khdr_dir = $(top_srcdir)/usr/include
-else
-# the KSFT preferred location when KBUILD_OUTPUT is set
-khdr_dir = $(KBUILD_OUTPUT)/kselftest/usr/include
-endif
-
-CFLAGS += -I$(khdr_dir)
+CFLAGS += $(KHDR_INCLUDES)
export CFLAGS
export top_srcdir
--
2.30.2
There is a spelling mistake in an error message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com>
---
tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 29c973f606b2..136df5b76319 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -4320,7 +4320,7 @@ static ssize_t get_nth(struct __test_metadata *_metadata, const char *path,
f = fopen(path, "r");
ASSERT_NE(f, NULL) {
- TH_LOG("Coud not open %s: %s", path, strerror(errno));
+ TH_LOG("Could not open %s: %s", path, strerror(errno));
}
for (i = 0; i < position; i++) {
--
2.35.1
v10:
- Relax constraints for changes made to "cpuset.cpus"
and "cpuset.cpus.partition" as suggested. Now almost all changes
are allowed.
v9:
- Add a new patch 1 to remove the child cpuset restriction on parent's
"cpuset.cpus".
- Relax initial root partition entry limitation to allow cpuset.cpus to
overlap that of parent's.
- An "isolated invalid" displayed type is added to
cpuset.cpus.partition.
- Resetting partition root to "member" will leave child partition root
as invalid.
- Update documentation and test accordingly.
v8:
- Reorganize the patch series and rationalize the features and
constraints of a partition.
- Update patch descriptions and documentation accordingly.
This patchset include the following enhancements to the cpuset v2
partition code.
1) Allow partitions that have no task to have empty effective cpus.
2) Relax the constraints on what changes are allowed in cpuset.cpus
and cpuset.cpus.partition. However, the partition remain invalid
until the constraints of a valid partition root is satisfied.
3) Add a new "isolated" partition type for partitions with no load
balancing which is available in v1 but not yet in v2.
4) Allow the reading of cpuset.cpus.partition to include a reason
string as to why the partition remain invalid.
In addition, the cgroup-v2.rst documentation file is updated and
a self test is added to verify the correctness the partition code.
Waiman Long (8):
cgroup/cpuset: Add top_cpuset check in update_tasks_cpumask()
cgroup/cpuset: Miscellaneous cleanups & add helper functions
cgroup/cpuset: Allow no-task partition to have empty
cpuset.cpus.effective
cgroup/cpuset: Relax constraints to partition & cpus changes
cgroup/cpuset: Add a new isolated cpus.partition type
cgroup/cpuset: Show invalid partition reason string
cgroup/cpuset: Update description of cpuset.cpus.partition in
cgroup-v2.rst
kselftest/cgroup: Add cpuset v2 partition root state test
Documentation/admin-guide/cgroup-v2.rst | 145 ++--
kernel/cgroup/cpuset.c | 712 +++++++++++-------
tools/testing/selftests/cgroup/Makefile | 5 +-
.../selftests/cgroup/test_cpuset_prs.sh | 674 +++++++++++++++++
tools/testing/selftests/cgroup/wait_inotify.c | 87 +++
5 files changed, 1295 insertions(+), 328 deletions(-)
create mode 100755 tools/testing/selftests/cgroup/test_cpuset_prs.sh
create mode 100644 tools/testing/selftests/cgroup/wait_inotify.c
--
2.27.0
This patch series adds a memory.reclaim proactive reclaim interface.
The rationale behind the interface and how it works are in the first
patch.
---
Changes in V5:
- Fixed comment formating and added Co-developed-by in patch 1.
- Modified selftest to work if swap is enabled or not, and retry
multiple times to wait for background allocation before failing
with a clear message.
Changes in V4:
mm/memcontrol.c:
- Return -EINTR on signal_pending().
- On the final retry, drain percpu lru caches hoping that it might
introduce some evictable pages for reclaim.
- Simplified the retry loop as suggested by Dan Schatzberg.
selftests:
- Always return -errno on failure from cg_write() (whether open() or
write() fail), also update cg_read() and read_text() to return -errno
as well for consistency. Also make sure to correctly check that the
whole buffer was written in cg_write().
- Added a maximum number of retries for the reclaim selftest.
Changes in V3:
- Fix cg_write() (in patch 2) to properly return -1 if open() fails
and not fail if len == errno.
- Remove debug printf() in patch 3.
Changes in V2:
- Add the interface to root as well.
- Added a selftest.
- Documented the interface as a nested-keyed interface, which makes
adding optional arguments in the future easier (see doc updates in the
first patch).
- Modified the commit message to reflect changes and added a timeout
argument as a suggested possible extension
- Return -EAGAIN if the kernel fails to reclaim the full requested
amount.
---
Shakeel Butt (1):
memcg: introduce per-memcg reclaim interface
Yosry Ahmed (3):
selftests: cgroup: return -errno from cg_read()/cg_write() on failure
selftests: cgroup: fix alloc_anon_noexit() instantly freeing memory
selftests: cgroup: add a selftest for memory.reclaim
Documentation/admin-guide/cgroup-v2.rst | 21 ++++
mm/memcontrol.c | 45 +++++++
tools/testing/selftests/cgroup/cgroup_util.c | 44 +++----
.../selftests/cgroup/test_memcontrol.c | 114 +++++++++++++++++-
4 files changed, 197 insertions(+), 27 deletions(-)
--
2.36.0.rc2.479.g8af0fa9b8e-goog
The first patch of this series is a documentation fix.
The second patch allows BPF helpers to accept memory regions of fixed
size without doing runtime size checks.
The two next patches add new functionality that allows XDP to
accelerate iptables synproxy.
v1 of this series [1] used to include a patch that exposed conntrack
lookup to BPF using stable helpers. It was superseded by series [2] by
Kumar Kartikeya Dwivedi, which implements this functionality using
unstable helpers.
The third patch adds new helpers to issue and check SYN cookies without
binding to a socket, which is useful in the synproxy scenario.
The fourth patch adds a selftest, which includes an XDP program and a
userspace control application. The XDP program uses socketless SYN
cookie helpers and queries conntrack status instead of socket status.
The userspace control application allows to tune parameters of the XDP
program. This program also serves as a minimal example of usage of the
new functionality.
The last patch exposes the new helpers to TC BPF.
The draft of the new functionality was presented on Netdev 0x15 [3].
v2 changes:
Split into two series, submitted bugfixes to bpf, dropped the conntrack
patches, implemented the timestamp cookie in BPF using bpf_loop, dropped
the timestamp cookie patch.
v3 changes:
Moved some patches from bpf to bpf-next, dropped the patch that changed
error codes, split the new helpers into IPv4/IPv6, added verifier
functionality to accept memory regions of fixed size.
v4 changes:
Converted the selftest to the test_progs runner. Replaced some
deprecated functions in xdp_synproxy userspace helper.
v5 changes:
Fixed a bug in the selftest. Added questionable functionality to support
new helpers in TC BPF, added selftests for it.
v6 changes:
Wrap the new helpers themselves into #ifdef CONFIG_SYN_COOKIES, replaced
fclose with pclose and fixed the MSS for IPv6 in the selftest.
v7 changes:
Fixed the off-by-one error in indices, changed the section name to
"xdp", added missing kernel config options to vmtest in CI.
v8 changes:
Properly rebased, dropped the first patch (the same change was applied
by someone else), updated the cover letter.
[1]: https://lore.kernel.org/bpf/20211020095815.GJ28644@breakpoint.cc/t/
[2]: https://lore.kernel.org/bpf/20220114163953.1455836-1-memxor@gmail.com/
[3]: https://netdevconf.info/0x15/session.html?Accelerating-synproxy-with-XDP
Maxim Mikityanskiy (5):
bpf: Fix documentation of th_len in bpf_tcp_{gen,check}_syncookie
bpf: Allow helpers to accept pointers with a fixed size
bpf: Add helpers to issue and check SYN cookies in XDP
bpf: Add selftests for raw syncookie helpers
bpf: Allow the new syncookie helpers to work with SKBs
include/linux/bpf.h | 10 +
include/net/tcp.h | 1 +
include/uapi/linux/bpf.h | 88 +-
kernel/bpf/verifier.c | 26 +-
net/core/filter.c | 128 +++
net/ipv4/tcp_input.c | 3 +-
scripts/bpf_doc.py | 4 +
tools/include/uapi/linux/bpf.h | 88 +-
tools/testing/selftests/bpf/.gitignore | 1 +
tools/testing/selftests/bpf/Makefile | 2 +-
.../selftests/bpf/prog_tests/xdp_synproxy.c | 144 +++
.../selftests/bpf/progs/xdp_synproxy_kern.c | 819 ++++++++++++++++++
tools/testing/selftests/bpf/xdp_synproxy.c | 466 ++++++++++
13 files changed, 1759 insertions(+), 21 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_synproxy.c
create mode 100644 tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
create mode 100644 tools/testing/selftests/bpf/xdp_synproxy.c
--
2.30.2
Dear linux-kselftest
We are interested in having some of your hot selling product in
our stores and outlets spread all over United Kingdom, Northern
Island and Africa. ASDA Stores Limited is one of the highest-
ranking Wholesale & Retail outlets in the United Kingdom.
We shall furnish our detailed company profile in our next
correspondent. However, it would be appreciated if you can send
us your catalog through email to learn more about your company's
products and wholesale quote. It is hopeful that we can start a
viable long-lasting business relationship (partnership) with you.
Your prompt response would be delightfully appreciated.
Best Wishes
Hanes S. Thomas
Procurement Office.
ASDA Stores Limited
Tel: + 44 - 7451271650
WhatsApp: + 44 – 7441440360
Website: www.asda.co.uk
This splits up the get_proc_stat function to make it so we can use it as a
generic helper to read the nth field from multiple different files, versus
replicating the logic in multiple places.
Signed-off-by: Sargun Dhillon <sargun(a)sargun.me>
Cc: linux-kselftest(a)vger.kernel.org
---
tools/testing/selftests/seccomp/seccomp_bpf.c | 54 +++++++++++++------
1 file changed, 38 insertions(+), 16 deletions(-)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index ab340c4759a3..4fb5eda89223 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -4231,32 +4231,54 @@ TEST(user_notification_addfd_rlimit)
close(memfd);
}
-static char get_proc_stat(int pid)
+/*
+ * gen_nth - Get the nth, space separated entry in a file.
+ *
+ * Returns the length of the read field.
+ * Throws error if field is zero-lengthed.
+ */
+static ssize_t get_nth(struct __test_metadata *_metadata, const char *path,
+ const unsigned int position, char **entry)
{
- char proc_path[100] = {0};
char *line = NULL;
- size_t len = 0;
+ unsigned int i;
ssize_t nread;
- char status;
+ size_t len = 0;
FILE *f;
- int i;
- snprintf(proc_path, sizeof(proc_path), "/proc/%d/stat", pid);
- f = fopen(proc_path, "r");
- if (f == NULL)
- ksft_exit_fail_msg("%s - Could not open %s\n",
- strerror(errno), proc_path);
+ f = fopen(path, "r");
+ ASSERT_NE(f, NULL) {
+ TH_LOG("Coud not open %s: %s", path, strerror(errno));
+ }
- for (i = 0; i < 3; i++) {
+ for (i = 0; i < position; i++) {
nread = getdelim(&line, &len, ' ', f);
- if (nread <= 0)
- ksft_exit_fail_msg("Failed to read status: %s\n",
- strerror(errno));
+ ASSERT_GE(nread, 0) {
+ TH_LOG("Failed to read %d entry in file %s", i, path);
+ }
}
+ fclose(f);
+
+ ASSERT_GT(nread, 0) {
+ TH_LOG("Entry in file %s had zero length", path);
+ }
+
+ *entry = line;
+ return nread - 1;
+}
+
+/* For a given PID, get the task state (D, R, etc...) */
+static char get_proc_stat(struct __test_metadata *_metadata, pid_t pid)
+{
+ char proc_path[100] = {0};
+ char status;
+ char *line;
+
+ snprintf(proc_path, sizeof(proc_path), "/proc/%d/stat", pid);
+ ASSERT_EQ(get_nth(_metadata, proc_path, 3, &line), 1);
status = *line;
free(line);
- fclose(f);
return status;
}
@@ -4317,7 +4339,7 @@ TEST(user_notification_fifo)
/* This spins until all of the children are sleeping */
restart_wait:
for (i = 0; i < ARRAY_SIZE(pids); i++) {
- if (get_proc_stat(pids[i]) != 'S') {
+ if (get_proc_stat(_metadata, pids[i]) != 'S') {
nanosleep(&delay, NULL);
goto restart_wait;
}
--
2.25.1
This patch series revisits the proposal for a GPU cgroup controller to
track and limit memory allocations by various device/allocator
subsystems. The patch series also contains a simple prototype to
illustrate how Android intends to implement DMA-BUF allocator
attribution using the GPU cgroup controller. The prototype does not
include resource limit enforcements.
Changelog:
v6:
Move documentation into cgroup-v2.rst per Tejun Heo.
Rename BINDER_FD{A}_FLAG_SENDER_NO_NEED ->
BINDER_FD{A}_FLAG_XFER_CHARGE per Carlos Llamas.
Return error on transfer failure per Carlos Llamas.
v5:
Rebase on top of v5.18-rc3
Drop the global GPU cgroup "total" (sum of all device totals) portion
of the design since there is no currently known use for this per
Tejun Heo.
Fix commit message which still contained the old name for
dma_buf_transfer_charge per Michal Koutný.
Remove all GPU cgroup code except what's necessary to support charge transfer
from dma_buf. Previously charging was done in export, but for non-Android
graphics use-cases this is not ideal since there may be a delay between
allocation and export, during which time there is no accounting.
Merge dmabuf: Use the GPU cgroup charge/uncharge APIs patch into
dmabuf: heaps: export system_heap buffers with GPU cgroup charging as a
result of above.
Put the charge and uncharge code in the same file (system_heap_allocate,
system_heap_dma_buf_release) instead of splitting them between the heap and
the dma_buf_release. This avoids asymmetric management of the gpucg charges.
Modify the dma_buf_transfer_charge API to accept a task_struct instead
of a gpucg. This avoids requiring the caller to manage the refcount
of the gpucg upon failure and confusing ownership transfer logic.
Support all strings for gpucg_register_bucket instead of just string
literals.
Enforce globally unique gpucg_bucket names.
Constrain gpucg_bucket name lengths to 64 bytes.
Append "-heap" to gpucg_bucket names from dmabuf-heaps.
Drop patch 7 from the series, which changed the types of
binder_transaction_data's sender_pid and sender_euid fields. This was
done in another commit here:
https://lore.kernel.org/all/20220210021129.3386083-4-masahiroy@kernel.org/
Rename:
gpucg_try_charge -> gpucg_charge
find_cg_rpool_locked -> cg_rpool_find_locked
init_cg_rpool -> cg_rpool_init
get_cg_rpool_locked -> cg_rpool_get_locked
"gpu cgroup controller" -> "GPU controller"
gpucg_device -> gpucg_bucket
usage -> size
Tests:
Support both binder_fd_array_object and binder_fd_object. This is
necessary because new versions of Android will use binder_fd_object
instead of binder_fd_array_object, and we need to support both.
Tests for both binder_fd_array_object and binder_fd_object.
For binder_utils return error codes instead of
struct binder{fs}_ctx.
Use ifdef __ANDROID__ to choose platform-dependent temp path instead
of a runtime fallback.
Ensure binderfs_mntpt ends with a trailing '/' character instead of
prepending it where used.
v4:
Skip test if not run as root per Shuah Khan
Add better test logging for abnormal child termination per Shuah Khan
Adjust ordering of charge/uncharge during transfer to avoid potentially
hitting cgroup limit per Michal Koutný
Adjust gpucg_try_charge critical section for charge transfer functionality
Fix uninitialized return code error for dmabuf_try_charge error case
v3:
Remove Upstreaming Plan from gpu-cgroup.rst per John Stultz
Use more common dual author commit message format per John Stultz
Remove android from binder changes title per Todd Kjos
Add a kselftest for this new behavior per Greg Kroah-Hartman
Include details on behavior for all combinations of kernel/userspace
versions in changelog (thanks Suren Baghdasaryan) per Greg Kroah-Hartman.
Fix pid and uid types in binder UAPI header
v2:
See the previous revision of this change submitted by Hridya Valsaraju
at: https://lore.kernel.org/all/20220115010622.3185921-1-hridya@google.com/
Move dma-buf cgroup charge transfer from a dma_buf_op defined by every
heap to a single dma-buf function for all heaps per Daniel Vetter and
Christian König. Pointers to struct gpucg and struct gpucg_device
tracking the current associations were added to the dma_buf struct to
achieve this.
Fix incorrect Kconfig help section indentation per Randy Dunlap.
History of the GPU cgroup controller
====================================
The GPU/DRM cgroup controller came into being when a consensus[1]
was reached that the resources it tracked were unsuitable to be integrated
into memcg. Originally, the proposed controller was specific to the DRM
subsystem and was intended to track GEM buffers and GPU-specific
resources[2]. In order to help establish a unified memory accounting model
for all GPU and all related subsystems, Daniel Vetter put forth a
suggestion to move it out of the DRM subsystem so that it can be used by
other DMA-BUF exporters as well[3]. This RFC proposes an interface that
does the same.
[1]: https://patchwork.kernel.org/project/dri-devel/cover/20190501140438.9506-1-…
[2]: https://lore.kernel.org/amd-gfx/20210126214626.16260-1-brian.welty@intel.co…
[3]: https://lore.kernel.org/amd-gfx/YCVOl8%2F87bqRSQei@phenom.ffwll.local/
Hridya Valsaraju (3):
gpu: rfc: Proposal for a GPU cgroup controller
cgroup: gpu: Add a cgroup controller for allocator attribution of GPU
memory
binder: Add flags to relinquish ownership of fds
T.J. Mercier (3):
dmabuf: heaps: export system_heap buffers with GPU cgroup charging
dmabuf: Add gpu cgroup charge transfer function
selftests: Add binder cgroup gpu memory transfer tests
Documentation/admin-guide/cgroup-v2.rst | 24 +
drivers/android/binder.c | 31 +-
drivers/dma-buf/dma-buf.c | 80 ++-
drivers/dma-buf/dma-heap.c | 39 ++
drivers/dma-buf/heaps/system_heap.c | 28 +-
include/linux/cgroup_gpu.h | 137 +++++
include/linux/cgroup_subsys.h | 4 +
include/linux/dma-buf.h | 49 +-
include/linux/dma-heap.h | 15 +
include/uapi/linux/android/binder.h | 23 +-
init/Kconfig | 7 +
kernel/cgroup/Makefile | 1 +
kernel/cgroup/gpu.c | 386 +++++++++++++
.../selftests/drivers/android/binder/Makefile | 8 +
.../drivers/android/binder/binder_util.c | 250 +++++++++
.../drivers/android/binder/binder_util.h | 32 ++
.../selftests/drivers/android/binder/config | 4 +
.../binder/test_dmabuf_cgroup_transfer.c | 526 ++++++++++++++++++
18 files changed, 1621 insertions(+), 23 deletions(-)
create mode 100644 include/linux/cgroup_gpu.h
create mode 100644 kernel/cgroup/gpu.c
create mode 100644 tools/testing/selftests/drivers/android/binder/Makefile
create mode 100644 tools/testing/selftests/drivers/android/binder/binder_util.c
create mode 100644 tools/testing/selftests/drivers/android/binder/binder_util.h
create mode 100644 tools/testing/selftests/drivers/android/binder/config
create mode 100644 tools/testing/selftests/drivers/android/binder/test_dmabuf_cgroup_transfer.c
--
2.36.0.464.gb9c8b46e94-goog
If a memop fails due to key checked protection, after already having
written to the guest, don't indicate suppression to the guest, as that
would imply that memory wasn't modified.
This could be considered a fix to the code introducing storage key
support, however this is a bug in KVM only if we emulate an
instructions writing to an operand spanning multiple pages, which I
don't believe we do.
v1 -> v2
* Reword commit message of patch 1
Janis Schoetterl-Glausch (2):
KVM: s390: Don't indicate suppression on dirtying, failing memop
KVM: s390: selftest: Test suppression indication on key prot exception
arch/s390/kvm/gaccess.c | 47 ++++++++++++++---------
tools/testing/selftests/kvm/s390x/memop.c | 43 ++++++++++++++++++++-
2 files changed, 70 insertions(+), 20 deletions(-)
base-commit: af2d861d4cd2a4da5137f795ee3509e6f944a25b
--
2.32.0
These names sound more general than they are.
The _end() function increments a `static int kunit_suite_counter`, so it
can only safely be called on suites, aka top-level subtests.
It would need to have a separate counter for each level of subtest to be
generic enough.
So rename it to make it clear it's only appropriate for suites.
Signed-off-by: Daniel Latypov <dlatypov(a)google.com>
Reviewed-by: David Gow <davidgow(a)google.com>
---
v1 -> v2: no change (see patch 2 and 4)
---
lib/kunit/test.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 0f66c13d126e..64ee6a9d8003 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -134,7 +134,7 @@ size_t kunit_suite_num_test_cases(struct kunit_suite *suite)
}
EXPORT_SYMBOL_GPL(kunit_suite_num_test_cases);
-static void kunit_print_subtest_start(struct kunit_suite *suite)
+static void kunit_print_suite_start(struct kunit_suite *suite)
{
kunit_log(KERN_INFO, suite, KUNIT_SUBTEST_INDENT "# Subtest: %s",
suite->name);
@@ -192,7 +192,7 @@ EXPORT_SYMBOL_GPL(kunit_suite_has_succeeded);
static size_t kunit_suite_counter = 1;
-static void kunit_print_subtest_end(struct kunit_suite *suite)
+static void kunit_print_suite_end(struct kunit_suite *suite)
{
kunit_print_ok_not_ok((void *)suite, false,
kunit_suite_has_succeeded(suite),
@@ -498,7 +498,7 @@ int kunit_run_tests(struct kunit_suite *suite)
struct kunit_result_stats suite_stats = { 0 };
struct kunit_result_stats total_stats = { 0 };
- kunit_print_subtest_start(suite);
+ kunit_print_suite_start(suite);
kunit_suite_for_each_test_case(suite, test_case) {
struct kunit test = { .param_value = NULL, .param_index = 0 };
@@ -552,7 +552,7 @@ int kunit_run_tests(struct kunit_suite *suite)
}
kunit_print_suite_stats(suite, suite_stats, total_stats);
- kunit_print_subtest_end(suite);
+ kunit_print_suite_end(suite);
return 0;
}
base-commit: 59729170afcd4900e08997a482467ffda8d88c7f
--
2.36.0.464.gb9c8b46e94-goog
The kvm_binary_stats_test test currently does not have any output (unless
one of the TEST_ASSERT statement fails), so it's hard to say for a user
how far it did proceed already. Thus let's make this a little bit more
user-friendly and include some TAP output via the kselftest.h interface.
Signed-off-by: Thomas Huth <thuth(a)redhat.com>
---
.../testing/selftests/kvm/kvm_binary_stats_test.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index 17f65d514915..aa648834e178 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -19,6 +19,7 @@
#include "kvm_util.h"
#include "asm/kvm.h"
#include "linux/kvm.h"
+#include "kselftest.h"
static void stats_test(int stats_fd)
{
@@ -52,7 +53,7 @@ static void stats_test(int stats_fd)
/* Sanity check for other fields in header */
if (header->num_desc == 0) {
- printf("No KVM stats defined!");
+ ksft_print_msg("No KVM stats defined!\n");
return;
}
/* Check overlap */
@@ -219,12 +220,15 @@ int main(int argc, char *argv[])
max_vcpu = DEFAULT_NUM_VCPU;
}
+ ksft_print_header();
+
/* Check the extension for binary stats */
if (kvm_check_cap(KVM_CAP_BINARY_STATS_FD) <= 0) {
- print_skip("Binary form statistics interface is not supported");
- exit(KSFT_SKIP);
+ ksft_exit_skip("Binary form statistics interface is not supported\n");
}
+ ksft_set_plan(max_vm);
+
/* Create VMs and VCPUs */
vms = malloc(sizeof(vms[0]) * max_vm);
TEST_ASSERT(vms, "Allocate memory for storing VM pointers");
@@ -240,10 +244,12 @@ int main(int argc, char *argv[])
vm_stats_test(vms[i]);
for (j = 0; j < max_vcpu; ++j)
vcpu_stats_test(vms[i], j);
+ ksft_test_result_pass("vm%i\n", i);
}
for (i = 0; i < max_vm; ++i)
kvm_vm_free(vms[i]);
free(vms);
- return 0;
+
+ ksft_finished(); /* Print results and exit() accordingly */
}
--
2.27.0
This patch series is motivated by Shuah's suggestion here:
https://lore.kernel.org/kvm/d576d8f7-980f-3bc6-87ad-5a6ae45609b8@linuxfound…
Many s390x KVM selftests do not output any information about which
tests have been run, so it's hard to say whether a test binary
contains a certain sub-test or not. To improve this situation let's
add some TAP output via the kselftest.h interface to these tests,
so that it easier to understand what has been executed or not.
v2:
- Reworked the extension checking in the first patch
- Make sure to always print the TAP 13 header in the second patch
- Reworked the SKIP printing in the third patch
Thomas Huth (4):
KVM: s390: selftests: Use TAP interface in the memop test
KVM: s390: selftests: Use TAP interface in the sync_regs test
KVM: s390: selftests: Use TAP interface in the tprot test
KVM: s390: selftests: Use TAP interface in the reset test
tools/testing/selftests/kvm/s390x/memop.c | 90 +++++++++++++++----
tools/testing/selftests/kvm/s390x/resets.c | 38 ++++++--
.../selftests/kvm/s390x/sync_regs_test.c | 87 +++++++++++++-----
tools/testing/selftests/kvm/s390x/tprot.c | 28 ++++--
4 files changed, 192 insertions(+), 51 deletions(-)
--
2.27.0
From: Frank Rowand <frank.rowand(a)sony.com>
An August 2021 RFC patch [1] to create the KTAP Specification resulted in
some discussion of possible items to add to the specification.
The conversation ended without completing the document.
Progress resumed with a December 2021 RFC patch [2] to add a KTAP
Specification file (Version 1) to the Linux kernel. Many of the
suggestions from the August 2021 discussion were not included in
Version 1. This patch series is intended to revisit some of the
suggestions from the August 2021 discussion.
Patch 1 changes the Specification version to "2-rc" to indicate
that following patches are not yet accepted into a final version 2.
Patch 2 is an example of a simple change to the Specification. The
change does not change the content of the Specification, but updates
a formatting directive as suggested by the Documentation maintainer.
I intend to take some specific suggestions from the August 2021
discussion to create stand-alone RFC patches to the Specification
instead of adding them as additional patches in this series. The
intent is to focus discussion on a single area of the Specification
in each patch email thread.
[1] https://lore.kernel.org/r/CA+GJov6tdjvY9x12JsJT14qn6c7NViJxqaJk+r-K1YJzPggF…
[2] https://lore.kernel.org/r/20211207190251.18426-1-davidgow@google.com
Frank Rowand (2):
Documentation: dev-tools: KTAP spec change version to 2-rc
Documentation: dev-tools: use literal block instead of code-block
Documentation/dev-tools/ktap.rst | 20 +++++++++-----------
1 file changed, 9 insertions(+), 11 deletions(-)
--
Frank Rowand <frank.rowand(a)sony.com>
Currently the arm64 selftests don't support building with O=, this
series fixes that, bringing them more into line with how the kselftest
Makefiles want to work.
v3:
- Rebase onto arm64/for-next/core.
v2:
- Rebase onto v5.18-rc3.
Mark Brown (4):
selftests/arm64: Use TEST_GEN_PROGS_EXTENDED in the FP Makefile
selftests/arm64: Define top_srcdir for the fp tests
selftests/arm64: Clean the fp helper libraries
selftests/arm64: Fix O= builds for the floating point tests
tools/testing/selftests/arm64/fp/Makefile | 49 ++++++++++++-----------
1 file changed, 26 insertions(+), 23 deletions(-)
base-commit: 5c346f94d2933ba320af8325cfe77fc58c6e537a
--
2.30.2
The first patch of this series is an improvement to the existing
syncookie BPF helper. The second patch is a documentation fix.
The third patch allows BPF helpers to accept memory regions of fixed
size without doing runtime size checks.
The two last patches add new functionality that allows XDP to
accelerate iptables synproxy.
v1 of this series [1] used to include a patch that exposed conntrack
lookup to BPF using stable helpers. It was superseded by series [2] by
Kumar Kartikeya Dwivedi, which implements this functionality using
unstable helpers.
The fourth patch adds new helpers to issue and check SYN cookies without
binding to a socket, which is useful in the synproxy scenario.
The fifth patch adds a selftest, which consists of a script, an XDP
program and a userspace control application. The XDP program uses
socketless SYN cookie helpers and queries conntrack status instead of
socket status. The userspace control application allows to tune
parameters of the XDP program. This program also serves as a minimal
example of usage of the new functionality.
The draft of the new functionality was presented on Netdev 0x15 [3].
v2 changes:
Split into two series, submitted bugfixes to bpf, dropped the conntrack
patches, implemented the timestamp cookie in BPF using bpf_loop, dropped
the timestamp cookie patch.
v3 changes:
Moved some patches from bpf to bpf-next, dropped the patch that changed
error codes, split the new helpers into IPv4/IPv6, added verifier
functionality to accept memory regions of fixed size.
v4 changes:
Converted the selftest to the test_progs runner. Replaced some
deprecated functions in xdp_synproxy userspace helper.
v5 changes:
Fixed a bug in the selftest. Added questionable functionality to support
new helpers in TC BPF, added selftests for it.
v6 changes:
Wrap the new helpers themselves into #ifdef CONFIG_SYN_COOKIES, replaced
fclose with pclose and fixed the MSS for IPv6 in the selftest.
v7 changes:
Fixed the off-by-one error in indices, changed the section name to
"xdp", added missing kernel config options to vmtest in CI.
[1]: https://lore.kernel.org/bpf/20211020095815.GJ28644@breakpoint.cc/t/
[2]: https://lore.kernel.org/bpf/20220114163953.1455836-1-memxor@gmail.com/
[3]: https://netdevconf.info/0x15/session.html?Accelerating-synproxy-with-XDP
Maxim Mikityanskiy (6):
bpf: Use ipv6_only_sock in bpf_tcp_gen_syncookie
bpf: Fix documentation of th_len in bpf_tcp_{gen,check}_syncookie
bpf: Allow helpers to accept pointers with a fixed size
bpf: Add helpers to issue and check SYN cookies in XDP
bpf: Add selftests for raw syncookie helpers
bpf: Allow the new syncookie helpers to work with SKBs
include/linux/bpf.h | 10 +
include/net/tcp.h | 1 +
include/uapi/linux/bpf.h | 88 +-
kernel/bpf/verifier.c | 26 +-
net/core/filter.c | 130 ++-
net/ipv4/tcp_input.c | 3 +-
scripts/bpf_doc.py | 4 +
tools/include/uapi/linux/bpf.h | 88 +-
tools/testing/selftests/bpf/.gitignore | 1 +
tools/testing/selftests/bpf/Makefile | 2 +-
.../selftests/bpf/prog_tests/xdp_synproxy.c | 144 +++
.../selftests/bpf/progs/xdp_synproxy_kern.c | 819 ++++++++++++++++++
tools/testing/selftests/bpf/xdp_synproxy.c | 466 ++++++++++
13 files changed, 1760 insertions(+), 22 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_synproxy.c
create mode 100644 tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
create mode 100644 tools/testing/selftests/bpf/xdp_synproxy.c
--
2.30.2
Dzień dobry,
stworzyliśmy specjalną ofertę dla firm, na kompleksową obsługę inwestycji w fotowoltaikę.
Specjalizujemy się w zakresie doboru, montażu i serwisie instalacji fotowoltaicznych, dysponujemy najnowocześniejszymi rozwiązania, które zapewnią Państwu oczekiwane rezultaty.
Możemy przygotować dla Państwa wstępną kalkulację i przeanalizować efekty możliwe do osiągnięcia.
Czy są Państwo otwarci na wstępną rozmowę w tym temacie?
Pozdrawiam
Arkadiusz Sokołowski
The first patch of this series is an improvement to the existing
syncookie BPF helper. The second patch is a documentation fix.
The third patch allows BPF helpers to accept memory regions of fixed
size without doing runtime size checks.
The two last patches add new functionality that allows XDP to
accelerate iptables synproxy.
v1 of this series [1] used to include a patch that exposed conntrack
lookup to BPF using stable helpers. It was superseded by series [2] by
Kumar Kartikeya Dwivedi, which implements this functionality using
unstable helpers.
The fourth patch adds new helpers to issue and check SYN cookies without
binding to a socket, which is useful in the synproxy scenario.
The fifth patch adds a selftest, which consists of a script, an XDP
program and a userspace control application. The XDP program uses
socketless SYN cookie helpers and queries conntrack status instead of
socket status. The userspace control application allows to tune
parameters of the XDP program. This program also serves as a minimal
example of usage of the new functionality.
The draft of the new functionality was presented on Netdev 0x15 [3].
v2 changes:
Split into two series, submitted bugfixes to bpf, dropped the conntrack
patches, implemented the timestamp cookie in BPF using bpf_loop, dropped
the timestamp cookie patch.
v3 changes:
Moved some patches from bpf to bpf-next, dropped the patch that changed
error codes, split the new helpers into IPv4/IPv6, added verifier
functionality to accept memory regions of fixed size.
v4 changes:
Converted the selftest to the test_progs runner. Replaced some
deprecated functions in xdp_synproxy userspace helper.
v5 changes:
Fixed a bug in the selftest. Added questionable functionality to support
new helpers in TC BPF, added selftests for it.
v6 changes:
Wrap the new helpers themselves into #ifdef CONFIG_SYN_COOKIES, replaced
fclose with pclose and fixed the MSS for IPv6 in the selftest.
[1]: https://lore.kernel.org/bpf/20211020095815.GJ28644@breakpoint.cc/t/
[2]: https://lore.kernel.org/bpf/20220114163953.1455836-1-memxor@gmail.com/
[3]: https://netdevconf.info/0x15/session.html?Accelerating-synproxy-with-XDP
Maxim Mikityanskiy (6):
bpf: Use ipv6_only_sock in bpf_tcp_gen_syncookie
bpf: Fix documentation of th_len in bpf_tcp_{gen,check}_syncookie
bpf: Allow helpers to accept pointers with a fixed size
bpf: Add helpers to issue and check SYN cookies in XDP
bpf: Add selftests for raw syncookie helpers
bpf: Allow the new syncookie helpers to work with SKBs
include/linux/bpf.h | 10 +
include/net/tcp.h | 1 +
include/uapi/linux/bpf.h | 88 +-
kernel/bpf/verifier.c | 26 +-
net/core/filter.c | 130 ++-
net/ipv4/tcp_input.c | 3 +-
scripts/bpf_doc.py | 4 +
tools/include/uapi/linux/bpf.h | 88 +-
tools/testing/selftests/bpf/.gitignore | 1 +
tools/testing/selftests/bpf/Makefile | 2 +-
.../selftests/bpf/prog_tests/xdp_synproxy.c | 144 +++
.../selftests/bpf/progs/xdp_synproxy_kern.c | 819 ++++++++++++++++++
tools/testing/selftests/bpf/xdp_synproxy.c | 466 ++++++++++
13 files changed, 1760 insertions(+), 22 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_synproxy.c
create mode 100644 tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
create mode 100644 tools/testing/selftests/bpf/xdp_synproxy.c
--
2.30.2
Currently the arm64 selftests don't support building with O=, this
series fixes that, bringing them more into line with how the kselftest
Makefiles want to work.
v2:
- Rebase onto v5.18-rc3.
Mark Brown (4):
selftests/arm64: Use TEST_GEN_PROGS_EXTENDED in the FP Makefile
selftests/arm64: Define top_srcdir for the fp tests
selftests/arm64: Clean the fp helper libraries
selftests/arm64: Fix O= builds for the floating point tests
tools/testing/selftests/arm64/fp/Makefile | 29 +++++++++++++----------
1 file changed, 17 insertions(+), 12 deletions(-)
base-commit: b2d229d4ddb17db541098b83524d901257e93845
--
2.30.2
The code is copied from the Android Open Source Project and the author(
Maciej ��enczykowski) has gave permission to relicense it under GPLv2.
The test is to change input IPv6 packets to IPv4 ones and output IPv4 to
IPv6 with bpf_skb_change_proto.
Signed-off-by: Maciej ��enczykowski <maze(a)google.com>
Signed-off-by: Lina Wang <lina.wang(a)mediatek.com>
---
tools/testing/selftests/bpf/progs/nat6to4.c | 293 ++++++++++++++++++++
1 file changed, 293 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/nat6to4.c
diff --git a/tools/testing/selftests/bpf/progs/nat6to4.c b/tools/testing/selftests/bpf/progs/nat6to4.c
new file mode 100644
index 000000000000..099950f7a6cc
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/nat6to4.c
@@ -0,0 +1,293 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * This code is taken from the Android Open Source Project and the author
+ * (Maciej ��enczykowski) has gave permission to relicense it under the
+ * GPLv2. Therefore this program is free software;
+ * You can redistribute it and/or modify it under the terms of the GNU
+ * General Public License version 2 as published by the Free Software
+ * Foundation
+
+ * The original headers, including the original license headers, are
+ * included below for completeness.
+ *
+ * Copyright (C) 2019 The Android Open Source Project
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#include <linux/bpf.h>
+#include <linux/if.h>
+#include <linux/if_ether.h>
+#include <linux/if_packet.h>
+#include <linux/in.h>
+#include <linux/in6.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/pkt_cls.h>
+#include <linux/swab.h>
+#include <stdbool.h>
+#include <stdint.h>
+
+// bionic kernel uapi linux/udp.h header is munged...
+#define __kernel_udphdr udphdr
+#include <linux/udp.h>
+
+#include <bpf/bpf_helpers.h>
+
+#define htons(x) (__builtin_constant_p(x) ? ___constant_swab16(x) : __builtin_bswap16(x))
+#define htonl(x) (__builtin_constant_p(x) ? ___constant_swab32(x) : __builtin_bswap32(x))
+#define ntohs(x) htons(x)
+#define ntohl(x) htonl(x)
+
+// From kernel:include/net/ip.h
+#define IP_DF 0x4000 // Flag: "Don't Fragment"
+
+SEC("schedcls/ingress6/nat_6")
+int sched_cls_ingress6_nat_6_prog(struct __sk_buff *skb)
+{
+
+ const int l2_header_size = sizeof(struct ethhdr);
+ void *data = (void *)(long)skb->data;
+ const void *data_end = (void *)(long)skb->data_end;
+ const struct ethhdr * const eth = data; // used iff is_ethernet
+ const struct ipv6hdr * const ip6 = (void *)(eth + 1);
+
+ // Require ethernet dst mac address to be our unicast address.
+ if (skb->pkt_type != PACKET_HOST)
+ return TC_ACT_OK;
+
+ // Must be meta-ethernet IPv6 frame
+ if (skb->protocol != htons(ETH_P_IPV6))
+ return TC_ACT_OK;
+
+ // Must have (ethernet and) ipv6 header
+ if (data + l2_header_size + sizeof(*ip6) > data_end)
+ return TC_ACT_OK;
+
+ // Ethertype - if present - must be IPv6
+ if (eth->h_proto != htons(ETH_P_IPV6))
+ return TC_ACT_OK;
+
+ // IP version must be 6
+ if (ip6->version != 6)
+ return TC_ACT_OK;
+ // Maximum IPv6 payload length that can be translated to IPv4
+ if (ntohs(ip6->payload_len) > 0xFFFF - sizeof(struct iphdr))
+ return TC_ACT_OK;
+ switch (ip6->nexthdr) {
+ case IPPROTO_TCP: // For TCP & UDP the checksum neutrality of the chosen IPv6
+ case IPPROTO_UDP: // address means there is no need to update their checksums.
+ case IPPROTO_GRE: // We do not need to bother looking at GRE/ESP headers,
+ case IPPROTO_ESP: // since there is never a checksum to update.
+ break;
+ default: // do not know how to handle anything else
+ return TC_ACT_OK;
+ }
+
+ struct ethhdr eth2; // used iff is_ethernet
+
+ eth2 = *eth; // Copy over the ethernet header (src/dst mac)
+ eth2.h_proto = htons(ETH_P_IP); // But replace the ethertype
+
+ struct iphdr ip = {
+ .version = 4, // u4
+ .ihl = sizeof(struct iphdr) / sizeof(__u32), // u4
+ .tos = (ip6->priority << 4) + (ip6->flow_lbl[0] >> 4), // u8
+ .tot_len = htons(ntohs(ip6->payload_len) + sizeof(struct iphdr)), // u16
+ .id = 0, // u16
+ .frag_off = htons(IP_DF), // u16
+ .ttl = ip6->hop_limit, // u8
+ .protocol = ip6->nexthdr, // u8
+ .check = 0, // u16
+ .saddr = 0x0201a8c0, // u32
+ .daddr = 0x0101a8c0, // u32
+ };
+
+ // Calculate the IPv4 one's complement checksum of the IPv4 header.
+ __wsum sum4 = 0;
+
+ for (int i = 0; i < sizeof(ip) / sizeof(__u16); ++i)
+ sum4 += ((__u16 *)&ip)[i];
+
+ // Note that sum4 is guaranteed to be non-zero by virtue of ip.version == 4
+ sum4 = (sum4 & 0xFFFF) + (sum4 >> 16); // collapse u32 into range 1 .. 0x1FFFE
+ sum4 = (sum4 & 0xFFFF) + (sum4 >> 16); // collapse any potential carry into u16
+ ip.check = (__u16)~sum4; // sum4 cannot be zero, so this is never 0xFFFF
+
+ // Calculate the *negative* IPv6 16-bit one's complement checksum of the IPv6 header.
+ __wsum sum6 = 0;
+ // We'll end up with a non-zero sum due to ip6->version == 6 (which has '0' bits)
+ for (int i = 0; i < sizeof(*ip6) / sizeof(__u16); ++i)
+ sum6 += ~((__u16 *)ip6)[i]; // note the bitwise negation
+
+ // Note that there is no L4 checksum update: we are relying on the checksum neutrality
+ // of the ipv6 address chosen by netd's ClatdController.
+
+ // Packet mutations begin - point of no return, but if this first modification fails
+ // the packet is probably still pristine, so let clatd handle it.
+ if (bpf_skb_change_proto(skb, htons(ETH_P_IP), 0))
+ return TC_ACT_OK;
+ bpf_csum_update(skb, sum6);
+
+ data = (void *)(long)skb->data;
+ data_end = (void *)(long)skb->data_end;
+ if (data + l2_header_size + sizeof(struct iphdr) > data_end)
+ return TC_ACT_SHOT;
+
+ struct ethhdr *new_eth = data;
+
+ // Copy over the updated ethernet header
+ *new_eth = eth2;
+
+ // Copy over the new ipv4 header.
+ *(struct iphdr *)(new_eth + 1) = ip;
+ return bpf_redirect(skb->ifindex, BPF_F_INGRESS);
+}
+SEC("schedcls/egress4/snat4")
+int sched_cls_egress4_snat4_prog(struct __sk_buff *skb)
+{
+ const int l2_header_size = sizeof(struct ethhdr);
+ void *data = (void *)(long)skb->data;
+ const void *data_end = (void *)(long)skb->data_end;
+ const struct ethhdr *const eth = data; // used iff is_ethernet
+ const struct iphdr *const ip4 = (void *)(eth + 1);
+
+
+ // Must be meta-ethernet IPv4 frame
+ if (skb->protocol != htons(ETH_P_IP))
+ return TC_ACT_OK;
+
+ // Must have ipv4 header
+ if (data + l2_header_size + sizeof(struct ipv6hdr) > data_end)
+ return TC_ACT_OK;
+
+ // Ethertype - if present - must be IPv4
+ if (eth->h_proto != htons(ETH_P_IP))
+ return TC_ACT_OK;
+
+ // IP version must be 4
+ if (ip4->version != 4)
+ return TC_ACT_OK;
+
+ // We cannot handle IP options, just standard 20 byte == 5 dword minimal IPv4 header
+ if (ip4->ihl != 5)
+ return TC_ACT_OK;
+
+ // Maximum IPv6 payload length that can be translated to IPv4
+ if (htons(ip4->tot_len) > 0xFFFF - sizeof(struct ipv6hdr))
+ return TC_ACT_OK;
+
+ // Calculate the IPv4 one's complement checksum of the IPv4 header.
+ __wsum sum4 = 0;
+
+ for (int i = 0; i < sizeof(*ip4) / sizeof(__u16); ++i)
+ sum4 += ((__u16 *)ip4)[i];
+
+ // Note that sum4 is guaranteed to be non-zero by virtue of ip4->version == 4
+ sum4 = (sum4 & 0xFFFF) + (sum4 >> 16); // collapse u32 into range 1 .. 0x1FFFE
+ sum4 = (sum4 & 0xFFFF) + (sum4 >> 16); // collapse any potential carry into u16
+ // for a correct checksum we should get *a* zero, but sum4 must be positive, ie 0xFFFF
+ if (sum4 != 0xFFFF)
+ return TC_ACT_OK;
+
+ // Minimum IPv4 total length is the size of the header
+ if (ntohs(ip4->tot_len) < sizeof(*ip4))
+ return TC_ACT_OK;
+
+ // We are incapable of dealing with IPv4 fragments
+ if (ip4->frag_off & ~htons(IP_DF))
+ return TC_ACT_OK;
+
+ switch (ip4->protocol) {
+ case IPPROTO_TCP: // For TCP & UDP the checksum neutrality of the chosen IPv6
+ case IPPROTO_GRE: // address means there is no need to update their checksums.
+ case IPPROTO_ESP: // We do not need to bother looking at GRE/ESP headers,
+ break; // since there is never a checksum to update.
+
+ case IPPROTO_UDP: // See above comment, but must also have UDP header...
+ if (data + sizeof(*ip4) + sizeof(struct udphdr) > data_end)
+ return TC_ACT_OK;
+ const struct udphdr *uh = (const struct udphdr *)(ip4 + 1);
+ // If IPv4/UDP checksum is 0 then fallback to clatd so it can calculate the
+ // checksum. Otherwise the network or more likely the NAT64 gateway might
+ // drop the packet because in most cases IPv6/UDP packets with a zero checksum
+ // are invalid. See RFC 6935. TODO: calculate checksum via bpf_csum_diff()
+ if (!uh->check)
+ return TC_ACT_OK;
+ break;
+
+ default: // do not know how to handle anything else
+ return TC_ACT_OK;
+ }
+ struct ethhdr eth2; // used iff is_ethernet
+
+ eth2 = *eth; // Copy over the ethernet header (src/dst mac)
+ eth2.h_proto = htons(ETH_P_IPV6); // But replace the ethertype
+
+ struct ipv6hdr ip6 = {
+ .version = 6, // __u8:4
+ .priority = ip4->tos >> 4, // __u8:4
+ .flow_lbl = {(ip4->tos & 0xF) << 4, 0, 0}, // __u8[3]
+ .payload_len = htons(ntohs(ip4->tot_len) - 20), // __be16
+ .nexthdr = ip4->protocol, // __u8
+ .hop_limit = ip4->ttl, // __u8
+ };
+ ip6.saddr.in6_u.u6_addr32[0] = htonl(0x20010db8);
+ ip6.saddr.in6_u.u6_addr32[1] = 0;
+ ip6.saddr.in6_u.u6_addr32[2] = 0;
+ ip6.saddr.in6_u.u6_addr32[3] = htonl(1);
+ ip6.daddr.in6_u.u6_addr32[0] = htonl(0x20010db8);
+ ip6.daddr.in6_u.u6_addr32[1] = 0;
+ ip6.daddr.in6_u.u6_addr32[2] = 0;
+ ip6.daddr.in6_u.u6_addr32[3] = htonl(2);
+
+ // Calculate the IPv6 16-bit one's complement checksum of the IPv6 header.
+ __wsum sum6 = 0;
+ // We'll end up with a non-zero sum due to ip6.version == 6
+ for (int i = 0; i < sizeof(ip6) / sizeof(__u16); ++i)
+ sum6 += ((__u16 *)&ip6)[i];
+
+ // Packet mutations begin - point of no return, but if this first modification fails
+ // the packet is probably still pristine, so let clatd handle it.
+ if (bpf_skb_change_proto(skb, htons(ETH_P_IPV6), 0))
+ return TC_ACT_OK;
+
+ // This takes care of updating the skb->csum field for a CHECKSUM_COMPLETE packet.
+ // In such a case, skb->csum is a 16-bit one's complement sum of the entire payload,
+ // thus we need to subtract out the ipv4 header's sum, and add in the ipv6 header's sum.
+ // However, we've already verified the ipv4 checksum is correct and thus 0.
+ // Thus we only need to add the ipv6 header's sum.
+ //
+ // bpf_csum_update() always succeeds if the skb is CHECKSUM_COMPLETE and returns an error
+ // (-ENOTSUPP) if it isn't. So we just ignore the return code (see above for more details).
+ bpf_csum_update(skb, sum6);
+
+ // bpf_skb_change_proto() invalidates all pointers - reload them.
+ data = (void *)(long)skb->data;
+ data_end = (void *)(long)skb->data_end;
+
+ // I cannot think of any valid way for this error condition to trigger, however I do
+ // believe the explicit check is required to keep the in kernel ebpf verifier happy.
+ if (data + l2_header_size + sizeof(ip6) > data_end)
+ return TC_ACT_SHOT;
+
+ struct ethhdr *new_eth = data;
+
+ // Copy over the updated ethernet header
+ *new_eth = eth2;
+ // Copy over the new ipv4 header.
+ *(struct ipv6hdr *)(new_eth + 1) = ip6;
+ return TC_ACT_OK;
+}
+
+char _license[] SEC("license") = ("GPL");
+
--
2.18.0
The kunit_remove_resource() function is used to unlink a resource from
the list of resources in the test, making it no longer show up in
kunit_find_resource().
However, this could lead to a race condition if two threads called
kunit_remove_resource() on the same resource at the same time: the
resource would be removed from the list twice (causing a crash at the
second list_del()), and the refcount for the resource would be
decremented twice (instead of once, for the reference held by the
resource list).
Fix both problems, the first by using list_del_init(), and the second by
checking if the resource has already been removed using list_empty(),
and only decrementing its refcount if it has not.
Also add a KUnit test for the kunit_remove_resource() function which
tests this behaviour.
Reported-by: Daniel Latypov <dlatypov(a)google.com>
Signed-off-by: David Gow <davidgow(a)google.com>
Reviewed-by: Brendan Higgins <brendanhiggins(a)google.com>
---
Changes since v1:
https://lore.kernel.org/linux-kselftest/20220318064959.3298768-1-davidgow@g…
- Rebased on top of Daniel's split of the resource system into
resource.{c,h}
- https://lore.kernel.org/linux-kselftest/20220328174143.857262-1-dlatypov@go…
- https://lore.kernel.org/linux-kselftest/20220328174143.857262-2-dlatypov@go…
lib/kunit/kunit-test.c | 35 +++++++++++++++++++++++++++++++++++
lib/kunit/resource.c | 8 ++++++--
2 files changed, 41 insertions(+), 2 deletions(-)
diff --git a/lib/kunit/kunit-test.c b/lib/kunit/kunit-test.c
index 555601d17f79..9005034558aa 100644
--- a/lib/kunit/kunit-test.c
+++ b/lib/kunit/kunit-test.c
@@ -190,6 +190,40 @@ static void kunit_resource_test_destroy_resource(struct kunit *test)
KUNIT_EXPECT_TRUE(test, list_empty(&ctx->test.resources));
}
+static void kunit_resource_test_remove_resource(struct kunit *test)
+{
+ struct kunit_test_resource_context *ctx = test->priv;
+ struct kunit_resource *res = kunit_alloc_and_get_resource(
+ &ctx->test,
+ fake_resource_init,
+ fake_resource_free,
+ GFP_KERNEL,
+ ctx);
+
+ /* The resource is in the list */
+ KUNIT_EXPECT_FALSE(test, list_empty(&ctx->test.resources));
+
+ /* Remove the resource. The pointer is still valid, but it can't be
+ * found.
+ */
+ kunit_remove_resource(test, res);
+ KUNIT_EXPECT_TRUE(test, list_empty(&ctx->test.resources));
+ /* We haven't been freed yet. */
+ KUNIT_EXPECT_TRUE(test, ctx->is_resource_initialized);
+
+ /* Removing the resource multiple times is valid. */
+ kunit_remove_resource(test, res);
+ KUNIT_EXPECT_TRUE(test, list_empty(&ctx->test.resources));
+ /* Despite having been removed twice (from only one reference), the
+ * resource still has not been freed.
+ */
+ KUNIT_EXPECT_TRUE(test, ctx->is_resource_initialized);
+
+ /* Free the resource. */
+ kunit_put_resource(res);
+ KUNIT_EXPECT_FALSE(test, ctx->is_resource_initialized);
+}
+
static void kunit_resource_test_cleanup_resources(struct kunit *test)
{
int i;
@@ -387,6 +421,7 @@ static struct kunit_case kunit_resource_test_cases[] = {
KUNIT_CASE(kunit_resource_test_init_resources),
KUNIT_CASE(kunit_resource_test_alloc_resource),
KUNIT_CASE(kunit_resource_test_destroy_resource),
+ KUNIT_CASE(kunit_resource_test_remove_resource),
KUNIT_CASE(kunit_resource_test_cleanup_resources),
KUNIT_CASE(kunit_resource_test_proper_free_ordering),
KUNIT_CASE(kunit_resource_test_static),
diff --git a/lib/kunit/resource.c b/lib/kunit/resource.c
index b8bced246217..09ec392d2323 100644
--- a/lib/kunit/resource.c
+++ b/lib/kunit/resource.c
@@ -98,11 +98,15 @@ EXPORT_SYMBOL_GPL(kunit_alloc_and_get_resource);
void kunit_remove_resource(struct kunit *test, struct kunit_resource *res)
{
unsigned long flags;
+ bool was_linked;
spin_lock_irqsave(&test->lock, flags);
- list_del(&res->node);
+ was_linked = !list_empty(&res->node);
+ list_del_init(&res->node);
spin_unlock_irqrestore(&test->lock, flags);
- kunit_put_resource(res);
+
+ if (was_linked)
+ kunit_put_resource(res);
}
EXPORT_SYMBOL_GPL(kunit_remove_resource);
--
2.35.1.1094.g7c7d902a7c-goog
Currently if opening /dev/null fails to open then file pointer fp
is null and further access to fp via fprintf will cause a null
pointer dereference. Fix this by returning a negative error value
when a null fp is detected.
Detected using cppcheck static analysis:
tools/testing/selftests/resctrl/fill_buf.c:124:6: note: Assuming
that condition '!fp' is not redundant
if (!fp)
^
tools/testing/selftests/resctrl/fill_buf.c:126:10: note: Null
pointer dereference
fprintf(fp, "Sum: %d ", ret);
Fixes: a2561b12fe39 ("selftests/resctrl: Add built in benchmark")
Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com>
---
V2: Add cppcheck analysis information
---
tools/testing/selftests/resctrl/fill_buf.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c
index 51e5cf22632f..56ccbeae0638 100644
--- a/tools/testing/selftests/resctrl/fill_buf.c
+++ b/tools/testing/selftests/resctrl/fill_buf.c
@@ -121,8 +121,10 @@ static int fill_cache_read(unsigned char *start_ptr, unsigned char *end_ptr,
/* Consume read result so that reading memory is not optimized out. */
fp = fopen("/dev/null", "w");
- if (!fp)
+ if (!fp) {
perror("Unable to write to /dev/null");
+ return -1;
+ }
fprintf(fp, "Sum: %d ", ret);
fclose(fp);
--
2.35.1
Hello,
The aim of this series is to print a message to let users know a possible
cause of failure, if the result of MBM&CMT tests is failed on Intel CPU.
In order to detect Intel vendor, I extended AMD vendor detect function.
Difference from v4:
- Fixed the typos.
- Changed "get_vendor() != ARCH_AMD" to "get_vendor() == ARCH_INTEL".
- Reorder the declarations based on line length from longest to shortest.
https://lore.kernel.org/lkml/20220316055940.292550-1-tan.shaopeng@jp.fujits… [PATCH v4]
This patch series is based on v5.17.
Shaopeng Tan (2):
selftests/resctrl: Extend CPU vendor detection
selftests/resctrl: Print a message if the result of MBM&CMT tests is
failed on Intel CPU
tools/testing/selftests/resctrl/cat_test.c | 2 +-
tools/testing/selftests/resctrl/resctrl.h | 5 ++-
.../testing/selftests/resctrl/resctrl_tests.c | 45 +++++++++++++------
tools/testing/selftests/resctrl/resctrlfs.c | 2 +-
4 files changed, 37 insertions(+), 17 deletions(-)
--
2.27.0
Now that the discussions surrounding the support for SGX2 is settling,
the kselftest audience is added to the discussion for the first time
to consider the testing of the new features.
V3: https://lore.kernel.org/lkml/cover.1648847675.git.reinette.chatre@intel.com/
Changes since V3 that directly impact user space:
- SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS ioctl()'s struct
sgx_enclave_restrict_permissions no longer provides entire secinfo,
just the new permissions in new "permissions" struct member. (Jarkko)
- Rename SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl() to
SGX_IOC_ENCLAVE_MODIFY_TYPES. (Jarkko)
- SGX_IOC_ENCLAVE_MODIFY_TYPES ioctl()'s struct sgx_enclave_modify_type
no longer provides entire secinfo, just the new page type in new
"page_type" struct member. (Jarkko)
Details about changes since V3 that do not directly impact user space:
- Add new patch to enable VA pages to be added without invoking reclaimer
directly if no EPC pages are available, failing instead. This enables
VA pages to be added with enclave's mutex held. Fixes an issue
encountered by Haitao. More details in new patch "x86/sgx: Support VA page
allocation without reclaiming".
- While refactoring, change existing code to consistently use
IS_ALIGNED(). (Jarkko)
- Many patches received a tag from Jarkko.
- Many smaller changes, please refer to individual patches.
V2: https://lore.kernel.org/lkml/cover.1644274683.git.reinette.chatre@intel.com/
Changes since V2 that directly impact user space:
- Maximum allowed permissions of dynamically added pages is RWX,
previously limited to RW. (Jarkko)
Dynamically added pages are initially created with architecturally
limited EPCM permissions of RW. mmap() and mprotect() of these pages
with RWX permissions would no longer be blocked by SGX driver. PROT_EXEC
on dynamically added pages will be possible after running ENCLU[EMODPE]
from within the enclave with appropriate VMA permissions.
- The kernel no longer attempts to track the EPCM runtime permissions. (Jarkko)
Consequences are:
- Kernel does not modify PTEs to follow EPCM permissions. User space
will receive #PF with SGX error code in cases where the V2
implementation would have resulted in regular (non-SGX) page fault
error code.
- SGX_IOC_ENCLAVE_RELAX_PERMISSIONS is removed. This ioctl() was used
to clear PTEs after permissions were modified from within the enclave
and ensure correct PTEs are installed. Since PTEs no longer track
EPCM permissions the changes in EPCM permissions would not impact PTEs.
As long as new permissions are within the maximum vetted permissions
(vm_max_prot_bits) only ENCLU[EMODPE] from within enclave is needed,
as accompanied by appropriate VMA permissions.
- struct sgx_enclave_restrict_perm renamed to
sgx_enclave_restrict_permissions (Jarkko)
- struct sgx_enclave_modt renamed to struct sgx_enclave_modify_type
to be consistent with the verbose naming of other SGX uapi structs.
Details about changes since V2 that do not directly impact user space:
- Kernel no longer tracks the runtime EPCM permissions with the aim of
installing accurate PTEs. (Jarkko)
- In support of this change the following patches were removed:
Documentation/x86: Document SGX permission details
x86/sgx: Support VMA permissions more relaxed than enclave permissions
x86/sgx: Add pfn_mkwrite() handler for present PTEs
x86/sgx: Add sgx_encl_page->vm_run_prot_bits for dynamic permission changes
x86/sgx: Support relaxing of enclave page permissions
- No more handling of scenarios where VMA permissions may be more
relaxed than what the EPCM allows. Enclaves are not prevented
from accessing such pages and the EPCM permissions are entrusted
to control access as supported by the SGX error code in page faults.
- No more explicit setting of protection bits in page fault handler.
Protection bits are inherited from VMA similar to SGX1 support.
- Selftest patches are moved to the end of the series. (Jarkko)
- New patch contributed by Jarkko to avoid duplicated code:
x86/sgx: Export sgx_encl_page_alloc()
- New patch separating changes from existing patch. (Jarkko)
x86/sgx: Export sgx_encl_{grow,shrink}()
- New patch to keep one required benefit from the (now removed) kernel
EPCM permission tracking:
x86/sgx: Support loading enclave page without VMA permissions check
- Updated cover letter to reflect architecture changes.
- Many smaller changes, please refer to individual patches.
V1: https://lore.kernel.org/linux-sgx/cover.1638381245.git.reinette.chatre@inte…
Changes since V1 that directly impact user space:
- SGX2 permission changes changed from a single ioctl() named
SGX_IOC_PAGE_MODP to two new ioctl()s:
SGX_IOC_ENCLAVE_RELAX_PERMISSIONS and
SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS, supported by two different
parameter structures (SGX_IOC_ENCLAVE_RELAX_PERMISSIONS does
not support a result output parameter) (Jarkko).
User space flow impact: After user space runs ENCLU[EMODPE] it
needs to call SGX_IOC_ENCLAVE_RELAX_PERMISSIONS to have PTEs
updated. Previously running SGX_IOC_PAGE_MODP in this scenario
resulted in EPCM.PR being set but calling
SGX_IOC_ENCLAVE_RELAX_PERMISSIONS will not result in EPCM.PR
being set anymore and thus no need for an additional
ENCLU[EACCEPT].
- SGX_IOC_ENCLAVE_RELAX_PERMISSIONS and
SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
obtain new permissions from secinfo as parameter instead of
the permissions directly (Jarkko).
- ioctl() supporting SGX2 page type change is renamed from
SGX_IOC_PAGE_MODT to SGX_IOC_ENCLAVE_MODIFY_TYPE (Jarkko).
- SGX_IOC_ENCLAVE_MODIFY_TYPE obtains new page type from secinfo
as parameter instead of the page type directly (Jarkko).
- ioctl() supporting SGX2 page removal is renamed from
SGX_IOC_PAGE_REMOVE to SGX_IOC_ENCLAVE_REMOVE_PAGES (Jarkko).
- All ioctl() parameter structures have been renamed as a result of the
ioctl() renaming:
SGX_IOC_ENCLAVE_RELAX_PERMISSIONS => struct sgx_enclave_relax_perm
SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS => struct sgx_enclave_restrict_perm
SGX_IOC_ENCLAVE_MODIFY_TYPE => struct sgx_enclave_modt
SGX_IOC_ENCLAVE_REMOVE_PAGES => struct sgx_enclave_remove_pages
Changes since V1 that do not directly impact user space:
- Number of patches in series increased from 25 to 32 primarily because
of splitting the original submission:
- Wrappers for the new SGX2 functions are introduced in three separate
patches replacing the original "x86/sgx: Add wrappers for SGX2
functions"
(Jarkko).
- Moving and renaming sgx_encl_ewb_cpumask() is done with two patches
replacing the original "x86/sgx: Use more generic name for enclave
cpumask function" (Jarkko).
- Support for SGX2 EPCM permission changes is split into two ioctls(),
one for relaxing and one for restricting permissions, each introduced
by a new patch replacing the original "x86/sgx: Support enclave page
permission changes" (Jarkko).
- Extracted code used by existing ioctls() for usage by new ioctl()s
into a new utility in new patch "x86/sgx: Create utility to validate
user provided offset and length" (Dave did not specifically ask for
this but it addresses his review feedback).
- Two new Documentation patches to support the SGX2 work
("Documentation/x86: Introduce enclave runtime management") and
a dedicated section on the enclave permission management
("Documentation/x86: Document SGX permission details") (Andy).
- Most patches were reworked to improve the language by:
* aiming to refer to exact item instead of English rephrasing (Jarkko).
* use ioctl() instead of ioctl throughout (Dave).
* Use "relaxed" instead of "exceed" when referring to permissions
(Dave).
- Improved documentation with several additions to
Documentation/x86/sgx.rst.
- Many smaller changes, please refer to individual patches.
Hi Everybody,
The current Linux kernel support for SGX includes support for SGX1 that
requires that an enclave be created with properties that accommodate all
usages over its (the enclave's) lifetime. This includes properties such
as permissions of enclave pages, the number of enclave pages, and the
number of threads supported by the enclave.
Consequences of this requirement to have the enclave be created to
accommodate all usages include:
* pages needing to support relocated code are required to have RWX
permissions for their entire lifetime,
* an enclave needs to be created with the maximum stack and heap
projected to be needed during the enclave's entire lifetime which
can be longer than the processes running within it,
* an enclave needs to be created with support for the maximum number
of threads projected to run in the enclave.
Since SGX1 a few more functions were introduced, collectively called
SGX2, that support modifications to an initialized enclave. Hardware
supporting these functions are already available as listed on
https://github.com/ayeks/SGX-hardware
This series adds support for SGX2, also referred to as Enclave Dynamic
Memory Management (EDMM). This includes:
* Support modifying EPCM permissions of regular enclave pages belonging
to an initialized enclave. Only permission restriction is supported
via a new ioctl() SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS. Relaxing of
EPCM permissions can only be done from within the enclave with the
SGX instruction ENCLU[EMODPE].
* Support dynamic addition of regular enclave pages to an initialized
enclave. At creation new pages are architecturally limited to RW EPCM
permissions but will be accessible with PROT_EXEC after the enclave
runs ENCLU[EMODPE] to relax EPCM permissions to RWX.
Pages are dynamically added to an initialized enclave from the SGX
page fault handler.
* Support expanding an initialized enclave to accommodate more threads.
More threads can be accommodated by an enclave with the addition of
Thread Control Structure (TCS) pages that is done by changing the
type of regular enclave pages to TCS pages using a new ioctl()
SGX_IOC_ENCLAVE_MODIFY_TYPES.
* Support removing regular and TCS pages from an initialized enclave.
Removing pages is accomplished in two stages as supported by two new
ioctl()s SGX_IOC_ENCLAVE_MODIFY_TYPES (same ioctl() as mentioned in
previous bullet) and SGX_IOC_ENCLAVE_REMOVE_PAGES.
* Tests covering all the new flows, some edge cases, and one
comprehensive stress scenario.
No additional work is needed to support SGX2 in a virtualized
environment. All tests included in this series passed when run from
a guest as tested with the recent QEMU release based on 6.2.0
that supports SGX.
Patches 1 through 14 prepare the existing code for SGX2 support by
introducing the SGX2 functions, refactoring code, and tracking enclave
page types.
Patches 15 through 21 enable the SGX2 features and include a
Documentation patch.
Patches 22 through 31 test several scenarios of all the enabled
SGX2 features.
This series is based on v5.18-rc2.
Your feedback will be greatly appreciated.
Regards,
Reinette
Jarkko Sakkinen (1):
x86/sgx: Export sgx_encl_page_alloc()
Reinette Chatre (30):
x86/sgx: Add short descriptions to ENCLS wrappers
x86/sgx: Add wrapper for SGX2 EMODPR function
x86/sgx: Add wrapper for SGX2 EMODT function
x86/sgx: Add wrapper for SGX2 EAUG function
x86/sgx: Support loading enclave page without VMA permissions check
x86/sgx: Export sgx_encl_ewb_cpumask()
x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask()
x86/sgx: Move PTE zap code to new sgx_zap_enclave_ptes()
x86/sgx: Make sgx_ipi_cb() available internally
x86/sgx: Create utility to validate user provided offset and length
x86/sgx: Keep record of SGX page type
x86/sgx: Export sgx_encl_{grow,shrink}()
x86/sgx: Support VA page allocation without reclaiming
x86/sgx: Support restricting of enclave page permissions
x86/sgx: Support adding of pages to an initialized enclave
x86/sgx: Tighten accessible memory range after enclave initialization
x86/sgx: Support modifying SGX page type
x86/sgx: Support complete page removal
x86/sgx: Free up EPC pages directly to support large page ranges
Documentation/x86: Introduce enclave runtime management section
selftests/sgx: Add test for EPCM permission changes
selftests/sgx: Add test for TCS page permission changes
selftests/sgx: Test two different SGX2 EAUG flows
selftests/sgx: Introduce dynamic entry point
selftests/sgx: Introduce TCS initialization enclave operation
selftests/sgx: Test complete changing of page type flow
selftests/sgx: Test faulty enclave behavior
selftests/sgx: Test invalid access to removed enclave page
selftests/sgx: Test reclaiming of untouched page
selftests/sgx: Page removal stress test
Documentation/x86/sgx.rst | 15 +
arch/x86/include/asm/sgx.h | 8 +
arch/x86/include/uapi/asm/sgx.h | 61 +
arch/x86/kernel/cpu/sgx/encl.c | 329 +++-
arch/x86/kernel/cpu/sgx/encl.h | 15 +-
arch/x86/kernel/cpu/sgx/encls.h | 33 +
arch/x86/kernel/cpu/sgx/ioctl.c | 640 +++++++-
arch/x86/kernel/cpu/sgx/main.c | 75 +-
arch/x86/kernel/cpu/sgx/sgx.h | 3 +
tools/testing/selftests/sgx/defines.h | 23 +
tools/testing/selftests/sgx/load.c | 41 +
tools/testing/selftests/sgx/main.c | 1435 +++++++++++++++++
tools/testing/selftests/sgx/main.h | 1 +
tools/testing/selftests/sgx/test_encl.c | 68 +
.../selftests/sgx/test_encl_bootstrap.S | 6 +
15 files changed, 2625 insertions(+), 128 deletions(-)
base-commit: ce522ba9ef7e2d9fb22a39eb3371c0c64e2a433e
--
2.25.1