Nested translation is a hardware feature that is supported by many modern
IOMMU hardwares. It has two stages (stage-1, stage-2) address translation
to get access to the physical address. stage-1 translation table is owned
by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes
to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page
table. As the below diagram shows, guest I/O page table pointer in GPA
(guest physical address) is passed to host and be used to perform the stage-1
address translation. Along with it, modifications to present mappings in the
guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------.
| vIOMMU | | Guest I/O page table |
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush --+
'-------------' |
| | V
| | I/O page table pointer in GPA
'-------------'
Guest
------| Shadow |---------------------------|--------
v v v
Host
.-------------. .------------------------.
| pIOMMU | | FS for GIOVA->GPA |
| | '------------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.----------------------------------.
| | | SS for GPA->HPA, unmanaged domain|
| | '----------------------------------'
'-------------'
Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
In IOMMUFD, all the translation tables are tracked by hw_pagetable (hwpt)
and each has an iommu_domain allocated from iommu driver. So in this series
hw_pagetable and iommu_domain means the same thing if no special note.
IOMMUFD has already supported allocating hw_pagetable that is linked with
an IOAS. However, nesting requires IOMMUFD to allow allocating hw_pagetable
with driver specific parameters and interface to sync stage-1 IOTLB as user
owns the stage-1 translation table.
This series is based on the iommu hw info reporting series [1]. It first
introduces new iommu op for allocating domains with user data and the op
for invalidate stage-1 IOTLB, and then extend the IOMMUFD internal infrastructure
to accept user_data and parent hwpt, then relay the data to iommu core to
allocate user iommu_domain. After it, extends the ioctl IOMMU_HWPT_ALLOC to
accept user data and stage-2 hwpt ID to allocate hwpt. Along with it, ioctl
IOMMU_HWPT_INVALIDATE is added to invalidate stage-1 IOTLB. This is needed
for user-managed hwpts. Selftest is added as well to cover the new ioctls.
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks
them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20230724105936.107042-1-yi.l.liu@intel.…
[2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
[3] https://github.com/yiliu1765/qemu/tree/wip/iommufd_rfcv4_nesting
Change log:
v3:
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for
sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence
userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O
page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.c…
- Add union iommu_domain_user_data to include all user data structures to avoid
passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.…
Thanks,
Yi Liu
Lu Baolu (2):
iommu: Add new iommu op to create domains owned by userspace
iommu: Add nested domain support
Nicolin Chen (6):
iommufd/hw_pagetable: Do not populate user-managed hw_pagetables
iommufd: Only enforce IOMMU_RESV_SW_MSI when attaching user-managed
HWPT
iommufd/selftest: Add domain_alloc_user() support in iommu mock
iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with user data
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (9):
iommufd/hw_pagetable: Use domain_alloc_user op for domain allocation
iommufd: Pass in hwpt_type/parent/user_data to
iommufd_hw_pagetable_alloc()
iommufd: Add IOMMU_RESV_IOVA_RANGES
iommufd: IOMMU_HWPT_ALLOC allocation with user data
iommufd: Add IOMMU_HWPT_INVALIDATE
iommufd/selftest: Add a helper to get test device
iommufd/selftest: Add IOMMU_TEST_OP_DEV_[ADD|DEL]_RESERVED to add/del
reserved regions to selftest device
iommufd/selftest: Add .get_resv_regions() for mock_dev
iommufd/selftest: Add coverage for IOMMU_RESV_IOVA_RANGES
drivers/iommu/iommufd/device.c | 9 +-
drivers/iommu/iommufd/hw_pagetable.c | 181 +++++++++++-
drivers/iommu/iommufd/io_pagetable.c | 5 +-
drivers/iommu/iommufd/iommufd_private.h | 20 +-
drivers/iommu/iommufd/iommufd_test.h | 36 +++
drivers/iommu/iommufd/main.c | 59 +++-
drivers/iommu/iommufd/selftest.c | 266 ++++++++++++++++--
include/linux/iommu.h | 34 +++
include/uapi/linux/iommufd.h | 96 ++++++-
tools/testing/selftests/iommu/iommufd.c | 224 ++++++++++++++-
tools/testing/selftests/iommu/iommufd_utils.h | 70 +++++
11 files changed, 958 insertions(+), 42 deletions(-)
--
2.34.1
test_kmem_basic creates 100,000 negative dentries, with each one mapping
to a slab object. After memory.high is set, these are reclaimed through
the shrink_slab function call which reclaims all 100,000 entries. The
test passes the majority of the time because when slab1 is calculated,
it is often above 0, however, 0 is also an acceptable value.
Signed-off-by: Lucas Karpinski <lkarpins(a)redhat.com>
---
https://lore.kernel.org/all/m6jbt5hzq27ygt3l4xyiaxxb7i5auvb2lahbcj4yaxxigqz…
V2: Corrected title
tools/testing/selftests/cgroup/test_kmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c
index 258ddc565deb..ba0a0bfc5a98 100644
--- a/tools/testing/selftests/cgroup/test_kmem.c
+++ b/tools/testing/selftests/cgroup/test_kmem.c
@@ -71,7 +71,7 @@ static int test_kmem_basic(const char *root)
cg_write(cg, "memory.high", "1M");
slab1 = cg_read_key_long(cg, "memory.stat", "slab ");
- if (slab1 <= 0)
+ if (slab1 < 0)
goto cleanup;
current = cg_read_long(cg, "memory.current");
--
2.41.0
The following error happens:
In file included from vstate_exec_nolibc.c:2:
/usr/include/riscv64-linux-gnu/sys/prctl.h:42:12: error: conflicting types for ‘prctl’; h
ave ‘int(int, ...)’
42 | extern int prctl (int __option, ...) __THROW;
| ^~~~~
In file included from ./../../../../include/nolibc/nolibc.h:99,
from <command-line>:
./../../../../include/nolibc/sys.h:892:5: note: previous definition of ‘prctl’ with type
‘int(int, long unsigned int, long unsigned int, long unsigned int, long unsigned int)
’
892 | int prctl(int option, unsigned long arg2, unsigned long arg3,
| ^~~~~
Fix this by not including <sys/prctl.h>, which is not needed here since
prctl syscall is directly called using its number.
Fixes: 7cf6198ce22d ("selftests: Test RISC-V Vector prctl interface")
Signed-off-by: Alexandre Ghiti <alexghiti(a)rivosinc.com>
---
tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c b/tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c
index 5cbc392944a6..2c0d2b1126c1 100644
--- a/tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c
+++ b/tools/testing/selftests/riscv/vector/vstate_exec_nolibc.c
@@ -1,6 +1,4 @@
// SPDX-License-Identifier: GPL-2.0-only
-#include <sys/prctl.h>
-
#define THIS_PROGRAM "./vstate_exec_nolibc"
int main(int argc, char **argv)
--
2.39.2
[ Upstream commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 ]
Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
u8 val;
int ret;
ret = kstrtou8(buf, 10, &val);
if (ret)
return ret;
mutex_lock(&test_fw_mutex);
*(u8 *)cfg = val;
mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
static ssize_t config_num_requests_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
int rc;
mutex_lock(&test_fw_mutex);
if (test_fw_config->reqs) {
pr_err("Must call release_all_firmware prior to changing config\n");
rc = -EINVAL;
mutex_unlock(&test_fw_mutex);
goto out;
}
mutex_unlock(&test_fw_mutex);
// NOTE: HERE is the race!!! Function can be preempted!
// test_fw_config->reqs can change between the release of
// the lock about and acquire of the lock in the
// test_dev_config_update_u8()
rc = test_dev_config_update_u8(buf, count,
&test_fw_config->num_requests);
out:
return rc;
}
static ssize_t config_read_fw_idx_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
return test_dev_config_update_u8(buf, count,
&test_fw_config->read_fw_idx);
}
The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.
To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.
Having two locks wouldn't assure a race-proof mutual exclusion.
This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:
static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
mutex_lock(&test_fw_mutex);
ret = __test_dev_config_update_u8(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
}
doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.
Fixes: c92316bf8e948 ("test_firmware: add batched firmware tests")
Cc: Luis R. Rodriguez <mcgrof(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Russ Weight <russell.h.weight(a)intel.com>
Cc: Takashi Iwai <tiwai(a)suse.de>
Cc: Tianfei Zhang <tianfei.zhang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Colin Ian King <colin.i.king(a)gmail.com>
Cc: Randy Dunlap <rdunlap(a)infradead.org>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # v5.4, 4.19
Suggested-by: Dan Carpenter <error27(a)gmail.com>
Link: https://lore.kernel.org/r/20230509084746.48259-1-mirsad.todorovac@alu.unizg…
Signed-off-by: Mirsad Todorovac <mirsad.todorovac(a)alu.unizg.hr>
[ This is the patch to fix the racing condition in locking for the 5.4, ]
[ 4.19 and 4.4 stable branches. Not all the fixes from the upstream ]
[ commit apply, but those which do are verbatim equal to those in the ]
[ upstream commit. ]
---
v2:
bundled locking and ENOSPC patches together.
tested on 5.4 and 4.19 stable.
lib/test_firmware.c | 37 ++++++++++++++++++++++++++++---------
1 file changed, 28 insertions(+), 9 deletions(-)
diff --git a/lib/test_firmware.c b/lib/test_firmware.c
index 38553944e967..92d7195d5b5b 100644
--- a/lib/test_firmware.c
+++ b/lib/test_firmware.c
@@ -301,16 +301,26 @@ static ssize_t config_test_show_str(char *dst,
return len;
}
-static int test_dev_config_update_bool(const char *buf, size_t size,
- bool *cfg)
+static inline int __test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
{
int ret;
- mutex_lock(&test_fw_mutex);
if (strtobool(buf, cfg) < 0)
ret = -EINVAL;
else
ret = size;
+
+ return ret;
+}
+
+static int test_dev_config_update_bool(const char *buf, size_t size,
+ bool *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_bool(buf, size, cfg);
mutex_unlock(&test_fw_mutex);
return ret;
@@ -340,7 +350,7 @@ static ssize_t test_dev_config_show_int(char *buf, int cfg)
return snprintf(buf, PAGE_SIZE, "%d\n", val);
}
-static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+static inline int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
int ret;
long new;
@@ -352,14 +362,23 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
if (new > U8_MAX)
return -EINVAL;
- mutex_lock(&test_fw_mutex);
*(u8 *)cfg = new;
- mutex_unlock(&test_fw_mutex);
/* Always return full write size even if we didn't consume all */
return size;
}
+static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
+{
+ int ret;
+
+ mutex_lock(&test_fw_mutex);
+ ret = __test_dev_config_update_u8(buf, size, cfg);
+ mutex_unlock(&test_fw_mutex);
+
+ return ret;
+}
+
static ssize_t test_dev_config_show_u8(char *buf, u8 cfg)
{
u8 val;
@@ -392,10 +411,10 @@ static ssize_t config_num_requests_store(struct device *dev,
mutex_unlock(&test_fw_mutex);
goto out;
}
- mutex_unlock(&test_fw_mutex);
- rc = test_dev_config_update_u8(buf, count,
- &test_fw_config->num_requests);
+ rc = __test_dev_config_update_u8(buf, count,
+ &test_fw_config->num_requests);
+ mutex_unlock(&test_fw_mutex);
out:
return rc;
--
2.39.3
It seems that the most critical issue with vm.memfd_noexec=2 (the fact
that passing MFD_EXEC would bypass it entirely[1]) has been fixed in
Andrew's tree[2], but there are still some outstanding issues that need
to be addressed:
* The dmesg warnings are pr_warn_once, which on most systems means that
they will be used up by systemd or some other boot process and
userspace developers will never see it. The original patch posted to
the ML used pr_warn_ratelimited but the merged patch had it changed
(with a comment about it being "per review"), but given that the
current warnings are useless, pr_warn_ratelimited makes far more
sense.
* vm.memfd_noexec=2 shouldn't reject old-style memfd_create(2) syscalls
because it will make it far to difficult to ever migrate. Instead it
should imply MFD_EXEC.
* The racheting mechanism for vm.memfd_noexec doesn't make sense as a
security mechanism because a CAP_SYS_ADMIN capable user can create
executable binaries in a hidden tmpfs very easily, not to mention the
many other things they can do.
* The memfd selftests would not exit with a non-zero error code when
certain tests that ran in a forked process (specifically the ones
related to MFD_EXEC and MFD_NOEXEC_SEAL) failed.
(This patchset is based on top of Jeff Xu's patches[2] fixing the
MFD_EXEC bug in vm.memfd_noexec=2.)
[1]: https://lore.kernel.org/all/ZJwcsU0vI-nzgOB_@codewreck.org/
[2]: https://lore.kernel.org/all/20230705063315.3680666-1-jeffxu@google.com/
Aleksa Sarai (3):
memfd: cleanups for vm.memfd_noexec handling
memfd: remove racheting feature from vm.memfd_noexec
selftests: memfd: error out test process when child test fails
include/linux/pid_namespace.h | 16 +++------
kernel/pid_sysctl.h | 7 ----
mm/memfd.c | 32 +++++++----------
tools/testing/selftests/memfd/memfd_test.c | 41 ++++++++++++++++++----
4 files changed, 51 insertions(+), 45 deletions(-)
--
2.41.0
As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:
"Your app should create sockets with IPPROTO_MPTCP as the proto:
( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be
forced to create and use MPTCP sockets instead of TCP ones via the
mptcpize command bundled with the mptcpd daemon."
But the mptcpize (LD_PRELOAD technique) command has some limitations
[2]:
- it doesn't work if the application is not using libc (e.g. GoLang
apps)
- in some envs, it might not be easy to set env vars / change the way
apps are launched, e.g. on Android
- mptcpize needs to be launched with all apps that want MPTCP: we could
have more control from BPF to enable MPTCP only for some apps or all the
ones of a netns or a cgroup, etc.
- it is not in BPF, we cannot talk about it at netdev conf.
So this patchset attempts to use BPF to implement functions similer to
mptcpize.
The main idea is to add a hook in sys_socket() to change the protocol id
from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.
[1]
https://github.com/multipath-tcp/mptcp_net-next/wiki
[2]
https://github.com/multipath-tcp/mptcp_net-next/issues/79
v8:
- drop the additional checks on the 'protocol' value after the
'update_socket_protocol()' call.
v7:
- add __weak and __diag_* for update_socket_protocol.
v6:
- add update_socket_protocol.
v5:
- add bpf_mptcpify helper.
v4:
- use lsm_cgroup/socket_create
v3:
- patch 8: char cmd[128]; -> char cmd[256];
v2:
- Fix build selftests errors reported by CI
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
Geliang Tang (4):
bpf: Add update_socket_protocol hook
selftests/bpf: Use random netns name for mptcp
selftests/bpf: Add two mptcp netns helpers
selftests/bpf: Add mptcpify test
net/mptcp/bpf.c | 17 +++
net/socket.c | 25 ++++
.../testing/selftests/bpf/prog_tests/mptcp.c | 125 ++++++++++++++++--
tools/testing/selftests/bpf/progs/mptcpify.c | 25 ++++
4 files changed, 183 insertions(+), 9 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
--
2.35.3
Hi, Willy
Here is the v5, purely include the ppc parts, with two critical fixups
for the latest gcc 13.1.0 toolchain, now, both run and run-user pass.
Here is the run-user test report:
// with local toolchains
$ for arch in ppc ppc64 ppc64le; do make run-user XARCH=$arch | grep "status: "; done
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
// with latest toolchains
$ for arch in ppc ppc64 ppc64le; do make run-user XARCH=$arch CC=/path/to/gcc-13.1.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc | grep status; file nolibc-test; done
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
nolibc-test: ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1 (SYSV), statically linked, stripped
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
nolibc-test: ELF 64-bit MSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, stripped
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
nolibc-test: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically linked, stripped
Since the missing serial console enabling patch [1] for ppc32 has
already gotten a Reviewed-by line from the ppc maintainer, now, the ppc
defconfig aligns with the others', and it is able to simply move the
nolibc-test-config related stuff to the next tinyconfig series.
Based on v4 [2], beside removing several nolibc-test-config related
patches, two bugs with the latest gcc 13.1.0 have been fixed.
Changes from v4 --> v5:
* tools/nolibc: add support for powerpc64
selftests/nolibc: add XARCH and ARCH mapping support
selftests/nolibc: add test support for ppc64
selftests/nolibc: allow customize CROSS_COMPILE by architecture
selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc
Almost the same as v4.
* tools/nolibc: add support for powerpc
For 32-bit PowerPC, with newer gcc compilers (e.g. gcc 13.1.0),
"omit-frame-pointer" fails with __attribute__((no_stack_protector)) but
works with __attribute__((__optimize__("-fno-stack-protector")))
Using the later for ppc32 to workaround the issue.
* selftests/nolibc: add test support for ppc
Add default CFLAGS for ppc to allow build with the
latest powerpc64-linux-gcc toolchain from
https://mirrors.edge.kernel.org/pub/tools/crosstool/
* selftests/nolibc: add test support for ppc64le
Align with kernel, prefer elfv2 ABI to elfv1 ABI when the toolchain
support, otherwise, ABI mismatched binary will not run.
Best regards,
Zhangjin Wu
---
[1]: https://lore.kernel.org/lkml/bb7b5f9958b3e3a20f6573ff7ce7c5dc566e7e32.16909…
[2]: https://lore.kernel.org/lkml/cover.1690916314.git.falcon@tinylab.org/
Zhangjin Wu (8):
tools/nolibc: add support for powerpc
tools/nolibc: add support for powerpc64
selftests/nolibc: add XARCH and ARCH mapping support
selftests/nolibc: add test support for ppc
selftests/nolibc: add test support for ppc64le
selftests/nolibc: add test support for ppc64
selftests/nolibc: allow customize CROSS_COMPILE by architecture
selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc
tools/include/nolibc/arch-powerpc.h | 213 ++++++++++++++++++++++++
tools/include/nolibc/arch.h | 2 +
tools/testing/selftests/nolibc/Makefile | 74 ++++++--
3 files changed, 277 insertions(+), 12 deletions(-)
create mode 100644 tools/include/nolibc/arch-powerpc.h
--
2.25.1
Hi, Willy, Hi Thomas
v4 here is mainly with a new nolibc-test-config target from your
suggestions and with the reordering of some patches to make
nolibc-test-config be fast forward.
run-user tests for all of the powerpc variants:
$ for arch in ppc ppc64 ppc64le; do make run-user XARCH=$arch | grep status; done
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
165 test(s): 157 passed, 8 skipped, 0 failed => status: warning
and defconfig + run for ppc:
$ make nolibc-test-config XARCH=ppc
$ make run XARCH=ppc
165 test(s): 159 passed, 6 skipped, 0 failed => status: warning
* tools/nolibc: add support for powerpc
tools/nolibc: add support for powerpc64
No change.
* selftests/nolibc: fix up O= option support
selftests/nolibc: add macros to reduce duplicated changes
From tinyconfig-part1 patchset, required by our nolibc-test-config target
Let nolibc-test-config be able to use objtree and the kernel related
macros directly.
* selftests/nolibc: add XARCH and ARCH mapping support
Moved before nolibc-test-config, for the NOLIBC_TEST_CONFIG macro used by
nolibc-test-config target
Willy talked about this twice, let nolibc-test-config be able to use
nolibc-test-$(XARCH).config listed in NOLIBC_TEST_CONFIG directly.
* selftests/nolibc: add nolibc-test-config target
selftests/nolibc: add help for nolibc-test-config target
A new generic nolibc-test-config target is added, allows to enable
additional options for a top-level config target.
defconfig is reserved as an alias of nolibc-test-config.
As suggested by Thomas and Willy.
* selftests/nolibc: add test support for ppc
selftests/nolibc: add test support for ppc64le
selftests/nolibc: add test support for ppc64
Renamed from $(XARCH).config to nolibc-test-$(XARCH).config
As suggested by Willy.
* selftests/nolibc: allow customize CROSS_COMPILE by architecture
selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc
Moved here as suggested by Willy.
Best regards,
Zhangjin
---
[1]: https://lore.kernel.org/lkml/cover.1690468707.git.falcon@tinylab.org/
Zhangjin Wu (12):
tools/nolibc: add support for powerpc
tools/nolibc: add support for powerpc64
selftests/nolibc: fix up O= option support
selftests/nolibc: add macros to reduce duplicated changes
selftests/nolibc: add XARCH and ARCH mapping support
selftests/nolibc: add nolibc-test-config target
selftests/nolibc: add help for nolibc-test-config target
selftests/nolibc: add test support for ppc
selftests/nolibc: add test support for ppc64le
selftests/nolibc: add test support for ppc64
selftests/nolibc: allow customize CROSS_COMPILE by architecture
selftests/nolibc: customize CROSS_COMPILE for 32/64-bit powerpc
tools/include/nolibc/arch-powerpc.h | 202 ++++++++++++++++++
tools/include/nolibc/arch.h | 2 +
tools/testing/selftests/nolibc/Makefile | 157 ++++++++++----
.../nolibc/configs/nolibc-test-ppc.config | 3 +
4 files changed, 327 insertions(+), 37 deletions(-)
create mode 100644 tools/include/nolibc/arch-powerpc.h
create mode 100644 tools/testing/selftests/nolibc/configs/nolibc-test-ppc.config
--
2.25.1
As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:
"Your app should create sockets with IPPROTO_MPTCP as the proto:
( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be
forced to create and use MPTCP sockets instead of TCP ones via the
mptcpize command bundled with the mptcpd daemon."
But the mptcpize (LD_PRELOAD technique) command has some limitations
[2]:
- it doesn't work if the application is not using libc (e.g. GoLang
apps)
- in some envs, it might not be easy to set env vars / change the way
apps are launched, e.g. on Android
- mptcpize needs to be launched with all apps that want MPTCP: we could
have more control from BPF to enable MPTCP only for some apps or all the
ones of a netns or a cgroup, etc.
- it is not in BPF, we cannot talk about it at netdev conf.
So this patchset attempts to use BPF to implement functions similer to
mptcpize.
The main idea is to add a hook in sys_socket() to change the protocol id
from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.
[1]
https://github.com/multipath-tcp/mptcp_net-next/wiki
[2]
https://github.com/multipath-tcp/mptcp_net-next/issues/79
v7:
- add __weak and __diag_* for update_socket_protocol.
v6:
- add update_socket_protocol.
v5:
- add bpf_mptcpify helper.
v4:
- use lsm_cgroup/socket_create
v3:
- patch 8: char cmd[128]; -> char cmd[256];
v2:
- Fix build selftests errors reported by CI
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
Geliang Tang (6):
net: socket: add update_socket_protocol hook
bpf: Register mptcp modret set
selftests/bpf: Add mptcpify program
selftests/bpf: use random netns name for mptcp
selftests/bpf: add two mptcp netns helpers
selftests/bpf: Add mptcpify selftest
net/mptcp/bpf.c | 17 +++
net/socket.c | 26 ++++
.../testing/selftests/bpf/prog_tests/mptcp.c | 125 ++++++++++++++++--
tools/testing/selftests/bpf/progs/mptcpify.c | 25 ++++
4 files changed, 184 insertions(+), 9 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/mptcpify.c
--
2.35.3