Hi Linus,
Please pull this kselftest fixes update for Linux 6.10.
This kselftest fixes update for Linux 6.10 consists of fixes to clang
build failures to timerns, vDSO tests and fixes to vDSO makefile.
Note: makefile fixes are included to avoid conflicts during 6.11 merge
window.
diff is attached.
thanks,
-- Shuah
----------------------------------------------------------------
The following changes since commit 48236960c06d32370bfa6f2cc408e786873262c8:
selftests/resctrl: Fix non-contiguous CBM for AMD (2024-06-26 13:22:34 -0600)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux_kselftest-fixes-6.10
for you to fetch changes up to 66cde337fa1b7c6cf31f856fa015bd91a4d383e7:
selftests/vDSO: remove duplicate compiler invocations from Makefile (2024-07-05 14:12:34 -0600)
----------------------------------------------------------------
linux_kselftest-fixes-6.10
This kselftest fixes update for Linux 6.10 consists of fixes to clang
build failures to timerns, vDSO tests and fixes to vDSO makefile.
----------------------------------------------------------------
John Hubbard (4):
selftest/timerns: fix clang build failures for abs() calls
selftests/vDSO: fix clang build errors and warnings
selftests/vDSO: remove partially duplicated "all:" target in Makefile
selftests/vDSO: remove duplicate compiler invocations from Makefile
tools/testing/selftests/timens/exec.c | 6 ++---
tools/testing/selftests/timens/timer.c | 2 +-
tools/testing/selftests/timens/timerfd.c | 2 +-
tools/testing/selftests/timens/vfork_exec.c | 4 +--
tools/testing/selftests/vDSO/Makefile | 29 +++++++++-------------
tools/testing/selftests/vDSO/parse_vdso.c | 16 ++++++++----
.../selftests/vDSO/vdso_standalone_test_x86.c | 18 ++++++++++++--
7 files changed, 46 insertions(+), 31 deletions(-)
----------------------------------------------------------------
For cgroup v1, if turned on, and there's any cgroup in the "cpu" hierarchy it
needs an RT budget assigned, otherwise the processes in it will not be able to
get RT at all. The problem with RT group scheduling is that it requires the
budget assigned but there's no way we could assign a default budget, since the
values to assign are both upper and lower time limits, are absolute, and need to
be sum up to < 1 for each individal cgroup. That means we cannot really come up
with values that would work by default in the general case.[1]
For cgroup v2, it's almost unusable as well. If it turned on, the cpu controller
can only be enabled when all RT processes are in the root cgroup. But it will
lose the benefits of cgroup v2 if all RT process were placed in the same cgroup.
Red Hat, Gentoo, Arch Linux and Debian all disable it. systemd also doesn't
support it.[2]
I leave tools/testing/selftests/bpf/config.{s390x,aarch64} untouched because
I don't whether bpf testing requires it.
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1229700
[2]: https://github.com/systemd/systemd/issues/13781#issuecomment-549164383
Celeste Liu (6):
riscv: defconfig: drop RT_GROUP_SCHED=y
loongarch: defconfig: drop RT_GROUP_SCHED=y
mips: defconfig: drop RT_GROUP_SCHED=y from generic/db1xxx/eyeq5
powerpc: defconfig: drop RT_GROUP_SCHED=y from ppc6xx_defconfig
sh: defconfig: drop RT_GROUP_SCHED=y from sdk7786/urquell
arm: defconfig: drop RT_GROUP_SCHED=y from bcm2855/tegra/omap2plus
arch/arm/configs/bcm2835_defconfig | 1 -
arch/arm/configs/omap2plus_defconfig | 1 -
arch/arm/configs/tegra_defconfig | 1 -
arch/loongarch/configs/loongson3_defconfig | 1 -
arch/mips/configs/db1xxx_defconfig | 1 -
arch/mips/configs/eyeq5_defconfig | 1 -
arch/mips/configs/generic_defconfig | 1 -
arch/powerpc/configs/ppc6xx_defconfig | 1 -
arch/riscv/configs/defconfig | 1 -
arch/sh/configs/sdk7786_defconfig | 1 -
arch/sh/configs/urquell_defconfig | 1 -
11 files changed, 11 deletions(-)
--
2.45.1
in randomize function, there is a open function, but there is no
close function in the randomize, which is easy to cause memory leaks.
Signed-off-by: Liu Jing <liujing(a)cmss.chinamobile.com>
---
tools/testing/selftests/net/tcp_mmap.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/net/tcp_mmap.c b/tools/testing/selftests/net/tcp_mmap.c
index 4fcce5150850..ab305e262d0a 100644
--- a/tools/testing/selftests/net/tcp_mmap.c
+++ b/tools/testing/selftests/net/tcp_mmap.c
@@ -438,6 +438,7 @@ static void randomize(void *target, size_t count)
perror("read /dev/urandom");
exit(1);
}
+ close(urandom);
}
int main(int argc, char *argv[])
--
2.33.0
+ Dave Miller, Jakub Kicinski, Paolo Abeni, Shuah Khan,
linux-kselftest(a)vger.kernel.org
On Mon, Jul 08, 2024 at 09:04:05PM +0000, zijianzhang(a)bytedance.com wrote:
> From: Zijian Zhang <zijianzhang(a)bytedance.com>
>
> We update selftests/net/msg_zerocopy.c to accommodate the new mechanism,
> cfg_notification_limit has the same semantics for both methods. Test
> results are as follows, we update skb_orphan_frags_rx to the same as
> skb_orphan_frags to support zerocopy in the localhost test.
>
> cfg_notification_limit = 1, both method get notifications after 1 calling
> of sendmsg. In this case, the new method has around 17% cpu savings in TCP
> and 23% cpu savings in UDP.
> +---------------------+---------+---------+---------+---------+
> | Test Type / Protocol| TCP v4 | TCP v6 | UDP v4 | UDP v6 |
> +---------------------+---------+---------+---------+---------+
> | ZCopy (MB) | 7523 | 7706 | 7489 | 7304 |
> +---------------------+---------+---------+---------+---------+
> | New ZCopy (MB) | 8834 | 8993 | 9053 | 9228 |
> +---------------------+---------+---------+---------+---------+
> | New ZCopy / ZCopy | 117.42% | 116.70% | 120.88% | 126.34% |
> +---------------------+---------+---------+---------+---------+
>
> cfg_notification_limit = 32, both get notifications after 32 calling of
> sendmsg, which means more chances to coalesce notifications, and less
> overhead of poll + recvmsg for the original method. In this case, the new
> method has around 7% cpu savings in TCP and slightly better cpu usage in
> UDP. In the context of selftest, notifications of TCP are more likely to
> out of order than UDP, it's easier to coalesce more notifications in UDP.
> The original method can get one notification with range of 32 in a recvmsg
> most of the time. In TCP, most notifications' range is around 2, so the
> original method needs around 16 recvmsgs to get notified in one round.
> That's the reason for the "New ZCopy / ZCopy" diff in TCP and UDP here.
> +---------------------+---------+---------+---------+---------+
> | Test Type / Protocol| TCP v4 | TCP v6 | UDP v4 | UDP v6 |
> +---------------------+---------+---------+---------+---------+
> | ZCopy (MB) | 8842 | 8735 | 10072 | 9380 |
> +---------------------+---------+---------+---------+---------+
> | New ZCopy (MB) | 9366 | 9477 | 10108 | 9385 |
> +---------------------+---------+---------+---------+---------+
> | New ZCopy / ZCopy | 106.00% | 108.28% | 100.31% | 100.01% |
> +---------------------+---------+---------+---------+---------+
>
> In conclusion, when notification interval is small or notifications are
> hard to be coalesced, the new mechanism is highly recommended. Otherwise,
> the performance gain from the new mechanism is very limited.
>
> Signed-off-by: Zijian Zhang <zijianzhang(a)bytedance.com>
> Signed-off-by: Xiaochun Lu <xiaochun.lu(a)bytedance.com>
> ---
> tools/testing/selftests/net/msg_zerocopy.c | 111 ++++++++++++++++++--
> tools/testing/selftests/net/msg_zerocopy.sh | 1 +
> 2 files changed, 105 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/net/msg_zerocopy.c b/tools/testing/selftests/net/msg_zerocopy.c
...
> @@ -466,6 +504,44 @@ static void do_recv_completions(int fd, int domain)
> sends_since_notify = 0;
> }
>
> +static void do_recv_completions2(void)
> +{
> + struct cmsghdr *cm = (struct cmsghdr *)zc_ckbuf;
> + struct zc_info *zc_info;
> + __u32 hi, lo, range;
> + __u8 zerocopy;
> + int i;
> +
> + zc_info = (struct zc_info *)CMSG_DATA(cm);
> + for (i = 0; i < zc_info->size; i++) {
> + hi = zc_info->arr[i].hi;
> + lo = zc_info->arr[i].lo;
> + zerocopy = zc_info->arr[i].zerocopy;
> + range = hi - lo + 1;
> +
> + if (cfg_verbose && lo != next_completion)
> + fprintf(stderr, "gap: %u..%u does not append to %u\n",
> + lo, hi, next_completion);
> + next_completion = hi + 1;
> +
> + if (zerocopied == -1)
> + zerocopied = zerocopy;
> + else if (zerocopied != zerocopy) {
> + fprintf(stderr, "serr: inconsistent\n");
> + zerocopied = zerocopy;
> + }
nit: If any arms of a conditional have {}, then all arms should have them
> +
> + completions += range;
> +
> + if (cfg_verbose >= 2)
> + fprintf(stderr, "completed: %u (h=%u l=%u)\n",
> + range, hi, lo);
> + }
> +
> + sends_since_notify = 0;
> + added_zcopy_info = false;
> +}
...
From: Geliang Tang <tanggeliang(a)kylinos.cn>
Run this BPF selftests (./test_progs -t sockmap_basic) on a Loongarch
platform, a kernel panic occurs:
'''
Oops[#1]:
CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18
Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018
... ...
ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560
ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0
CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
PRMD: 0000000c (PPLV0 +PIE +PWE)
EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
BADV: 0000000000000040
PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack
Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...)
Stack : ...
...
Call Trace:
[<9000000004162774>] copy_page_to_iter+0x74/0x1c0
[<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560
[<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0
[<90000000049aae34>] inet_recvmsg+0x54/0x100
[<900000000481ad5c>] sock_recvmsg+0x7c/0xe0
[<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0
[<900000000481e27c>] sys_recvfrom+0x1c/0x40
[<9000000004c076ec>] do_syscall+0x8c/0xc0
[<9000000003731da4>] handle_syscall+0xc4/0x160
Code: ...
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Fatal exception
Kernel relocated by 0x3510000
.text @ 0x9000000003710000
.data @ 0x9000000004d70000
.bss @ 0x9000000006469400
---[ end Kernel panic - not syncing: Fatal exception ]---
'''
This crash happens every time when running sockmap_skb_verdict_shutdown
subtest in sockmap_basic.
This crash is because a NULL pointer is passed to page_address() in
sk_msg_recvmsg(). Due to the difference implementations depending on the
architecture, page_address(NULL) will trigger a panic on Loongarch
platform but not on X86 platform. So this bug was hidden on X86 platform
for a while, but now it is exposed on Loongarch platform.
The root cause is a zero length skb (skb->len == 0) is put on the queue.
This zero length skb is a TCP FIN packet, which is sent by shutdown(),
invoked in test_sockmap_skb_verdict_shutdown():
shutdown(p1, SHUT_WR);
In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no
page is put to this sge (see sg_set_page in sg_set_page), but this empty
sge is queued into ingress_msg list.
And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by
sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it
to kmap_local_page() and to page_address(), then kernel panics.
To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(),
if copy is zero, that means it's a zero length skb, skip invoking
copy_page_to_iter(). We are using the EFAULT return triggered by
copy_page_to_iter to check for is_fin in tcp_bpf.c.
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Suggested-by: John Fastabend <john.fastabend(a)gmail.com>
Signed-off-by: Geliang Tang <tanggeliang(a)kylinos.cn>
---
v5:
- update v5 as John suggested.
- skmsg: skip zero length skb in sk_msg_recvmsg
v4:
- skmsg: skip empty sge in sk_msg_recvmsg
v3:
- skmsg: prevent empty ingress skb from enqueuing
v2:
- skmsg: null check for sg_page in sk_msg_recvmsg
---
net/core/skmsg.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index fd20aae30be2..bbf40b999713 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -434,7 +434,8 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
page = sg_page(sge);
if (copied + copy > len)
copy = len - copied;
- copy = copy_page_to_iter(page, sge->offset, copy, iter);
+ if (copy)
+ copy = copy_page_to_iter(page, sge->offset, copy, iter);
if (!copy) {
copied = copied ? copied : -EFAULT;
goto out;
--
2.43.0
Changes v3:
- Reworked patch 2.
- Changed minor things in patch 1 like function name and made
corrections to the patch message.
Changes v2:
- Removed patches 2 and 3 since now this part will be supported by the
kernel.
Sub-Numa Clustering (SNC) allows splitting CPU cores, caches and memory
into multiple NUMA nodes. When enabled, NUMA-aware applications can
achieve better performance on bigger server platforms.
SNC support in the kernel is currently in review [1]. With SNC enabled
and kernel support in place all the tests will function normally (aside
from effective cache size). There might be a problem when SNC is enabled
but the system is still using an older kernel version without SNC
support. Currently the only message displayed in that situation is a
guess that SNC might be enabled and is causing issues. That message also
is displayed whenever the test fails on an Intel platform.
Add a mechanism to discover kernel support for SNC which will add more
meaning and certainty to the error message.
Add runtime SNC mode detection and verify how reliable that information
is.
Series was tested on Ice Lake server platforms with SNC disabled, SNC-2
and SNC-4. The tests were also ran with and without kernel support for
SNC.
Series applies cleanly on kselftest/next.
[1] https://lore.kernel.org/all/20240628215619.76401-1-tony.luck@intel.com/
Previous versions of this series:
[v1] https://lore.kernel.org/all/cover.1709721159.git.maciej.wieczor-retman@inte…
[v2] https://lore.kernel.org/all/cover.1715769576.git.maciej.wieczor-retman@inte…
Maciej Wieczor-Retman (2):
selftests/resctrl: Adjust effective L3 cache size with SNC enabled
selftests/resctrl: Adjust SNC support messages
tools/testing/selftests/resctrl/cache.c | 3 +
tools/testing/selftests/resctrl/cmt_test.c | 4 +-
tools/testing/selftests/resctrl/mba_test.c | 4 +
tools/testing/selftests/resctrl/mbm_test.c | 6 +-
tools/testing/selftests/resctrl/resctrl.h | 8 +
.../testing/selftests/resctrl/resctrl_tests.c | 7 +
tools/testing/selftests/resctrl/resctrlfs.c | 138 ++++++++++++++++++
7 files changed, 166 insertions(+), 4 deletions(-)
--
2.45.2
Even if a vgem device is configured in, we will skip the import_vgem_fd()
test almost every time.
TAP version 13
1..11
# Testing heap: system
# =======================================
# Testing allocation and importing:
ok 1 # SKIP Could not open vgem -1
The problem is that we use the DRM_IOCTL_VERSION ioctl to query the driver
version information but leave the name field a non-null-terminated string.
Terminate it properly to actually test against the vgem device.
Signed-off-by: Zenghui Yu <yuzenghui(a)huawei.com>
---
tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c b/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c
index 5f541522364f..2fcc74998fa9 100644
--- a/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c
+++ b/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c
@@ -32,6 +32,8 @@ static int check_vgem(int fd)
if (ret)
return 0;
+ name[4] = '\0';
+
return !strcmp(name, "vgem");
}
--
2.33.0