November 2023 - Linux-kselftest-mirror

[RFC 0/7] Add SIOV virtual device support

by Yi Liu

Intel SIOV allows creating virtual devices of which the vRID is represented by a pasid of a physical device. It is called as SIOV virtual device in this series. Such devices can be bound to an iommufd as physical device does and then later be attached to an IOAS/hwpt using that pasid. Such PASIDs are called as default pasid. iommufd has already supported pasid attach[1]. So a simple way to support SIOV virtual device attachment is to let device driver call the iommufd_device_pasid_attach() and pass in the default pasid for the virtual device. This should work for now, but it may have problem if iommufd core wants to differentiate the default pasids with other kind of pasids (e.g. pasid given by userspace). In the later forwarding page request to userspace, the default pasids are not supposed to send to userspace as default pasids are mainly used by the SIOV device driver. With above reason, this series chooses to have a new API to bind the default pasid to iommufd, and extends the iommufd_device_attach() to convert the attachment to be pasid attach with the default pasid. Device drivers (e.g. VFIO) that support SIOV shall call the below APIs to interact with iommufd: - iommufd_device_bind_pasid(): Bind virtual device (a pasid of a device) to iommufd; - iommufd_device_attach(): Attach a SIOV virtual device to IOAS/HWPT; - iommufd_device_replace(): Replace IOAS/HWPT of a SIOV virtual device; - iommufd_device_detach(): Detach IOAS/HWPT of a SIOV virtual device; - iommufd_device_unbind(): Unbind virtual device from iommufd; For vfio devices, the device drivers that support SIOV should: - use below API to register vdev for SIOV virtual device vfio_register_pasid_iommu_dev() - use below API to bind vdev to iommufd in .bind_iommufd() callback iommufd_device_bind_pasid() - allocate pasid by itself before calling iommufd_device_bind_pasid() Complete code can be found at[2] [1] https://lore.kernel.org/linux-iommu/20230926092651.17041-1-yi.l.liu@intel.c… [2] https://github.com/yiliu1765/iommufd/tree/iommufd_pasid_siov Regards, Yi Liu Kevin Tian (5): iommufd: Handle unsafe interrupts in a separate function iommufd: Introduce iommufd_alloc_device() iommufd: Add iommufd_device_bind_pasid() iommufd: Support attach/replace for SIOV virtual device {dev, pasid} vfio: Add vfio_register_pasid_iommu_dev() Yi Liu (2): iommufd/selftest: Extend IOMMU_TEST_OP_MOCK_DOMAIN to pass in pasid iommufd/selftest: Add test coverage for SIOV virtual device drivers/iommu/iommufd/device.c | 163 ++++++++++++++---- drivers/iommu/iommufd/iommufd_private.h | 7 + drivers/iommu/iommufd/iommufd_test.h | 2 + drivers/iommu/iommufd/selftest.c | 10 +- drivers/vfio/group.c | 18 ++ drivers/vfio/vfio.h | 8 + drivers/vfio/vfio_main.c | 10 ++ include/linux/iommufd.h | 3 + include/linux/vfio.h | 1 + tools/testing/selftests/iommu/iommufd.c | 75 ++++++-- .../selftests/iommu/iommufd_fail_nth.c | 42 ++++- tools/testing/selftests/iommu/iommufd_utils.h | 21 ++- 12 files changed, 296 insertions(+), 64 deletions(-) -- 2.34.1

1 year, 7 months

4
23
0 0

[PATCH] selftests/memfd: fix a memleak

by zhujun2

The memory allocated within a function should be released before the function return,otherwise memleak will occur. Signed-off-by: zhujun2 <zhujun2(a)cmss.chinamobile.com> --- tools/testing/selftests/memfd/fuse_test.c | 3 +++ tools/testing/selftests/memfd/memfd_test.c | 10 ++++++++++ 2 files changed, 13 insertions(+) diff --git a/tools/testing/selftests/memfd/fuse_test.c b/tools/testing/selftests/memfd/fuse_test.c index 93798c8c5d54..f302294a9001 100644 --- a/tools/testing/selftests/memfd/fuse_test.c +++ b/tools/testing/selftests/memfd/fuse_test.c @@ -205,6 +205,7 @@ static pid_t spawn_sealing_thread(void) stack = malloc(STACK_SIZE); if (!stack) { printf("malloc(STACK_SIZE) failed: %m\n"); + free(stack); abort(); } @@ -214,9 +215,11 @@ static pid_t spawn_sealing_thread(void) NULL); if (pid < 0) { printf("clone() failed: %m\n"); + free(stack); abort(); } + free(stack); return pid; } diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c index 3df008677239..917ffc210723 100644 --- a/tools/testing/selftests/memfd/memfd_test.c +++ b/tools/testing/selftests/memfd/memfd_test.c @@ -658,15 +658,18 @@ static void mfd_assert_grow_write(int fd) buf = malloc(mfd_def_size * 8); if (!buf) { printf("malloc(%zu) failed: %m\n", mfd_def_size * 8); + free(buf); abort(); } l = pwrite(fd, buf, mfd_def_size * 8, 0); if (l != (mfd_def_size * 8)) { printf("pwrite() failed: %m\n"); + free(buf); abort(); } + free(buf); mfd_assert_size(fd, mfd_def_size * 8); } @@ -682,14 +685,18 @@ static void mfd_fail_grow_write(int fd) buf = malloc(mfd_def_size * 8); if (!buf) { printf("malloc(%zu) failed: %m\n", mfd_def_size * 8); + free(buf); abort(); } l = pwrite(fd, buf, mfd_def_size * 8, 0); if (l == (mfd_def_size * 8)) { printf("pwrite() didn't fail as expected\n"); + free(buf); abort(); } + + free(buf); } static void mfd_assert_mode(int fd, int mode) @@ -771,15 +778,18 @@ static pid_t spawn_thread(unsigned int flags, int (*fn)(void *), void *arg) stack = malloc(STACK_SIZE); if (!stack) { printf("malloc(STACK_SIZE) failed: %m\n"); + free(stack); abort(); } pid = clone(fn, stack + STACK_SIZE, SIGCHLD | flags, arg); if (pid < 0) { printf("clone() failed: %m\n"); + free(stack); abort(); } + free(stack); return pid; } -- 2.17.1

1 year, 7 months

3
4
0 0

[PATCH bpf-next v3 0/4] selftests/bpf: Update multiple prog_tests to use ASSERT_ macros

by Yuran Pereira

Multiple files/programs in `tools/testing/selftests/bpf/prog_tests/` still heavily use the `CHECK` macro, even when better `ASSERT_` alternatives are available. As it was already pointed out by Yonghong Song [1] in the bpf selftests the use of the ASSERT_* series of macros is preferred over the CHECK macro. This patchset replaces the usage of `CHECK(` macros to the equivalent `ASSERT_` family of macros in the following prog_tests: - bind_perm.c - bpf_obj_id.c - bpf_tcp_ca.c - vmlinux.c [1] https://lore.kernel.org/lkml/0a142924-633c-44e6-9a92-2dc019656bf2@linux.dev Changes in v3: - Addressed the following points mentioned by Yonghong Song - Improved `bpf_map_lookup_elem` assertion in bpf_tcp_ca. - Replaced assertion introduced in v2 with one that checks `thread_ret` instead of `pthread_join`. This ensures that `server`'s return value (thread_ret) is the one being checked, as oposed to `pthread_join`'s return value, since the latter one is less likely to fail. Changes in v2: - Fixed pthread_join assertion that broke the previous test Previous version: v2 - https://lore.kernel.org/lkml/GV1PR10MB6563AECF8E94798A1E5B36A4E8B6A@GV1PR10… v1 - https://lore.kernel.org/lkml/GV1PR10MB6563FCFF1C5DEBE84FEA985FE8B0A@GV1PR10… Yuran Pereira (4): Replaces the usage of CHECK calls for ASSERTs in bpf_tcp_ca Replaces the usage of CHECK calls for ASSERTs in bind_perm Replaces the usage of CHECK calls for ASSERTs in bpf_obj_id selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in vmlinux .../selftests/bpf/prog_tests/bind_perm.c | 6 +- .../selftests/bpf/prog_tests/bpf_obj_id.c | 204 +++++++----------- .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 48 ++--- .../selftests/bpf/prog_tests/vmlinux.c | 16 +- 4 files changed, 105 insertions(+), 169 deletions(-) -- 2.25.1

1 year, 7 months

3
7
0 0

[PATCH v2 0/7] Fix Python string escapes

by Benjamin Gray

Changes from v1: * Dropped some changes that were independently fixed[1] * No longer separate the f strings to their own patch * Use r strings when the value is a regular expression * Updated verification script In retrospect a script to find the instances and apply fixes isn't that useful for review, so the attached script this time just looks for differences in the AST. Apply the series and run the script, with the two references to compare as arguments. There are some intentional changes to the AST now though, as the r strings turn '\t' from a single character tab into a backslash and 't' character pair (similar for '\n'). This does not affect the correctness of the regular expression though. v1: https://lore.kernel.org/all/20230814060704.79655-1-bgray@linux.ibm.com/ [1]: https://lore.kernel.org/all/20230816122133.1231599-1-vishalc@linux.ibm.com/ --- #!/usr/bin/env python3 """ Verify Python syntax trees are equivalent between two references """ import argparse import ast from pathlib import Path import subprocess as sp def read_file(path: Path, ref: str) -> str: return sp.run(f"git show {ref}:{path}", stdout=sp.PIPE, shell=True, encoding="utf-8", check=True).stdout parser = argparse.ArgumentParser("Compare Python ASTs between revisions") parser.add_argument("ref1", type=str, help="First revision to use") parser.add_argument("ref2", type=str, help="Second revision to use") args = parser.parse_args() for pyfile in Path(".").glob("**/*.py"): try: ref1_content = read_file(pyfile, args.ref1) ref2_content = read_file(pyfile, args.ref2) except Exception as e: print(f"ERROR:{pyfile}: Failed to read ({e})") continue try: ref1_syntax = ast.parse(ref1_content, filename=pyfile) ref2_syntax = ast.parse(ref2_content, filename=pyfile) except SyntaxError as e: print(f"ERROR:{pyfile}: Failed to parse, is it Python3? ({e})") continue if ast.dump(ref1_syntax) != ast.dump(ref2_syntax): print(f"ERROR:{pyfile}: Revisions have different AST") cmd = f"diff <(git show {args.ref1}:{pyfile} | python -m ast) <(git show {args.ref2}:{pyfile} | python -m ast)" print(cmd) sp.run(cmd, shell=True) continue Benjamin Gray (7): ia64: fix Python string escapes Documentation/sphinx: fix Python string escapes drivers/comedi: fix Python string escapes scripts: fix Python string escapes tools/perf: fix Python string escapes tools/power: fix Python string escapes selftests/bpf: fix Python string escapes Documentation/sphinx/cdomain.py | 2 +- Documentation/sphinx/kernel_abi.py | 2 +- Documentation/sphinx/kernel_feat.py | 2 +- Documentation/sphinx/kerneldoc.py | 2 +- Documentation/sphinx/maintainers_include.py | 8 +++--- arch/ia64/scripts/unwcheck.py | 2 +- .../ni_routing/tools/convert_csv_to_c.py | 2 +- scripts/clang-tools/gen_compile_commands.py | 2 +- scripts/gdb/linux/symbols.py | 2 +- tools/perf/pmu-events/jevents.py | 2 +- .../scripts/python/arm-cs-trace-disasm.py | 4 +-- tools/perf/scripts/python/compaction-times.py | 2 +- .../scripts/python/exported-sql-viewer.py | 4 +-- tools/power/pm-graph/bootgraph.py | 12 ++++----- .../selftests/bpf/test_bpftool_synctypes.py | 26 +++++++++---------- tools/testing/selftests/bpf/test_offload.py | 2 +- 16 files changed, 38 insertions(+), 38 deletions(-) -- 2.41.0

1 year, 7 months

6
14
0 0

[PATCH] selftests/media_tests: fix a resource leak

by zhujun2

The opened file should be closed in main(), otherwise resource leak will occur that this problem was discovered by code reading Signed-off-by: zhujun2 <zhujun2(a)cmss.chinamobile.com> --- tools/testing/selftests/media_tests/media_device_open.c | 3 +++ tools/testing/selftests/media_tests/media_device_test.c | 3 +++ 2 files changed, 6 insertions(+) diff --git a/tools/testing/selftests/media_tests/media_device_open.c b/tools/testing/selftests/media_tests/media_device_open.c index 93183a37b133..2dfb2a11b148 100644 --- a/tools/testing/selftests/media_tests/media_device_open.c +++ b/tools/testing/selftests/media_tests/media_device_open.c @@ -70,6 +70,7 @@ int main(int argc, char **argv) fd = open(media_device, O_RDWR); if (fd == -1) { printf("Media Device open errno %s\n", strerror(errno)); + close(fd); exit(-1); } @@ -79,4 +80,6 @@ int main(int argc, char **argv) else printf("Media device model %s driver %s\n", mdi.model, mdi.driver); + + close(fd); } diff --git a/tools/testing/selftests/media_tests/media_device_test.c b/tools/testing/selftests/media_tests/media_device_test.c index 4b9953359e40..7cabb62535a7 100644 --- a/tools/testing/selftests/media_tests/media_device_test.c +++ b/tools/testing/selftests/media_tests/media_device_test.c @@ -79,6 +79,7 @@ int main(int argc, char **argv) fd = open(media_device, O_RDWR); if (fd == -1) { printf("Media Device open errno %s\n", strerror(errno)); + close(fd); exit(-1); } @@ -100,4 +101,6 @@ int main(int argc, char **argv) sleep(10); count--; } + + close(fd); } -- 2.17.1

1 year, 7 months

4
5
0 0

[PATCH][next] selftests/mm: Fix spelling mistake "succedded" -> "succeeded"

by Colin Ian King

There is a spelling mistake in a ksft_exit_fail_msg message. Fix it. Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com> --- tools/testing/selftests/mm/vm_util.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c index 4aeb8d5299ff..05736c615734 100644 --- a/tools/testing/selftests/mm/vm_util.c +++ b/tools/testing/selftests/mm/vm_util.c @@ -75,7 +75,7 @@ static bool pagemap_scan_supported(int fd, char *start) /* Provide an invalid address in order to trigger EFAULT. */ ret = __pagemap_scan_get_categories(fd, start, (struct page_region *) ~0UL); if (ret == 0) - ksft_exit_fail_msg("PAGEMAP_SCAN succedded unexpectedly\n"); + ksft_exit_fail_msg("PAGEMAP_SCAN succeeded unexpectedly\n"); supported = errno == EFAULT; -- 2.39.2

1 year, 7 months

1
0
0 0

[PATCH net-next] selftests: net: verify fq per-band packet limit

by Willem de Bruijn

From: Willem de Bruijn <willemb(a)google.com> Commit 29f834aa326e ("net_sched: sch_fq: add 3 bands and WRR scheduling") introduces multiple traffic bands, and per-band maximum packet count. Per-band limits ensures that packets in one class cannot fill the entire qdisc and so cause DoS to the traffic in the other classes. Verify this behavior: 1. set the limit to 10 per band 2. send 20 pkts on band A: verify that 10 are queued, 10 dropped 3. send 20 pkts on band A: verify that 0 are queued, 20 dropped 4. send 20 pkts on band B: verify that 10 are queued, 10 dropped Packets must remain queued for a period to trigger this behavior. Use SO_TXTIME to store packets for 100 msec. The test reuses existing upstream test infra. The script is a fork of cmsg_time.sh. The scripts call cmsg_sender. The test extends cmsg_sender with two arguments: * '-P' SO_PRIORITY There is a subtle difference between IPv4 and IPv6 stack behavior: PF_INET/IP_TOS sets IP header bits and sk_priority PF_INET6/IPV6_TCLASS sets IP header bits BUT NOT sk_priority * '-n' num pkts Send multiple packets in quick succession. I first attempted a for loop in the script, but this is too slow in virtualized environments, causing flakiness as the 100ms timeout is reached and packets are dequeued. Also do not wait for timestamps to be queued unless timestamps are requested. Signed-off-by: Willem de Bruijn <willemb(a)google.com> --- tools/testing/selftests/net/Makefile | 1 + tools/testing/selftests/net/cmsg_sender.c | 50 ++++++++++------ .../testing/selftests/net/fq_band_pktlimit.sh | 57 +++++++++++++++++++ 3 files changed, 91 insertions(+), 17 deletions(-) create mode 100755 tools/testing/selftests/net/fq_band_pktlimit.sh diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index 5b2aca4c5f10..9274edfb76ff 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -91,6 +91,7 @@ TEST_PROGS += test_bridge_neigh_suppress.sh TEST_PROGS += test_vxlan_nolocalbypass.sh TEST_PROGS += test_bridge_backup_port.sh TEST_PROGS += fdb_flush.sh +TEST_PROGS += fq_band_pktlimit.sh TEST_FILES := settings diff --git a/tools/testing/selftests/net/cmsg_sender.c b/tools/testing/selftests/net/cmsg_sender.c index 24b21b15ed3f..8d7575389f58 100644 --- a/tools/testing/selftests/net/cmsg_sender.c +++ b/tools/testing/selftests/net/cmsg_sender.c @@ -45,11 +45,13 @@ struct options { const char *host; const char *service; unsigned int size; + unsigned int num_pkt; struct { unsigned int mark; unsigned int dontfrag; unsigned int tclass; unsigned int hlimit; + unsigned int priority; } sockopt; struct { unsigned int family; @@ -72,6 +74,7 @@ struct options { } v6; } opt = { .size = 13, + .num_pkt = 1, .sock = { .family = AF_UNSPEC, .type = SOCK_DGRAM, @@ -112,7 +115,7 @@ static void cs_parse_args(int argc, char *argv[]) { int o; - while ((o = getopt(argc, argv, "46sS:p:m:M:d:tf:F:c:C:l:L:H:")) != -1) { + while ((o = getopt(argc, argv, "46sS:p:P:m:M:n:d:tf:F:c:C:l:L:H:")) != -1) { switch (o) { case 's': opt.silent_send = true; @@ -138,7 +141,9 @@ static void cs_parse_args(int argc, char *argv[]) cs_usage(argv[0]); } break; - + case 'P': + opt.sockopt.priority = atoi(optarg); + break; case 'm': opt.mark.ena = true; opt.mark.val = atoi(optarg); @@ -146,6 +151,9 @@ static void cs_parse_args(int argc, char *argv[]) case 'M': opt.sockopt.mark = atoi(optarg); break; + case 'n': + opt.num_pkt = atoi(optarg); + break; case 'd': opt.txtime.ena = true; opt.txtime.delay = atoi(optarg); @@ -410,6 +418,10 @@ static void ca_set_sockopts(int fd) setsockopt(fd, SOL_IPV6, IPV6_UNICAST_HOPS, &opt.sockopt.hlimit, sizeof(opt.sockopt.hlimit))) error(ERN_SOCKOPT, errno, "setsockopt IPV6_HOPLIMIT"); + if (opt.sockopt.priority && + setsockopt(fd, SOL_SOCKET, SO_PRIORITY, + &opt.sockopt.priority, sizeof(opt.sockopt.priority))) + error(ERN_SOCKOPT, errno, "setsockopt SO_PRIORITY"); } int main(int argc, char *argv[]) @@ -421,6 +433,7 @@ int main(int argc, char *argv[]) char *buf; int err; int fd; + int i; cs_parse_args(argc, argv); @@ -480,24 +493,27 @@ int main(int argc, char *argv[]) cs_write_cmsg(fd, &msg, cbuf, sizeof(cbuf)); - err = sendmsg(fd, &msg, 0); - if (err < 0) { - if (!opt.silent_send) - fprintf(stderr, "send failed: %s\n", strerror(errno)); - err = ERN_SEND; - goto err_out; - } else if (err != (int)opt.size) { - fprintf(stderr, "short send\n"); - err = ERN_SEND_SHORT; - goto err_out; - } else { - err = ERN_SUCCESS; + for (i = 0; i < opt.num_pkt; i++) { + err = sendmsg(fd, &msg, 0); + if (err < 0) { + if (!opt.silent_send) + fprintf(stderr, "send failed: %s\n", strerror(errno)); + err = ERN_SEND; + goto err_out; + } else if (err != (int)opt.size) { + fprintf(stderr, "short send\n"); + err = ERN_SEND_SHORT; + goto err_out; + } } + err = ERN_SUCCESS; - /* Make sure all timestamps have time to loop back */ - usleep(opt.txtime.delay); + if (opt.ts.ena) { + /* Make sure all timestamps have time to loop back */ + usleep(opt.txtime.delay); - cs_read_cmsg(fd, &msg, cbuf, sizeof(cbuf)); + cs_read_cmsg(fd, &msg, cbuf, sizeof(cbuf)); + } err_out: close(fd); diff --git a/tools/testing/selftests/net/fq_band_pktlimit.sh b/tools/testing/selftests/net/fq_band_pktlimit.sh new file mode 100755 index 000000000000..24b77bdf41ff --- /dev/null +++ b/tools/testing/selftests/net/fq_band_pktlimit.sh @@ -0,0 +1,57 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Verify that FQ has a packet limit per band: +# +# 1. set the limit to 10 per band +# 2. send 20 pkts on band A: verify that 10 are queued, 10 dropped +# 3. send 20 pkts on band A: verify that 0 are queued, 20 dropped +# 4. send 20 pkts on band B: verify that 10 are queued, 10 dropped +# +# Send packets with a 100ms delay to ensure that previously sent +# packets are still queued when later ones are sent. +# Use SO_TXTIME for this. + +die() { + echo "$1" + exit 1 +} + +# run inside private netns +if [[ $# -eq 0 ]]; then + ./in_netns.sh "$0" __subprocess + exit +fi + +ip link add type dummy +ip link set dev dummy0 up +ip -6 addr add fdaa::1/128 dev dummy0 +ip -6 route add fdaa::/64 dev dummy0 +tc qdisc replace dev dummy0 root handle 1: fq quantum 1514 initial_quantum 1514 limit 10 + +./cmsg_sender -6 -p u -d 100000 -n 20 fdaa::2 8000 +OUT1="$(tc -s qdisc show dev dummy0 | grep '^\ Sent')" + +./cmsg_sender -6 -p u -d 100000 -n 20 fdaa::2 8000 +OUT2="$(tc -s qdisc show dev dummy0 | grep '^\ Sent')" + +./cmsg_sender -6 -p u -d 100000 -n 20 -P 7 fdaa::2 8000 +OUT3="$(tc -s qdisc show dev dummy0 | grep '^\ Sent')" + +# Initial stats will report zero sent, as all packets are still +# queued in FQ. Sleep for the delay period (100ms) and see that +# twenty are now sent. +sleep 0.1 +OUT4="$(tc -s qdisc show dev dummy0 | grep '^\ Sent')" + +# Log the output after the test +echo "${OUT1}" +echo "${OUT2}" +echo "${OUT3}" +echo "${OUT4}" + +# Test the output for expected values +echo "${OUT1}" | grep -q '0\ pkt\ (dropped\ 10' || die "unexpected drop count at 1" +echo "${OUT2}" | grep -q '0\ pkt\ (dropped\ 30' || die "unexpected drop count at 2" +echo "${OUT3}" | grep -q '0\ pkt\ (dropped\ 40' || die "unexpected drop count at 3" +echo "${OUT4}" | grep -q '20\ pkt\ (dropped\ 40' || die "unexpected accept count at 4" -- 2.43.0.rc1.413.gea7ed67945-goog

1 year, 7 months

4
3
0 0

Re: [PATCH v2 0/6] IOMMUFD: Deliver IO page faults to user space

by Jason Gunthorpe

On Wed, Nov 15, 2023 at 01:17:06PM +0800, Liu, Jing2 wrote: > This is the right way to approach it, > > I learned that there was discussion about using io_uring to get the > page fault without > > eventfd notification in [1], and I am new at io_uring and studying the > man page of > > liburing, but there're questions in my mind on how can QEMU get the > coming page fault > > with a good performance. > > Since both QEMU and Kernel don't know when comes faults, after QEMU > submits one > > read task to io_uring, we want kernel pending until fault comes. While > based on > > hwpt_fault_fops_read() in [patch v2 4/6], it just returns 0 since > there's now no fault, > > thus this round of read completes to CQ but it's not what we want. So > I'm wondering > > how kernel pending on the read until fault comes. Does fops callback > need special work to Implement a fops with poll support that triggers when a new event is pushed and everything will be fine. There are many examples in the kernel. The ones in the mlx5 vfio driver spring to mind as a scheme I recently looked at. Jason

1 year, 7 months

2
2
0 0

[PATCH bpf-next v2 0/4] selftests/bpf: Update multiple prog_tests to use ASSERT_ macros

by Yuran Pereira

Multiple files/programs in `tools/testing/selftests/bpf/prog_tests/` still heavily use the `CHECK` macro, even when better `ASSERT_` alternatives are available. As it was already pointed out by Yonghong Song [1] in the bpf selftests the use of the ASSERT_* series of macros is preferred over the CHECK macro. This patchset replaces the usage of `CHECK(` macros to the equivalent `ASSERT_` family of macros in the following prog_tests: - bind_perm.c - bpf_obj_id.c - bpf_tcp_ca.c - vmlinux.c [1] https://lore.kernel.org/lkml/0a142924-633c-44e6-9a92-2dc019656bf2@linux.dev Changes in v2: - Fixed pthread_join assertion that broke the previous test Previous version: v1 - https://lore.kernel.org/lkml/GV1PR10MB6563FCFF1C5DEBE84FEA985FE8B0A@GV1PR10… Yuran Pereira (4): Replaces the usage of CHECK calls for ASSERTs in bpf_tcp_ca Replaces the usage of CHECK calls for ASSERTs in bind_perm Replaces the usage of CHECK calls for ASSERTs in bpf_obj_id selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in vmlinux .../selftests/bpf/prog_tests/bind_perm.c | 6 +- .../selftests/bpf/prog_tests/bpf_obj_id.c | 204 +++++++----------- .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 50 ++--- .../selftests/bpf/prog_tests/vmlinux.c | 16 +- 4 files changed, 106 insertions(+), 170 deletions(-) -- 2.25.1

1 year, 7 months

2
10
0 0

[PATCH v4 0/5] cgroup/cpuset: Improve CPU isolation in isolated partitions

by Waiman Long

v4: - Update patch 1 to move apply_wqattrs_lock() and apply_wqattrs_unlock() down into CONFIG_SYSFS block to avoid compilation warnings. v3: - Break out a separate patch to make workqueue_set_unbound_cpumask() static and move it down to the CONFIG_SYSFS section. - Remove the "__DEBUG__." prefix and the CFTYPE_DEBUG flag from the new root only cpuset.cpus.isolated control files and update the test accordingly. v2: - Add 2 read-only workqueue sysfs files to expose the user requested cpumask as well as the isolated CPUs to be excluded from wq_unbound_cpumask. - Ensure that caller of the new workqueue_unbound_exclude_cpumask() hold cpus_read_lock. - Update the cpuset code to make sure the cpus_read_lock is held whenever workqueue_unbound_exclude_cpumask() may be called. Isolated cpuset partition can currently be created to contain an exclusive set of CPUs not used in other cgroups and with load balancing disabled to reduce interference from the scheduler. The main purpose of this isolated partition type is to dynamically emulate what can be done via the "isolcpus" boot command line option, specifically the default domain flag. One effect of the "isolcpus" option is to remove the isolated CPUs from the cpumasks of unbound workqueues since running work functions in an isolated CPU can be a major source of interference. Changing the unbound workqueue cpumasks can be done at run time by writing an appropriate cpumask without the isolated CPUs to /sys/devices/virtual/workqueue/cpumask. So one can set up an isolated cpuset partition and then write to the cpumask sysfs file to achieve similar level of CPU isolation. However, this manual process can be error prone. This patch series implements automatic exclusion of isolated CPUs from unbound workqueue cpumasks when an isolated cpuset partition is created and then adds those CPUs back when the isolated partition is destroyed. There are also other places in the kernel that look at the HK_FLAG_DOMAIN cpumask or other HK_FLAG_* cpumasks and exclude the isolated CPUs from certain actions to further reduce interference. CPUs in an isolated cpuset partition will not be able to avoid those interferences yet. That may change in the future as the need arises. Waiman Long (5): workqueue: Make workqueue_set_unbound_cpumask() static workqueue: Add workqueue_unbound_exclude_cpumask() to exclude CPUs from wq_unbound_cpumask selftests/cgroup: Minor code cleanup and reorganization of test_cpuset_prs.sh cgroup/cpuset: Keep track of CPUs in isolated partitions cgroup/cpuset: Take isolated CPUs out of workqueue unbound cpumask Documentation/admin-guide/cgroup-v2.rst | 10 +- include/linux/workqueue.h | 2 +- kernel/cgroup/cpuset.c | 286 +++++++++++++----- kernel/workqueue.c | 165 +++++++--- .../selftests/cgroup/test_cpuset_prs.sh | 216 ++++++++----- 5 files changed, 475 insertions(+), 204 deletions(-) -- 2.39.3

1 year, 7 months

2
7
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror November 2023