- Linux-kselftest-mirror - lists.linaro.org

[PATCH v1 RESEND 0/6] mm: (pte|pmd)_mkdirty() should not unconditionally allow for write access

by David Hildenbrand

This is the follow-up on [1], adding selftests (testing for known issues we added workarounds for and other issues that haven't been fixed yet), fixing sparc64, reverting the workarounds, and perform one cleanup. The patch from [1] was modified slightly (updated/extended patch description, dropped one unnecessary NOP instruction from the ASM in __pte_mkhwwrite()). Retested on x86_64 and sparc64 (sun4u in QEMU). I scanned most architectures to make sure their (pte|pmd)_mkdirty() handling is correct. To be sure, we can run the selftests and find out if other architectures are still affectes (loongarch was fixed recently as well). Based on master for now. I don't expect surprises regarding mm-tress, but I can rebase if there are any problems. [1] https://lkml.kernel.org/r/20221212130213.136267-1-david@redhat.com Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: "David S. Miller" <davem(a)davemloft.net> Cc: Peter Xu <peterx(a)redhat.com> Cc: Hugh Dickins <hughd(a)google.com> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Sam Ravnborg <sam(a)ravnborg.org> Cc: Yu Zhao <yuzhao(a)google.com> Cc: Anshuman Khandual <anshuman.khandual(a)arm.com> David Hildenbrand (6): selftests/mm: reuse read_pmd_pagesize() in COW selftest selftests/mm: mkdirty: test behavior of (pte|pmd)_mkdirty on VMAs without write permissions sparc/mm: don't unconditionally set HW writable bit when setting PTE dirty on 64bit mm/migrate: revert "mm/migrate: fix wrongly apply write bit after mkdirty on sparc64" mm/huge_memory: revert "Partly revert "mm/thp: carry over dirty bit when thp splits on pmd"" mm/huge_memory: conditionally call maybe_mkwrite() and drop pte_wrprotect() in __split_huge_pmd_locked() arch/sparc/include/asm/pgtable_64.h | 116 +++--- mm/huge_memory.c | 16 +- mm/migrate.c | 2 - tools/testing/selftests/mm/Makefile | 2 + tools/testing/selftests/mm/cow.c | 33 +- tools/testing/selftests/mm/khugepaged.c | 4 + tools/testing/selftests/mm/mkdirty.c | 379 ++++++++++++++++++ tools/testing/selftests/mm/soft-dirty.c | 3 + .../selftests/mm/split_huge_page_test.c | 4 + tools/testing/selftests/mm/vm_util.c | 4 +- 10 files changed, 468 insertions(+), 95 deletions(-) create mode 100644 tools/testing/selftests/mm/mkdirty.c -- 2.39.2

2 years, 2 months

3
9
0 0

[RFC PATCH 0/5] cgroup/cpuset: A new "isolcpus" paritition

by Waiman Long

This patch series introduces a new "isolcpus" partition type to the existing list of {member, root, isolated} types. The primary reason of adding this new "isolcpus" partition is to facilitate the distribution of isolated CPUs down the cgroup v2 hierarchy. The other non-member partition types have the limitation that their parents have to be valid partitions too. It will be hard to create a partition a few layers down the hierarchy. It is relatively rare to have applications that require creation of a separate scheduling domain (root). However, it is more common to have applications that require the use of isolated CPUs (isolated), e.g. DPDK. One can use the "isolcpus" or "nohz_full" boot command options to get that statically. Of course, the "isolated" partition is another way to achieve that dynamically. Modern container orchestration tools like Kubernetes use the cgroup hierarchy to manage different containers. If a container needs to use isolated CPUs, it is hard to get those with existing set of cpuset partition types. With this patch series, a new "isolcpus" partition can be created to hold a set of isolated CPUs that can be pull into other "isolated" partitions. The "isolcpus" partition is special that there can have at most one instance of this in a system. It serves as a pool for isolated CPUs and cannot hold tasks or sub-cpusets underneath it. It is also not cpu-exclusive so that the isolated CPUs can be distributed down the sibling hierarchies, though those isolated CPUs will not be useable until the partition type becomes "isolated". Once isolated CPUs are needed in a cgroup, the administrator can write a list of isolated CPUs into its "cpuset.cpus" and change its partition type to "isolated" to pull in those isolated CPUs from the "isolcpus" partition and use them in that cgroup. That will make the distribution of isolated CPUs to cgroups that need them much easier. In the future, we may be able to extend this special "isolcpus" partition type to support other isolation attributes like those that can be specified with the "isolcpus" boot command line and related options. Waiman Long (5): cgroup/cpuset: Extract out CS_CPU_EXCLUSIVE & CS_SCHED_LOAD_BALANCE handling cgroup/cpuset: Add a new "isolcpus" paritition root state cgroup/cpuset: Make isolated partition pull CPUs from isolcpus partition cgroup/cpuset: Documentation update for the new "isolcpus" partition cgroup/cpuset: Extend test_cpuset_prs.sh to test isolcpus partition Documentation/admin-guide/cgroup-v2.rst | 89 ++- kernel/cgroup/cpuset.c | 548 +++++++++++++++--- .../selftests/cgroup/test_cpuset_prs.sh | 376 ++++++++---- 3 files changed, 789 insertions(+), 224 deletions(-) -- 2.31.1

2 years, 2 months

2
1
0 0

[PATCH v8 0/6] Some improvements of resctrl selftest

by Shaopeng Tan

Hello, The aim of this patch series is to improve the resctrl selftest. Without these fixes, some unnecessary processing will be executed and test results will be confusing. There is no behavior change in test themselves. [patch 1] Make write_schemata() run to set up shemata with 100% allocation on first run in MBM test. [patch 2] The MBA test result message is always output as "ok", make output message to be "not ok" if MBA check result is failed. [patch 3] When a child process is created by fork(), the buffer of the parent process is also copied. Flush the buffer before executing fork(). [patch 4] An error occurs whether in parents process or child process, the parents process always kills child process and runs umount_resctrlfs(), and the child process always waits to be killed by the parent process. [patch 5] If a signal received, to cleanup properly before exiting the parent process, commonize the signal handler registered for CMT/MBM/MBA tests and reuse it in CAT, also unregister the signal handler at the end of each test. [patch 6] Before exiting each test CMT/CAT/MBM/MBA, clear test result files function cat/cmt/mbm/mba_test_cleanup() are called twice. Delete once. This patch series is based on Linux v6.2-rc7. Difference from v7: [patch 4] - Fix commitlog. [patch 5] - Fix commitlog. Pervious versions of this series: [v1] https://lore.kernel.org/lkml/20220914015147.3071025-1-tan.shaopeng@jp.fujit… [v2] https://lore.kernel.org/lkml/20221005013933.1486054-1-tan.shaopeng@jp.fujit… [v3] https://lore.kernel.org/lkml/20221101094341.3383073-1-tan.shaopeng@jp.fujit… [v4] https://lore.kernel.org/lkml/20221117010541.1014481-1-tan.shaopeng@jp.fujit… [v5] https://lore.kernel.org/lkml/20230111075802.3556803-1-tan.shaopeng@jp.fujit… [v6] https://lore.kernel.org/lkml/20230131054655.396270-1-tan.shaopeng@jp.fujits… [v7] https://lore.kernel.org/lkml/20230213062428.1721572-1-tan.shaopeng@jp.fujit… Shaopeng Tan (6): selftests/resctrl: Fix set up schemata with 100% allocation on first run in MBM test selftests/resctrl: Return MBA check result and make it to output message selftests/resctrl: Flush stdout file buffer before executing fork() selftests/resctrl: Cleanup properly when an error occurs in CAT test selftests/resctrl: Commonize the signal handler register/unregister for all tests selftests/resctrl: Remove duplicate codes that clear each test result file tools/testing/selftests/resctrl/cat_test.c | 29 ++++---- tools/testing/selftests/resctrl/cmt_test.c | 7 +- tools/testing/selftests/resctrl/fill_buf.c | 14 ---- tools/testing/selftests/resctrl/mba_test.c | 23 +++---- tools/testing/selftests/resctrl/mbm_test.c | 20 +++--- tools/testing/selftests/resctrl/resctrl.h | 2 + .../testing/selftests/resctrl/resctrl_tests.c | 4 -- tools/testing/selftests/resctrl/resctrl_val.c | 67 ++++++++++++++----- tools/testing/selftests/resctrl/resctrlfs.c | 5 +- 9 files changed, 96 insertions(+), 75 deletions(-) -- 2.27.0

2 years, 2 months

2
9
0 0

Re: [PATCH v6 1/3] mm: add new api to enable ksm per process

by Matthew Wilcox

On Tue, Apr 11, 2023 at 08:16:46PM -0700, Stefan Roesch wrote: > case PR_SET_VMA: > error = prctl_set_vma(arg2, arg3, arg4, arg5); > break; > +#ifdef CONFIG_KSM > + case PR_SET_MEMORY_MERGE: > + if (mmap_write_lock_killable(me->mm)) > + return -EINTR; > + > + if (arg2) { > + int err = ksm_add_mm(me->mm); > + > + if (!err) > + ksm_add_vmas(me->mm); in the last version of this patch, you reported the error. Now you swallow the error. I have no idea which is correct, but you've changed the behaviour without explaining it, so I assume it's wrong. > + } else { > + clear_bit(MMF_VM_MERGE_ANY, &me->mm->flags); > + } > + mmap_write_unlock(me->mm); > + break; > + case PR_GET_MEMORY_MERGE: > + if (arg2 || arg3 || arg4 || arg5) > + return -EINVAL; > + > + error = !!test_bit(MMF_VM_MERGE_ANY, &me->mm->flags); > + break; Why do we need a GET? Just for symmetry, or is there an actual need for it?

2 years, 2 months

2
2
0 0

[PATCH bpf-next v6 4/4] selftests: xsk: Add tests for 8K and 9K frame sizes

by Kal Conley

Add tests: - RUN_TO_COMPLETION_8K_FRAME_SIZE: frame_size=8192 (aligned) - UNALIGNED_9K_FRAME_SIZE: frame_size=9000 (unaligned) Signed-off-by: Kal Conley <kal.conley(a)dectris.com> Acked-by: Magnus Karlsson <magnus.karlsson(a)intel.com> --- tools/testing/selftests/bpf/xskxceiver.c | 25 ++++++++++++++++++++++++ tools/testing/selftests/bpf/xskxceiver.h | 2 ++ 2 files changed, 27 insertions(+) diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c index 7eccf57a0ccc..86797de7fc50 100644 --- a/tools/testing/selftests/bpf/xskxceiver.c +++ b/tools/testing/selftests/bpf/xskxceiver.c @@ -1841,6 +1841,17 @@ static void run_pkt_test(struct test_spec *test, enum test_mode mode, enum test_ pkt_stream_replace(test, DEFAULT_PKT_CNT, PKT_SIZE); testapp_validate_traffic(test); break; + case TEST_TYPE_RUN_TO_COMPLETION_8K_FRAME: + if (!hugepages_present(test->ifobj_tx)) { + ksft_test_result_skip("No 2M huge pages present.\n"); + return; + } + test_spec_set_name(test, "RUN_TO_COMPLETION_8K_FRAME_SIZE"); + test->ifobj_tx->umem->frame_size = 8192; + test->ifobj_rx->umem->frame_size = 8192; + pkt_stream_replace(test, DEFAULT_PKT_CNT, PKT_SIZE); + testapp_validate_traffic(test); + break; case TEST_TYPE_RX_POLL: test->ifobj_rx->use_poll = true; test_spec_set_name(test, "POLL_RX"); @@ -1904,6 +1915,20 @@ static void run_pkt_test(struct test_spec *test, enum test_mode mode, enum test_ if (!testapp_unaligned(test)) return; break; + case TEST_TYPE_UNALIGNED_9K_FRAME: + if (!hugepages_present(test->ifobj_tx)) { + ksft_test_result_skip("No 2M huge pages present.\n"); + return; + } + test_spec_set_name(test, "UNALIGNED_9K_FRAME_SIZE"); + test->ifobj_tx->umem->frame_size = 9000; + test->ifobj_rx->umem->frame_size = 9000; + test->ifobj_tx->umem->unaligned_mode = true; + test->ifobj_rx->umem->unaligned_mode = true; + pkt_stream_replace(test, DEFAULT_PKT_CNT, PKT_SIZE); + test->ifobj_rx->pkt_stream->use_addr_for_fill = true; + testapp_validate_traffic(test); + break; case TEST_TYPE_HEADROOM: testapp_headroom(test); break; diff --git a/tools/testing/selftests/bpf/xskxceiver.h b/tools/testing/selftests/bpf/xskxceiver.h index 919327807a4e..7f52f737f5e9 100644 --- a/tools/testing/selftests/bpf/xskxceiver.h +++ b/tools/testing/selftests/bpf/xskxceiver.h @@ -69,12 +69,14 @@ enum test_mode { enum test_type { TEST_TYPE_RUN_TO_COMPLETION, TEST_TYPE_RUN_TO_COMPLETION_2K_FRAME, + TEST_TYPE_RUN_TO_COMPLETION_8K_FRAME, TEST_TYPE_RUN_TO_COMPLETION_SINGLE_PKT, TEST_TYPE_RX_POLL, TEST_TYPE_TX_POLL, TEST_TYPE_POLL_RXQ_TMOUT, TEST_TYPE_POLL_TXQ_TMOUT, TEST_TYPE_UNALIGNED, + TEST_TYPE_UNALIGNED_9K_FRAME, TEST_TYPE_ALIGNED_INV_DESC, TEST_TYPE_ALIGNED_INV_DESC_2K_FRAME, TEST_TYPE_UNALIGNED_INV_DESC, -- 2.39.2

2 years, 2 months

1
0
0 0

[PATCH bpf-next v6 3/4] selftests: xsk: Use hugepages when umem->frame_size > PAGE_SIZE

by Kal Conley

HugeTLB UMEMs now support chunk_size > PAGE_SIZE. Set MAP_HUGETLB when frame_size > PAGE_SIZE for future tests. Signed-off-by: Kal Conley <kal.conley(a)dectris.com> Acked-by: Magnus Karlsson <magnus.karlsson(a)intel.com> --- tools/testing/selftests/bpf/xskxceiver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c index 5a9691e942de..7eccf57a0ccc 100644 --- a/tools/testing/selftests/bpf/xskxceiver.c +++ b/tools/testing/selftests/bpf/xskxceiver.c @@ -1289,7 +1289,7 @@ static void thread_common_ops(struct test_spec *test, struct ifobject *ifobject) void *bufs; int ret; - if (ifobject->umem->unaligned_mode) + if (ifobject->umem->frame_size > sysconf(_SC_PAGESIZE) || ifobject->umem->unaligned_mode) mmap_flags |= MAP_HUGETLB; if (ifobject->shared_umem) -- 2.39.2

2 years, 2 months

1
0
0 0

[RFC PATCH 5/5] cgroup/cpuset: Extend test_cpuset_prs.sh to test isolcpus partition

by Waiman Long

This patch extends the test_cpuset_prs.sh test script to support testing the new isolcpus partition by adding new tests for specifically for isolcpus partition. In addition, the following changes are also made: 1) Remove the first column of the TEST_MATRIX as it is always the same and so is redundant. 2) Add a new C1 cgroup directory for testing and add that column to the TEST_MATRIX. 3) Add support for the .__DEBUG__.cpuset.cpus.subpartitions file if "cgroup_debug" kernel boot option is specified and a new column into TEST_MATRIX for testing against this cgroup control file. 4) Add another column to for the list of expected isolated CPUs and compare it with the actual value by looking at the state of /sys/kernel/debug/sched/domains. Signed-off-by: Waiman Long <longman(a)redhat.com> --- .../selftests/cgroup/test_cpuset_prs.sh | 376 ++++++++++++------ 1 file changed, 258 insertions(+), 118 deletions(-) diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh index 2b5215cc599f..7fa2bfe6c1c0 100755 --- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh +++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh @@ -23,18 +23,18 @@ WAIT_INOTIFY=$(cd $(dirname $0); pwd)/wait_inotify CGROUP2=$(mount -t cgroup2 | head -1 | awk -e '{print $3}') [[ -n "$CGROUP2" ]] || skip_test "Cgroup v2 mount point not found!" -CPUS=$(lscpu | grep "^CPU(s):" | sed -e "s/.*:[[:space:]]*//") -[[ $CPUS -lt 8 ]] && skip_test "Test needs at least 8 cpus available!" +NR_CPUS=$(lscpu | grep "^CPU(s):" | sed -e "s/.*:[[:space:]]*//") +[[ $NR_CPUS -lt 8 ]] && skip_test "Test needs at least 8 cpus available!" # Set verbose flag and delay factor PROG=$1 -VERBOSE= +VERBOSE=0 DELAY_FACTOR=1 SCHED_DEBUG= while [[ "$1" = -* ]] do case "$1" in - -v) VERBOSE=1 + -v) ((VERBOSE++)) # Enable sched/verbose can slow thing down [[ $DELAY_FACTOR -eq 1 ]] && DELAY_FACTOR=2 @@ -52,7 +52,7 @@ do done # Set sched verbose flag if available when "-v" option is specified -if [[ -n "$VERBOSE" && -d /sys/kernel/debug/sched ]] +if [[ $VERBOSE -gt 0 && -d /sys/kernel/debug/sched ]] then # Used to restore the original setting during cleanup SCHED_DEBUG=$(cat /sys/kernel/debug/sched/verbose) @@ -103,7 +103,7 @@ test_partition() [[ $? -eq 0 ]] || exit 1 ACTUAL_VAL=$(cat cpuset.cpus.partition) [[ $ACTUAL_VAL != $EXPECTED_VAL ]] && { - echo "cpuset.cpus.partition: expect $EXPECTED_VAL, found $EXPECTED_VAL" + echo "cpuset.cpus.partition: expect $EXPECTED_VAL, found $ACTUAL_VAL" echo "Test FAILED" exit 1 } @@ -114,7 +114,7 @@ test_effective_cpus() EXPECTED_VAL=$1 ACTUAL_VAL=$(cat cpuset.cpus.effective) [[ "$ACTUAL_VAL" != "$EXPECTED_VAL" ]] && { - echo "cpuset.cpus.effective: expect '$EXPECTED_VAL', found '$EXPECTED_VAL'" + echo "cpuset.cpus.effective: expect '$EXPECTED_VAL', found '$ACTUAL_VAL'" echo "Test FAILED" exit 1 } @@ -204,124 +204,175 @@ test_isolated() # Cgroup test hierarchy # # test -- A1 -- A2 -- A3 -# \- B1 +# +- B1 +# +- C1 # -# P<v> = set cpus.partition (0:member, 1:root, 2:isolated, -1:root invalid) +# P<v> = set cpus.partition (0:member, 1:root, 2:isolated, 3: isolcpus) # C<l> = add cpu-list # S<p> = use prefix in subtree_control # T = put a task into cgroup -# O<c>-<v> = Write <v> to CPU online file of <c> +# O<c>=<v> = Write <v> to CPU online file of <c> # SETUP_A123_PARTITIONS="C1-3:P1:S+ C2-3:P1:S+ C3:P1" TEST_MATRIX=( - # test old-A1 old-A2 old-A3 old-B1 new-A1 new-A2 new-A3 new-B1 fail ECPUs Pstate - # ---- ------ ------ ------ ------ ------ ------ ------ ------ ---- ----- ------ - " S+ C0-1 . . C2-3 S+ C4-5 . . 0 A2:0-1" - " S+ C0-1 . . C2-3 P1 . . . 0 " - " S+ C0-1 . . C2-3 P1:S+ C0-1:P1 . . 0 " - " S+ C0-1 . . C2-3 P1:S+ C1:P1 . . 0 " - " S+ C0-1:S+ . . C2-3 . . . P1 0 " - " S+ C0-1:P1 . . C2-3 S+ C1 . . 0 " - " S+ C0-1:P1 . . C2-3 S+ C1:P1 . . 0 " - " S+ C0-1:P1 . . C2-3 S+ C1:P1 . P1 0 " - " S+ C0-1:P1 . . C2-3 C4-5 . . . 0 A1:4-5" - " S+ C0-1:P1 . . C2-3 S+:C4-5 . . . 0 A1:4-5" - " S+ C0-1 . . C2-3:P1 . . . C2 0 " - " S+ C0-1 . . C2-3:P1 . . . C4-5 0 B1:4-5" - " S+ C0-3:P1:S+ C2-3:P1 . . . . . . 0 A1:0-1,A2:2-3" - " S+ C0-3:P1:S+ C2-3:P1 . . C1-3 . . . 0 A1:1,A2:2-3" - " S+ C2-3:P1:S+ C3:P1 . . C3 . . . 0 A1:,A2:3 A1:P1,A2:P1" - " S+ C2-3:P1:S+ C3:P1 . . C3 P0 . . 0 A1:3,A2:3 A1:P1,A2:P0" - " S+ C2-3:P1:S+ C2:P1 . . C2-4 . . . 0 A1:3-4,A2:2" - " S+ C2-3:P1:S+ C3:P1 . . C3 . . C0-2 0 A1:,B1:0-2 A1:P1,A2:P1" - " S+ $SETUP_A123_PARTITIONS . C2-3 . . . 0 A1:,A2:2,A3:3 A1:P1,A2:P1,A3:P1" + # old-A1 old-A2 old-A3 old-B1 new-A1 new-A2 new-A3 new-B1 new-C1 fail ECPUs Pstate PCPUS ISOLCPUS + # ------ ------ ------ ------ ------ ------ ------ ------ ------ ---- ----- ------ ----- -------- + " C0-1 . . C2-3 S+ C4-5 . . . 0 A2:0-1" + " C0-1 . . C2-3 P1 . . . . 0 " + " C0-1 . . C2-3 P1:S+ C0-1:P1 . . . 0 " + " C0-1 . . C2-3 P1:S+ C1:P1 . . . 0 " + " C0-1:S+ . . C2-3 . . . P1 . 0 " + " C0-1:P1 . . C2-3 S+ C1 . . . 0 " + " C0-1:P1 . . C2-3 S+ C1:P1 . . . 0 " + " C0-1:P1 . . C2-3 S+ C1:P1 . P1 . 0 " + " C0-1:P1 . . C2-3 C4-5 . . . . 0 A1:4-5" + " C0-1:P1 . . C2-3 S+:C4-5 . . . . 0 A1:4-5" + " C0-1 . . C2-3:P1 . . . C2 . 0 " + " C0-1 . . C2-3:P1 . . . C4-5 . 0 B1:4-5" + "C0-3:P1:S+ C2-3:P1 . . . . . . . 0 A1:0-1,A2:2-3" + "C0-3:P1:S+ C2-3:P1 . . C1-3 . . . . 0 A1:1,A2:2-3" + "C2-3:P1:S+ C3:P1 . . C3 . . . . 0 A1:,A2:3 A1:P1,A2:P1" + "C2-3:P1:S+ C3:P1 . . C3 P0 . . . 0 A1:3,A2:3 A1:P1,A2:P0" + "C2-3:P1:S+ C2:P1 . . C2-4 . . . . 0 A1:3-4,A2:2" + "C2-3:P1:S+ C3:P1 . . C3 . . C0-2 . 0 A1:,B1:0-2 A1:P1,A2:P1" + "$SETUP_A123_PARTITIONS . C2-3 . . . . 0 A1:,A2:2,A3:3 A1:P1,A2:P1,A3:P1" # CPU offlining cases: - " S+ C0-1 . . C2-3 S+ C4-5 . O2-0 0 A1:0-1,B1:3" - " S+ C0-3:P1:S+ C2-3:P1 . . O2-0 . . . 0 A1:0-1,A2:3" - " S+ C0-3:P1:S+ C2-3:P1 . . O2-0 O2-1 . . 0 A1:0-1,A2:2-3" - " S+ C0-3:P1:S+ C2-3:P1 . . O1-0 . . . 0 A1:0,A2:2-3" - " S+ C0-3:P1:S+ C2-3:P1 . . O1-0 O1-1 . . 0 A1:0-1,A2:2-3" - " S+ C2-3:P1:S+ C3:P1 . . O3-0 O3-1 . . 0 A1:2,A2:3 A1:P1,A2:P1" - " S+ C2-3:P1:S+ C3:P2 . . O3-0 O3-1 . . 0 A1:2,A2:3 A1:P1,A2:P2" - " S+ C2-3:P1:S+ C3:P1 . . O2-0 O2-1 . . 0 A1:2,A2:3 A1:P1,A2:P1" - " S+ C2-3:P1:S+ C3:P2 . . O2-0 O2-1 . . 0 A1:2,A2:3 A1:P1,A2:P2" - " S+ C2-3:P1:S+ C3:P1 . . O2-0 . . . 0 A1:,A2:3 A1:P1,A2:P1" - " S+ C2-3:P1:S+ C3:P1 . . O3-0 . . . 0 A1:2,A2: A1:P1,A2:P1" - " S+ C2-3:P1:S+ C3:P1 . . T:O2-0 . . . 0 A1:3,A2:3 A1:P1,A2:P-1" - " S+ C2-3:P1:S+ C3:P1 . . . T:O3-0 . . 0 A1:2,A2:2 A1:P1,A2:P-1" - " S+ $SETUP_A123_PARTITIONS . O1-0 . . . 0 A1:,A2:2,A3:3 A1:P1,A2:P1,A3:P1" - " S+ $SETUP_A123_PARTITIONS . O2-0 . . . 0 A1:1,A2:,A3:3 A1:P1,A2:P1,A3:P1" - " S+ $SETUP_A123_PARTITIONS . O3-0 . . . 0 A1:1,A2:2,A3: A1:P1,A2:P1,A3:P1" - " S+ $SETUP_A123_PARTITIONS . T:O1-0 . . . 0 A1:2-3,A2:2-3,A3:3 A1:P1,A2:P-1,A3:P-1" - " S+ $SETUP_A123_PARTITIONS . . T:O2-0 . . 0 A1:1,A2:3,A3:3 A1:P1,A2:P1,A3:P-1" - " S+ $SETUP_A123_PARTITIONS . . . T:O3-0 . 0 A1:1,A2:2,A3:2 A1:P1,A2:P1,A3:P-1" - " S+ $SETUP_A123_PARTITIONS . T:O1-0 O1-1 . . 0 A1:1,A2:2,A3:3 A1:P1,A2:P1,A3:P1" - " S+ $SETUP_A123_PARTITIONS . . T:O2-0 O2-1 . 0 A1:1,A2:2,A3:3 A1:P1,A2:P1,A3:P1" - " S+ $SETUP_A123_PARTITIONS . . . T:O3-0 O3-1 0 A1:1,A2:2,A3:3 A1:P1,A2:P1,A3:P1" - " S+ $SETUP_A123_PARTITIONS . T:O1-0 O2-0 O1-1 . 0 A1:1,A2:,A3:3 A1:P1,A2:P1,A3:P1" - " S+ $SETUP_A123_PARTITIONS . T:O1-0 O2-0 O2-1 . 0 A1:2-3,A2:2-3,A3:3 A1:P1,A2:P-1,A3:P-1" - - # test old-A1 old-A2 old-A3 old-B1 new-A1 new-A2 new-A3 new-B1 fail ECPUs Pstate - # ---- ------ ------ ------ ------ ------ ------ ------ ------ ---- ----- ------ + " C0-1 . . C2-3 S+ C4-5 . O2=0 . 0 A1:0-1,B1:3" + "C0-3:P1:S+ C2-3:P1 . . O2=0 . . . . 0 A1:0-1,A2:3" + "C0-3:P1:S+ C2-3:P1 . . O2=0 O2=1 . . . 0 A1:0-1,A2:2-3" + "C0-3:P1:S+ C2-3:P1 . . O1=0 . . . . 0 A1:0,A2:2-3" + "C0-3:P1:S+ C2-3:P1 . . O1=0 O1=1 . . . 0 A1:0-1,A2:2-3" + "C2-3:P1:S+ C3:P1 . . O3=0 O3=1 . . . 0 A1:2,A2:3 A1:P1,A2:P1" + "C2-3:P1:S+ C3:P2 . . O3=0 O3=1 . . . 0 A1:2,A2:3 A1:P1,A2:P2" + "C2-3:P1:S+ C3:P1 . . O2=0 O2=1 . . . 0 A1:2,A2:3 A1:P1,A2:P1" + "C2-3:P1:S+ C3:P2 . . O2=0 O2=1 . . . 0 A1:2,A2:3 A1:P1,A2:P2" + "C2-3:P1:S+ C3:P1 . . O2=0 . . . . 0 A1:,A2:3 A1:P1,A2:P1" + "C2-3:P1:S+ C3:P1 . . O3=0 . . . . 0 A1:2,A2: A1:P1,A2:P1" + "C2-3:P1:S+ C3:P1 . . T:O2=0 . . . . 0 A1:3,A2:3 A1:P1,A2:P-1" + "C2-3:P1:S+ C3:P1 . . . T:O3=0 . . . 0 A1:2,A2:2 A1:P1,A2:P-1" + "$SETUP_A123_PARTITIONS . O1=0 . . . . 0 A1:,A2:2,A3:3 A1:P1,A2:P1,A3:P1" + "$SETUP_A123_PARTITIONS . O2=0 . . . . 0 A1:1,A2:,A3:3 A1:P1,A2:P1,A3:P1" + "$SETUP_A123_PARTITIONS . O3=0 . . . . 0 A1:1,A2:2,A3: A1:P1,A2:P1,A3:P1" + "$SETUP_A123_PARTITIONS . T:O1=0 . . . . 0 A1:2-3,A2:2-3,A3:3 A1:P1,A2:P-1,A3:P-1" + "$SETUP_A123_PARTITIONS . . T:O2=0 . . . 0 A1:1,A2:3,A3:3 A1:P1,A2:P1,A3:P-1" + "$SETUP_A123_PARTITIONS . . . T:O3=0 . . 0 A1:1,A2:2,A3:2 A1:P1,A2:P1,A3:P-1" + "$SETUP_A123_PARTITIONS . T:O1=0 O1=1 . . . 0 A1:1,A2:2,A3:3 A1:P1,A2:P1,A3:P1" + "$SETUP_A123_PARTITIONS . . T:O2=0 O2=1 . . 0 A1:1,A2:2,A3:3 A1:P1,A2:P1,A3:P1" + "$SETUP_A123_PARTITIONS . . . T:O3=0 O3=1 . 0 A1:1,A2:2,A3:3 A1:P1,A2:P1,A3:P1" + "$SETUP_A123_PARTITIONS . T:O1=0 O2=0 O1=1 . . 0 A1:1,A2:,A3:3 A1:P1,A2:P1,A3:P1" + "$SETUP_A123_PARTITIONS . T:O1=0 O2=0 O2=1 . . 0 A1:2-3,A2:2-3,A3:3 A1:P1,A2:P-1,A3:P-1" + + # old-A1 old-A2 old-A3 old-B1 new-A1 new-A2 new-A3 new-B1 new-C1 fail ECPUs Pstate PCPUS ISOLCPUS + # ------ ------ ------ ------ ------ ------ ------ ------ ------ ---- ----- ------ ----- -------- + # + # isolcpus partition tests + # + + # isolcpus partition can have empty cpuset.cpus & effective cpus + " . . . P3 . . . . . 0 B1: B1:P3" + + # isolcpus partition is not exclusive + " C1-2 . . C3:P3 C1-3:S+ C3 . . . 0 A1:1-2,A2:1-2,B1:3 B1:P3" + " C1-3 . . C3 . . . P3 . 0 A1:1-2,B1:3 B1:P3" + + # Only 1 isolcpus partition is allowed + " . . . C3:P3 C1:P3 . . . . 0 A1:1,B1:3 A1:P-3,B1:P3" + + # Isolated partition can pull isolated cpus from isolcpus partition + " C1-3:S+ C3 . C3:P3 . P2 . . . 0 A1:1-2,A2:3,B1: A2:P2,B1:P3 .:3,B1:3 3" + " C1-3:S+ C3 . C3:P3 . P2 . C2-3 . 0 A1:1,A2:3,B1:2 A2:P2,B1:P3 .:2-3,B1:3 2-3" + + # Isolated partition becomes invalid if cpu update fails pulling + " C1-3:S+ C3 . C3:P3 . P2:C2-3 . . . 0 A1:1-2,A2:2,B1:3 A2:P-2,B1:P3 .:3,B1: 3" + " C1-3:S+ C3 . C3:P3 . P2 . C1 . 0 A1:2-3,A2:3,B1:1 A2:P-2,B1:P3 .:1,B1: 1" + + # Once isolated partition pulls cpus from isolcpus, parent can shrink cpu list + " C1-3:S+ C3:P2 . C3:P3 C1-2 . . . . 0 A1:1-2,A2:3,B1: A2:P2,B1:P3 . 3" + " C1-3:S+ C3:P2 . C3:P3 C1 . . . . 0 A1:1,A2:3,B1: A2:P2,B1:P3 . 3" + + # Isolated partition can't be enabled if it can't pull all isolated cpus from parent or isolcpus + " C1-3:S+ C2 . C3:P3 . P2 . . . 0 A1:1-2,A2:2,B1:3 A2:P-2,B1:P3" + + # Isolated/isolcpus partition online/offline tests + " C1-3:S+ C3 . C2-3:P3 . P2 O2=0 . . 0 A1:1,A2:3,B1: A2:P2,B1:P3 .:2-3,B1:3 2-3" + " C1-3:S+ C3 . C2-3:P3 . P2 O2=0 O2=1 . 0 A1:1,A2:3,B1:2 A2:P2,B1:P3 .:2-3,B1:3 2-3" + " C1-3:S+ C2-3 . C2-3:P3 . P2 O2=0 . . 0 A1:1,A2:3,B1: A2:P2,B1:P3 .:2-3,B1:2-3 2-3" + " C1-3:S+ C2-3 . C2-3:P3 . P2 O2=0 O2=1 . 0 A1:1,A2:2-3,B1: A2:P2,B1:P3 .:2-3,B1:2-3 2-3" + + # Isolated partition pulling from isolcpus become invalid if all isolated cpus gone + " C1-3:S+ C3 . C2-3:P3 . P2 O3=0 . . 0 A1:1,A2:1,B1:2 A2:P-2,B1:P3 .:2-3,B1:" + " C1-3:S+ C3 . C2-3:P3 . P2 O3=0 O3=1 . 0 A1:1,A2:1,B1:2-3 A2:P-2,B1:P3 .:2-3,B1:" + + # Hotplug won't affect isolcpus partition with empty cpus_allowed + " C1-3 . . P3 . . O1=0 . . 0 A1:2-3,B1: B1:P3" + + # old-A1 old-A2 old-A3 old-B1 new-A1 new-A2 new-A3 new-B1 new-C1 fail ECPUs Pstate PCPUS ISOLCPUS + # ------ ------ ------ ------ ------ ------ ------ ------ ------ ---- ----- ------ ----- -------- # # Incorrect change to cpuset.cpus invalidates partition root # # Adding CPUs to partition root that are not in parent's # cpuset.cpus is allowed, but those extra CPUs are ignored. - " S+ C2-3:P1:S+ C3:P1 . . . C2-4 . . 0 A1:,A2:2-3 A1:P1,A2:P1" + "C2-3:P1:S+ C3:P1 . . . C2-4 . . . 0 A1:,A2:2-3 A1:P1,A2:P1" # Taking away all CPUs from parent or itself if there are tasks # will make the partition invalid. - " S+ C2-3:P1:S+ C3:P1 . . T C2-3 . . 0 A1:2-3,A2:2-3 A1:P1,A2:P-1" - " S+ C3:P1:S+ C3 . . T P1 . . 0 A1:3,A2:3 A1:P1,A2:P-1" - " S+ $SETUP_A123_PARTITIONS . T:C2-3 . . . 0 A1:2-3,A2:2-3,A3:3 A1:P1,A2:P-1,A3:P-1" - " S+ $SETUP_A123_PARTITIONS . T:C2-3:C1-3 . . . 0 A1:1,A2:2,A3:3 A1:P1,A2:P1,A3:P1" + "C2-3:P1:S+ C3:P1 . . T C2-3 . . . 0 A1:2-3,A2:2-3 A1:P1,A2:P-1" + " C3:P1:S+ C3 . . T P1 . . . 0 A1:3,A2:3 A1:P1,A2:P-1" + "$SETUP_A123_PARTITIONS . T:C2-3 . . . . 0 A1:2-3,A2:2-3,A3:3 A1:P1,A2:P-1,A3:P-1" + "$SETUP_A123_PARTITIONS . T:C2-3:C1-3 . . . . 0 A1:1,A2:2,A3:3 A1:P1,A2:P1,A3:P1" # Changing a partition root to member makes child partitions invalid - " S+ C2-3:P1:S+ C3:P1 . . P0 . . . 0 A1:2-3,A2:3 A1:P0,A2:P-1" - " S+ $SETUP_A123_PARTITIONS . C2-3 P0 . . 0 A1:2-3,A2:2-3,A3:3 A1:P1,A2:P0,A3:P-1" + "C2-3:P1:S+ C3:P1 . . P0 . . . . 0 A1:2-3,A2:3 A1:P0,A2:P-1" + "$SETUP_A123_PARTITIONS . C2-3 P0 . . . 0 A1:2-3,A2:2-3,A3:3 A1:P1,A2:P0,A3:P-1" # cpuset.cpus can contains cpus not in parent's cpuset.cpus as long # as they overlap. - " S+ C2-3:P1:S+ . . . . C3-4:P1 . . 0 A1:2,A2:3 A1:P1,A2:P1" + "C2-3:P1:S+ . . . . C3-4:P1 . . . 0 A1:2,A2:3 A1:P1,A2:P1" # Deletion of CPUs distributed to child cgroup is allowed. - " S+ C0-1:P1:S+ C1 . C2-3 C4-5 . . . 0 A1:4-5,A2:4-5" + "C0-1:P1:S+ C1 . C2-3 C4-5 . . . . 0 A1:4-5,A2:4-5" # To become a valid partition root, cpuset.cpus must overlap parent's # cpuset.cpus. - " S+ C0-1:P1 . . C2-3 S+ C4-5:P1 . . 0 A1:0-1,A2:0-1 A1:P1,A2:P-1" + " C0-1:P1 . . C2-3 S+ C4-5:P1 . . . 0 A1:0-1,A2:0-1 A1:P1,A2:P-1" # Enabling partition with child cpusets is allowed - " S+ C0-1:S+ C1 . C2-3 P1 . . . 0 A1:0-1,A2:1 A1:P1" + " C0-1:S+ C1 . C2-3 P1 . . . . 0 A1:0-1,A2:1 A1:P1" # A partition root with non-partition root parent is invalid, but it # can be made valid if its parent becomes a partition root too. - " S+ C0-1:S+ C1 . C2-3 . P2 . . 0 A1:0-1,A2:1 A1:P0,A2:P-2" - " S+ C0-1:S+ C1:P2 . C2-3 P1 . . . 0 A1:0,A2:1 A1:P1,A2:P2" + " C0-1:S+ C1 . C2-3 . P2 . . . 0 A1:0-1,A2:1 A1:P0,A2:P-2" + " C0-1:S+ C1:P2 . C2-3 P1 . . . . 0 A1:0,A2:1 A1:P1,A2:P2" # A non-exclusive cpuset.cpus change will invalidate partition and its siblings - " S+ C0-1:P1 . . C2-3 C0-2 . . . 0 A1:0-2,B1:2-3 A1:P-1,B1:P0" - " S+ C0-1:P1 . . P1:C2-3 C0-2 . . . 0 A1:0-2,B1:2-3 A1:P-1,B1:P-1" - " S+ C0-1 . . P1:C2-3 C0-2 . . . 0 A1:0-2,B1:2-3 A1:P0,B1:P-1" + " C0-1:P1 . . C2-3 C0-2 . . . . 0 A1:0-2,B1:2-3 A1:P-1,B1:P0" + " C0-1:P1 . . P1:C2-3 C0-2 . . . . 0 A1:0-2,B1:2-3 A1:P-1,B1:P-1" + " C0-1 . . P1:C2-3 C0-2 . . . . 0 A1:0-2,B1:2-3 A1:P0,B1:P-1" - # test old-A1 old-A2 old-A3 old-B1 new-A1 new-A2 new-A3 new-B1 fail ECPUs Pstate - # ---- ------ ------ ------ ------ ------ ------ ------ ------ ---- ----- ------ + # old-A1 old-A2 old-A3 old-B1 new-A1 new-A2 new-A3 new-B1 new-C1 fail ECPUs Pstate PCPUS ISOLCPUS + # ------ ------ ------ ------ ------ ------ ------ ------ ------ ---- ----- ------ ----- -------- # Failure cases: # A task cannot be added to a partition with no cpu - " S+ C2-3:P1:S+ C3:P1 . . O2-0:T . . . 1 A1:,A2:3 A1:P1,A2:P1" + "C2-3:P1:S+ C3:P1 . . O2=0:T . . . . 1 A1:,A2:3 A1:P1,A2:P1" + + # Task is not allowed in an isolcpus partition + " . . . C3:P3 . . . T . 1" + + # Child cpuset is not allowed under an isolcpus partition + " C1:P3 . . . S+ . . . . 1" ) # # Write to the cpu online file -# $1 - <c>-<v> where <c> = cpu number, <v> value to be written +# $1 - <c>=<v> where <c> = cpu number, <v> value to be written # write_cpu_online() { - CPU=${1%-*} - VAL=${1#*-} + CPU=${1%=*} + VAL=${1#*=} CPUFILE=//sys/devices/system/cpu/cpu${CPU}/online if [[ $VAL -eq 0 ]] then @@ -349,11 +400,12 @@ set_ctrl_state() TMPMSG=/tmp/.msg_$$ CGRP=$1 STATE=$2 - SHOWERR=${3}${VERBOSE} + SHOWERR=${3} CTRL=${CTRL:=$CONTROLLER} HASERR=0 REDIRECT="2> $TMPMSG" [[ -z "$STATE" || "$STATE" = '.' ]] && return 0 + [[ $VERBOSE -gt 0 ]] && SHOWERR=1 rm -f $TMPMSG for CMD in $(echo $STATE | sed -e "s/:/ /g") @@ -383,6 +435,9 @@ set_ctrl_state() ;; 2) VAL=isolated ;; + 3) + VAL=isolcpus + ;; *) echo "Invalid partition state - $VAL" exit 1 @@ -430,7 +485,7 @@ online_cpus() [[ -n "OFFLINE_CPUS" ]] && { for C in $OFFLINE_CPUS do - write_cpu_online ${C}-1 + write_cpu_online ${C}=1 done } } @@ -442,19 +497,23 @@ reset_cgroup_states() { echo 0 > $CGROUP2/cgroup.procs online_cpus - rmdir A1/A2/A3 A1/A2 A1 B1 > /dev/null 2>&1 + rmdir A1/A2/A3 A1/A2 A1 B1 C1 > /dev/null 2>&1 set_ctrl_state . S- pause 0.01 } dump_states() { - for DIR in A1 A1/A2 A1/A2/A3 B1 + for DIR in . A1 A1/A2 A1/A2/A3 B1 C1 do ECPUS=$DIR/cpuset.cpus.effective PRS=$DIR/cpuset.cpus.partition + PCPUS=$DIR/cpuset.cpus.subpartitions + [[ -e $PCPUS ]] || + PCPUS=$DIR/.__DEBUG__.cpuset.cpus.subpartitions [[ -e $ECPUS ]] && echo "$ECPUS: $(cat $ECPUS)" [[ -e $PRS ]] && echo "$PRS: $(cat $PRS)" + [[ -e $PCPUS ]] && echo "$PCPUS: $(cat $PCPUS)" done } @@ -478,6 +537,26 @@ check_effective_cpus() done } +# +# Check subparts cpus +# $1 - check string, format: <cgroup>:<cpu-list>[,<cgroup>:<cpu-list>]* +# +check_subparts_cpus() +{ + CHK_STR=$1 + for CHK in $(echo $CHK_STR | sed -e "s/,/ /g") + do + set -- $(echo $CHK | sed -e "s/:/ /g") + CGRP=$1 + CPUS=$2 + [[ $CGRP = A2 ]] && CGRP=A1/A2 + [[ $CGRP = A3 ]] && CGRP=A1/A2/A3 + FILE=$CGRP/.__DEBUG__.cpuset.cpus.subpartitions + [[ -e $FILE ]] || return 0 # Skip test + [[ $CPUS = $(cat $FILE) ]] || return 1 + done +} + # # Check cgroup states # $1 - check string, format: <cgroup>:<state>[,<cgroup>:<state>]* @@ -512,18 +591,80 @@ check_cgroup_states() isolated) VAL=2 ;; + isolcpus) + VAL=3 + ;; "root invalid"*) VAL=-1 ;; "isolated invalid"*) VAL=-2 ;; + "isolcpus invalid"*) + VAL=-3 + ;; esac [[ $EVAL != $VAL ]] && return 1 done return 0 } +# +# Get isolated (including offline) CPUs by looking at +# /sys/kernel/debug/sched/domains and compare that with the expected value. +# +# $1 - expected isolated cpu list +# +check_isolcpus() +{ + EXPECT_VAL=$1 + ISOLCPUS= + LASTISOLCPU= + SCHED_DOMAINS=/sys/kernel/debug/sched/domains + [[ -d $SCHED_DOMAINS ]] || return 0 # Skip check + + for ((CPU=0; CPU < $NR_CPUS; CPU++)) + do + [[ -n "$(ls ${SCHED_DOMAINS}/cpu$CPU)" ]] && continue + + if [[ -z "$LASTISOLCPU" ]] + then + ISOLCPUS=$CPU + LASTISOLCPU=$CPU + elif [[ "$LASTISOLCPU" -eq $((CPU - 1)) ]] + then + echo $ISOLCPUS | grep -q "\<$LASTISOLCPU\$" + if [[ $? -eq 0 ]] + then + ISOLCPUS=${ISOLCPUS}- + fi + LASTISOLCPU=$CPU + else + if [[ $ISOLCPUS = *- ]] + then + ISOLCPUS=${ISOLCPUS}$LASTISOLCPU + fi + ISOLCPUS=${ISOLCPUS},$CPU + LASTISOLCPU=$CPU + fi + done + [[ "$ISOLCPUS" = *- ]] && ISOLCPUS=${ISOLCPUS}$LASTISOLCPU + [[ $EXPECT_VAL = $ISOLCPUS ]] +} + +test_fail() +{ + TESTNUM=$1 + TESTTYPE=$2 + ADDINFO=$3 + echo "Test $TEST[$TESTNUM] failed $TESTTYPE check!" + [[ -n "$ADDINFO" ]] && echo "*** $ADDINFO ***" + eval echo \"\${$TEST[$I]}\" + echo + dump_states + exit 1 +} + # # Run cpuset state transition test # $1 - test matrix name @@ -548,60 +689,59 @@ run_state_test() while [[ $I -lt $CNT ]] do echo "Running test $I ..." > /dev/console + [[ $VERBOSE -gt 1 ]] && eval echo \"\${$TEST[$I]}\" eval set -- "\${$TEST[$I]}" - ROOT=$1 - OLD_A1=$2 - OLD_A2=$3 - OLD_A3=$4 - OLD_B1=$5 - NEW_A1=$6 - NEW_A2=$7 - NEW_A3=$8 - NEW_B1=$9 + OLD_A1=$1 + OLD_A2=$2 + OLD_A3=$3 + OLD_B1=$4 + NEW_A1=$5 + NEW_A2=$6 + NEW_A3=$7 + NEW_B1=$8 + NEW_C1=$9 RESULT=${10} ECPUS=${11} STATES=${12} + PCPUS=${13} + ICPUS=${14} - set_ctrl_state_noerr . $ROOT + set_ctrl_state_noerr . "S+" + set_ctrl_state_noerr B1 $OLD_B1 set_ctrl_state_noerr A1 $OLD_A1 set_ctrl_state_noerr A1/A2 $OLD_A2 set_ctrl_state_noerr A1/A2/A3 $OLD_A3 - set_ctrl_state_noerr B1 $OLD_B1 RETVAL=0 set_ctrl_state A1 $NEW_A1; ((RETVAL += $?)) set_ctrl_state A1/A2 $NEW_A2; ((RETVAL += $?)) set_ctrl_state A1/A2/A3 $NEW_A3; ((RETVAL += $?)) set_ctrl_state B1 $NEW_B1; ((RETVAL += $?)) + set_ctrl_state C1 $NEW_C1; ((RETVAL += $?)) - [[ $RETVAL -ne $RESULT ]] && { - echo "Test $TEST[$I] failed result check!" - eval echo \"\${$TEST[$I]}\" - dump_states - exit 1 - } + [[ $RETVAL -ne $RESULT ]] && test_fail $I result [[ -n "$ECPUS" && "$ECPUS" != . ]] && { check_effective_cpus $ECPUS - [[ $? -ne 0 ]] && { - echo "Test $TEST[$I] failed effective CPU check!" - eval echo \"\${$TEST[$I]}\" - echo - dump_states - exit 1 - } + [[ $? -ne 0 ]] && test_fail $I "effective CPU" } - [[ -n "$STATES" ]] && { + [[ -n "$STATES" && "$STATES" != . ]] && { check_cgroup_states $STATES - [[ $? -ne 0 ]] && { - echo "FAILED: Test $TEST[$I] failed states check!" - eval echo \"\${$TEST[$I]}\" - echo - dump_states - exit 1 - } + [[ $? -ne 0 ]] && test_fail $I states } + [[ -n "$PCPUS" && "$PCPUS" != . ]] && { + check_subparts_cpus $PCPUS + [[ $? -ne 0 ]] && test_fail $I "subpartitions CPU" + } + + # Compare the expected isolated CPUs with the actual ones, + # if available + [[ -n "$ICPUS" ]] && { + check_isolcpus $ICPUS + [[ $? -ne 0 ]] && test_fail $I "isolated CPU" \ + "Expect $ICPUS, get $ISOLCPUS instead" + } reset_cgroup_states # # Check to see if effective cpu list changes @@ -612,7 +752,7 @@ run_state_test() echo "Effective cpus changed to $NEWLIST after test $I!" exit 1 } - [[ -n "$VERBOSE" ]] && echo "Test $I done." + [[ $VERBOSE -gt 0 ]] && echo "Test $I done." ((I++)) done echo "All $I tests of $TEST PASSED." @@ -655,7 +795,7 @@ test_inotify() rm -f $PRS wait_inotify $PWD/cpuset.cpus.partition $PRS & pause 0.01 - set_ctrl_state . "O1-0" + set_ctrl_state . "O1=0" pause 0.01 check_cgroup_states ".:P-1" if [[ $? -ne 0 ]] -- 2.31.1

2 years, 2 months

1
0
0 0

[RFC PATCH 4/5] cgroup/cpuset: Documentation update for the new "isolcpus" partition

by Waiman Long

This patch updates the cgroup-v2.rst file to include information about the new "isolcpus" partition type. Signed-off-by: Waiman Long <longman(a)redhat.com> --- Documentation/admin-guide/cgroup-v2.rst | 89 +++++++++++++++++++------ 1 file changed, 70 insertions(+), 19 deletions(-) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index f67c0829350b..352a02849fa7 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -2225,7 +2225,8 @@ Cpuset Interface Files ========== ===================================== "member" Non-root member of a partition "root" Partition root - "isolated" Partition root without load balancing + "isolcpus" Partition root for isolated CPUs pool + "isolated" Partition root for isolated CPUs ========== ===================================== The root cgroup is always a partition root and its state @@ -2237,24 +2238,41 @@ Cpuset Interface Files its descendants except those that are separate partition roots themselves and their descendants. + When set to "isolcpus", the CPUs in that partition root will + be in an isolated state without any load balancing from the + scheduler. This partition root is special as there can be at + most one instance of it in a system and no task or child cpuset + is allowed in this cgroup. It acts as a pool of isolated CPUs to + be pulled into other "isolated" partitions. The "cpuset.cpus" + of an "isolcpus" partition root contains the list of isolated + CPUs it holds, where "cpuset.cpus.effective" contains the list + of freely available isolated CPUs that are ready to be pull + into other "isolated" partition. + When set to "isolated", the CPUs in that partition root will be in an isolated state without any load balancing from the scheduler. Tasks placed in such a partition with multiple CPUs should be carefully distributed and bound to each of the - individual CPUs for optimal performance. - - The value shown in "cpuset.cpus.effective" of a partition root - is the CPUs that the partition root can dedicate to a potential - new child partition root. The new child subtracts available - CPUs from its parent "cpuset.cpus.effective". - - A partition root ("root" or "isolated") can be in one of the - two possible states - valid or invalid. An invalid partition - root is in a degraded state where some state information may - be retained, but behaves more like a "member". - - All possible state transitions among "member", "root" and - "isolated" are allowed. + individual CPUs for optimal performance. The isolated CPUs can + come from either the parent partition root or from an "isolcpus" + partition if the parent cannot satisfy its request. + + The value shown in "cpuset.cpus.effective" of a partition root is + the CPUs that the partition root can dedicate to a potential new + child partition root. The new child partition subtracts available + CPUs from its parent "cpuset.cpus.effective". An exception is + an "isolated" partition that pulls its isolated CPUs from the + "isolcpus" partition root that is not its direct parent. + + A partition root can be in one of the two possible states - + valid or invalid. An invalid partition root is in a degraded + state where some state information may be retained, but behaves + more like a "member". + + All possible state transitions among "member", "root", "isolcpus" + and "isolated" are allowed. However, the partition root may + not be valid if the corresponding prerequisite conditions are + not met. On read, the "cpuset.cpus.partition" file can show the following values. @@ -2262,16 +2280,18 @@ Cpuset Interface Files ============================= ===================================== "member" Non-root member of a partition "root" Partition root - "isolated" Partition root without load balancing + "isolcpus" Partition root for isolated CPUs pool + "isolated" Partition root for isolated CPUs "root invalid (<reason>)" Invalid partition root + "isolcpus invalid (<reason>)" Invalid isolcpus partition root "isolated invalid (<reason>)" Invalid isolated partition root ============================= ===================================== In the case of an invalid partition root, a descriptive string on - why the partition is invalid is included within parentheses. + why the partition is invalid may be included within parentheses. - For a partition root to become valid, the following conditions - must be met. + For a "root" partition root to become valid, the following + conditions must be met. 1) The "cpuset.cpus" is exclusive with its siblings , i.e. they are not shared by any of its siblings (exclusivity rule). @@ -2281,6 +2301,37 @@ Cpuset Interface Files 4) The "cpuset.cpus.effective" cannot be empty unless there is no task associated with this partition. + A valid "isolcpus" partition root requires the following + conditions. + + 1) The parent cgroup is a valid partition root. + 2) The "cpuset.cpus" must be a subset of parent's "cpuset.cpus" + including an empty cpu list. + 3) There can be no more than one valid "isolcpus" partition. + 4) No task or child cpuset is allowed. + + Note that an "isolcpus" partition is not exclusive and its + isolated CPUs can be distributed down sibling cgroups even + though they may not appear in their "cpuset.cpus.effective". + + A valid "isolated" partition root can pull isolated CPUs from + either its parent partition or from the "isolcpus" partition. + It also requires the following conditions to be met. + + 1) The "cpuset.cpus" is exclusive with its siblings , i.e. they + are not shared by any of its siblings (exclusivity rule). + 2) The "cpuset.cpus" is not empty and must be a subset of + parent's "cpuset.cpus". + 3) The "cpuset.cpus.effective" cannot be empty unless there is + no task associated with this partition. + + If pulling isolated CPUS from "isolcpus" partition, + the "cpuset.cpus" must also be a subset of "isolcpus" + partition's "cpuset.cpus" and all the requested CPUs must + be available for pulling, i.e. in "isolcpus" partition's + "cpuset.cpus.effective". In this case, its hierarchical parent + does not need to be a valid partition root. + External events like hotplug or changes to "cpuset.cpus" can cause a valid partition root to become invalid and vice versa. Note that a task cannot be moved to a cgroup with empty -- 2.31.1

2 years, 2 months

1
0
0 0

[RFC PATCH 3/5] cgroup/cpuset: Make isolated partition pull CPUs from isolcpus partition

by Waiman Long

With the addition of a new "isolcpus" partition in a previous patch, this patch adds the capability for a privileged user to pull isolated CPUs from the "isolcpus" partition to an "isolated" partition if its parent cannot satisfy its request directly. The following conditions must be true for the pulling of isolated CPUs from "isolcpus" partition to be successful. (1) The value of "cpuset.cpus" must still be a subset of its parent's "cpuset.cpus" to ensure proper inheritance even though these CPUs cannot be used until the cpuset becomes an "isolated" partition. (2) All the CPUs in "cpuset.cpus" are freely available in the "isolcpus" partition, i.e. in its "cpuset.cpus.effective" and not yet claimed by other isolated partitions. With this change, the CPUs in an "isolated" partition can either come from the "isolcpus" partition or from its direct parent, but not both. Now the parent of an isolated partition does not need to be a partition root anymore. Because of the cpu exclusive nature of an "isolated" partition, these isolated CPUs cannot be distributed to other siblings of that isolated partition. Changes to "cpuset.cpus" of such an isolated partition is allowed as long as all the newly requested CPUs can be granted from the "isolcpus" partition. Otherwise, the partition will become invalid. This makes the management and distribution of isolated CPUs to those applications that require them much easier. An "isolated" partition that pulls CPUs from the special "isolcpus" partition can now have 2 parents - the "isolcpus" partition where it gets its isolated CPUs and its hierarchical parent where it gets all the other resources. However, such an "isolated" partition cannot have subpartitions as all the CPUs from "isolcpus" must be in the same isolated state. Signed-off-by: Waiman Long <longman(a)redhat.com> --- kernel/cgroup/cpuset.c | 282 ++++++++++++++++++++++++++++++++++++++--- 1 file changed, 264 insertions(+), 18 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 444eae3a9a6b..a5bbd43ed46e 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -101,6 +101,7 @@ enum prs_errcode { PERR_ISOLCPUS, PERR_ISOLTASK, PERR_ISOLCHILD, + PERR_ISOPARENT, }; static const char * const perr_strings[] = { @@ -114,6 +115,7 @@ static const char * const perr_strings[] = { [PERR_ISOLCPUS] = "An isolcpus partition is already present", [PERR_ISOLTASK] = "Isolcpus partition can't have tasks", [PERR_ISOLCHILD] = "Isolcpus partition can't have children", + [PERR_ISOPARENT] = "Isolated/isolcpus parent can't have subpartition", }; struct cpuset { @@ -1333,6 +1335,195 @@ static void update_partition_sd_lb(struct cpuset *cs, int old_prs) rebuild_sched_domains_locked(); } +/* + * isolcpus_pull - Enable or disable pulling of isolated cpus from isolcpus + * @cs: the cpuset to update + * @cmd: the command code (only partcmd_enable or partcmd_disable) + * Return: 1 if successful, 0 if error + * + * Note that pulling isolated cpus from isolcpus or cpus from parent does + * not require rebuilding sched domains. So we can change the flags directly. + */ +static int isolcpus_pull(struct cpuset *cs, enum subparts_cmd cmd) +{ + struct cpuset *parent = parent_cs(cs); + + if (!isolcpus_cs) + return 0; + + /* + * To enable pulling of isolated CPUs from isolcpus, cpus_allowed + * must be a subset of both its parent's cpus_allowed and isolcpus_cs's + * effective_cpus and the user has sysadmin privilege. + */ + if ((cmd == partcmd_enable) && capable(CAP_SYS_ADMIN) && + cpumask_subset(cs->cpus_allowed, isolcpus_cs->effective_cpus) && + cpumask_subset(cs->cpus_allowed, parent->cpus_allowed)) { + /* + * Move cpus from effective_cpus to subparts_cpus & make + * cs a child of isolcpus partition. + */ + spin_lock_irq(&callback_lock); + cpumask_andnot(isolcpus_cs->effective_cpus, + isolcpus_cs->effective_cpus, cs->cpus_allowed); + cpumask_or(isolcpus_cs->subparts_cpus, + isolcpus_cs->subparts_cpus, cs->cpus_allowed); + cpumask_copy(cs->effective_cpus, cs->cpus_allowed); + isolcpus_cs->nr_subparts_cpus + = cpumask_weight(isolcpus_cs->subparts_cpus); + + if (cs->use_parent_ecpus) { + cs->use_parent_ecpus = false; + parent->child_ecpus_count--; + } + list_add(&cs->isol_sibling, &isol_children); + clear_bit(CS_SCHED_LOAD_BALANCE, &cs->flags); + spin_unlock_irq(&callback_lock); + return 1; + } + + if ((cmd == partcmd_disable) && !list_empty(&cs->isol_sibling)) { + /* + * This can be called after isolcpus shrinks its cpu list. + * So not all the cpus should be returned back to isolcpus. + */ + WARN_ON_ONCE(cs->partition_root_state != PRS_ISOLATED); + spin_lock_irq(&callback_lock); + cpumask_andnot(isolcpus_cs->subparts_cpus, + isolcpus_cs->subparts_cpus, cs->cpus_allowed); + cpumask_or(isolcpus_cs->effective_cpus, + isolcpus_cs->effective_cpus, cs->effective_cpus); + cpumask_and(isolcpus_cs->effective_cpus, + isolcpus_cs->effective_cpus, + isolcpus_cs->cpus_allowed); + cpumask_and(isolcpus_cs->effective_cpus, + isolcpus_cs->effective_cpus, cpu_active_mask); + isolcpus_cs->nr_subparts_cpus + = cpumask_weight(isolcpus_cs->subparts_cpus); + + if (!cpumask_and(cs->effective_cpus, parent->effective_cpus, + cs->cpus_allowed)) { + cs->use_parent_ecpus = true; + parent->child_ecpus_count++; + cpumask_copy(cs->effective_cpus, + parent->effective_cpus); + } + list_del_init(&cs->isol_sibling); + cs->partition_root_state = PRS_INVALID_ISOLATED; + cs->prs_err = PERR_INVCPUS; + + set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags); + clear_bit(CS_CPU_EXCLUSIVE, &cs->flags); + spin_unlock_irq(&callback_lock); + return 1; + } + return 0; +} + +static void isolcpus_disable(void) +{ + struct cpuset *child, *next; + + list_for_each_entry_safe(child, next, &isol_children, isol_sibling) + WARN_ON_ONCE(isolcpus_pull(child, partcmd_disable)); + + isolcpus_cs = NULL; +} + +/* + * isolcpus_cpus_update - cpuset.cpus change in isolcpus partition + */ +static void isolcpus_cpus_update(struct cpuset *cs) +{ + struct cpuset *child, *next; + + if (WARN_ON_ONCE(isolcpus_cs != cs)) + return; + + if (list_empty(&isol_children)) + return; + + /* + * Remove child isolated partitions that are not fully covered by + * subparts_cpus. + */ + list_for_each_entry_safe(child, next, &isol_children, + isol_sibling) { + if (cpumask_subset(child->cpus_allowed, + cs->subparts_cpus)) + continue; + + isolcpus_pull(child, partcmd_disable); + } +} + +/* + * isolated_cpus_update - cpuset.cpus change in isolated partition + * + * Return: 1 if no further action needs, 0 otherwise + */ +static int isolated_cpus_update(struct cpuset *cs, struct cpumask *newmask, + struct tmpmasks *tmp) +{ + struct cpumask *addmask = tmp->addmask; + struct cpumask *delmask = tmp->delmask; + + if (WARN_ON_ONCE(cs->partition_root_state != PRS_ISOLATED) || + list_empty(&cs->isol_sibling)) + return 0; + + if (WARN_ON_ONCE(!isolcpus_cs) || cpumask_empty(newmask)) { + isolcpus_pull(cs, partcmd_disable); + return 0; + } + + if (cpumask_andnot(addmask, newmask, cs->cpus_allowed)) { + /* + * Check if isolcpus partition can provide the new CPUs + */ + if (!cpumask_subset(addmask, isolcpus_cs->cpus_allowed) || + cpumask_intersects(addmask, isolcpus_cs->subparts_cpus)) { + isolcpus_pull(cs, partcmd_disable); + return 0; + } + + /* + * Pull addmask isolated CPUs from isolcpus partition + */ + spin_lock_irq(&callback_lock); + cpumask_andnot(isolcpus_cs->subparts_cpus, + isolcpus_cs->subparts_cpus, addmask); + cpumask_andnot(isolcpus_cs->effective_cpus, + isolcpus_cs->effective_cpus, addmask); + isolcpus_cs->nr_subparts_cpus + = cpumask_weight(isolcpus_cs->subparts_cpus); + spin_unlock_irq(&callback_lock); + } + + if (cpumask_andnot(tmp->delmask, cs->cpus_allowed, newmask)) { + /* + * Return isolated CPUs back to isolcpus partition + */ + spin_lock_irq(&callback_lock); + cpumask_or(isolcpus_cs->subparts_cpus, + isolcpus_cs->subparts_cpus, delmask); + cpumask_or(isolcpus_cs->effective_cpus, + isolcpus_cs->effective_cpus, delmask); + cpumask_and(isolcpus_cs->effective_cpus, + isolcpus_cs->effective_cpus, cpu_active_mask); + isolcpus_cs->nr_subparts_cpus + = cpumask_weight(isolcpus_cs->subparts_cpus); + spin_unlock_irq(&callback_lock); + } + + spin_lock_irq(&callback_lock); + cpumask_copy(cs->cpus_allowed, newmask); + cpumask_andnot(cs->effective_cpus, newmask, cs->subparts_cpus); + cpumask_and(cs->effective_cpus, cs->effective_cpus, cpu_active_mask); + spin_unlock_irq(&callback_lock); + return 1; +} + /** * update_parent_subparts_cpumask - update subparts_cpus mask of parent cpuset * @cs: The cpuset that requests change in partition root state @@ -1579,7 +1770,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd, spin_unlock_irq(&callback_lock); if ((isolcpus_cs == cs) && (cs->partition_root_state != PRS_ISOLCPUS)) - isolcpus_cs = NULL; + isolcpus_disable(); if (adding || deleting) update_tasks_cpumask(parent, tmp->addmask); @@ -1625,6 +1816,12 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp, struct cpuset *parent = parent_cs(cp); bool update_parent = false; + /* + * Skip isolated cpuset that pull isolated CPUs from isolcpus + */ + if (!list_empty(&cp->isol_sibling)) + continue; + compute_effective_cpumask(tmp->new_cpus, cp, parent); /* @@ -1742,7 +1939,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp, WARN_ON(!is_in_v2_mode() && !cpumask_equal(cp->cpus_allowed, cp->effective_cpus)); - update_tasks_cpumask(cp, tmp->new_cpus); + update_tasks_cpumask(cp, cp->effective_cpus); /* * On legacy hierarchy, if the effective cpumask of any non- @@ -1888,6 +2085,10 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, return retval; if (cs->partition_root_state) { + if (!list_empty(&cs->isol_sibling) && + isolated_cpus_update(cs, trialcs->cpus_allowed, &tmp)) + goto update_hier; /* CPUs update done */ + if (invalidate) update_parent_subparts_cpumask(cs, partcmd_invalidate, NULL, &tmp); @@ -1920,6 +2121,7 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, } spin_unlock_irq(&callback_lock); +update_hier: #ifdef CONFIG_CPUMASK_OFFSTACK /* Now trialcs->cpus_allowed is available */ tmp.new_cpus = trialcs->cpus_allowed; @@ -1928,8 +2130,7 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, /* effective_cpus will be updated here */ update_cpumasks_hier(cs, &tmp, false); - if (cs->partition_root_state) { - bool force = (cs->partition_root_state == PRS_ISOLCPUS); + if (cs->partition_root_state && list_empty(&cs->isol_sibling)) { struct cpuset *parent = parent_cs(cs); /* @@ -1937,8 +2138,12 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, * cpusets if they use parent's effective_cpus or when * the current cpuset is an isolcpus partition. */ - if (parent->child_ecpus_count || force) - update_sibling_cpumasks(parent, cs, &tmp, force); + if (cs->partition_root_state == PRS_ISOLCPUS) { + update_sibling_cpumasks(parent, cs, &tmp, true); + isolcpus_cpus_update(cs); + } else if (parent->child_ecpus_count) { + update_sibling_cpumasks(parent, cs, &tmp, false); + } /* Update CS_SCHED_LOAD_BALANCE and/or sched_domains */ update_partition_sd_lb(cs, old_prs); @@ -2307,7 +2512,7 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, return err; } -/** +/* * update_prstate - update partition_root_state * @cs: the cpuset to update * @new_prs: new partition root state @@ -2325,13 +2530,10 @@ static int update_prstate(struct cpuset *cs, int new_prs) return 0; /* - * For a previously invalid partition root, leave it at being - * invalid if new_prs is not "member". + * For a previously invalid partition root, treat it like a "member". */ - if (new_prs && is_prs_invalid(old_prs)) { - cs->partition_root_state = -new_prs; - return 0; - } + if (new_prs && is_prs_invalid(old_prs)) + old_prs = PRS_MEMBER; if (alloc_cpumasks(NULL, &tmpmask)) return -ENOMEM; @@ -2371,6 +2573,21 @@ static int update_prstate(struct cpuset *cs, int new_prs) } } + /* + * A parent isolated partition that gets its isolated CPUs from + * isolcpus cannot have subpartition. + */ + if (new_prs && !list_empty(&parent->isol_sibling)) { + err = PERR_ISOPARENT; + goto out; + } + + if ((old_prs == PRS_ISOLATED) && !list_empty(&cs->isol_sibling)) { + isolcpus_pull(cs, partcmd_disable); + old_prs = 0; + } + WARN_ON_ONCE(!list_empty(&cs->isol_sibling)); + err = update_partition_exclusive(cs, new_prs); if (err) goto out; @@ -2386,6 +2603,10 @@ static int update_prstate(struct cpuset *cs, int new_prs) err = update_parent_subparts_cpumask(cs, partcmd_enable, NULL, &tmpmask); + if (err && (new_prs == PRS_ISOLATED) && + isolcpus_pull(cs, partcmd_enable)) + err = 0; /* Successful isolcpus pull */ + if (err) goto out; } else if (old_prs && new_prs) { @@ -2445,7 +2666,7 @@ static int update_prstate(struct cpuset *cs, int new_prs) if (new_prs == PRS_ISOLCPUS) isolcpus_cs = cs; else if (cs == isolcpus_cs) - isolcpus_cs = NULL; + isolcpus_disable(); /* * Update child cpusets, if present. @@ -3674,8 +3895,31 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp) } parent = parent_cs(cs); - compute_effective_cpumask(&new_cpus, cs, parent); nodes_and(new_mems, cs->mems_allowed, parent->effective_mems); + /* + * In the special case of a valid isolated cpuset pulling isolated + * cpus from isolcpus. We just need to mask offline cpus from + * cpus_allowed unless all the isolated cpus are gone. + */ + if (!list_empty(&cs->isol_sibling)) { + if (!cpumask_and(&new_cpus, cs->cpus_allowed, cpu_active_mask)) + isolcpus_pull(cs, partcmd_disable); + } else if ((cs->partition_root_state == PRS_ISOLCPUS) && + cpumask_empty(cs->cpus_allowed)) { + /* + * For isolcpus with empty cpus_allowed, just update + * effective_mems and be done with it. + */ + spin_lock_irq(&callback_lock); + if (nodes_empty(new_mems)) + cs->effective_mems = parent->effective_mems; + else + cs->effective_mems = new_mems; + spin_unlock_irq(&callback_lock); + goto unlock; + } else { + compute_effective_cpumask(&new_cpus, cs, parent); + } if (cs->nr_subparts_cpus) /* @@ -3707,10 +3951,12 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp) * the following conditions hold: * 1) empty effective cpus but not valid empty partition. * 2) parent is invalid or doesn't grant any cpus to child - * partitions. + * partitions and not an isolated cpuset pulling cpus from + * isolcpus. */ - if (is_partition_valid(cs) && (!parent->nr_subparts_cpus || - (cpumask_empty(&new_cpus) && partition_is_populated(cs, NULL)))) { + if (is_partition_valid(cs) && + ((!parent->nr_subparts_cpus && list_empty(&cs->isol_sibling)) || + (cpumask_empty(&new_cpus) && partition_is_populated(cs, NULL)))) { int old_prs, parent_prs; update_parent_subparts_cpumask(cs, partcmd_disable, NULL, tmp); -- 2.31.1

2 years, 2 months

1
0
0 0

[RFC PATCH 2/5] cgroup/cpuset: Add a new "isolcpus" paritition root state

by Waiman Long

One can use "cpuset.cpus.partition" to create multiple scheduling domains or to produce a set of isolated CPUs where load balancing is disabled. The former use case is less common but the latter one can be frequently used especially for the Telco use cases like DPDK. The existing "isolated" partition can be used to produce isolated CPUs if the applications have full control of a system. However, in a containerized environment where all the apps are run in a container, it is hard to distribute out isolated CPUs from the root down given the unified hierarchy nature of cgroup v2. The container running on isolated CPUs can be several layers down from the root. The current partition feature requires that all the ancestors of a leaf partition root must be parititon roots themselves. This can be hard to manage. This patch introduces a new special partition root state called "isolcpus" that serves as a pool of isolated CPUs to be pulled into other "isolated" partitions. At most one instance of the "isolcpus" partition is allowed in a system preferrably as a child of the top cpuset. In a valid "isolcpus" partition, "cpuset.cpus" contains the set of isolated CPUs and "cpuset.cpus.effective" contains the set of freely available isolated CPUs that have not yet been pulled into other "isolated" cpusets. The special "isolcpus" partition cannot have normal cpuset children. So we are not allowed to enable child cpuset in its "cgroup.subtree_control" file if it has children. Tasks are also not allowed in the "cgroup.procs" of the "isolcpus" partition. Unlike other partition roots, empty "cpuset.cpus" is allowed in the "isolcpus" partition as this special cpuset is not designed to hold tasks. The CPUs in the "isolcpus" partition are not exclusive so that those isolated CPUs can be distributed down sibling hierarchies as usual even though they will not show up in their "cpuset.cpus.effective". Right now, an "isolcpus" partition only disable load balancing of the isolated CPUs. In the near future, it may be extended to support additional isolation attributes like those currently supported by the "isolcpus" or related kernel boot command line options. In a subsequent patch, a privileged user can change a "member" cpuset to an "isolated" partition root by pulling isolated CPUs from the "isolcpus" partition if its parent is not a partition root that can directly satisfy the request. Signed-off-by: Waiman Long <longman(a)redhat.com> --- kernel/cgroup/cpuset.c | 158 ++++++++++++++++++++++++++++++++++------- 1 file changed, 133 insertions(+), 25 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 83a7193e0f2c..444eae3a9a6b 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -98,6 +98,9 @@ enum prs_errcode { PERR_NOCPUS, PERR_HOTPLUG, PERR_CPUSEMPTY, + PERR_ISOLCPUS, + PERR_ISOLTASK, + PERR_ISOLCHILD, }; static const char * const perr_strings[] = { @@ -108,6 +111,9 @@ static const char * const perr_strings[] = { [PERR_NOCPUS] = "Parent unable to distribute cpu downstream", [PERR_HOTPLUG] = "No cpu available due to hotplug", [PERR_CPUSEMPTY] = "cpuset.cpus is empty", + [PERR_ISOLCPUS] = "An isolcpus partition is already present", + [PERR_ISOLTASK] = "Isolcpus partition can't have tasks", + [PERR_ISOLCHILD] = "Isolcpus partition can't have children", }; struct cpuset { @@ -198,6 +204,9 @@ struct cpuset { /* Handle for cpuset.cpus.partition */ struct cgroup_file partition_file; + + /* siblings list anchored at isol_children */ + struct list_head isol_sibling; }; /* @@ -206,14 +215,26 @@ struct cpuset { * 0 - member (not a partition root) * 1 - partition root * 2 - partition root without load balancing (isolated) + * 3 - isolated cpu pool (isolcpus) * -1 - invalid partition root * -2 - invalid isolated partition root + * -3 - invalid isolated cpu pool + * + * An isolated cpu pool is a special isolated partition root. At most one + * instance of it is allowed in a system. It provides a pool of isolated + * cpus that a normal isolated partition root can pull from, if privileged, + * in case its parent cannot fulfill its request. */ #define PRS_MEMBER 0 #define PRS_ROOT 1 #define PRS_ISOLATED 2 +#define PRS_ISOLCPUS 3 #define PRS_INVALID_ROOT -1 #define PRS_INVALID_ISOLATED -2 +#define PRS_INVALID_ISOLCPUS -3 + +static struct cpuset *isolcpus_cs; /* System isolcpus partition root */ +static struct list_head isol_children; /* Children that pull isolated cpus */ static inline bool is_prs_invalid(int prs_state) { @@ -335,6 +356,7 @@ static struct cpuset top_cpuset = { .flags = ((1 << CS_ONLINE) | (1 << CS_CPU_EXCLUSIVE) | (1 << CS_MEM_EXCLUSIVE)), .partition_root_state = PRS_ROOT, + .isol_sibling = LIST_HEAD_INIT(top_cpuset.isol_sibling), }; /** @@ -1282,7 +1304,7 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, */ static int update_partition_exclusive(struct cpuset *cs, int new_prs) { - bool exclusive = (new_prs > 0); + bool exclusive = (new_prs == PRS_ROOT) || (new_prs == PRS_ISOLATED); if (exclusive && !is_cpu_exclusive(cs)) { if (update_flag(CS_CPU_EXCLUSIVE, cs, 1)) @@ -1303,7 +1325,7 @@ static int update_partition_exclusive(struct cpuset *cs, int new_prs) static void update_partition_sd_lb(struct cpuset *cs, int old_prs) { int new_prs = cs->partition_root_state; - bool new_lb = (new_prs != PRS_ISOLATED); + bool new_lb = (new_prs != PRS_ISOLATED) && (new_prs != PRS_ISOLCPUS); if (new_lb != !!is_sched_load_balance(cs)) update_flag(CS_SCHED_LOAD_BALANCE, cs, new_lb); @@ -1360,18 +1382,20 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd, int part_error = PERR_NONE; /* Partition error? */ percpu_rwsem_assert_held(&cpuset_rwsem); + old_prs = new_prs = cs->partition_root_state; /* * The parent must be a partition root. * The new cpumask, if present, or the current cpus_allowed must - * not be empty. + * not be empty except for isolcpus partition. */ if (!is_partition_valid(parent)) { return is_partition_invalid(parent) ? PERR_INVPARENT : PERR_NOTPART; } - if ((newmask && cpumask_empty(newmask)) || - (!newmask && cpumask_empty(cs->cpus_allowed))) + if ((new_prs != PRS_ISOLCPUS) && + ((newmask && cpumask_empty(newmask)) || + (!newmask && cpumask_empty(cs->cpus_allowed)))) return PERR_CPUSEMPTY; /* @@ -1379,7 +1403,6 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd, * partcmd_invalidate commands. */ adding = deleting = false; - old_prs = new_prs = cs->partition_root_state; if (cmd == partcmd_enable) { /* * Enabling partition root is not allowed if cpus_allowed @@ -1498,11 +1521,13 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd, switch (cs->partition_root_state) { case PRS_ROOT: case PRS_ISOLATED: + case PRS_ISOLCPUS: if (part_error) new_prs = -old_prs; break; case PRS_INVALID_ROOT: case PRS_INVALID_ISOLATED: + case PRS_INVALID_ISOLCPUS: if (!part_error) new_prs = -old_prs; break; @@ -1553,6 +1578,9 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd, spin_unlock_irq(&callback_lock); + if ((isolcpus_cs == cs) && (cs->partition_root_state != PRS_ISOLCPUS)) + isolcpus_cs = NULL; + if (adding || deleting) update_tasks_cpumask(parent, tmp->addmask); @@ -1640,7 +1668,14 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp, */ old_prs = new_prs = cp->partition_root_state; if ((cp != cs) && old_prs) { - switch (parent->partition_root_state) { + int parent_prs = parent->partition_root_state; + + /* + * isolcpus partition parent can't have children + */ + WARN_ON_ONCE(parent_prs == PRS_ISOLCPUS); + + switch (parent_prs) { case PRS_ROOT: case PRS_ISOLATED: update_parent = true; @@ -1735,9 +1770,10 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp, * @parent: Parent cpuset * @cs: Current cpuset * @tmp: Temp variables + * @force: Force update if set */ static void update_sibling_cpumasks(struct cpuset *parent, struct cpuset *cs, - struct tmpmasks *tmp) + struct tmpmasks *tmp, bool force) { struct cpuset *sibling; struct cgroup_subsys_state *pos_css; @@ -1756,7 +1792,7 @@ static void update_sibling_cpumasks(struct cpuset *parent, struct cpuset *cs, cpuset_for_each_child(sibling, pos_css, parent) { if (sibling == cs) continue; - if (!sibling->use_parent_ecpus) + if (!sibling->use_parent_ecpus && !force) continue; if (!css_tryget_online(&sibling->css)) continue; @@ -1893,14 +1929,16 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, update_cpumasks_hier(cs, &tmp, false); if (cs->partition_root_state) { + bool force = (cs->partition_root_state == PRS_ISOLCPUS); struct cpuset *parent = parent_cs(cs); /* * For partition root, update the cpumasks of sibling - * cpusets if they use parent's effective_cpus. + * cpusets if they use parent's effective_cpus or when + * the current cpuset is an isolcpus partition. */ - if (parent->child_ecpus_count) - update_sibling_cpumasks(parent, cs, &tmp); + if (parent->child_ecpus_count || force) + update_sibling_cpumasks(parent, cs, &tmp, force); /* Update CS_SCHED_LOAD_BALANCE and/or sched_domains */ update_partition_sd_lb(cs, old_prs); @@ -2298,6 +2336,41 @@ static int update_prstate(struct cpuset *cs, int new_prs) if (alloc_cpumasks(NULL, &tmpmask)) return -ENOMEM; + /* + * Only one isolcpus partition is allowed and it can't have children + * or tasks in it. The isolcpus partition is also not exclusive so + * that the isolated but unused cpus can be distributed down the + * hierarchy. + */ + if (new_prs == PRS_ISOLCPUS) { + if (isolcpus_cs) + err = PERR_ISOLCPUS; + else if (!list_empty(&cs->css.children)) + err = PERR_ISOLCHILD; + else if (cs->css.cgroup->nr_populated_csets) + err = PERR_ISOLTASK; + + if (err && old_prs) { + /* + * A previous valid partition root is now invalid + */ + goto disable_partition; + } else if (err) { + goto out; + } + + /* + * Unlike other partition types, an isolated cpu pool can + * be empty as it is essentially a place holder for isolated + * CPUs. + */ + if (!old_prs && cpumask_empty(cs->cpus_allowed)) { + /* Force effective_cpus to be empty too */ + cpumask_clear(cs->effective_cpus); + goto out; + } + } + err = update_partition_exclusive(cs, new_prs); if (err) goto out; @@ -2316,11 +2389,9 @@ static int update_prstate(struct cpuset *cs, int new_prs) if (err) goto out; } else if (old_prs && new_prs) { - /* - * A change in load balance state only, no change in cpumasks. - */ - goto out; + goto out; /* Skip cpuset and sibling task update */ } else { +disable_partition: /* * Switching back to member is always allowed even if it * disables child partitions. @@ -2342,8 +2413,13 @@ static int update_prstate(struct cpuset *cs, int new_prs) update_tasks_cpumask(parent, tmpmask.new_cpus); - if (parent->child_ecpus_count) - update_sibling_cpumasks(parent, cs, &tmpmask); + /* + * Since isolcpus partition is not exclusive, we have to update + * sibling hierarchies as well. + */ + if ((new_prs == PRS_ISOLCPUS) || parent->child_ecpus_count) + update_sibling_cpumasks(parent, cs, &tmpmask, + new_prs == PRS_ISOLCPUS); out: /* @@ -2363,6 +2439,14 @@ static int update_prstate(struct cpuset *cs, int new_prs) /* Update sched domains and load balance flag */ update_partition_sd_lb(cs, old_prs); + /* + * Check isolcpus_cs state + */ + if (new_prs == PRS_ISOLCPUS) + isolcpus_cs = cs; + else if (cs == isolcpus_cs) + isolcpus_cs = NULL; + /* * Update child cpusets, if present. * Force update if switching back to member. @@ -2486,7 +2570,12 @@ static struct cpuset *cpuset_attach_old_cs; */ static int cpuset_can_attach_check(struct cpuset *cs) { + /* + * Task cannot be moved to a cpuset with empty effective cpus or + * is an isolcpus partition. + */ if (cpumask_empty(cs->effective_cpus) || + (cs->partition_root_state == PRS_ISOLCPUS) || (!is_in_v2_mode() && nodes_empty(cs->mems_allowed))) return -ENOSPC; return 0; @@ -2902,24 +2991,30 @@ static s64 cpuset_read_s64(struct cgroup_subsys_state *css, struct cftype *cft) static int sched_partition_show(struct seq_file *seq, void *v) { struct cpuset *cs = css_cs(seq_css(seq)); + int prs = cs->partition_root_state; const char *err, *type = NULL; - switch (cs->partition_root_state) { + switch (prs) { case PRS_ROOT: seq_puts(seq, "root\n"); break; case PRS_ISOLATED: seq_puts(seq, "isolated\n"); break; + case PRS_ISOLCPUS: + seq_puts(seq, "isolcpus\n"); + break; case PRS_MEMBER: seq_puts(seq, "member\n"); break; - case PRS_INVALID_ROOT: - type = "root"; - fallthrough; - case PRS_INVALID_ISOLATED: - if (!type) + default: + if (prs == PRS_INVALID_ROOT) + type = "root"; + else if (prs == PRS_INVALID_ISOLATED) type = "isolated"; + else + type = "isolcpus"; + err = perr_strings[READ_ONCE(cs->prs_err)]; if (err) seq_printf(seq, "%s invalid (%s)\n", type, err); @@ -2948,6 +3043,8 @@ static ssize_t sched_partition_write(struct kernfs_open_file *of, char *buf, val = PRS_MEMBER; else if (!strcmp(buf, "isolated")) val = PRS_ISOLATED; + else if (!strcmp(buf, "isolcpus")) + val = PRS_ISOLCPUS; else return -EINVAL; @@ -3157,6 +3254,7 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_css) nodes_clear(cs->effective_mems); fmeter_init(&cs->fmeter); cs->relax_domain_level = -1; + INIT_LIST_HEAD(&cs->isol_sibling); /* Set CS_MEMORY_MIGRATE for default hierarchy */ if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys)) @@ -3171,6 +3269,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css) struct cpuset *parent = parent_cs(cs); struct cpuset *tmp_cs; struct cgroup_subsys_state *pos_css; + int err = 0; if (!parent) return 0; @@ -3178,6 +3277,14 @@ static int cpuset_css_online(struct cgroup_subsys_state *css) cpus_read_lock(); percpu_down_write(&cpuset_rwsem); + /* + * An isolcpus partition cannot have direct children. + */ + if (parent->partition_root_state == PRS_ISOLCPUS) { + err = -EINVAL; + goto out_unlock; + } + set_bit(CS_ONLINE, &cs->flags); if (is_spread_page(parent)) set_bit(CS_SPREAD_PAGE, &cs->flags); @@ -3229,7 +3336,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css) out_unlock: percpu_up_write(&cpuset_rwsem); cpus_read_unlock(); - return 0; + return err; } /* @@ -3434,6 +3541,7 @@ int __init cpuset_init(void) fmeter_init(&top_cpuset.fmeter); set_bit(CS_SCHED_LOAD_BALANCE, &top_cpuset.flags); top_cpuset.relax_domain_level = -1; + INIT_LIST_HEAD(&isol_children); BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL)); -- 2.31.1

2 years, 2 months

1
0
0 0

[RFC PATCH 1/5] cgroup/cpuset: Extract out CS_CPU_EXCLUSIVE & CS_SCHED_LOAD_BALANCE handling

by Waiman Long

Extract out the setting of CS_CPU_EXCLUSIVE and CS_SCHED_LOAD_BALANCE flags as well as the rebuilding of scheduling domains into the new update_partition_exclusive() and update_partition_sd_lb() helper functions to simplify the logic. The update_partition_exclusive() helper is called mainly at the beginning of the caller, but it may be called at the end too. The update_partition_sd_lb() helper is called at the end of the caller. This patch should reduce the chance that cpuset partition will end up in an incorrect state. Signed-off-by: Waiman Long <longman(a)redhat.com> --- kernel/cgroup/cpuset.c | 124 ++++++++++++++++++++++++----------------- 1 file changed, 72 insertions(+), 52 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 937ef4d60cd4..83a7193e0f2c 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1252,7 +1252,7 @@ static void update_tasks_cpumask(struct cpuset *cs, struct cpumask *new_cpus) static void compute_effective_cpumask(struct cpumask *new_cpus, struct cpuset *cs, struct cpuset *parent) { - if (parent->nr_subparts_cpus) { + if (parent->nr_subparts_cpus && is_partition_valid(cs)) { cpumask_or(new_cpus, parent->effective_cpus, parent->subparts_cpus); cpumask_and(new_cpus, new_cpus, cs->cpus_allowed); @@ -1274,6 +1274,43 @@ enum subparts_cmd { static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, int turning_on); + +/* + * Update partition exclusive flag + * + * Return: 0 if successful, an error code otherwise + */ +static int update_partition_exclusive(struct cpuset *cs, int new_prs) +{ + bool exclusive = (new_prs > 0); + + if (exclusive && !is_cpu_exclusive(cs)) { + if (update_flag(CS_CPU_EXCLUSIVE, cs, 1)) + return PERR_NOTEXCL; + } else if (!exclusive && is_cpu_exclusive(cs)) { + /* Turning off CS_CPU_EXCLUSIVE will not return error */ + update_flag(CS_CPU_EXCLUSIVE, cs, 0); + } + return 0; +} + +/* + * Update partition load balance flag and/or rebuild sched domain + * + * Changing load balance flag will automatically call + * rebuild_sched_domains_locked(). + */ +static void update_partition_sd_lb(struct cpuset *cs, int old_prs) +{ + int new_prs = cs->partition_root_state; + bool new_lb = (new_prs != PRS_ISOLATED); + + if (new_lb != !!is_sched_load_balance(cs)) + update_flag(CS_SCHED_LOAD_BALANCE, cs, new_lb); + else if ((new_prs > 0) || (old_prs > 0)) + rebuild_sched_domains_locked(); +} + /** * update_parent_subparts_cpumask - update subparts_cpus mask of parent cpuset * @cs: The cpuset that requests change in partition root state @@ -1477,14 +1514,13 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd, /* * Transitioning between invalid to valid or vice versa may require - * changing CS_CPU_EXCLUSIVE and CS_SCHED_LOAD_BALANCE. + * changing CS_CPU_EXCLUSIVE. */ if (old_prs != new_prs) { - if (is_prs_invalid(old_prs) && !is_cpu_exclusive(cs) && - (update_flag(CS_CPU_EXCLUSIVE, cs, 1) < 0)) - return PERR_NOTEXCL; - if (is_prs_invalid(new_prs) && is_cpu_exclusive(cs)) - update_flag(CS_CPU_EXCLUSIVE, cs, 0); + int err = update_partition_exclusive(cs, new_prs); + + if (err) + return err; } /* @@ -1521,15 +1557,16 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd, update_tasks_cpumask(parent, tmp->addmask); /* - * Set or clear CS_SCHED_LOAD_BALANCE when partcmd_update, if necessary. - * rebuild_sched_domains_locked() may be called. + * For partcmd_update without newmask, it is being called from + * cpuset_hotplug_workfn() where cpus_read_lock() wasn't taken. + * Update the load balance flag and scheduling domain if + * cpus_read_trylock() is successful. */ - if (old_prs != new_prs) { - if (old_prs == PRS_ISOLATED) - update_flag(CS_SCHED_LOAD_BALANCE, cs, 1); - else if (new_prs == PRS_ISOLATED) - update_flag(CS_SCHED_LOAD_BALANCE, cs, 0); + if ((cmd == partcmd_update) && !newmask && cpus_read_trylock()) { + update_partition_sd_lb(cs, old_prs); + cpus_read_unlock(); } + notify_partition_change(cs, old_prs); return 0; } @@ -1744,6 +1781,7 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, int retval; struct tmpmasks tmp; bool invalidate = false; + int old_prs = cs->partition_root_state; /* top_cpuset.cpus_allowed tracks cpu_online_mask; it's read-only */ if (cs == &top_cpuset) @@ -1863,6 +1901,9 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, */ if (parent->child_ecpus_count) update_sibling_cpumasks(parent, cs, &tmp); + + /* Update CS_SCHED_LOAD_BALANCE and/or sched_domains */ + update_partition_sd_lb(cs, old_prs); } return 0; } @@ -2239,7 +2280,6 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, static int update_prstate(struct cpuset *cs, int new_prs) { int err = PERR_NONE, old_prs = cs->partition_root_state; - bool sched_domain_rebuilt = false; struct cpuset *parent = parent_cs(cs); struct tmpmasks tmpmask; @@ -2258,45 +2298,28 @@ static int update_prstate(struct cpuset *cs, int new_prs) if (alloc_cpumasks(NULL, &tmpmask)) return -ENOMEM; + err = update_partition_exclusive(cs, new_prs); + if (err) + goto out; + if (!old_prs) { /* - * Turning on partition root requires setting the - * CS_CPU_EXCLUSIVE bit implicitly as well and cpus_allowed - * cannot be empty. + * cpus_allowed cannot be empty. */ if (cpumask_empty(cs->cpus_allowed)) { err = PERR_CPUSEMPTY; goto out; } - err = update_flag(CS_CPU_EXCLUSIVE, cs, 1); - if (err) { - err = PERR_NOTEXCL; - goto out; - } - err = update_parent_subparts_cpumask(cs, partcmd_enable, NULL, &tmpmask); - if (err) { - update_flag(CS_CPU_EXCLUSIVE, cs, 0); + if (err) goto out; - } - - if (new_prs == PRS_ISOLATED) { - /* - * Disable the load balance flag should not return an - * error unless the system is running out of memory. - */ - update_flag(CS_SCHED_LOAD_BALANCE, cs, 0); - sched_domain_rebuilt = true; - } } else if (old_prs && new_prs) { /* * A change in load balance state only, no change in cpumasks. */ - update_flag(CS_SCHED_LOAD_BALANCE, cs, (new_prs != PRS_ISOLATED)); - sched_domain_rebuilt = true; - goto out; /* Sched domain is rebuilt in update_flag() */ + goto out; } else { /* * Switching back to member is always allowed even if it @@ -2315,15 +2338,6 @@ static int update_prstate(struct cpuset *cs, int new_prs) compute_effective_cpumask(cs->effective_cpus, cs, parent); spin_unlock_irq(&callback_lock); } - - /* Turning off CS_CPU_EXCLUSIVE will not return error */ - update_flag(CS_CPU_EXCLUSIVE, cs, 0); - - if (!is_sched_load_balance(cs)) { - /* Make sure load balance is on */ - update_flag(CS_SCHED_LOAD_BALANCE, cs, 1); - sched_domain_rebuilt = true; - } } update_tasks_cpumask(parent, tmpmask.new_cpus); @@ -2331,18 +2345,24 @@ static int update_prstate(struct cpuset *cs, int new_prs) if (parent->child_ecpus_count) update_sibling_cpumasks(parent, cs, &tmpmask); - if (!sched_domain_rebuilt) - rebuild_sched_domains_locked(); out: /* - * Make partition invalid if an error happen + * Make partition invalid & disable CS_CPU_EXCLUSIVE if an error + * happens. */ - if (err) + if (err) { new_prs = -new_prs; + update_partition_exclusive(cs, new_prs); + } + spin_lock_irq(&callback_lock); cs->partition_root_state = new_prs; WRITE_ONCE(cs->prs_err, err); spin_unlock_irq(&callback_lock); + + /* Update sched domains and load balance flag */ + update_partition_sd_lb(cs, old_prs); + /* * Update child cpusets, if present. * Force update if switching back to member. -- 2.31.1

2 years, 2 months

1
0
0 0

[PATCH 00/11] selftests: hid: import the tests from hid-tools

by Benjamin Tissoires

I have been running hid-tools for a while, but it was in its own separate repository for multiple reasons. And the past few weeks I finally managed to make the kernel tests in that repo in a state where we can merge them in the kernel tree directly: - the tests run in ~2 to 3 minutes - the tests are way more reliable than previously - the tests are mostly self-contained now (to the exception of the Sony ones) To be able to run the tests we need to use the latest release of hid-tools, as this project still keeps the HID parsing logic and is capable of generating the HID events. The series also ensures we can run the tests with vmtest.sh, allowing for a quick development and test in the tree itself. This should allow us to require tests to be added to a series when we see fit and keep them alive properly instead of having to deal with 2 repositories. In Cc are all of the people who participated in the elaboration of those tests, so please send back a signed-off-by for each commit you are part of. This series applies on top of the for-6.3/hid-bpf branch, which is the one that added the tools/testing/selftests/hid directory. Given that this is unlikely this series will make the cut for 6.3, we might just consider this series to be based on top of the future 6.3-rc1. Cheers, Benjamin Signed-off-by: Benjamin Tissoires <benjamin.tissoires(a)redhat.com> --- Benjamin Tissoires (11): selftests: hid: make vmtest rely on make selftests: hid: import hid-tools hid-core tests selftests: hid: import hid-tools hid-gamepad tests selftests: hid: import hid-tools hid-keyboards tests selftests: hid: import hid-tools hid-mouse tests selftests: hid: import hid-tools hid-multitouch and hid-tablets tests selftests: hid: import hid-tools wacom tests selftests: hid: import hid-tools hid-apple tests selftests: hid: import hid-tools hid-ite tests selftests: hid: import hid-tools hid-sony and hid-playstation tests selftests: hid: import hid-tools usb-crash tests tools/testing/selftests/hid/Makefile | 12 + tools/testing/selftests/hid/config | 11 + tools/testing/selftests/hid/hid-apple.sh | 7 + tools/testing/selftests/hid/hid-core.sh | 7 + tools/testing/selftests/hid/hid-gamepad.sh | 7 + tools/testing/selftests/hid/hid-ite.sh | 7 + tools/testing/selftests/hid/hid-keyboard.sh | 7 + tools/testing/selftests/hid/hid-mouse.sh | 7 + tools/testing/selftests/hid/hid-multitouch.sh | 7 + tools/testing/selftests/hid/hid-sony.sh | 7 + tools/testing/selftests/hid/hid-tablet.sh | 7 + tools/testing/selftests/hid/hid-usb_crash.sh | 7 + tools/testing/selftests/hid/hid-wacom.sh | 7 + tools/testing/selftests/hid/run-hid-tools-tests.sh | 28 + tools/testing/selftests/hid/settings | 3 + tools/testing/selftests/hid/tests/__init__.py | 2 + tools/testing/selftests/hid/tests/base.py | 345 ++++ tools/testing/selftests/hid/tests/conftest.py | 81 + .../selftests/hid/tests/descriptors_wacom.py | 1360 +++++++++++++ .../selftests/hid/tests/test_apple_keyboard.py | 440 +++++ tools/testing/selftests/hid/tests/test_gamepad.py | 209 ++ tools/testing/selftests/hid/tests/test_hid_core.py | 154 ++ .../selftests/hid/tests/test_ite_keyboard.py | 166 ++ tools/testing/selftests/hid/tests/test_keyboard.py | 485 +++++ tools/testing/selftests/hid/tests/test_mouse.py | 977 +++++++++ .../testing/selftests/hid/tests/test_multitouch.py | 2088 ++++++++++++++++++++ tools/testing/selftests/hid/tests/test_sony.py | 282 +++ tools/testing/selftests/hid/tests/test_tablet.py | 872 ++++++++ .../testing/selftests/hid/tests/test_usb_crash.py | 103 + .../selftests/hid/tests/test_wacom_generic.py | 844 ++++++++ tools/testing/selftests/hid/vmtest.sh | 25 +- 31 files changed, 8554 insertions(+), 10 deletions(-) --- base-commit: 2f7f4efb9411770b4ad99eb314d6418e980248b4 change-id: 20230217-import-hid-tools-tests-dc0cd4f3c8a8 Best regards, -- Benjamin Tissoires <benjamin.tissoires(a)redhat.com>

2 years, 2 months

4
14
0 0

[PATCH] mm: huge_memory: Replace obsolete memalign() with posix_memalign()

by Deming Wang

memalign() is obsolete according to its manpage. Replace memalign() with posix_memalign() As a pointer is passed into posix_memalign(), initialize *one_page to NULL to silence a warning about the function's return value being used as uninitialized (which is not valid anyway because the error is properly checked before p is returned). Signed-off-by: Deming Wang <wangdeming(a)inspur.com> --- tools/testing/selftests/mm/split_huge_page_test.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c index cbb5e6893cbf..94c7dffc4d7d 100644 --- a/tools/testing/selftests/mm/split_huge_page_test.c +++ b/tools/testing/selftests/mm/split_huge_page_test.c @@ -96,10 +96,10 @@ void split_pmd_thp(void) char *one_page; size_t len = 4 * pmd_pagesize; size_t i; + int ret; - one_page = memalign(pmd_pagesize, len); - - if (!one_page) { + ret = posix_memalign((void **)&one_page, pmd_pagesize, len); + if (ret < 0) { printf("Fail to allocate memory\n"); exit(EXIT_FAILURE); } -- 2.27.0

2 years, 2 months

2
1
0 0

[PATCH] selftests/powerpc: Replace obsolete memalign() with posix_memalign()

by Deming Wang

memalign() is obsolete according to its manpage. Replace memalign() with posix_memalign() and remove malloc.h include that was there for memalign(). As a pointer is passed into posix_memalign(), initialize *s to NULL to silence a warning about the function's return value being used as uninitialized (which is not valid anyway because the error is properly checked before p is returned). Signed-off-by: Deming Wang <wangdeming(a)inspur.com> --- tools/testing/selftests/powerpc/stringloops/strlen.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/powerpc/stringloops/strlen.c b/tools/testing/selftests/powerpc/stringloops/strlen.c index 9055ebc484d0..f9c1f9cc2d32 100644 --- a/tools/testing/selftests/powerpc/stringloops/strlen.c +++ b/tools/testing/selftests/powerpc/stringloops/strlen.c @@ -1,5 +1,4 @@ // SPDX-License-Identifier: GPL-2.0 -#include <malloc.h> #include <stdlib.h> #include <string.h> #include <time.h> @@ -51,10 +50,11 @@ static void bench_test(char *s) static int testcase(void) { char *s; + int ret; unsigned long i; - s = memalign(128, SIZE); - if (!s) { + ret = posix_memalign((void **)&s, 128, SIZE); + if (ret < 0) { perror("memalign"); exit(1); } -- 2.27.0

2 years, 2 months

2
1
0 0

[PATCH] selftests/mm: Replace obsolete memalign() with posix_memalign()

by Deming Wang

memalign() is obsolete according to its manpage. Replace memalign() with posix_memalign(). As a pointer is passed into posix_memalign(),initialize *map to NULL,to silence a warning about the function's return value being used as uninitialized (which is not valid anyway because the error is properly checked before p is returned). Signed-off-by: Deming Wang <wangdeming(a)inspur.com> --- tools/testing/selftests/mm/soft-dirty.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c index 21d8830c5f24..c99350e110ec 100644 --- a/tools/testing/selftests/mm/soft-dirty.c +++ b/tools/testing/selftests/mm/soft-dirty.c @@ -80,9 +80,9 @@ static void test_hugepage(int pagemap_fd, int pagesize) int i, ret; size_t hpage_len = read_pmd_pagesize(); - map = memalign(hpage_len, hpage_len); - if (!map) - ksft_exit_fail_msg("memalign failed\n"); + ret = posix_memalign((void **)(&map), hpage_len, hpage_len); + if (ret < 0) + ksft_exit_fail_msg("posix_memalign failed\n"); ret = madvise(map, hpage_len, MADV_HUGEPAGE); if (ret) -- 2.27.0

2 years, 2 months

1
0
0 0

[PATCH] selftests/sgx: Add "test_encl.elf" to TEST_FILES

by Yi Lai

The "test_encl.elf" file used by test_sgx is not installed in INSTALL_PATH. Attempting to execute test_sgx causes false negative: " enclave executable open(): No such file or directory main.c:188:unclobbered_vdso:Failed to load the test enclave. " Add "test_encl.elf" to TEST_FILES so that it will be installed. Fixes: 2adcba79e69d ("selftests/x86: Add a selftest for SGX") Signed-off-by: Yi Lai <yi1.lai(a)intel.com> --- tools/testing/selftests/sgx/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/sgx/Makefile b/tools/testing/selftests/sgx/Makefile index 75af864e07b6..50aab6b57da3 100644 --- a/tools/testing/selftests/sgx/Makefile +++ b/tools/testing/selftests/sgx/Makefile @@ -17,6 +17,7 @@ ENCL_CFLAGS := -Wall -Werror -static -nostdlib -nostartfiles -fPIC \ -fno-stack-protector -mrdrnd $(INCLUDES) TEST_CUSTOM_PROGS := $(OUTPUT)/test_sgx +TEST_FILES := $(OUTPUT)/test_encl.elf ifeq ($(CAN_BUILD_X86_64), 1) all: $(TEST_CUSTOM_PROGS) $(OUTPUT)/test_encl.elf -- 2.25.1

2 years, 2 months

1
0
0 0

[PATCH] mm: huge_memory: Replace obsolete memalign() with posix_memalign()

by Deming Wang

memalign() is obsolete according to its manpage. Replace memalign() with posix_memalign() and remove malloc.h include that was there for memalign(). As a pointer is passed into posix_memalign(), initialize *p to NULL to silence a warning about the function's return value being used as uninitialized (which is not valid anyway because the error is properly checked before p is returned). Signed-off-by: Deming Wang <wangdeming(a)inspur.com> --- tools/testing/selftests/mm/split_huge_page_test.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c index cbb5e6893cbf..8f48f07bc821 100644 --- a/tools/testing/selftests/mm/split_huge_page_test.c +++ b/tools/testing/selftests/mm/split_huge_page_test.c @@ -96,10 +96,10 @@ void split_pmd_thp(void) char *one_page; size_t len = 4 * pmd_pagesize; size_t i; + int ret; - one_page = memalign(pmd_pagesize, len); - - if (!one_page) { + ret = posix_memalign((void **)(&one_page), pmd_pagesize, len); + if (ret < 0) { printf("Fail to allocate memory\n"); exit(EXIT_FAILURE); } -- 2.27.0

2 years, 2 months

2
1
0 0

[PATCH] selftests/mm: Replace obsolete memalign() with posix_memalign()

by Deming Wang

memalign() is obsolete according to its manpage. Replace memalign() with posix_memalign() and remove malloc.h include that was there for memalign(). As a pointer is passed into posix_memalign(), initialize *p to NULL to silence a warning about the function's return value being used as uninitialized (which is not valid anyway because the error is properly checked before p is returned). Signed-off-by: Deming Wang <wangdeming(a)inspur.com> --- tools/testing/selftests/mm/soft-dirty.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c index 21d8830c5f24..4bb7421141a2 100644 --- a/tools/testing/selftests/mm/soft-dirty.c +++ b/tools/testing/selftests/mm/soft-dirty.c @@ -80,8 +80,8 @@ static void test_hugepage(int pagemap_fd, int pagesize) int i, ret; size_t hpage_len = read_pmd_pagesize(); - map = memalign(hpage_len, hpage_len); - if (!map) + ret = posix_memalign((void *)(&map), hpage_len, hpage_len); + if (ret < 0) ksft_exit_fail_msg("memalign failed\n"); ret = madvise(map, hpage_len, MADV_HUGEPAGE); -- 2.27.0

2 years, 2 months

2
1
0 0

Słowa kluczowe do wypozycjonowania

by Adam Charachuta

Dzień dobry, zapoznałem się z Państwa ofertą i z przyjemnością przyznaję, że przyciąga uwagę i zachęca do dalszych rozmów. Pomyślałem, że może mógłbym mieć swój wkład w Państwa rozwój i pomóc dotrzeć z tą ofertą do większego grona odbiorców. Pozycjonuję strony www, dzięki czemu generują świetny ruch w sieci. Możemy porozmawiać w najbliższym czasie? Pozdrawiam Adam Charachuta

2 years, 2 months

1
0
0 0

[PATCH] selftests/mm: Replace obsolete memalign() with posix_memalign()

by Deming Wang

memalign() is obsolete according to its manpage. Replace memalign() with posix_memalign() and remove malloc.h include that was there for memalign(). As a pointer is passed into posix_memalign(), initialize *p to NULL to silence a warning about the function's return value being used as uninitialized (which is not valid anyway because the error is properly checked before p is returned). Signed-off-by: Deming Wang <wangdeming(a)inspur.com> --- tools/testing/selftests/mm/soft-dirty.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c index 21d8830c5f24..4bb7421141a2 100644 --- a/tools/testing/selftests/mm/soft-dirty.c +++ b/tools/testing/selftests/mm/soft-dirty.c @@ -80,8 +80,8 @@ static void test_hugepage(int pagemap_fd, int pagesize) int i, ret; size_t hpage_len = read_pmd_pagesize(); - map = memalign(hpage_len, hpage_len); - if (!map) + ret = posix_memalign((void *)(&map), hpage_len, hpage_len) + if (ret < 0) ksft_exit_fail_msg("memalign failed\n"); ret = madvise(map, hpage_len, MADV_HUGEPAGE); -- 2.27.0

2 years, 2 months

1
0
0 0

kselftest/next kselftest-seccomp: 7 runs, 3 regressions (v6.3-rc1-17-g266679ffd867)

by kernelci.org bot

kselftest/next kselftest-seccomp: 7 runs, 3 regressions (v6.3-rc1-17-g266679ffd867) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 mt8183-kukui-...uniper-sku16 | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.3-rc1-17-g266… Test: kselftest-seccomp Tree: kselftest Branch: next Describe: v6.3-rc1-17-g266679ffd867 URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: 266679ffd867cb247c36717ea4d7998e9304823b Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/6435dc7e25588a277f2e85fd Results: 0 PASS, 1 FAIL, 0 SKIP Full config: multi_v7_defconfig+kselftest Compiler: gcc-10 (arm-linux-gnueabihf-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm/… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm/… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-seccomp.login: https://kernelci.org/test/case/id/6435dc7e25588a277f2e85fe failing since 105 days (last pass: v6.1-rc1-24-gd5ba85d6d8be, first fail: v6.2-rc1) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/6435dc6f25588a277f2e85f9 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-seccomp.login: https://kernelci.org/test/case/id/6435dc6f25588a277f2e85fa failing since 175 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ mt8183-kukui-...uniper-sku16 | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/6435dc52e7ccc9b6702e85fa Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-seccomp.login: https://kernelci.org/test/case/id/6435dc52e7ccc9b6702e85fb failing since 175 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1)

2 years, 2 months

1
0
0 0

kselftest/next kselftest-lkdtm: 7 runs, 3 regressions (v6.3-rc1-17-g266679ffd867)

by kernelci.org bot

kselftest/next kselftest-lkdtm: 7 runs, 3 regressions (v6.3-rc1-17-g266679ffd867) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 mt8183-kukui-...uniper-sku16 | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.3-rc1-17-g266… Test: kselftest-lkdtm Tree: kselftest Branch: next Describe: v6.3-rc1-17-g266679ffd867 URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: 266679ffd867cb247c36717ea4d7998e9304823b Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/6435dca8da3c60967e2e85f1 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: multi_v7_defconfig+kselftest Compiler: gcc-10 (arm-linux-gnueabihf-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm/… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm/… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lkdtm.login: https://kernelci.org/test/case/id/6435dca8da3c60967e2e85f2 failing since 132 days (last pass: v6.1-rc1-23-g00dd59519141, first fail: v6.1-rc1-23-g8008d88e6d16) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/6435ddbdba6077fa6a2e85fa Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lkdtm.login: https://kernelci.org/test/case/id/6435ddbdba6077fa6a2e85fb failing since 175 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ mt8183-kukui-...uniper-sku16 | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/6435ddc2ba6077fa6a2e85fd Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lkdtm.login: https://kernelci.org/test/case/id/6435ddc2ba6077fa6a2e85fe failing since 175 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1)

2 years, 2 months

1
0
0 0

kselftest/next kselftest-livepatch: 3 runs, 2 regressions (v6.3-rc1-17-g266679ffd867)

by kernelci.org bot

kselftest/next kselftest-livepatch: 3 runs, 2 regressions (v6.3-rc1-17-g266679ffd867) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions ----------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.3-rc1-17-g266… Test: kselftest-livepatch Tree: kselftest Branch: next Describe: v6.3-rc1-17-g266679ffd867 URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: 266679ffd867cb247c36717ea4d7998e9304823b Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions ----------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/6435dc6c5c96d1d5aa2e85f1 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: multi_v7_defconfig+kselftest Compiler: gcc-10 (arm-linux-gnueabihf-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm/… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm/… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-livepatch.login: https://kernelci.org/test/case/id/6435dc6c5c96d1d5aa2e85f2 failing since 105 days (last pass: v6.1-rc1-24-gd5ba85d6d8be, first fail: v6.2-rc1) platform | arch | lab | compiler | defconfig | regressions ----------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/6435dc5c26e20546922e85f3 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-livepatch.login: https://kernelci.org/test/case/id/6435dc5c26e20546922e85f4 failing since 175 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1)

2 years, 2 months

1
0
0 0

kselftest/next kselftest-lib: 5 runs, 3 regressions (v6.3-rc1-17-g266679ffd867)

by kernelci.org bot

kselftest/next kselftest-lib: 5 runs, 3 regressions (v6.3-rc1-17-g266679ffd867) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 meson-g12b-a311d-khadas-vim3 | arm64 | lab-collabora | gcc-10 | defconfig+kselftest | 1 mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.3-rc1-17-g266… Test: kselftest-lib Tree: kselftest Branch: next Describe: v6.3-rc1-17-g266679ffd867 URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: 266679ffd867cb247c36717ea4d7998e9304823b Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/6435dd5a674292a4ab2e85f8 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: multi_v7_defconfig+kselftest Compiler: gcc-10 (arm-linux-gnueabihf-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm/… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm/… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lib.login: https://kernelci.org/test/case/id/6435dd5a674292a4ab2e85f9 failing since 105 days (last pass: v6.1-rc1-24-gd5ba85d6d8be, first fail: v6.2-rc1) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ meson-g12b-a311d-khadas-vim3 | arm64 | lab-collabora | gcc-10 | defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/6435dc6dd35dcacff12e85f5 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lib.login: https://kernelci.org/test/case/id/6435dc6dd35dcacff12e85f6 failing since 175 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/6435dc47e7ccc9b6702e85e6 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lib.login: https://kernelci.org/test/case/id/6435dc47e7ccc9b6702e85e7 failing since 175 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1)

2 years, 2 months

1
0
0 0

kselftest/next kselftest-cpufreq: 8 runs, 1 regressions (v6.3-rc1-17-g266679ffd867)

by kernelci.org bot

kselftest/next kselftest-cpufreq: 8 runs, 1 regressions (v6.3-rc1-17-g266679ffd867) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions ----------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.3-rc1-17-g266… Test: kselftest-cpufreq Tree: kselftest Branch: next Describe: v6.3-rc1-17-g266679ffd867 URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: 266679ffd867cb247c36717ea4d7998e9304823b Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions ----------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/6435dc4e5050e9421c2e85f1 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-17-g266679ffd867/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-cpufreq.login: https://kernelci.org/test/case/id/6435dc4e5050e9421c2e85f2 failing since 175 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1)

2 years, 2 months

1
0
0 0

Re: [PATCH v5 1/3] mm: add new api to enable ksm per process

by Matthew Wilcox

On Thu, Apr 06, 2023 at 09:53:37AM -0700, Stefan Roesch wrote: > + case PR_SET_MEMORY_MERGE: > + if (mmap_write_lock_killable(me->mm)) > + return -EINTR; > + > + if (arg2) { > + int err = ksm_add_mm(me->mm); > + if (err) > + return err; You'll return to userspace with the mutex held, no?

2 years, 2 months

2
1
0 0

Re: [PATCH v5 2/3] mm: add new KSM process and sysfs knobs

by David Hildenbrand

On 06.04.23 18:53, Stefan Roesch wrote: > This adds the general_profit KSM sysfs knob and the process profit metric > and process merge type knobs to ksm_stat. > > 1) expose general_profit metric > > The documentation mentions a general profit metric, however this > metric is not calculated. In addition the formula depends on the size > of internal structures, which makes it more difficult for an > administrator to make the calculation. Adding the metric for a better > user experience. > > 2) document general_profit sysfs knob > > 3) calculate ksm process profit metric > > The ksm documentation mentions the process profit metric and how to > calculate it. This adds the calculation of the metric. > > 4) add ksm_merge_type() function > > This adds the ksm_merge_type function. The function returns the > merge type for the process. For madvise it returns "madvise", for > prctl it returns "process" and otherwise it returns "none". I'm curious, why exactly is this change required in this context? It might be sufficient to observe if the prctl is set for a process. If not, the ksm stats can reveal whether KSM is still active for that process -> madvise. For your use case, I'd assume it's pretty unnecessary to expose that. If there is no compelling reason, I'd suggest to drop this and limit this patch to exposing the general/per-mm profit, which I can understand why it's desirable when fine-tuning a workload. [...] > Signed-off-by: Stefan Roesch <shr(a)devkernel.io> > Reviewed-by: Bagas Sanjaya <bagasdotme(a)gmail.com> > Cc: David Hildenbrand <david(a)redhat.com> > Cc: Johannes Weiner <hannes(a)cmpxchg.org> > Cc: Michal Hocko <mhocko(a)suse.com> > Cc: Rik van Riel <riel(a)surriel.com> > Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> > --- > Documentation/ABI/testing/sysfs-kernel-mm-ksm | 8 +++++ > Documentation/admin-guide/mm/ksm.rst | 8 ++++- > fs/proc/base.c | 5 +++ > include/linux/ksm.h | 5 +++ > mm/ksm.c | 32 +++++++++++++++++++ > 5 files changed, 57 insertions(+), 1 deletion(-) > > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-ksm b/Documentation/ABI/testing/sysfs-kernel-mm-ksm > index d244674a9480..7768e90f7a8f 100644 > --- a/Documentation/ABI/testing/sysfs-kernel-mm-ksm > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-ksm > @@ -51,3 +51,11 @@ Description: Control merging pages across different NUMA nodes. > > When it is set to 0 only pages from the same node are merged, > otherwise pages from all nodes can be merged together (default). > + > +What: /sys/kernel/mm/ksm/general_profit > +Date: January 2023 ^ No > +KernelVersion: 6.1 ^ Outdated (kind of weird having to come up with the right numbers before getting it merged) [...] > > diff --git a/fs/proc/base.c b/fs/proc/base.c > index 07463ad4a70a..c74450318e05 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -96,6 +96,7 @@ > #include <linux/time_namespace.h> > #include <linux/resctrl.h> > #include <linux/cn_proc.h> > +#include <linux/ksm.h> > #include <trace/events/oom.h> > #include "internal.h" > #include "fd.h" > @@ -3199,6 +3200,7 @@ static int proc_pid_ksm_merging_pages(struct seq_file *m, struct pid_namespace * > > return 0; > } > + ^ unrelated change > static int proc_pid_ksm_stat(struct seq_file *m, struct pid_namespace *ns, > struct pid *pid, struct task_struct *task) > { > @@ -3208,6 +3210,9 @@ static int proc_pid_ksm_stat(struct seq_file *m, struct pid_namespace *ns, > if (mm) { > seq_printf(m, "ksm_rmap_items %lu\n", mm->ksm_rmap_items); > seq_printf(m, "zero_pages_sharing %lu\n", mm->ksm_zero_pages_sharing); > + seq_printf(m, "ksm_merging_pages %lu\n", mm->ksm_merging_pages); > + seq_printf(m, "ksm_merge_type %s\n", ksm_merge_type(mm)); > + seq_printf(m, "ksm_process_profit %ld\n", ksm_process_profit(mm)); > mmput(mm); > } > > diff --git a/include/linux/ksm.h b/include/linux/ksm.h > index c65455bf124c..4c32f9bca723 100644 > --- a/include/linux/ksm.h > +++ b/include/linux/ksm.h > @@ -60,6 +60,11 @@ struct page *ksm_might_need_to_copy(struct page *page, > void rmap_walk_ksm(struct folio *folio, struct rmap_walk_control *rwc); > void folio_migrate_ksm(struct folio *newfolio, struct folio *folio); > > +#ifdef CONFIG_PROC_FS > +long ksm_process_profit(struct mm_struct *); > +const char *ksm_merge_type(struct mm_struct *mm); > +#endif /* CONFIG_PROC_FS */ > + > #else /* !CONFIG_KSM */ > > static inline int ksm_add_mm(struct mm_struct *mm) > diff --git a/mm/ksm.c b/mm/ksm.c > index ab95ae0f9def..76b10ff840ac 100644 > --- a/mm/ksm.c > +++ b/mm/ksm.c > @@ -3042,6 +3042,25 @@ static void wait_while_offlining(void) > } > #endif /* CONFIG_MEMORY_HOTREMOVE */ > > +#ifdef CONFIG_PROC_FS > +long ksm_process_profit(struct mm_struct *mm) > +{ > + return (long)mm->ksm_merging_pages * PAGE_SIZE - Do we really need the cast to long? mm->ksm_merging_pages is defined as "unsigned long". Just like "ksm_pages_sharing" below. > + mm->ksm_rmap_items * sizeof(struct ksm_rmap_item); > +} > + > +/* Return merge type name as string. */ > +const char *ksm_merge_type(struct mm_struct *mm) > +{ > + if (test_bit(MMF_VM_MERGE_ANY, &mm->flags)) > + return "process"; > + else if (test_bit(MMF_VM_MERGEABLE, &mm->flags)) > + return "madvise"; > + else > + return "none"; > +} > +#endif /* CONFIG_PROC_FS */ > + Apart from these nits, LGTM (again, I don't see why the merge type should belong into this patch, and why there is a real need to expose it like that). Acked-by: David Hildenbrand <david(a)redhat.com> -- Thanks, David / dhildenb

2 years, 2 months

2
1
0 0

kselftest/next build: 8 builds: 0 failed, 8 passed, 9 warnings (v6.3-rc1-17-g266679ffd867)

by kernelci.org bot

kselftest/next build: 8 builds: 0 failed, 8 passed, 9 warnings (v6.3-rc1-17-g266679ffd867) Full Build Summary: https://kernelci.org/build/kselftest/branch/next/kernel/v6.3-rc1-17-g266679… Tree: kselftest Branch: next Git Describe: v6.3-rc1-17-g266679ffd867 Git Commit: 266679ffd867cb247c36717ea4d7998e9304823b Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git Built: 4 unique architectures Warnings Detected: arm64: defconfig+kselftest (clang-16): 3 warnings defconfig+kselftest+arm64-chromebook (clang-16): 3 warnings arm: i386: x86_64: x86_64_defconfig+kselftest (clang-16): 3 warnings Warnings summary: 2 drivers/gpu/host1x/dev.c:520:6: warning: variable 'syncpt_irq' is uninitialized when used here [-Wuninitialized] 2 drivers/gpu/host1x/dev.c:490:16: note: initialize the variable 'syncpt_irq' to silence this warning 2 1 warning generated. 1 vmlinux.o: warning: objtool: set_ftrace_ops_ro+0x3e: relocation to !ENDBR: .text+0x1463d6 1 vmlinux.o: warning: objtool: set_ftrace_ops_ro+0x28: relocation to !ENDBR: .text+0x14654b 1 vmlinux.o: warning: objtool: lkdtm_UNSET_SMEP+0xca: relocation to !ENDBR: native_write_cr4+0x4 ================================================================================ Detailed per-defconfig build reports: -------------------------------------------------------------------------------- defconfig+kselftest (arm64, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- defconfig+kselftest (arm64, clang-16) — PASS, 0 errors, 3 warnings, 0 section mismatches Warnings: drivers/gpu/host1x/dev.c:520:6: warning: variable 'syncpt_irq' is uninitialized when used here [-Wuninitialized] drivers/gpu/host1x/dev.c:490:16: note: initialize the variable 'syncpt_irq' to silence this warning 1 warning generated. -------------------------------------------------------------------------------- defconfig+kselftest+arm64-chromebook (arm64, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- defconfig+kselftest+arm64-chromebook (arm64, clang-16) — PASS, 0 errors, 3 warnings, 0 section mismatches Warnings: drivers/gpu/host1x/dev.c:520:6: warning: variable 'syncpt_irq' is uninitialized when used here [-Wuninitialized] drivers/gpu/host1x/dev.c:490:16: note: initialize the variable 'syncpt_irq' to silence this warning 1 warning generated. -------------------------------------------------------------------------------- i386_defconfig+kselftest (i386, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- multi_v7_defconfig+kselftest (arm, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- x86_64_defconfig+kselftest (x86_64, clang-16) — PASS, 0 errors, 3 warnings, 0 section mismatches Warnings: vmlinux.o: warning: objtool: set_ftrace_ops_ro+0x28: relocation to !ENDBR: .text+0x14654b vmlinux.o: warning: objtool: set_ftrace_ops_ro+0x3e: relocation to !ENDBR: .text+0x1463d6 vmlinux.o: warning: objtool: lkdtm_UNSET_SMEP+0xca: relocation to !ENDBR: native_write_cr4+0x4 -------------------------------------------------------------------------------- x86_64_defconfig+kselftest (x86_64, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches --- For more info write to <info(a)kernelci.org>

2 years, 2 months

1
0
0 0

[PATCH v2 0/3] tools/nolibc: Support vprintf() so we can use kselftest.h with nolibc

by Mark Brown

At present the kselftest header can't be used with nolibc since it makes use of vprintf() which is not available in nolibc. Fortunately nolibc has a vfprintf() so we can just wrap that in order to allow kselftests to be built with nolibc and still use the standard kselftest headers with a small change to prevent the inclusion of the standard libc headers. This allows us to avoid open coding of KTAP output for selftests that need to use nolibc in order to test interfaces that are controlled by libc, we've got several open coded examples of this in the tree already. As an example of using this the existing za-fork test is converted to use kselftest.h. The changes to kselftest and nolibc don't have any interaction until they are used by a test so could be merged separately if desired. Signed-off-by: Mark Brown <broonie(a)kernel.org> --- Changes in v2: - Turns out nolibc has a vfprintf() already which we can use so do that. - Link to v1: https://lore.kernel.org/r/20230405-kselftest-nolibc-v1-0-63fbcd70b202@kerne… --- Mark Brown (3): tools/nolibc/stdio: Implement vprintf() kselftest: Support nolibc kselftest/arm64: Convert za-fork to use kselftest.h tools/include/nolibc/stdio.h | 6 ++ tools/testing/selftests/arm64/fp/Makefile | 2 +- tools/testing/selftests/arm64/fp/za-fork.c | 88 ++++++------------------------ tools/testing/selftests/kselftest.h | 2 + 4 files changed, 25 insertions(+), 73 deletions(-) --- base-commit: e8d018dd0257f744ca50a729e3d042cf2ec9da65 change-id: 20230405-kselftest-nolibc-cb2ce0446d09 Best regards, -- Mark Brown <broonie(a)kernel.org>

2 years, 2 months

5
12
0 0

[PATCH v3 0/2] Fix failure to access u32* argument of tracked function

by Feng zhou

From: Feng Zhou <zhoufeng.zf(a)bytedance.com> When access traced function arguments with type is u32*, bpf verifier failed. Because u32 have typedef, needs to skip modifier. Add btf_type_is_modifier in is_int_ptr. Add a selftest to check it. Feng Zhou (2): bpf/btf: Fix is_int_ptr() selftests/bpf: Add test to access u32 ptr argument in tracing program Changelog: v2->v3: Addressed comments from jirka - Fix an issue that caused other test items to fail Details in here: https://lore.kernel.org/all/20230407084608.62296-1-zhoufeng.zf@bytedance.co… v1->v2: Addressed comments from Martin KaFai Lau - Add a selftest. - use btf_type_skip_modifiers. Some details in here: https://lore.kernel.org/all/20221012125815.76120-1-zhouchengming@bytedance.… kernel/bpf/btf.c | 8 ++------ net/bpf/test_run.c | 8 +++++++- .../testing/selftests/bpf/verifier/btf_ctx_access.c | 13 +++++++++++++ 3 files changed, 22 insertions(+), 7 deletions(-) -- 2.20.1

2 years, 2 months

3
4
0 0

selftests: # Warning: file test_ingress_egress_chaining.sh is not executable

by Naresh Kamboju

kselftest net test_ingress_egress_chaining.sh seems to be missing file execute permissions. Do you notice at your end ? # selftests: net: test_ingress_egress_chaining.sh # Warning: file test_ingress_egress_chaining.sh is not executable https://lkft.validation.linaro.org/scheduler/job/6339612#L8152 https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.3-rc6-16-… https://qa-reports.linaro.org/lkft/linux-mainline-master/build/v6.3-rc6-16-… -- Linaro LKFT https://lkft.linaro.org

2 years, 2 months

1
0
0 0

[PATCH v5 00/17] Add iommufd physical device operations for replace and alloc hwpt

by Jason Gunthorpe

This is the basic functionality for iommufd to support iommufd_device_replace() and IOMMU_HWPT_ALLOC for physical devices. iommufd_device_replace() allows changing the HWPT associated with the device to a new IOAS or HWPT. Replace does this in way that failure leaves things unchanged, and utilizes the iommu iommu_group_replace_domain() API to allow the iommu driver to perform an optional non-disruptive change. IOMMU_HWPT_ALLOC allows HWPTs to be explicitly allocated by the user and used by attach or replace. At this point it isn't very useful since the HWPT is the same as the automatically managed HWPT from the IOAS. However a following series will allow userspace to customize the created HWPT. The implementation is complicated because we have to introduce some per-iommu_group memory in iommufd and redo how we think about multi-device groups to be more explicit. This solves all the locking problems in the prior attempts. This series is infrastructure work for the following series which: - Add replace for attach - Expose replace through VFIO APIs - Implement driver parameters for HWPT creation (nesting) I'll update the linux-next branch with this version and keep it on a side branch and accumulate the following series when they are ready so we can have a stable base and make more incremental progress. When we have all the parts together to get a full implementation it can go to Linus. This is on github: https://github.com/jgunthorpe/linux/commits/iommufd_hwpt v5: - Got back to the v3 version of the code, keep the comment changes from v4. Syzkaller says the group lock change in v4 didn't work. - Adjust the fail_nth test to cover the path syzkaller found. We need to have an ioas with a mapped page installed to inject a failure during domain attachment. v4: https://lore.kernel.org/r/0-v4-9cd79ad52ee8+13f5-iommufd_alloc_jgg@nvidia.c… - Refine comments and commit messages - Move the group lock into iommufd_hw_pagetable_attach() - Fix error unwind in iommufd_device_do_replace() v3: https://lore.kernel.org/r/0-v3-61d41fd9e13e+1f5-iommufd_alloc_jgg@nvidia.com - Refine comments and commit messages - Adjust the flow in iommufd_device_auto_get_domain() so pt_id is only set on success - Reject replace on non-attached devices - Add missing __reserved check for IOMMU_HWPT_ALLOC v2: https://lore.kernel.org/r/0-v2-51b9896e7862+8a8c-iommufd_alloc_jgg@nvidia.c… - Use WARN_ON for the igroup->group test and move that logic to a function iommufd_group_try_get() - Change igroup->devices to igroup->device list Replace will need to iterate over all attached idevs - Rename to iommufd_group_setup_msi() - New patch to export iommu_get_resv_regions() - New patch to use per-device reserved regions instead of per-group regions - Split out the reorganizing of iommufd_device_change_pt() from the replace patch - Replace uses the per-dev reserved regions - Use stdev_id in a few more places in the selftest - Fix error handling in IOMMU_HWPT_ALLOC - Clarify comments - Rebase on v6.3-rc1 v1: https://lore.kernel.org/all/0-v1-7612f88c19f5+2f21-iommufd_alloc_jgg@nvidia… Jason Gunthorpe (15): iommufd: Move isolated msi enforcement to iommufd_device_bind() iommufd: Add iommufd_group iommufd: Replace the hwpt->devices list with iommufd_group iommu: Export iommu_get_resv_regions() iommufd: Keep track of each device's reserved regions instead of groups iommufd: Use the iommufd_group to avoid duplicate MSI setup iommufd: Make sw_msi_start a group global iommufd: Move putting a hwpt to a helper function iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc() iommufd: Reorganize iommufd_device_attach into iommufd_device_change_pt iommufd: Add iommufd_device_replace() iommufd: Make destroy_rwsem use a lock class per object type iommufd: Add IOMMU_HWPT_ALLOC iommufd/selftest: Return the real idev id from selftest mock_domain iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC Nicolin Chen (2): iommu: Introduce a new iommu_group_replace_domain() API iommufd/selftest: Test iommufd_device_replace() drivers/iommu/iommu-priv.h | 10 + drivers/iommu/iommu.c | 41 +- drivers/iommu/iommufd/device.c | 516 +++++++++++++----- drivers/iommu/iommufd/hw_pagetable.c | 96 +++- drivers/iommu/iommufd/io_pagetable.c | 27 +- drivers/iommu/iommufd/iommufd_private.h | 51 +- drivers/iommu/iommufd/iommufd_test.h | 6 + drivers/iommu/iommufd/main.c | 17 +- drivers/iommu/iommufd/selftest.c | 40 ++ include/linux/iommufd.h | 1 + include/uapi/linux/iommufd.h | 26 + tools/testing/selftests/iommu/iommufd.c | 67 ++- .../selftests/iommu/iommufd_fail_nth.c | 67 ++- tools/testing/selftests/iommu/iommufd_utils.h | 63 ++- 14 files changed, 825 insertions(+), 203 deletions(-) create mode 100644 drivers/iommu/iommu-priv.h base-commit: fd8c1a4aee973e87d890a5861e106625a33b2c4e -- 2.40.0

2 years, 2 months

1
17
0 0

[PATCH 1/3] kunit: add parameter generation macro using description from array

by benjamin＠sipsolutions.net

From: Benjamin Berg <benjamin.berg(a)intel.com> The existing KUNIT_ARRAY_PARAM macro requires a separate function to get the description. However, in a lot of cases the description can just be copied directly from the array. Add a second macro that avoids having to write a static function just for a single strscpy. Signed-off-by: Benjamin Berg <benjamin.berg(a)intel.com> --- include/kunit/test.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/include/kunit/test.h b/include/kunit/test.h index 08d3559dd703..519b90261c72 100644 --- a/include/kunit/test.h +++ b/include/kunit/test.h @@ -1414,6 +1414,25 @@ do { \ return NULL; \ } +/** + * KUNIT_ARRAY_PARAM_DESC() - Define test parameter generator from an array. + * @name: prefix for the test parameter generator function. + * @array: array of test parameters. + * @desc_member: structure member from array element to use as description + * + * Define function @name_gen_params which uses @array to generate parameters. + */ +#define KUNIT_ARRAY_PARAM_DESC(name, array, desc_member) \ + static const void *name##_gen_params(const void *prev, char *desc) \ + { \ + typeof((array)[0]) *__next = prev ? ((typeof(__next)) prev) + 1 : (array); \ + if (__next - (array) < ARRAY_SIZE((array))) { \ + strscpy(desc, __next->desc_member, KUNIT_PARAM_DESC_SIZE); \ + return __next; \ + } \ + return NULL; \ + } + // TODO(dlatypov(a)google.com): consider eventually migrating users to explicitly // include resource.h themselves if they need it. #include <kunit/resource.h> -- 2.39.2

2 years, 2 months

4
5
0 0

[PATCH] mm/huge_memory: conditionally call maybe_mkwrite() and drop pte_wrprotect() in __split_huge_pmd_locked()

by David Hildenbrand

No need to call maybe_mkwrite() to then wrprotect if the source PMD was not writable. It's worth nothing that this now allows for PTEs to be writable even if the source PMD was not writable: if vma->vm_page_prot includes write permissions. As documented in commit 931298e103c2 ("mm/userfaultfd: rely on vma->vm_page_prot in uffd_wp_range()"), any mechanism that intends to have pages wrprotected (COW, writenotify, mprotect, uffd-wp, softdirty, ...) has to properly adjust vma->vm_page_prot upfront, to not include write permissions. If vma->vm_page_prot includes write permissions, the PTE/PMD can be writable as default. This now mimics the handling in mm/migrate.c:remove_migration_pte() and in mm/huge_memory.c:remove_migration_pmd(), which has been in place for a long time (except that 96a9c287e25d ("mm/migrate: fix wrongly apply write bit after mkdirty on sparc64") temporarily changed it). Signed-off-by: David Hildenbrand <david(a)redhat.com> --- mm/huge_memory.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 6f3af65435c8..8332e16ac97b 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2235,11 +2235,10 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = pte_swp_mkuffd_wp(entry); } else { entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); - entry = maybe_mkwrite(entry, vma); + if (write) + entry = maybe_mkwrite(entry, vma); if (anon_exclusive) SetPageAnonExclusive(page + i); - if (!write) - entry = pte_wrprotect(entry); if (!young) entry = pte_mkold(entry); /* NOTE: this may set soft-dirty too on some archs */ -- 2.39.2

2 years, 2 months

1
8
0 0

[PATCH net-next v3 4/4] selftests: net: add SCM_PIDFD / SO_PEERPIDFD test

by Alexander Mikhalitsyn

Basic test to check consistency between: - SCM_CREDENTIALS and SCM_PIDFD - SO_PEERCRED and SO_PEERPIDFD Cc: "David S. Miller" <davem(a)davemloft.net> Cc: Eric Dumazet <edumazet(a)google.com> Cc: Jakub Kicinski <kuba(a)kernel.org> Cc: Paolo Abeni <pabeni(a)redhat.com> Cc: Leon Romanovsky <leon(a)kernel.org> Cc: David Ahern <dsahern(a)kernel.org> Cc: Arnd Bergmann <arnd(a)arndb.de> Cc: Kees Cook <keescook(a)chromium.org> Cc: Christian Brauner <brauner(a)kernel.org> Cc: Kuniyuki Iwashima <kuniyu(a)amazon.com> Cc: linux-kernel(a)vger.kernel.org Cc: netdev(a)vger.kernel.org Cc: linux-arch(a)vger.kernel.org Cc: linux-kselftest(a)vger.kernel.org Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn(a)canonical.com> --- v3: - started using kselftest lib (thanks to Kuniyuki Iwashima for suggestion/review) - now test covers abstract sockets too and SOCK_DGRAM sockets --- tools/testing/selftests/net/.gitignore | 1 + tools/testing/selftests/net/af_unix/Makefile | 2 +- .../testing/selftests/net/af_unix/scm_pidfd.c | 430 ++++++++++++++++++ 3 files changed, 432 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/net/af_unix/scm_pidfd.c diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore index 80f06aa62034..83fd1ebd34ec 100644 --- a/tools/testing/selftests/net/.gitignore +++ b/tools/testing/selftests/net/.gitignore @@ -26,6 +26,7 @@ reuseport_bpf_cpu reuseport_bpf_numa reuseport_dualstack rxtimestamp +scm_pidfd sk_bind_sendto_listen sk_connect_zero_addr socket diff --git a/tools/testing/selftests/net/af_unix/Makefile b/tools/testing/selftests/net/af_unix/Makefile index 1e4b397cece6..f5ca9da8c4d5 100644 --- a/tools/testing/selftests/net/af_unix/Makefile +++ b/tools/testing/selftests/net/af_unix/Makefile @@ -1,3 +1,3 @@ -TEST_GEN_PROGS := diag_uid test_unix_oob unix_connect +TEST_GEN_PROGS := diag_uid test_unix_oob unix_connect scm_pidfd include ../../lib.mk diff --git a/tools/testing/selftests/net/af_unix/scm_pidfd.c b/tools/testing/selftests/net/af_unix/scm_pidfd.c new file mode 100644 index 000000000000..a86222143d79 --- /dev/null +++ b/tools/testing/selftests/net/af_unix/scm_pidfd.c @@ -0,0 +1,430 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +#define _GNU_SOURCE +#include <error.h> +#include <limits.h> +#include <stddef.h> +#include <stdio.h> +#include <stdlib.h> +#include <sys/socket.h> +#include <linux/socket.h> +#include <unistd.h> +#include <string.h> +#include <errno.h> +#include <sys/un.h> +#include <sys/signal.h> +#include <sys/types.h> +#include <sys/wait.h> + +#include "../../kselftest_harness.h" + +#define clean_errno() (errno == 0 ? "None" : strerror(errno)) +#define log_err(MSG, ...) \ + fprintf(stderr, "(%s:%d: errno: %s) " MSG "\n", __FILE__, __LINE__, \ + clean_errno(), ##__VA_ARGS__) + +#ifndef SCM_PIDFD +#define SCM_PIDFD 0x04 +#endif + +static void child_die() +{ + exit(1); +} + +static int safe_int(const char *numstr, int *converted) +{ + char *err = NULL; + long sli; + + errno = 0; + sli = strtol(numstr, &err, 0); + if (errno == ERANGE && (sli == LONG_MAX || sli == LONG_MIN)) + return -ERANGE; + + if (errno != 0 && sli == 0) + return -EINVAL; + + if (err == numstr || *err != '\0') + return -EINVAL; + + if (sli > INT_MAX || sli < INT_MIN) + return -ERANGE; + + *converted = (int)sli; + return 0; +} + +static int char_left_gc(const char *buffer, size_t len) +{ + size_t i; + + for (i = 0; i < len; i++) { + if (buffer[i] == ' ' || buffer[i] == '\t') + continue; + + return i; + } + + return 0; +} + +static int char_right_gc(const char *buffer, size_t len) +{ + int i; + + for (i = len - 1; i >= 0; i--) { + if (buffer[i] == ' ' || buffer[i] == '\t' || + buffer[i] == '\n' || buffer[i] == '\0') + continue; + + return i + 1; + } + + return 0; +} + +static char *trim_whitespace_in_place(char *buffer) +{ + buffer += char_left_gc(buffer, strlen(buffer)); + buffer[char_right_gc(buffer, strlen(buffer))] = '\0'; + return buffer; +} + +/* borrowed (with all helpers) from pidfd/pidfd_open_test.c */ +static pid_t get_pid_from_fdinfo_file(int pidfd, const char *key, size_t keylen) +{ + int ret; + char path[512]; + FILE *f; + size_t n = 0; + pid_t result = -1; + char *line = NULL; + + snprintf(path, sizeof(path), "/proc/self/fdinfo/%d", pidfd); + + f = fopen(path, "re"); + if (!f) + return -1; + + while (getline(&line, &n, f) != -1) { + char *numstr; + + if (strncmp(line, key, keylen)) + continue; + + numstr = trim_whitespace_in_place(line + 4); + ret = safe_int(numstr, &result); + if (ret < 0) + goto out; + + break; + } + +out: + free(line); + fclose(f); + return result; +} + +static int cmsg_check(int fd) +{ + struct msghdr msg = { 0 }; + struct cmsghdr *cmsg; + struct iovec iov; + struct ucred *ucred = NULL; + int data = 0; + char control[CMSG_SPACE(sizeof(struct ucred)) + + CMSG_SPACE(sizeof(int))] = { 0 }; + int *pidfd = NULL; + pid_t parent_pid; + int err; + + iov.iov_base = &data; + iov.iov_len = sizeof(data); + + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = control; + msg.msg_controllen = sizeof(control); + + err = recvmsg(fd, &msg, 0); + if (err < 0) { + log_err("recvmsg"); + return 1; + } + + if (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC)) { + log_err("recvmsg: truncated"); + return 1; + } + + for (cmsg = CMSG_FIRSTHDR(&msg); cmsg != NULL; + cmsg = CMSG_NXTHDR(&msg, cmsg)) { + if (cmsg->cmsg_level == SOL_SOCKET && + cmsg->cmsg_type == SCM_PIDFD) { + if (cmsg->cmsg_len < sizeof(*pidfd)) { + log_err("CMSG parse: SCM_PIDFD wrong len"); + return 1; + } + + pidfd = (void *)CMSG_DATA(cmsg); + } + + if (cmsg->cmsg_level == SOL_SOCKET && + cmsg->cmsg_type == SCM_CREDENTIALS) { + if (cmsg->cmsg_len < sizeof(*ucred)) { + log_err("CMSG parse: SCM_CREDENTIALS wrong len"); + return 1; + } + + ucred = (void *)CMSG_DATA(cmsg); + } + } + + /* send(pfd, "x", sizeof(char), 0) */ + if (data != 'x') { + log_err("recvmsg: data corruption"); + return 1; + } + + if (!pidfd) { + log_err("CMSG parse: SCM_PIDFD not found"); + return 1; + } + + if (!ucred) { + log_err("CMSG parse: SCM_CREDENTIALS not found"); + return 1; + } + + /* pidfd from SCM_PIDFD should point to the parent process PID */ + parent_pid = + get_pid_from_fdinfo_file(*pidfd, "Pid:", sizeof("Pid:") - 1); + if (parent_pid != getppid()) { + log_err("wrong SCM_PIDFD %d != %d", parent_pid, getppid()); + return 1; + } + + return 0; +} + +struct sock_addr { + char sock_name[32]; + struct sockaddr_un listen_addr; + socklen_t addrlen; +}; + +FIXTURE(scm_pidfd) +{ + int server; + pid_t client_pid; + int startup_pipe[2]; + struct sock_addr server_addr; + struct sock_addr *client_addr; +}; + +FIXTURE_VARIANT(scm_pidfd) +{ + int type; + bool abstract; +}; + +FIXTURE_VARIANT_ADD(scm_pidfd, stream_pathname) +{ + .type = SOCK_STREAM, + .abstract = 0, +}; + +FIXTURE_VARIANT_ADD(scm_pidfd, stream_abstract) +{ + .type = SOCK_STREAM, + .abstract = 1, +}; + +FIXTURE_VARIANT_ADD(scm_pidfd, dgram_pathname) +{ + .type = SOCK_DGRAM, + .abstract = 0, +}; + +FIXTURE_VARIANT_ADD(scm_pidfd, dgram_abstract) +{ + .type = SOCK_DGRAM, + .abstract = 1, +}; + +FIXTURE_SETUP(scm_pidfd) +{ + self->client_addr = mmap(NULL, sizeof(*self->client_addr), PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_ANONYMOUS, -1, 0); + ASSERT_NE(MAP_FAILED, self->client_addr); +} + +FIXTURE_TEARDOWN(scm_pidfd) +{ + close(self->server); + + kill(self->client_pid, SIGKILL); + waitpid(self->client_pid, NULL, 0); + + if (!variant->abstract) { + unlink(self->server_addr.sock_name); + unlink(self->client_addr->sock_name); + } +} + +static void fill_sockaddr(struct sock_addr *addr, bool abstract) +{ + char *sun_path_buf = (char *)&addr->listen_addr.sun_path; + + addr->listen_addr.sun_family = AF_UNIX; + addr->addrlen = offsetof(struct sockaddr_un, sun_path); + snprintf(addr->sock_name, sizeof(addr->sock_name), "scm_pidfd_%d", getpid()); + addr->addrlen += strlen(addr->sock_name); + if (abstract) { + *sun_path_buf = '\0'; + addr->addrlen++; + sun_path_buf++; + } else { + unlink(addr->sock_name); + } + memcpy(sun_path_buf, addr->sock_name, strlen(addr->sock_name)); +} + +static void client(FIXTURE_DATA(scm_pidfd) *self, + const FIXTURE_VARIANT(scm_pidfd) *variant) +{ + int err; + int cfd; + socklen_t len; + struct ucred peer_cred; + int peer_pidfd; + pid_t peer_pid; + int on = 0; + + cfd = socket(AF_UNIX, variant->type, 0); + if (cfd < 0) { + log_err("socket"); + child_die(); + } + + if (variant->type == SOCK_DGRAM) { + fill_sockaddr(self->client_addr, variant->abstract); + + if (bind(cfd, (struct sockaddr *)&self->client_addr->listen_addr, self->client_addr->addrlen)) { + log_err("bind"); + child_die(); + } + } + + if (connect(cfd, (struct sockaddr *)&self->server_addr.listen_addr, + self->server_addr.addrlen) != 0) { + log_err("connect"); + child_die(); + } + + on = 1; + if (setsockopt(cfd, SOL_SOCKET, SO_PASSCRED, &on, sizeof(on))) { + log_err("Failed to set SO_PASSCRED"); + child_die(); + } + + if (setsockopt(cfd, SOL_SOCKET, SO_PASSPIDFD, &on, sizeof(on))) { + log_err("Failed to set SO_PASSPIDFD"); + child_die(); + } + + close(self->startup_pipe[1]); + + if (cmsg_check(cfd)) { + log_err("cmsg_check failed"); + child_die(); + } + + /* skip further for SOCK_DGRAM as it's not applicable */ + if (variant->type == SOCK_DGRAM) + return; + + len = sizeof(peer_cred); + if (getsockopt(cfd, SOL_SOCKET, SO_PEERCRED, &peer_cred, &len)) { + log_err("Failed to get SO_PEERCRED"); + child_die(); + } + + len = sizeof(peer_pidfd); + if (getsockopt(cfd, SOL_SOCKET, SO_PEERPIDFD, &peer_pidfd, &len)) { + log_err("Failed to get SO_PEERPIDFD"); + child_die(); + } + + /* pid from SO_PEERCRED should point to the parent process PID */ + if (peer_cred.pid != getppid()) { + log_err("peer_cred.pid != getppid(): %d != %d", peer_cred.pid, getppid()); + child_die(); + } + + peer_pid = get_pid_from_fdinfo_file(peer_pidfd, + "Pid:", sizeof("Pid:") - 1); + if (peer_pid != peer_cred.pid) { + log_err("peer_pid != peer_cred.pid: %d != %d", peer_pid, peer_cred.pid); + child_die(); + } +} + +TEST_F(scm_pidfd, test) +{ + int err; + int pfd; + int child_status = 0; + + self->server = socket(AF_UNIX, variant->type, 0); + ASSERT_NE(-1, self->server); + + fill_sockaddr(&self->server_addr, variant->abstract); + + err = bind(self->server, (struct sockaddr *)&self->server_addr.listen_addr, self->server_addr.addrlen); + ASSERT_EQ(0, err); + + if (variant->type == SOCK_STREAM) { + err = listen(self->server, 1); + ASSERT_EQ(0, err); + } + + err = pipe(self->startup_pipe); + ASSERT_NE(-1, err); + + self->client_pid = fork(); + ASSERT_NE(-1, self->client_pid); + if (self->client_pid == 0) { + close(self->server); + close(self->startup_pipe[0]); + client(self, variant); + exit(0); + } + close(self->startup_pipe[1]); + + if (variant->type == SOCK_STREAM) { + pfd = accept(self->server, NULL, NULL); + ASSERT_NE(-1, pfd); + } else { + pfd = self->server; + } + + /* wait until the child arrives at checkpoint */ + read(self->startup_pipe[0], &err, sizeof(int)); + close(self->startup_pipe[0]); + + if (variant->type == SOCK_DGRAM) { + err = sendto(pfd, "x", sizeof(char), 0, (struct sockaddr *)&self->client_addr->listen_addr, self->client_addr->addrlen); + ASSERT_NE(-1, err); + } else { + err = send(pfd, "x", sizeof(char), 0); + ASSERT_NE(-1, err); + } + + close(pfd); + waitpid(self->client_pid, &child_status, 0); + ASSERT_EQ(0, WIFEXITED(child_status) ? WEXITSTATUS(child_status) : 1); +} + +TEST_HARNESS_MAIN -- 2.34.1

2 years, 2 months

1
0
0 0

[PATCH v6 0/4] Add IO page table replacement support

by Nicolin Chen

[ This series depends on the VFIO device cdev series ] Changelog v6: * Rebased on top of cdev v8 series https://lore.kernel.org/kvm/20230327094047.47215-1-yi.l.liu@intel.com/ * Added "Reviewed-by" from Kevin to PATCH-4 * Squashed access->ioas updating lines into iommufd_access_change_pt(), and changed function return type accordingly for simplification. v5: https://lore.kernel.org/linux-iommu/cover.1679559476.git.nicolinc@nvidia.co… * Kept the cmd->id in the iommufd_test_create_access() so the access can be created with an ioas by default. Then, renamed the previous ioctl IOMMU_TEST_OP_ACCESS_SET_IOAS to IOMMU_TEST_OP_ACCESS_REPLACE_IOAS, so it would be used to replace an access->ioas pointer. * Added iommufd_access_replace() API after the introductions of the other two APIs iommufd_access_attach() and iommufd_access_detach(). * Since vdev->iommufd_attached is also set in emulated pathway too, call iommufd_access_update(), similar to the physical pathway. v4: https://lore.kernel.org/linux-iommu/cover.1678284812.git.nicolinc@nvidia.co… * Rebased on top of Jason's series adding replace() and hwpt_alloc() https://lore.kernel.org/linux-iommu/0-v2-51b9896e7862+8a8c-iommufd_alloc_jg… * Rebased on top of cdev series v6 https://lore.kernel.org/kvm/20230308132903.465159-1-yi.l.liu@intel.com/ * Dropped the patch that's moved to cdev series. * Added unmap function pointer sanity before calling it. * Added "Reviewed-by" from Kevin and Yi. * Added back the VFIO change updating the ATTACH uAPI. v3: https://lore.kernel.org/linux-iommu/cover.1677288789.git.nicolinc@nvidia.co… * Rebased on top of Jason's iommufd_hwpt branch: https://lore.kernel.org/linux-iommu/0-v2-406f7ac07936+6a-iommufd_hwpt_jgg@n… * Dropped patches from this series accordingly. There were a couple of VFIO patches that will be submitted after the VFIO cdev series. Also, renamed the series to be "emulated". * Moved dma_unmap sanity patch to the first in the series. * Moved dma_unmap sanity to cover both VFIO and IOMMUFD pathways. * Added Kevin's "Reviewed-by" to two of the patches. * Fixed a NULL pointer bug in vfio_iommufd_emulated_bind(). * Moved unmap() call to the common place in iommufd_access_set_ioas(). v2: https://lore.kernel.org/linux-iommu/cover.1675802050.git.nicolinc@nvidia.co… * Rebased on top of vfio_device cdev v2 series. * Update the kdoc and commit message of iommu_group_replace_domain(). * Dropped revert-to-core-domain part in iommu_group_replace_domain(). * Dropped !ops->dma_unmap check in vfio_iommufd_emulated_attach_ioas(). * Added missing rc value in vfio_iommufd_emulated_attach_ioas() from the iommufd_access_set_ioas() call. * Added a new patch in vfio_main to deny vfio_pin/unpin_pages() calls if vdev->ops->dma_unmap is not implemented. * Added a __iommmufd_device_detach helper and let the replace routine do a partial detach(). * Added restriction on auto_domains to use the replace feature. * Added the patch "iommufd/device: Make hwpt_list list_add/del symmetric" from the has_group removal series. v1: https://lore.kernel.org/linux-iommu/cover.1675320212.git.nicolinc@nvidia.co… Hi all, The existing IOMMU APIs provide a pair of functions: iommu_attach_group() for callers to attach a device from the default_domain (NULL if not being supported) to a given iommu domain, and iommu_detach_group() for callers to detach a device from a given domain to the default_domain. Internally, the detach_dev op is deprecated for the newer drivers with default_domain. This means that those drivers likely can switch an attaching domain to another one, without stagging the device at a blocking or default domain, for use cases such as: 1) vPASID mode, when a guest wants to replace a single pasid (PASID=0) table with a larger table (PASID=N) 2) Nesting mode, when switching the attaching device from an S2 domain to an S1 domain, or when switching between relevant S1 domains. This series is rebased on top of Jason Gunthorpe's series that introduces iommu_group_replace_domain API and IOMMUFD infrastructure for the IOMMUFD "physical" devices. The IOMMUFD "emulated" deivces will need some extra steps to replace the access->ioas object and its iopt pointer. You can also find this series on Github: https://github.com/nicolinc/iommufd/commits/iommu_group_replace_domain-v6 Thank you Nicolin Chen Nicolin Chen (4): vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages() iommufd: Add iommufd_access_replace() API iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_REPLACE_IOAS coverage vfio: Support IO page table replacement drivers/iommu/iommufd/device.c | 53 ++++++++++++++----- drivers/iommu/iommufd/iommufd_test.h | 4 ++ drivers/iommu/iommufd/selftest.c | 19 +++++++ drivers/vfio/iommufd.c | 11 ++-- drivers/vfio/vfio_main.c | 4 ++ include/linux/iommufd.h | 1 + include/uapi/linux/vfio.h | 6 +++ tools/testing/selftests/iommu/iommfd*.c | 0 tools/testing/selftests/iommu/iommufd.c | 29 +++++++++- tools/testing/selftests/iommu/iommufd_utils.h | 19 +++++++ 10 files changed, 127 insertions(+), 19 deletions(-) create mode 100644 tools/testing/selftests/iommu/iommfd*.c -- 2.40.0

2 years, 2 months

2
6
0 0

kselftest/next kselftest-seccomp: 7 runs, 4 regressions (v6.3-rc1-14-g5874a6a187f2)

by kernelci.org bot

kselftest/next kselftest-seccomp: 7 runs, 4 regressions (v6.3-rc1-14-g5874a6a187f2) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 meson-gxl-s905x-libretech-cc | arm64 | lab-broonie | gcc-10 | defconfig+kselftest | 1 mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 mt8183-kukui-...uniper-sku16 | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.3-rc1-14-g587… Test: kselftest-seccomp Tree: kselftest Branch: next Describe: v6.3-rc1-14-g5874a6a187f2 URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: 5874a6a187f2e814542d7fdf918fd29f79ff25c3 Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/64347bb1c164166bc52e8608 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: multi_v7_defconfig+kselftest Compiler: gcc-10 (arm-linux-gnueabihf-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm/… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm/… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-seccomp.login: https://kernelci.org/test/case/id/64347bb1c164166bc52e8609 failing since 104 days (last pass: v6.1-rc1-24-gd5ba85d6d8be, first fail: v6.2-rc1) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ meson-gxl-s905x-libretech-cc | arm64 | lab-broonie | gcc-10 | defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/64348b1d05a44e02062e8611 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-seccomp.login: https://kernelci.org/test/case/id/64348b1d05a44e02062e8612 failing since 172 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1-1-gde3ee3f63400a) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/643480b506d514599c2e8606 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-seccomp.login: https://kernelci.org/test/case/id/643480b506d514599c2e8607 failing since 174 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ mt8183-kukui-...uniper-sku16 | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/6434878c95927633232e8691 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-seccomp.login: https://kernelci.org/test/case/id/6434878c95927633232e8692 failing since 174 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1)

2 years, 2 months

1
0
0 0

kselftest/next kselftest-lkdtm: 4 runs, 3 regressions (v6.3-rc1-14-g5874a6a187f2)

by kernelci.org bot

kselftest/next kselftest-lkdtm: 4 runs, 3 regressions (v6.3-rc1-14-g5874a6a187f2) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 mt8183-kukui-...uniper-sku16 | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.3-rc1-14-g587… Test: kselftest-lkdtm Tree: kselftest Branch: next Describe: v6.3-rc1-14-g5874a6a187f2 URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: 5874a6a187f2e814542d7fdf918fd29f79ff25c3 Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/64347be74c89b6021d2e85f0 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: multi_v7_defconfig+kselftest Compiler: gcc-10 (arm-linux-gnueabihf-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm/… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm/… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lkdtm.login: https://kernelci.org/test/case/id/64347be74c89b6021d2e85f1 failing since 131 days (last pass: v6.1-rc1-23-g00dd59519141, first fail: v6.1-rc1-23-g8008d88e6d16) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/6434812f32b8898c522e85f1 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lkdtm.login: https://kernelci.org/test/case/id/6434812f32b8898c522e85f2 failing since 174 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ mt8183-kukui-...uniper-sku16 | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/6434863a382be783322e860d Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lkdtm.login: https://kernelci.org/test/case/id/6434863a382be783322e860e failing since 174 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1)

2 years, 2 months

1
0
0 0

kselftest/next kselftest-livepatch: 2 runs, 2 regressions (v6.3-rc1-14-g5874a6a187f2)

by kernelci.org bot

kselftest/next kselftest-livepatch: 2 runs, 2 regressions (v6.3-rc1-14-g5874a6a187f2) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions ----------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.3-rc1-14-g587… Test: kselftest-livepatch Tree: kselftest Branch: next Describe: v6.3-rc1-14-g5874a6a187f2 URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: 5874a6a187f2e814542d7fdf918fd29f79ff25c3 Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions ----------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/64347be64c89b6021d2e85ed Results: 0 PASS, 1 FAIL, 0 SKIP Full config: multi_v7_defconfig+kselftest Compiler: gcc-10 (arm-linux-gnueabihf-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm/… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm/… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-livepatch.login: https://kernelci.org/test/case/id/64347be64c89b6021d2e85ee failing since 104 days (last pass: v6.1-rc1-24-gd5ba85d6d8be, first fail: v6.2-rc1) platform | arch | lab | compiler | defconfig | regressions ----------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/64348231291f5811772e85ed Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-livepatch.login: https://kernelci.org/test/case/id/64348231291f5811772e85ee failing since 174 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1)

2 years, 2 months

1
0
0 0

kselftest/next kselftest-lib: 8 runs, 4 regressions (v6.3-rc1-14-g5874a6a187f2)

by kernelci.org bot

kselftest/next kselftest-lib: 8 runs, 4 regressions (v6.3-rc1-14-g5874a6a187f2) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 meson-g12b-a311d-khadas-vim3 | arm64 | lab-collabora | gcc-10 | defconfig+kselftest | 1 meson-gxl-s905x-libretech-cc | arm64 | lab-broonie | gcc-10 | defconfig+kselftest | 1 mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.3-rc1-14-g587… Test: kselftest-lib Tree: kselftest Branch: next Describe: v6.3-rc1-14-g5874a6a187f2 URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: 5874a6a187f2e814542d7fdf918fd29f79ff25c3 Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ imx6q-sabrelite | arm | lab-collabora | gcc-10 | multi_v7_defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/64347baff9764bbe142e85f1 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: multi_v7_defconfig+kselftest Compiler: gcc-10 (arm-linux-gnueabihf-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm/… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm/… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lib.login: https://kernelci.org/test/case/id/64347baff9764bbe142e85f2 failing since 104 days (last pass: v6.1-rc1-24-gd5ba85d6d8be, first fail: v6.2-rc1) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ meson-g12b-a311d-khadas-vim3 | arm64 | lab-collabora | gcc-10 | defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/643484e97a3f9d9ae22e860a Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lib.login: https://kernelci.org/test/case/id/643484e97a3f9d9ae22e860b failing since 174 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ meson-gxl-s905x-libretech-cc | arm64 | lab-broonie | gcc-10 | defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/64348ae2cfcb1b8f342e85ea Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lib.login: https://kernelci.org/test/case/id/64348ae2cfcb1b8f342e85eb failing since 172 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1-1-gde3ee3f63400a) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/6434819b365b6ccede2e85e9 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-lib.login: https://kernelci.org/test/case/id/6434819b365b6ccede2e85ea failing since 174 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1)

2 years, 2 months

1
0
0 0

kselftest/next kselftest-cpufreq: 7 runs, 2 regressions (v6.3-rc1-14-g5874a6a187f2)

by kernelci.org bot

kselftest/next kselftest-cpufreq: 7 runs, 2 regressions (v6.3-rc1-14-g5874a6a187f2) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ meson-gxl-s905x-libretech-cc | arm64 | lab-broonie | gcc-10 | defconfig+kselftest | 1 mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/job/kselftest/branch/next/kernel/v6.3-rc1-14-g587… Test: kselftest-cpufreq Tree: kselftest Branch: next Describe: v6.3-rc1-14-g5874a6a187f2 URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git SHA: 5874a6a187f2e814542d7fdf918fd29f79ff25c3 Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ meson-gxl-s905x-libretech-cc | arm64 | lab-broonie | gcc-10 | defconfig+kselftest | 1 Details: https://kernelci.org/test/plan/id/64348b1f24aaf7600b2e85e6 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-cpufreq.login: https://kernelci.org/test/case/id/64348b1f24aaf7600b2e85e7 failing since 172 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1-1-gde3ee3f63400a) platform | arch | lab | compiler | defconfig | regressions -----------------------------+-------+---------------+----------+------------------------------+------------ mt8173-elm-hana | arm64 | lab-collabora | gcc-10 | defconfig+kse...4-chromebook | 1 Details: https://kernelci.org/test/plan/id/64348210e3ffd7435f2e87a3 Results: 0 PASS, 1 FAIL, 0 SKIP Full config: defconfig+kselftest+arm64-chromebook Compiler: gcc-10 (aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… HTML log: https://storage.kernelci.org//kselftest/next/v6.3-rc1-14-g5874a6a187f2/arm6… Rootfs: http://storage.kernelci.org/images/rootfs/debian/bullseye-kselftest/2023040… * kselftest-cpufreq.login: https://kernelci.org/test/case/id/64348210e3ffd7435f2e87a4 failing since 174 days (last pass: linux-kselftest-next-6.0-rc2-11-g144eeb2fc761, first fail: v6.1-rc1)

2 years, 2 months

1
0
0 0

kselftest/next build: 6 builds: 0 failed, 6 passed, 3 warnings (v6.3-rc1-14-g5874a6a187f2)

by kernelci.org bot

kselftest/next build: 6 builds: 0 failed, 6 passed, 3 warnings (v6.3-rc1-14-g5874a6a187f2) Full Build Summary: https://kernelci.org/build/kselftest/branch/next/kernel/v6.3-rc1-14-g5874a6… Tree: kselftest Branch: next Git Describe: v6.3-rc1-14-g5874a6a187f2 Git Commit: 5874a6a187f2e814542d7fdf918fd29f79ff25c3 Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git Built: 4 unique architectures Warnings Detected: arm64: defconfig+kselftest (clang-16): 3 warnings arm: i386: x86_64: Warnings summary: 1 drivers/gpu/host1x/dev.c:520:6: warning: variable 'syncpt_irq' is uninitialized when used here [-Wuninitialized] 1 drivers/gpu/host1x/dev.c:490:16: note: initialize the variable 'syncpt_irq' to silence this warning 1 1 warning generated. ================================================================================ Detailed per-defconfig build reports: -------------------------------------------------------------------------------- defconfig+kselftest (arm64, clang-16) — PASS, 0 errors, 3 warnings, 0 section mismatches Warnings: drivers/gpu/host1x/dev.c:520:6: warning: variable 'syncpt_irq' is uninitialized when used here [-Wuninitialized] drivers/gpu/host1x/dev.c:490:16: note: initialize the variable 'syncpt_irq' to silence this warning 1 warning generated. -------------------------------------------------------------------------------- defconfig+kselftest (arm64, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- defconfig+kselftest+arm64-chromebook (arm64, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- i386_defconfig+kselftest (i386, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- multi_v7_defconfig+kselftest (arm, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches -------------------------------------------------------------------------------- x86_64_defconfig+kselftest (x86_64, gcc-10) — PASS, 0 errors, 0 warnings, 0 section mismatches --- For more info write to <info(a)kernelci.org>

2 years, 2 months

1
0
0 0

[PATCH bpf-next] selftests/bpf: trace_helpers.c: Fix segfault

by Rong Tao

From: Rong Tao <rongtao(a)cestc.cn> When the number of symbols is greater than MAX_SYMS (300000), the access array struct ksym syms[MAX_SYMS] goes out of bounds, which will result in a segfault. Resolve this issue by judging the maximum number and exiting the loop, and increasing the default size appropriately. (6.2.9 = 329839 below) $ cat /proc/kallsyms | wc -l 329839 GDB debugging: $ cd linux/samples/bpf $ sudo gdb ./sampleip ... (gdb) r ... Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7e2debf in malloc () from /lib64/libc.so.6 Missing separate debuginfos, use: dnf debuginfo-install elfutils-libelf-0.189-1.fc37.x86_64 glibc-2.36-9.fc37.x86_64 libzstd-1.5.4-1.fc37.x86_64 zlib-1.2.12-5.fc37.x86_64 (gdb) bt #0 0x00007ffff7e2debf in malloc () from /lib64/libc.so.6 #1 0x00007ffff7e33f8e in strdup () from /lib64/libc.so.6 #2 0x0000000000403fb0 in load_kallsyms_refresh() from trace_helpers.c #3 0x00000000004038b2 in main () Signed-off-by: Rong Tao <rongtao(a)cestc.cn> --- tools/testing/selftests/bpf/trace_helpers.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c index 09a16a77bae4..a9d589c560d2 100644 --- a/tools/testing/selftests/bpf/trace_helpers.c +++ b/tools/testing/selftests/bpf/trace_helpers.c @@ -14,7 +14,7 @@ #define DEBUGFS "/sys/kernel/debug/tracing/" -#define MAX_SYMS 300000 +#define MAX_SYMS 400000 static struct ksym syms[MAX_SYMS]; static int sym_cnt; @@ -44,7 +44,8 @@ int load_kallsyms_refresh(void) continue; syms[i].addr = (long) addr; syms[i].name = strdup(func); - i++; + if (++i >= MAX_SYMS) + break; } fclose(f); sym_cnt = i; -- 2.39.2

2 years, 3 months

2
1
0 0

[PATCH bpf-next v5 4/4] selftests: xsk: Add tests for 8K and 9K frame sizes

by Kal Conley

Add tests: - RUN_TO_COMPLETION_8K_FRAME_SIZE: frame_size=8192 (aligned) - UNALIGNED_9K_FRAME_SIZE: frame_size=9000 (unaligned) Signed-off-by: Kal Conley <kal.conley(a)dectris.com> --- tools/testing/selftests/bpf/xskxceiver.c | 25 ++++++++++++++++++++++++ tools/testing/selftests/bpf/xskxceiver.h | 2 ++ 2 files changed, 27 insertions(+) diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c index 7eccf57a0ccc..86797de7fc50 100644 --- a/tools/testing/selftests/bpf/xskxceiver.c +++ b/tools/testing/selftests/bpf/xskxceiver.c @@ -1841,6 +1841,17 @@ static void run_pkt_test(struct test_spec *test, enum test_mode mode, enum test_ pkt_stream_replace(test, DEFAULT_PKT_CNT, PKT_SIZE); testapp_validate_traffic(test); break; + case TEST_TYPE_RUN_TO_COMPLETION_8K_FRAME: + if (!hugepages_present(test->ifobj_tx)) { + ksft_test_result_skip("No 2M huge pages present.\n"); + return; + } + test_spec_set_name(test, "RUN_TO_COMPLETION_8K_FRAME_SIZE"); + test->ifobj_tx->umem->frame_size = 8192; + test->ifobj_rx->umem->frame_size = 8192; + pkt_stream_replace(test, DEFAULT_PKT_CNT, PKT_SIZE); + testapp_validate_traffic(test); + break; case TEST_TYPE_RX_POLL: test->ifobj_rx->use_poll = true; test_spec_set_name(test, "POLL_RX"); @@ -1904,6 +1915,20 @@ static void run_pkt_test(struct test_spec *test, enum test_mode mode, enum test_ if (!testapp_unaligned(test)) return; break; + case TEST_TYPE_UNALIGNED_9K_FRAME: + if (!hugepages_present(test->ifobj_tx)) { + ksft_test_result_skip("No 2M huge pages present.\n"); + return; + } + test_spec_set_name(test, "UNALIGNED_9K_FRAME_SIZE"); + test->ifobj_tx->umem->frame_size = 9000; + test->ifobj_rx->umem->frame_size = 9000; + test->ifobj_tx->umem->unaligned_mode = true; + test->ifobj_rx->umem->unaligned_mode = true; + pkt_stream_replace(test, DEFAULT_PKT_CNT, PKT_SIZE); + test->ifobj_rx->pkt_stream->use_addr_for_fill = true; + testapp_validate_traffic(test); + break; case TEST_TYPE_HEADROOM: testapp_headroom(test); break; diff --git a/tools/testing/selftests/bpf/xskxceiver.h b/tools/testing/selftests/bpf/xskxceiver.h index 919327807a4e..7f52f737f5e9 100644 --- a/tools/testing/selftests/bpf/xskxceiver.h +++ b/tools/testing/selftests/bpf/xskxceiver.h @@ -69,12 +69,14 @@ enum test_mode { enum test_type { TEST_TYPE_RUN_TO_COMPLETION, TEST_TYPE_RUN_TO_COMPLETION_2K_FRAME, + TEST_TYPE_RUN_TO_COMPLETION_8K_FRAME, TEST_TYPE_RUN_TO_COMPLETION_SINGLE_PKT, TEST_TYPE_RX_POLL, TEST_TYPE_TX_POLL, TEST_TYPE_POLL_RXQ_TMOUT, TEST_TYPE_POLL_TXQ_TMOUT, TEST_TYPE_UNALIGNED, + TEST_TYPE_UNALIGNED_9K_FRAME, TEST_TYPE_ALIGNED_INV_DESC, TEST_TYPE_ALIGNED_INV_DESC_2K_FRAME, TEST_TYPE_UNALIGNED_INV_DESC, -- 2.39.2

2 years, 3 months

1
0
0 0

[PATCH bpf-next v5 3/4] selftests: xsk: Use hugepages when umem->frame_size > PAGE_SIZE

by Kal Conley

HugeTLB UMEMs now support chunk_size > PAGE_SIZE. Set MAP_HUGETLB when frame_size > PAGE_SIZE for future tests. Signed-off-by: Kal Conley <kal.conley(a)dectris.com> --- tools/testing/selftests/bpf/xskxceiver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c index 5a9691e942de..7eccf57a0ccc 100644 --- a/tools/testing/selftests/bpf/xskxceiver.c +++ b/tools/testing/selftests/bpf/xskxceiver.c @@ -1289,7 +1289,7 @@ static void thread_common_ops(struct test_spec *test, struct ifobject *ifobject) void *bufs; int ret; - if (ifobject->umem->unaligned_mode) + if (ifobject->umem->frame_size > sysconf(_SC_PAGESIZE) || ifobject->umem->unaligned_mode) mmap_flags |= MAP_HUGETLB; if (ifobject->shared_umem) -- 2.39.2

2 years, 3 months

1
0
0 0

[PATCH][next] KVM: selftests: Fix spelling mistake "KVM_HYPERCAL_EXIT_SMC" -> "KVM_HYPERCALL_EXIT_SMC"

by Colin Ian King

There is a spelling mistake in a test assert message. Fix it. Signed-off-by: Colin Ian King <colin.i.king(a)gmail.com> --- tools/testing/selftests/kvm/aarch64/smccc_filter.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/kvm/aarch64/smccc_filter.c b/tools/testing/selftests/kvm/aarch64/smccc_filter.c index 0f9db0641847..82650313451a 100644 --- a/tools/testing/selftests/kvm/aarch64/smccc_filter.c +++ b/tools/testing/selftests/kvm/aarch64/smccc_filter.c @@ -211,7 +211,7 @@ static void expect_call_fwd_to_user(struct kvm_vcpu *vcpu, uint32_t func_id, "KVM_HYPERCALL_EXIT_SMC is not set"); else TEST_ASSERT(!(run->hypercall.flags & KVM_HYPERCALL_EXIT_SMC), - "KVM_HYPERCAL_EXIT_SMC is set"); + "KVM_HYPERCALL_EXIT_SMC is set"); } /* SMCCC calls forwarded to userspace cause KVM_EXIT_HYPERCALL exits */ -- 2.30.2

2 years, 3 months

3
2
0 0

Re: Re: [PATCH v2 0/2] Fix failure to access u32* argument of tracked function

by Feng Zhou

在 2023/4/7 18:49, Feng Zhou 写道: > 在 2023/4/7 17:33, Jiri Olsa 写道: >> On Fri, Apr 07, 2023 at 04:46:06PM +0800, Feng zhou wrote: >>> From: Feng Zhou<zhoufeng.zf(a)bytedance.com> >>> >>> When access traced function arguments with type is u32*, bpf >>> verifier failed. >>> Because u32 have typedef, needs to skip modifier. Add >>> btf_type_is_modifier in >>> is_int_ptr. Add a selftest to check it. >>> >>> Feng Zhou (2): >>> bpf/btf: Fix is_int_ptr() >>> selftests/bpf: Add test to access u32 ptr argument in tracing >>> program >> hi, >> it breaks several tests in test_progs suite: >> >> #11/36 bpf_iter/link-iter:FAIL >> #11 bpf_iter:FAIL >> test_dummy_st_ops_attach:FAIL:dummy_st_ops_load unexpected error: -13 >> #63/1 dummy_st_ops/dummy_st_ops_attach:FAIL >> test_dummy_init_ret_value:FAIL:dummy_st_ops_load unexpected error: -13 >> #63/2 dummy_st_ops/dummy_init_ret_value:FAIL >> test_dummy_init_ptr_arg:FAIL:dummy_st_ops_load unexpected error: -13 >> #63/3 dummy_st_ops/dummy_init_ptr_arg:FAIL >> test_dummy_multiple_args:FAIL:dummy_st_ops_load unexpected error: -13 >> #63/4 dummy_st_ops/dummy_multiple_args:FAIL >> test_dummy_sleepable:FAIL:dummy_st_ops_load unexpected error: -13 >> #63/5 dummy_st_ops/dummy_sleepable:FAIL >> #63 dummy_st_ops:FAIL >> test_fentry_fexit:FAIL:fentry_skel_load unexpected error: -13 >> #69 fentry_fexit:FAIL >> test_fentry_test:FAIL:fentry_skel_load unexpected error: -13 >> #70 fentry_test:FAIL >> >> jirka >> > > I tried it, and it did cause the test to fail. Bpfverify reported an > error, > 'R1 invalid mem access'scalar', let me confirm the reason. I used btf_type_skip_modifiers，but did not delete the previous "t = btf_type_by_id (btf, t- > type);" resulting in some testcases failing. I will send a v3 nextweek, thank you for your suggestion. >>> Changelog: >>> v1->v2: Addressed comments from Martin KaFai Lau >>> - Add a selftest. >>> - use btf_type_skip_modifiers. >>> Some details in here: >>> https://lore.kernel.org/all/20221012125815.76120-1-zhouchengming@bytedance.… >>> >>> >>> kernel/bpf/btf.c | 5 ++--- >>> net/bpf/test_run.c | 8 +++++++- >>> .../testing/selftests/bpf/verifier/btf_ctx_access.c | 13 >>> +++++++++++++ >>> 3 files changed, 22 insertions(+), 4 deletions(-) >>> >>> -- >>> 2.20.1 >>> >

2 years, 3 months

1
0
0 0

[PATCH v2 0/2] Fix failure to access u32* argument of tracked function

by Feng zhou

From: Feng Zhou <zhoufeng.zf(a)bytedance.com> When access traced function arguments with type is u32*, bpf verifier failed. Because u32 have typedef, needs to skip modifier. Add btf_type_is_modifier in is_int_ptr. Add a selftest to check it. Feng Zhou (2): bpf/btf: Fix is_int_ptr() selftests/bpf: Add test to access u32 ptr argument in tracing program Changelog: v1->v2: Addressed comments from Martin KaFai Lau - Add a selftest. - use btf_type_skip_modifiers. Some details in here: https://lore.kernel.org/all/20221012125815.76120-1-zhouchengming@bytedance.… kernel/bpf/btf.c | 5 ++--- net/bpf/test_run.c | 8 +++++++- .../testing/selftests/bpf/verifier/btf_ctx_access.c | 13 +++++++++++++ 3 files changed, 22 insertions(+), 4 deletions(-) -- 2.20.1

2 years, 3 months

2
3
0 0

Re: [PATCH v5 1/3] mm: add new api to enable ksm per process

by Andrew Morton

On Thu, 6 Apr 2023 09:53:37 -0700 Stefan Roesch <shr(a)devkernel.io> wrote: > So far KSM can only be enabled by calling madvise for memory regions. To > be able to use KSM for more workloads, KSM needs to have the ability to be > enabled / disabled at the process / cgroup level. > > ... > > @@ -53,6 +62,18 @@ void folio_migrate_ksm(struct folio *newfolio, struct folio *folio); > > #else /* !CONFIG_KSM */ > > +static inline int ksm_add_mm(struct mm_struct *mm) > +{ > +} The compiler doesn't like the lack of a return value. I queued up a patch to simply delete the above function - seems that ksm_add_mm() has no callers if CONFIG_KSM=n. The same might be true of the ksm_add_vma()...ksm_exit() stubs also, Perhaps some kind soul could take a look at whether we can simply clean those out.

2 years, 3 months

2
1
0 0

[PATCH bpf-next v2 0/3] Add FOU support for externally controlled ipip devices

by Christian Ehrig

This patch set adds support for using FOU or GUE encapsulation with an ipip device operating in collect-metadata mode and a set of kfuncs for controlling encap parameters exposed to a BPF tc-hook. BPF tc-hooks allow us to read tunnel metadata (like remote IP addresses) in the ingress path of an externally controlled tunnel interface via the bpf_skb_get_tunnel_{key,opt} bpf-helpers. Packets can then be redirected to the same or a different externally controlled tunnel interface by overwriting metadata via the bpf_skb_set_tunnel_{key,opt} helpers and a call to bpf_redirect. This enables us to redirect packets between tunnel interfaces - and potentially change the encapsulation type - using only a single BPF program. Today this approach works fine for a couple of tunnel combinations. For example: redirecting packets between Geneve and GRE interfaces or GRE and plain ipip interfaces. However, redirecting using FOU or GUE is not supported today. The ip_tunnel module does not allow us to egress packets using additional UDP encapsulation from an ipip device in collect-metadata mode. Patch 1 lifts this restriction by adding a struct ip_tunnel_encap to the tunnel metadata. It can be filled by a new BPF kfunc introduced in Patch 2 and evaluated by the ip_tunnel egress path. This will allow us to use FOU and GUE encap with externally controlled ipip devices. Patch 2 introduces two new BPF kfuncs: bpf_skb_{set,get}_fou_encap. These helpers can be used to set and get UDP encap parameters from the BPF tc-hook doing the packet redirect. Patch 3 adds BPF tunnel selftests using the two kfuncs. --- v2: - Fixes for checkpatch.pl - Fixes for kernel test robot Christian Ehrig (3): ipip,ip_tunnel,sit: Add FOU support for externally controlled ipip devices bpf,fou: Add bpf_skb_{set,get}_fou_encap kfuncs selftests/bpf: Test FOU kfuncs for externally controlled ipip devices include/net/fou.h | 2 + include/net/ip_tunnels.h | 28 +++-- net/ipv4/Makefile | 2 +- net/ipv4/fou_bpf.c | 119 ++++++++++++++++++ net/ipv4/fou_core.c | 5 + net/ipv4/ip_tunnel.c | 22 +++- net/ipv4/ipip.c | 1 + net/ipv6/sit.c | 2 +- .../selftests/bpf/progs/test_tunnel_kern.c | 117 +++++++++++++++++ tools/testing/selftests/bpf/test_tunnel.sh | 81 ++++++++++++ 10 files changed, 362 insertions(+), 17 deletions(-) create mode 100644 net/ipv4/fou_bpf.c -- 2.39.2

2 years, 3 months

2
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror