- Linux-kselftest-mirror - lists.linaro.org

[PATCH v2 0/5] introduce VM_MAYBE_GUARD and make it sticky

by Lorenzo Stoakes

Currently, guard regions are not visible to users except through /proc/$pid/pagemap, with no explicit visibility at the VMA level. This makes the feature less useful, as it isn't entirely apparent which VMAs may have these entries present, especially when performing actions which walk through memory regions such as those performed by CRIU. This series addresses this issue by introducing the VM_MAYBE_GUARD flag which fulfils this role, updating the smaps logic to display an entry for these. The semantics of this flag are that a guard region MAY be present if set (we cannot be sure, as we can't efficiently track whether an MADV_GUARD_REMOVE finally removes all the guard regions in a VMA) - but if not set the VMA definitely does NOT have any guard regions present. It's problematic to establish this flag without further action, because that means that VMAs with guard regions in them become non-mergeable with adjacent VMAs for no especially good reason. To work around this, this series also introduces the concept of 'sticky' VMA flags - that is flags which: a. if set in one VMA and not in another still permit those VMAs to be merged (if otherwise compatible). b. When they are merged, the resultant VMA must have the flag set. The VMA logic is updated to propagate these flags correctly. Additionally, VM_MAYBE_GUARD being an explicit VMA flag allows us to solve an issue with file-backed guard regions - previously these established an anon_vma object for file-backed mappings solely to have vma_needs_copy() correctly propagate guard region mappings to child processes. We introduce a new flag alias VM_COPY_ON_FORK (which currently only specifies VM_MAYBE_GUARD) and update vma_needs_copy() to check explicitly for this flag and to copy page tables if it is present, which resolves this issue. Additionally, we add the ability for allow-listed VMA flags to be atomically writable with only mmap/VMA read locks held. The only flag we allow so far is VM_MAYBE_GUARD, which we carefully ensure does not cause any races by being allowed to do so. This allows us to maintain guard region installation as a read-locked operation and not endure the overhead of obtaining a write lock here. Finally we introduce extensive VMA userland tests to assert that the sticky VMA logic behaves correctly as well as guard region self tests to assert that smaps visibility is correctly implemented. v2: * Separated out userland VMA tests for sticky behaviour as per Suren. * Added the concept of atomic writable VMA flags as per Pedro and Vlastimil. * Made VM_MAYBE_GUARD an atomic writable flag so we don't have to take a VMA write lock in madvise() as per Pedro and Vlastimil. v1: https://lore.kernel.org/all/cover.1761756437.git.lorenzo.stoakes@oracle.com/ Lorenzo Stoakes (5): mm: introduce VM_MAYBE_GUARD and make visible in /proc/$pid/smaps mm: add atomic VMA flags, use VM_MAYBE_GUARD as such mm: implement sticky, copy on fork VMA flags tools/testing/vma: add VMA sticky userland tests selftests/mm/guard-regions: add smaps visibility test Documentation/filesystems/proc.rst | 1 + fs/proc/task_mmu.c | 1 + include/linux/mm.h | 58 ++++++++++ include/trace/events/mmflags.h | 1 + mm/madvise.c | 22 ++-- mm/memory.c | 3 + mm/vma.c | 22 ++-- tools/testing/selftests/mm/guard-regions.c | 120 +++++++++++++++++++++ tools/testing/selftests/mm/vm_util.c | 5 + tools/testing/selftests/mm/vm_util.h | 1 + tools/testing/vma/vma.c | 89 +++++++++++++-- tools/testing/vma/vma_internal.h | 35 ++++++ 12 files changed, 330 insertions(+), 28 deletions(-) -- 2.51.0

11 hours, 54 minutes

4
22
0 0

[PATCH nf-next v9 0/3] Add IPIP flowtable SW acceleration

by Lorenzo Bianconi

Introduce SW acceleration for IPIP tunnels in the netfilter flowtable infrastructure. This series introduces basic infrastructure to accelerate other tunnel types (e.g. IP6IP6). --- Changes in v9: - Fixed IPIP tunnel offloading when VLAN encapsulation is enabled. - Add IPIP tunnel over vlan self-test - Remove wrong filed from flow_offload_tuple key - Cosmetics - Link to v8: https://lore.kernel.org/r/20251023-nf-flowtable-ipip-v8-0-5d5d8595c730@kern… Changes in v8: - Rebase on top of the following series (not yet applied) https://patchwork.ozlabs.org/project/netfilter-devel/list/?series=477081 - Link to v7: https://lore.kernel.org/r/20251021-nf-flowtable-ipip-v7-0-a45214896106@kern… Changes in v7: - Introduce sw acceleration for tx path of IPIP tunnels - Rely on exact match during flowtable entry lookup - Fix typos - Link to v6: https://lore.kernel.org/r/20250818-nf-flowtable-ipip-v6-0-eda90442739c@kern… Changes in v6: - Rebase on top of nf-next main branch - Link to v5: https://lore.kernel.org/r/20250721-nf-flowtable-ipip-v5-0-0865af9e58c6@kern… Changes in v5: - Rely on __ipv4_addr_hash() to compute the hash used as encap ID - Remove unnecessary pskb_may_pull() in nf_flow_tuple_encap() - Add nf_flow_ip4_ecanp_pop utility routine - Link to v4: https://lore.kernel.org/r/20250718-nf-flowtable-ipip-v4-0-f8bb1c18b986@kern… Changes in v4: - Use the hash value of the saddr, daddr and protocol of outer IP header as encapsulation id. - Link to v3: https://lore.kernel.org/r/20250703-nf-flowtable-ipip-v3-0-880afd319b9f@kern… Changes in v3: - Add outer IP header sanity checks - target nf-next tree instead of net-next - Link to v2: https://lore.kernel.org/r/20250627-nf-flowtable-ipip-v2-0-c713003ce75b@kern… Changes in v2: - Introduce IPIP flowtable selftest - Link to v1: https://lore.kernel.org/r/20250623-nf-flowtable-ipip-v1-1-2853596e3941@kern… --- Lorenzo Bianconi (3): net: netfilter: Add IPIP flowtable rx sw acceleration net: netfilter: Add IPIP flowtable tx sw acceleration selftests: netfilter: nft_flowtable.sh: Add IPIP flowtable selftest include/linux/netdevice.h | 13 ++ include/net/netfilter/nf_flow_table.h | 18 +++ net/ipv4/ipip.c | 25 ++++ net/netfilter/nf_flow_table_core.c | 3 + net/netfilter/nf_flow_table_ip.c | 135 +++++++++++++++++++-- net/netfilter/nf_flow_table_path.c | 84 +++++++++++-- .../selftests/net/netfilter/nft_flowtable.sh | 69 +++++++++++ 7 files changed, 328 insertions(+), 19 deletions(-) --- base-commit: 32e4b1bf1bbfe63e52e2fff7ade0aaeb805defe3 change-id: 20250623-nf-flowtable-ipip-1b3d7b08d067 Best regards, -- Lorenzo Bianconi <lorenzo(a)kernel.org>

13 hours, 20 minutes

1
3
0 0

[RFC 0/2] xdp: Delegate fast path return decision to page_pool

by Dragos Tatulea

This small series proposes the removal of the BPF_RI_F_RF_NO_DIRECT XDP flag in favour of page_pool's internal page_pool_napi_local() check which can override a non-direct recycle into a direct one if the right conditions are met., This was discussed on the mailing list on several occasions [1][2]. The first patch adds additional benchmarking code to the page_pool benchmark. The second patch has the actual change with a proper explanation and measurements. It remains to be debated if the whole BPF_RI_F_RF_NO_DIRECT mechanism should be deleted or only its use in xdp_return_frame_rx_napi(). There is still the unresolved issue of drivers that don't support page_pool NAPI recycling. This series could be extended to add that support. Otherwise those drivers would end up with slow path recycling for XDP. [1] https://lore.kernel.org/all/8d165026-1477-46cb-94d4-a01e1da40833@kernel.org/ [2] https://lore.kernel.org/all/20250918084823.372000-1-dtatulea@nvidia.com/ Dragos Tatulea (2): page_pool: add benchmarking for napi-based recycling xdp: Delegate fast path return decision to page_pool drivers/net/veth.c | 2 - include/linux/filter.h | 22 ----- include/net/xdp.h | 2 +- kernel/bpf/cpumap.c | 2 - net/bpf/test_run.c | 2 - net/core/filter.c | 2 +- net/core/xdp.c | 24 ++--- .../bench/page_pool/bench_page_pool_simple.c | 92 ++++++++++++++++++- 8 files changed, 104 insertions(+), 44 deletions(-) -- 2.50.1

13 hours, 30 minutes

2
2
0 0

[PATCH v8 28/28] tracing: selftests: Add pKVM trace remote tests

by Vincent Donnefort

Run the trace remote selftests with the pKVM trace remote "hypervisor". Cc: Shuah Khan <skhan(a)linuxfoundation.org> Cc: linux-kselftest(a)vger.kernel.org Signed-off-by: Vincent Donnefort <vdonnefort(a)google.com> diff --git a/tools/testing/selftests/ftrace/test.d/remotes/pkvm/buffer_size.tc b/tools/testing/selftests/ftrace/test.d/remotes/pkvm/buffer_size.tc new file mode 100644 index 000000000000..2de07e4d72fe --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/remotes/pkvm/buffer_size.tc @@ -0,0 +1,11 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Test pkvm hypervisor trace buffer size +# requires: remotes/hypervisor/write_event + +SOURCE_REMOTE_TEST=1 +. $TEST_DIR/remotes/buffer_size.tc + +set -e +setup_remote "hypervisor" +test_buffer_size diff --git a/tools/testing/selftests/ftrace/test.d/remotes/pkvm/reset.tc b/tools/testing/selftests/ftrace/test.d/remotes/pkvm/reset.tc new file mode 100644 index 000000000000..48afc51627e8 --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/remotes/pkvm/reset.tc @@ -0,0 +1,11 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Test pkvm hypervisor trace buffer reset +# requires: remotes/hypervisor/write_event + +SOURCE_REMOTE_TEST=1 +. $TEST_DIR/remotes/reset.tc + +set -e +setup_remote "hypervisor" +test_reset diff --git a/tools/testing/selftests/ftrace/test.d/remotes/pkvm/trace.tc b/tools/testing/selftests/ftrace/test.d/remotes/pkvm/trace.tc index 49dca7c3861a..00aed1c2e650 100644 --- a/tools/testing/selftests/ftrace/test.d/remotes/pkvm/trace.tc +++ b/tools/testing/selftests/ftrace/test.d/remotes/pkvm/trace.tc @@ -1,9 +1,10 @@ #!/bin/sh # SPDX-License-Identifier: GPL-2.0 -# description: Test pkvm hypervisor tracing pipe +# description: Test pkvm hypervisor non-consuming trace read +# requires: remotes/hypervisor/write_event SOURCE_REMOTE_TEST=1 -. $TEST_DIR/remotes/trace_pipe.tc +. $TEST_DIR/remotes/trace.tc set -e setup_remote "hypervisor" diff --git a/tools/testing/selftests/ftrace/test.d/remotes/pkvm/trace_pipe.tc b/tools/testing/selftests/ftrace/test.d/remotes/pkvm/trace_pipe.tc new file mode 100644 index 000000000000..b63339aca380 --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/remotes/pkvm/trace_pipe.tc @@ -0,0 +1,11 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Test pkvm hypervisor consuming trace read +# requires: remotes/hypervisor/write_event + +SOURCE_REMOTE_TEST=1 +. $TEST_DIR/remotes/trace_pipe.tc + +set -e +setup_remote "hypervisor" +test_trace_pipe diff --git a/tools/testing/selftests/ftrace/test.d/remotes/pkvm/unloading.tc b/tools/testing/selftests/ftrace/test.d/remotes/pkvm/unloading.tc new file mode 100644 index 000000000000..eb1640a927cc --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/remotes/pkvm/unloading.tc @@ -0,0 +1,11 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Test pkvm hypervisor trace buffer unloading +# requires: remotes/hypervisor/write_event + +SOURCE_REMOTE_TEST=1 +. $TEST_DIR/remotes/unloading.tc + +set -e +setup_remote "hypervisor" +test_unloading -- 2.51.2.1041.gc1ab5b90ca-goog

14 hours, 56 minutes

1
0
0 0

[PATCH v8 15/28] tracing: selftests: Add trace remote tests

by Vincent Donnefort

Exercise the tracefs interface for trace remote with a set of tests to check: * loading/unloading (unloading.tc) * reset (reset.tc) * size changes (buffer_size.tc) * consuming read (trace_pipe.tc) * non-consuming read (trace.tc) Cc: Shuah Khan <skhan(a)linuxfoundation.org> Cc: linux-kselftest(a)vger.kernel.org Signed-off-by: Vincent Donnefort <vdonnefort(a)google.com> diff --git a/tools/testing/selftests/ftrace/test.d/remotes/buffer_size.tc b/tools/testing/selftests/ftrace/test.d/remotes/buffer_size.tc new file mode 100644 index 000000000000..1a43280ffa97 --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/remotes/buffer_size.tc @@ -0,0 +1,25 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Test trace remote buffer size +# requires: remotes/test + +. $TEST_DIR/remotes/functions + +test_buffer_size() +{ + echo 0 > tracing_on + assert_unloaded + + echo 4096 > buffer_size_kb + echo 1 > tracing_on + assert_loaded + + echo 0 > tracing_on + echo 7 > buffer_size_kb +} + +if [ -z "$SOURCE_REMOTE_TEST" ]; then + set -e + setup_remote_test + test_buffer_size +fi diff --git a/tools/testing/selftests/ftrace/test.d/remotes/functions b/tools/testing/selftests/ftrace/test.d/remotes/functions new file mode 100644 index 000000000000..97a09d564a34 --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/remotes/functions @@ -0,0 +1,88 @@ +# SPDX-License-Identifier: GPL-2.0 + +setup_remote() +{ + local name=$1 + + [ -e $TRACING_DIR/remotes/$name/write_event ] || exit_unresolved + + cd remotes/$name/ + echo 0 > tracing_on + clear_trace + echo 7 > buffer_size_kb + echo 0 > events/enable + echo 1 > events/$name/selftest/enable + echo 1 > tracing_on +} + +setup_remote_test() +{ + [ -d $TRACING_DIR/remotes/test/ ] || modprobe remote_test || exit_unresolved + + setup_remote "test" +} + +assert_loaded() +{ + grep -q "(loaded)" buffer_size_kb +} + +assert_unloaded() +{ + grep -q "(unloaded)" buffer_size_kb +} + +dump_trace_pipe() +{ + output=$(mktemp $TMPDIR/remote_test.XXXXXX) + cat trace_pipe > $output & + pid=$! + sleep 1 + kill -1 $pid + + echo $output +} + +check_trace() +{ + start_id="$1" + end_id="$2" + file="$3" + + # Ensure the file is not empty + test -n "$(head $file)" + + prev_ts=0 + id=0 + + # Only keep <timestamp> <id> + tmp=$(mktemp $TMPDIR/remote_test.XXXXXX) + sed -e 's/\[[0-9]*\]\s*$[0-9]*.[0-9]*$: [a-z]* id=$[0-9]*$/\1 \2/' $file > $tmp + + while IFS= read -r line; do + ts=$(echo $line | cut -d ' ' -f 1) + id=$(echo $line | cut -d ' ' -f 2) + + test $(echo "$ts>$prev_ts" | bc) -eq 1 + test $id -eq $start_id + + prev_ts=$ts + start_id=$((start_id + 1)) + done < $tmp + + test $id -eq $end_id + rm $tmp +} + +get_cpu_ids() +{ + sed -n 's/^processor\s*:\s*$[0-9]\+$.*/\1/p' /proc/cpuinfo +} + +get_page_size() { + sed -ne 's/^.*data.*size:$[0-9][0-9]*$.*/\1/p' events/header_page +} + +get_selftest_event_size() { + sed -ne 's/^.*field:.*;.*size:$[0-9][0-9]*$;.*/\1/p' events/*/selftest/format | awk '{s+=$1} END {print s}' +} diff --git a/tools/testing/selftests/ftrace/test.d/remotes/reset.tc b/tools/testing/selftests/ftrace/test.d/remotes/reset.tc new file mode 100644 index 000000000000..4d176349b2bc --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/remotes/reset.tc @@ -0,0 +1,90 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Test trace remote reset +# requires: remotes/test + +. $TEST_DIR/remotes/functions + +check_reset() +{ + write_event_path="write_event" + taskset="" + + clear_trace + + # Is the buffer empty? + output=$(dump_trace_pipe) + test $(wc -l $output | cut -d ' ' -f1) -eq 0 + + if $(echo $(pwd) | grep -q "per_cpu/cpu"); then + write_event_path="../../write_event" + cpu_id=$(echo $(pwd) | sed -e 's/.*per_cpu\/cpu//') + taskset="taskset -c $cpu_id" + fi + rm $output + + # Can we properly write a new event? + $taskset echo 7890 > $write_event_path + output=$(dump_trace_pipe) + test $(wc -l $output | cut -d ' ' -f1) -eq 1 + grep -q "id=7890" $output + rm $output +} + +test_global_interface() +{ + output=$(mktemp $TMPDIR/remote_test.XXXXXX) + + # Confidence check + echo 123456 > write_event + output=$(dump_trace_pipe) + grep -q "id=123456" $output + rm $output + + # Reset single event + echo 1 > write_event + check_reset + + # Reset lost events + for i in $(seq 1 10000); do + echo 1 > write_event + done + check_reset +} + +test_percpu_interface() +{ + [ "$(get_cpu_ids | wc -l)" -ge 2 ] || return 0 + + for cpu in $(get_cpu_ids); do + taskset -c $cpu echo 1 > write_event + done + + check_non_empty=0 + for cpu in $(get_cpu_ids); do + cd per_cpu/cpu$cpu/ + + if [ $check_non_empty -eq 0 ]; then + check_reset + check_non_empty=1 + else + # Check we have only reset 1 CPU + output=$(dump_trace_pipe) + test $(wc -l $output | cut -d ' ' -f1) -eq 1 + rm $output + fi + cd - + done +} + +test_reset() +{ + test_global_interface + test_percpu_interface +} + +if [ -z "$SOURCE_REMOTE_TEST" ]; then + set -e + setup_remote_test + test_reset +fi diff --git a/tools/testing/selftests/ftrace/test.d/remotes/trace.tc b/tools/testing/selftests/ftrace/test.d/remotes/trace.tc new file mode 100644 index 000000000000..081133ec45ff --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/remotes/trace.tc @@ -0,0 +1,127 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Test trace remote non-consuming read +# requires: remotes/test + +. $TEST_DIR/remotes/functions + +test_trace() +{ + echo 0 > tracing_on + assert_unloaded + + echo 7 > buffer_size_kb + echo 1 > tracing_on + assert_loaded + + # Simple test: Emit few events and try to read them + for i in $(seq 1 8); do + echo $i > write_event + done + + check_trace 1 8 trace + + # + # Test interaction with consuming read + # + + cat trace_pipe > /dev/null & + pid=$! + + sleep 1 + kill $pid + + test $(wc -l < trace) -eq 0 + + for i in $(seq 16 32); do + echo $i > write_event + done + + check_trace 16 32 trace + + # + # Test interaction with reset + # + + echo 0 > trace + + test $(wc -l < trace) -eq 0 + + for i in $(seq 1 8); do + echo $i > write_event + done + + check_trace 1 8 trace + + # + # Test interaction with lost events + # + + # Ensure the writer is not on the reader page by reloading the buffer + echo 0 > tracing_on + echo 0 > trace + assert_unloaded + echo 1 > tracing_on + assert_loaded + + # Ensure ring-buffer overflow by emitting events from the same CPU + for cpu in $(get_cpu_ids); do + break + done + + events_per_page=$(($(get_page_size) / $(get_selftest_event_size))) # Approx: does not take TS into account + nr_events=$(($events_per_page * 2)) + for i in $(seq 1 $nr_events); do + taskset -c $cpu echo $i > write_event + done + + id=$(sed -n -e '1s/\[[0-9]*\]\s*[0-9]*.[0-9]*: [a-z]* id=$[0-9]*$/\1/p' trace) + test $id -ne 1 + + check_trace $id $nr_events trace + + # + # Test per-CPU interface + # + echo 0 > trace + + for cpu in $(get_cpu_ids) ; do + taskset -c $cpu echo $cpu > write_event + done + + for cpu in $(get_cpu_ids); do + cd per_cpu/cpu$cpu/ + + check_trace $cpu $cpu trace + + cd - > /dev/null + done + + # + # Test with hotplug + # + + [ "$(get_cpu_ids | wc -l)" -ge 2 ] || return 0 + + echo 0 > trace + + for cpu in $(get_cpu_ids); do + echo 0 > /sys/devices/system/cpu/cpu$cpu/online + break + done + + for i in $(seq 1 8); do + echo $i > write_event + done + + check_trace 1 8 trace + + echo 1 > /sys/devices/system/cpu/cpu$cpu/online +} + +if [ -z "$SOURCE_REMOTE_TEST" ]; then + set -e + + setup_remote_test + test_trace +fi diff --git a/tools/testing/selftests/ftrace/test.d/remotes/trace_pipe.tc b/tools/testing/selftests/ftrace/test.d/remotes/trace_pipe.tc new file mode 100644 index 000000000000..d28eaee10c7c --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/remotes/trace_pipe.tc @@ -0,0 +1,127 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Test trace remote consuming read +# requires: remotes/test + +. $TEST_DIR/remotes/functions + +test_trace_pipe() +{ + echo 0 > tracing_on + assert_unloaded + + # Emit events from the same CPU + for cpu in $(get_cpu_ids); do + break + done + + # + # Simple test: Emit enough events to fill few pages + # + + echo 1024 > buffer_size_kb + echo 1 > tracing_on + assert_loaded + + events_per_page=$(($(get_page_size) / $(get_selftest_event_size))) + nr_events=$(($events_per_page * 4)) + + output=$(mktemp $TMPDIR/remote_test.XXXXXX) + + cat trace_pipe > $output & + pid=$! + + for i in $(seq 1 $nr_events); do + taskset -c $cpu echo $i > write_event + done + + echo 0 > tracing_on + sleep 1 + kill $pid + + check_trace 1 $nr_events $output + + rm $output + + # + # Test interaction with lost events + # + + assert_unloaded + echo 7 > buffer_size_kb + echo 1 > tracing_on + assert_loaded + + nr_events=$((events_per_page * 2)) + for i in $(seq 1 $nr_events); do + taskset -c $cpu echo $i > write_event + done + + output=$(dump_trace_pipe) + + lost_events=$(sed -n -e '1s/CPU:.*\[LOST $[0-9]*$ EVENTS\]/\1/p' $output) + test -n "$lost_events" + + id=$(sed -n -e '2s/\[[0-9]*\]\s*[0-9]*.[0-9]*: [a-z]* id=$[0-9]*$/\1/p' $output) + test "$id" -eq $(($lost_events + 1)) + + # Drop [LOST EVENTS] line + sed -i '1d' $output + + check_trace $id $nr_events $output + + rm $output + + # + # Test per-CPU interface + # + + echo 0 > trace + echo 1 > tracing_on + + for cpu in $(get_cpu_ids); do + taskset -c $cpu echo $cpu > write_event + done + + for cpu in $(get_cpu_ids); do + cd per_cpu/cpu$cpu/ + output=$(dump_trace_pipe) + + check_trace $cpu $cpu $output + + rm $output + cd - > /dev/null + done + + # + # Test interaction with hotplug + # + + [ "$(get_cpu_ids | wc -l)" -ge 2 ] || return 0 + + echo 0 > trace + + for cpu in $(get_cpu_ids); do + echo 0 > /sys/devices/system/cpu/cpu$cpu/online + break + done + + for i in $(seq 1 8); do + echo $i > write_event + done + + output=$(dump_trace_pipe) + + check_trace 1 8 $output + + rm $output + + echo 1 > /sys/devices/system/cpu/cpu$cpu/online +} + +if [ -z "$SOURCE_REMOTE_TEST" ]; then + set -e + + setup_remote_test + test_trace_pipe +fi diff --git a/tools/testing/selftests/ftrace/test.d/remotes/unloading.tc b/tools/testing/selftests/ftrace/test.d/remotes/unloading.tc new file mode 100644 index 000000000000..cac2190183f6 --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/remotes/unloading.tc @@ -0,0 +1,41 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Test trace remote unloading +# requires: remotes/test + +. $TEST_DIR/remotes/functions + +test_unloading() +{ + # No reader, writing + assert_loaded + + # No reader, no writing + echo 0 > tracing_on + assert_unloaded + + # 1 reader, no writing + cat trace_pipe & + pid=$! + sleep 1 + assert_loaded + kill $pid + assert_unloaded + + # No reader, no writing, events + echo 1 > tracing_on + echo 1 > write_event + echo 0 > tracing_on + assert_loaded + + # Test reset + clear_trace + assert_unloaded +} + +if [ -z "$SOURCE_REMOTE_TEST" ]; then + set -e + + setup_remote_test + test_unloading +fi -- 2.51.2.1041.gc1ab5b90ca-goog

14 hours, 56 minutes

1
0
0 0

[PATCH v6 0/2] platform/chrome: Fix an UAF via revocable primitive APIs

by Tzung-Bi Shih

The series is separated from [1] to show the independency and compare potential use cases easier. This use case uses the primitive revocable APIs directly. It relies on the revocable core part [2]. It tries to fix an UAF in the fops of cros_ec_chardev after the underlying protocol device has gone by using revocable. The file operations make sure the resources are available when using them. Even though it has the finest grain for accessing the resources, it makes the user code verbose. Per feedback from the community, I'm looking for some subsystem level helpers so that user code can be simlper. The 1st patch converts existing protocol devices to resource providers of cros_ec_device. The 2nd patch converts cros_ec_chardev to a resource consumer of cros_ec_device to fix the UAF. [1] https://lore.kernel.org/chrome-platform/20251016054204.1523139-1-tzungbi@ke… [2] https://lore.kernel.org/chrome-platform/20251106152330.11733-1-tzungbi@kern… v6: - New, separated from an existing series. Tzung-Bi Shih (2): platform/chrome: Protect cros_ec_device lifecycle with revocable platform/chrome: cros_ec_chardev: Consume cros_ec_device via revocable drivers/platform/chrome/cros_ec.c | 5 ++ drivers/platform/chrome/cros_ec_chardev.c | 71 ++++++++++++++++----- include/linux/platform_data/cros_ec_proto.h | 4 ++ 3 files changed, 65 insertions(+), 15 deletions(-) -- 2.48.1

16 hours, 56 minutes

3
5
0 0

[PATCH v8 00/15] Consolidate iommu page table implementations (AMD)

by Jason Gunthorpe

[Joerg, can you put this and vtd in linux-next please. The vtd series is still good at v3 thanks] Currently each of the iommu page table formats duplicates all of the logic to maintain the page table and perform map/unmap/etc operations. There are several different versions of the algorithms between all the different formats. The io-pgtable system provides an interface to help isolate the page table code from the iommu driver, but doesn't provide tools to implement the common algorithms. This makes it very hard to improve the state of the pagetable code under the iommu domains as any proposed improvement needs to alter a large number of different driver code paths. Combined with a lack of software based testing this makes improvement in this area very hard. iommufd wants several new page table operations: - More efficient map/unmap operations, using iommufd's batching logic - unmap that returns the physical addresses into a batch as it progresses - cut that allows splitting areas so large pages can have holes poked in them dynamically (ie guestmemfd hitless shared/private transitions) - More agressive freeing of table memory to avoid waste - Fragmenting large pages so that dirty tracking can be more granular - Reassembling large pages so that VMs can run at full IO performance in migration/dirty tracking error flows - KHO integration for kernel live upgrade Together these are algorithmically complex enough to be a very significant task to go and implement in all the page table formats we support. Just the "server" focused drivers use almost all the formats (ARMv8 S1&S2 / x86 PAE / AMDv1 / VT-d SS / RISCV) Instead of doing the duplicated work, this series takes the first step to consolidate the algorithms into one places. In spirit it is similar to the work Christoph did a few years back to pull the redundant get_user_pages() implementations out of the arch code into core MM. This unlocked a great deal of improvement in that space in the following years. I would like to see the same benefit in iommu as well. My first RFC showed a bigger picture with all most all formats and more algorithms. This series reorganizes that to be narrowly focused on just enough to convert the AMD driver to use the new mechanism. kunit tests are provided that allow good testing of the algorithms and all formats on x86, nothing is arch specific. AMD is one of the simpler options as the HW is quite uniform with few different options/bugs while still requiring the complicated contiguous pages support. The HW also has a very simple range based invalidation approach that is easy to implement. The AMD v1 and AMD v2 page table formats are implemented bit for bit identical to the current code, tested using a compare kunit test that checks against the io-pgtable version (on github, see below). Updating the AMD driver to replace the io-pgtable layer with the new stuff is fairly straightforward now. The layering is fixed up in the new version so that all the invalidation goes through function pointers. Several small fixing patches have come out of this as I've been fixing the problems that the test suite uncovers in the current code, and implementing the fixed version in iommupt. On performance, there is a quite wide variety of implementation designs across all the drivers. Looking at some key performance across the main formats: iommu_map(): pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 53,66 , 51,63 , 19.19 (AMDV1) 256*2^12, 386,1909 , 367,1795 , 79.79 256*2^21, 362,1633 , 355,1556 , 77.77 2^12, 56,62 , 52,59 , 11.11 (AMDv2) 256*2^12, 405,1355 , 357,1292 , 72.72 256*2^21, 393,1160 , 358,1114 , 67.67 2^12, 55,65 , 53,62 , 14.14 (VT-d second stage) 256*2^12, 391,518 , 332,512 , 35.35 256*2^21, 383,635 , 336,624 , 46.46 2^12, 57,65 , 55,63 , 12.12 (ARM 64 bit) 256*2^12, 380,389 , 361,369 , 2.02 256*2^21, 358,419 , 345,400 , 13.13 iommu_unmap(): pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 69,88 , 65,85 , 23.23 (AMDv1) 256*2^12, 353,6498 , 331,6029 , 94.94 256*2^21, 373,6014 , 360,5706 , 93.93 2^12, 71,72 , 66,69 , 4.04 (AMDv2) 256*2^12, 228,891 , 206,871 , 76.76 256*2^21, 254,721 , 245,711 , 65.65 2^12, 69,87 , 65,82 , 20.20 (VT-d second stage) 256*2^12, 210,321 , 200,315 , 36.36 256*2^21, 255,349 , 238,342 , 30.30 2^12, 72,77 , 68,74 , 8.08 (ARM 64 bit) 256*2^12, 521,357 , 447,346 , -29.29 256*2^21, 489,358 , 433,345 , -25.25 * Above numbers include additional patches to remove the iommu_pgsize() overheads. gcc 13.3.0, i7-12700 This version provides fairly consistent performance across formats. ARM unmap performance is quite different because this version supports contiguous pages and uses a very different algorithm for unmapping. Though why it is so worse compared to AMDv1 I haven't figured out yet. The per-format commits include a more detailed chart. There is a second branch: https://github.com/jgunthorpe/linux/commits/iommu_pt_all Containing supporting work and future steps: - ARM short descriptor (32 bit), ARM long descriptor (64 bit) formats - RISCV format and RISCV conversion https://github.com/jgunthorpe/linux/commits/iommu_pt_riscv - Support for a DMA incoherent HW page table walker - VT-d second stage format and VT-d conversion https://github.com/jgunthorpe/linux/commits/iommu_pt_vtd - DART v1 & v2 format - Draft of a iommufd 'cut' operation to break down huge pages - A compare test that checks the iommupt formats against the iopgtable interface, including updating AMD to have a working iopgtable and patches to make VT-d have an iopgtable for testing. - A performance test to micro-benchmark map and unmap against iogptable My strategy is to go one by one for the drivers: - AMD driver conversion - RISCV page table and driver - Intel VT-d driver and VTDSS page table - Flushing improvements for RISCV - ARM SMMUv3 And concurrently work on the algorithm side: - debugfs content dump, like VT-d has - Cut support - Increase/Decrease page size support - map/unmap batching - KHO As we make more algorithm improvements the value to convert the drivers increases. This is on github: https://github.com/jgunthorpe/linux/commits/iommu_pt v8: - Remove unused to_amdv1pt/common_to_amdv1pt/to_x86_64_pt/common_to_x86_64_pt - Fix 32 bit udiv compile failure in the kunit v7: https://patch.msgid.link/r/0-v7-ab019a8791e2+175b8-iommu_pt_jgg@nvidia.com - Rebase to v6.18-rc2 - Improve comments and documentation - Add a few missed __sme_sets() for AMD CC - Rename pt_iommu_flush_ops -> pt_iommu_driver_ops VT-D -> VT-d pt_clear_entry -> pt_clear_entries pt_entry_write_is_dirty -> pt_entry_is_write_dirty pt_entry_set_write_clean -> pt_entry_make_write_clean - Tidy some of the map flow into a new function do_map() - Fix ffz64() v6: https://patch.msgid.link/r/0-v6-0fb54a1d9850+36b-iommu_pt_jgg@nvidia.com - Improve comments and documentation - Rename pt_entry_oa_full -> pt_entry_oa_exact pt_has_system_page -> pt_has_system_page_size pt_max_output_address_lg2 -> pt_max_oa_lg2 log2_f*() -> vaf* / oaf* / f*_t pt_item_fully_covered -> pt_entry_fully_covered - Fix missed constant propogation causing division - Consolidate debugging checks to pt_check_install_leaf_args() - Change collect->ignore_mapped to check_mapped - Shuffle some hunks around to more appropriate patches - Two new mini kunit tests v5: https://patch.msgid.link/r/0-v5-116c4948af3d+68091-iommu_pt_jgg@nvidia.com - Text grammar updates and kdoc fixes v4: https://patch.msgid.link/r/0-v4-0d6a6726a372+18959-iommu_pt_jgg@nvidia.com - Rebase on v6.16-rc3 - Integrate the HATS/HATDis changes - Remove 'default n' from kconfig - Remove unused 'PT_FIXED_TOP_LEVEL' - Improve comments and documentation - Fix some compile warnings from kbuild robots v3: https://patch.msgid.link/r/0-v3-a93aab628dbc+521-iommu_pt_jgg@nvidia.com - Rebase on v6.16-rc2 - s/PT_ENTRY_WORD_SIZE/PT_ITEM_WORD_SIZE/s to follow the language better - Comment and documentation updates - Add PT_TOP_PHYS_MASK to help manage alignment restrictions on the top pointer - Add missed force_aperture = true - Make pt_iommu_deinit() take care of the not-yet-inited error case internally as AMD/RISCV/VTD all shared this logic - Change gather_range() into gather_range_pages() so it also deals with the page list. This makes the following cache flushing series simpler - Fix missed update of unmap->unmapped in some error cases - Change clear_contig() to order the gather more logically - Remove goto from the error handling in __map_range_leaf() - s/log2_/oalog2_/ in places where the argument is an oaddr_t - Pass the pts to pt_table_install64/32() - Do not use SIGN_EXTEND for the AMDv2 page table because of Vasant's information on how PASID 0 works. v2: https://patch.msgid.link/r/0-v2-5c26bde5c22d+58b-iommu_pt_jgg@nvidia.com - AMD driver only, many code changes RFC: https://lore.kernel.org/all/0-v1-01fa10580981+1d-iommu_pt_jgg@nvidia.com/ Cc: Michael Roth <michael.roth(a)amd.com> Cc: Alexey Kardashevskiy <aik(a)amd.com> Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com> Cc: James Gowans <jgowans(a)amazon.com> Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com> Alejandro Jimenez (1): iommu/amd: Use the generic iommu page table Jason Gunthorpe (14): genpt: Generic Page Table base API genpt: Add Documentation/ files iommupt: Add the basic structure of the iommu implementation iommupt: Add the AMD IOMMU v1 page table format iommupt: Add iova_to_phys op iommupt: Add unmap_pages op iommupt: Add map_pages op iommupt: Add read_and_clear_dirty op iommupt: Add a kunit test for Generic Page Table iommupt: Add a mock pagetable format for iommufd selftest to use iommufd: Change the selftest to use iommupt instead of xarray iommupt: Add the x86 64 bit page table format iommu/amd: Remove AMD io_pgtable support iommupt: Add a kunit test for the IOMMU implementation .clang-format | 1 + Documentation/driver-api/generic_pt.rst | 142 ++ Documentation/driver-api/index.rst | 1 + drivers/iommu/Kconfig | 2 + drivers/iommu/Makefile | 1 + drivers/iommu/amd/Kconfig | 5 +- drivers/iommu/amd/Makefile | 2 +- drivers/iommu/amd/amd_iommu.h | 1 - drivers/iommu/amd/amd_iommu_types.h | 110 +- drivers/iommu/amd/io_pgtable.c | 577 -------- drivers/iommu/amd/io_pgtable_v2.c | 370 ------ drivers/iommu/amd/iommu.c | 538 ++++---- drivers/iommu/generic_pt/.kunitconfig | 13 + drivers/iommu/generic_pt/Kconfig | 68 + drivers/iommu/generic_pt/fmt/Makefile | 26 + drivers/iommu/generic_pt/fmt/amdv1.h | 411 ++++++ drivers/iommu/generic_pt/fmt/defs_amdv1.h | 21 + drivers/iommu/generic_pt/fmt/defs_x86_64.h | 21 + drivers/iommu/generic_pt/fmt/iommu_amdv1.c | 15 + drivers/iommu/generic_pt/fmt/iommu_mock.c | 10 + drivers/iommu/generic_pt/fmt/iommu_template.h | 48 + drivers/iommu/generic_pt/fmt/iommu_x86_64.c | 11 + drivers/iommu/generic_pt/fmt/x86_64.h | 255 ++++ drivers/iommu/generic_pt/iommu_pt.h | 1162 +++++++++++++++++ drivers/iommu/generic_pt/kunit_generic_pt.h | 713 ++++++++++ drivers/iommu/generic_pt/kunit_iommu.h | 183 +++ drivers/iommu/generic_pt/kunit_iommu_pt.h | 487 +++++++ drivers/iommu/generic_pt/pt_common.h | 358 +++++ drivers/iommu/generic_pt/pt_defs.h | 329 +++++ drivers/iommu/generic_pt/pt_fmt_defaults.h | 233 ++++ drivers/iommu/generic_pt/pt_iter.h | 636 +++++++++ drivers/iommu/generic_pt/pt_log2.h | 122 ++ drivers/iommu/io-pgtable.c | 4 - drivers/iommu/iommufd/Kconfig | 1 + drivers/iommu/iommufd/iommufd_test.h | 11 +- drivers/iommu/iommufd/selftest.c | 438 +++---- include/linux/generic_pt/common.h | 167 +++ include/linux/generic_pt/iommu.h | 271 ++++ include/linux/io-pgtable.h | 2 - include/linux/irqchip/riscv-imsic.h | 3 +- tools/testing/selftests/iommu/iommufd.c | 60 +- tools/testing/selftests/iommu/iommufd_utils.h | 12 + 42 files changed, 6229 insertions(+), 1612 deletions(-) create mode 100644 Documentation/driver-api/generic_pt.rst delete mode 100644 drivers/iommu/amd/io_pgtable.c delete mode 100644 drivers/iommu/amd/io_pgtable_v2.c create mode 100644 drivers/iommu/generic_pt/.kunitconfig create mode 100644 drivers/iommu/generic_pt/Kconfig create mode 100644 drivers/iommu/generic_pt/fmt/Makefile create mode 100644 drivers/iommu/generic_pt/fmt/amdv1.h create mode 100644 drivers/iommu/generic_pt/fmt/defs_amdv1.h create mode 100644 drivers/iommu/generic_pt/fmt/defs_x86_64.h create mode 100644 drivers/iommu/generic_pt/fmt/iommu_amdv1.c create mode 100644 drivers/iommu/generic_pt/fmt/iommu_mock.c create mode 100644 drivers/iommu/generic_pt/fmt/iommu_template.h create mode 100644 drivers/iommu/generic_pt/fmt/iommu_x86_64.c create mode 100644 drivers/iommu/generic_pt/fmt/x86_64.h create mode 100644 drivers/iommu/generic_pt/iommu_pt.h create mode 100644 drivers/iommu/generic_pt/kunit_generic_pt.h create mode 100644 drivers/iommu/generic_pt/kunit_iommu.h create mode 100644 drivers/iommu/generic_pt/kunit_iommu_pt.h create mode 100644 drivers/iommu/generic_pt/pt_common.h create mode 100644 drivers/iommu/generic_pt/pt_defs.h create mode 100644 drivers/iommu/generic_pt/pt_fmt_defaults.h create mode 100644 drivers/iommu/generic_pt/pt_iter.h create mode 100644 drivers/iommu/generic_pt/pt_log2.h create mode 100644 include/linux/generic_pt/common.h create mode 100644 include/linux/generic_pt/iommu.h base-commit: 8440410283bb5533b676574211f31f030a18011b -- 2.43.0

16 hours, 57 minutes

5
30
0 0

[PATCH v2] selftests: harness: Support KCOV.

by Kuniyuki Iwashima

While writing a selftest with kselftest_harness.h, I often want to check which paths are actually exercised. Let's support generating KCOV coverage data. We can specify the output directory via the KCOV_OUTPUT environment variable, and the number of instructions to collect via the KCOV_SLOTS environment variable. # KCOV_OUTPUT=$PWD/kcov KCOV_SLOTS=$((4096 * 2)) \ ./tools/testing/selftests/net/af_unix/scm_inq Both variables can also be specified as the make variable. # make -C tools/testing/selftests/ \ KCOV_OUTPUT=$PWD/kcov KCOV_SLOTS=$((4096 * 4)) \ kselftest_override_timeout=60 TARGETS=net/af_unix run_tests The coverage data can be simply decoded with addr2line: $ cat kcov/* | sort | uniq | addr2line -e vmlinux | grep unix net/unix/af_unix.c:1056 net/unix/af_unix.c:3138 net/unix/af_unix.c:3834 net/unix/af_unix.c:3838 net/unix/af_unix.c:311 (discriminator 2) ... or more nicely with a script embedded in vock [0]: $ cat kcov/* | sort | uniq > local.log $ python3 ~/kernel/tools/vock/report.py \ --kernel-src ./ --vmlinux ./vmlinux \ --mode local --local-log local.log --filter unix ... ------------------------------- Coverage Report -------------------------------- 📄 net/unix/af_unix.c (276 lines) ... 942 | static int unix_setsockopt(struct socket *sock, int level, int optname, 943 | sockptr_t optval, unsigned int optlen) 944 | { ... 961 | switch (optname) { 962 | case SO_INQ: 963 > if (sk->sk_type != SOCK_STREAM) 964 | return -EINVAL; 965 | 966 > if (val > 1 || val < 0) 967 | return -EINVAL; 968 | 969 > WRITE_ONCE(u->recvmsg_inq, val); 970 | break; Link: https://github.com/kzall0c/vock/blob/f3d97de9954f9df758c0ab287ca7e24e654288… #[0] Signed-off-by: Kuniyuki Iwashima <kuniyu(a)google.com> --- v2: Support TEST() v1: https://lore.kernel.org/linux-kselftest/20251017084022.3721950-1-kuniyu@goo… --- Documentation/dev-tools/kselftest.rst | 41 ++++++ tools/testing/selftests/Makefile | 14 ++- tools/testing/selftests/kselftest_harness.h | 133 +++++++++++++++++++- 3 files changed, 178 insertions(+), 10 deletions(-) diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst index 18c2da67fae4..5c2b92ac4a30 100644 --- a/Documentation/dev-tools/kselftest.rst +++ b/Documentation/dev-tools/kselftest.rst @@ -200,6 +200,47 @@ You can look at the TAP output to see if you ran into the timeout. Test runners which know a test must run under a specific time can then optionally treat these timeouts then as fatal. +KCOV for selftests +================== + +Selftests built with `kselftest_harness.h` natively support generating +KCOV coverage data. See :doc:`KCOV: code coverage for fuzzing </dev-tools/kcov>` +for prerequisites. + +You can specify the output directory with the `KCOV_OUTPUT` environment +variable. Additionally, you can specify the number of instructions to +collect with the `KCOV_SLOTS` environment variable :: + + # KCOV_OUTPUT=$PWD/kcov KCOV_SLOTS=$((4096 * 2)) \ + ./tools/testing/selftests/net/af_unix/scm_inq + +In the output directory, a coverage file is generated for each test +case in the selftest :: + + $ ls kcov/ + scm_inq.dgram.basic scm_inq.seqpacket.basic scm_inq.stream.basic + +The default value of `KCOV_SLOTS` is `4096`, and `KCOV_SLOTS` multiplied +by `sizeof(unsigned long)` must be multiple of `4096`, so the smallest +value is `512`. + +Both `KCOV_OUTPUT` and `KCOV_SLOTS` can be specified as the variables +on the `make` command line :: + + # make -C tools/testing/selftests/ \ + kselftest_override_timeout=60 \ + KCOV_OUTPUT=$PWD/kcov KCOV_SLOTS=$((4096 * 4)) \ + TARGETS=net/af_unix run_tests + +The collected data can be decoded with `addr2line` :: + + $ cat kcov/* | sort | uniq | addr2line -e vmlinux | grep unix + net/unix/af_unix.c:1056 + net/unix/af_unix.c:3138 + net/unix/af_unix.c:3834 + net/unix/af_unix.c:3838 + ... + Packaging selftests =================== diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index c46ebdb9b8ef..40e70fb1a347 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -218,12 +218,14 @@ all: done; exit $$ret; run_tests: all - @for TARGET in $(TARGETS); do \ - BUILD_TARGET=$$BUILD/$$TARGET; \ - $(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET run_tests \ - SRC_PATH=$(shell readlink -e $$(pwd)) \ - OBJ_PATH=$(BUILD) \ - O=$(abs_objtree); \ + @for TARGET in $(TARGETS); do \ + BUILD_TARGET=$$BUILD/$$TARGET; \ + $(MAKE) OUTPUT=$$BUILD_TARGET \ + KCOV_OUTPUT=$(abspath $(KCOV_OUTPUT)) \ + -C $$TARGET run_tests \ + SRC_PATH=$(shell readlink -e $$(pwd)) \ + OBJ_PATH=$(BUILD) \ + O=$(abs_objtree); \ done; hotplug: diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index 3f66e862e83e..5b7a01722981 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -56,6 +56,8 @@ #include <asm/types.h> #include <ctype.h> #include <errno.h> +#include <fcntl.h> +#include <linux/kcov.h> #include <linux/unistd.h> #include <poll.h> #include <stdbool.h> @@ -63,7 +65,9 @@ #include <stdio.h> #include <stdlib.h> #include <string.h> +#include <sys/ioctl.h> #include <sys/mman.h> +#include <sys/stat.h> #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> @@ -175,9 +179,12 @@ static void test_name(struct __test_metadata *_metadata); \ static void wrapper_##test_name( \ struct __test_metadata *_metadata, \ - struct __fixture_variant_metadata __attribute__((unused)) *variant) \ + struct __fixture_variant_metadata __attribute__((unused)) *variant, \ + char *test_full_name) \ { \ + enable_kcov(_metadata); \ test_name(_metadata); \ + disable_kcov(_metadata, test_full_name); \ } \ static struct __test_metadata _##test_name##_object = \ { .name = #test_name, \ @@ -401,7 +408,8 @@ const FIXTURE_VARIANT(fixture_name) *variant); \ static void wrapper_##fixture_name##_##test_name( \ struct __test_metadata *_metadata, \ - struct __fixture_variant_metadata *variant) \ + struct __fixture_variant_metadata *variant, \ + char *test_full_name) \ { \ /* fixture data is alloced, setup, and torn down per call. */ \ FIXTURE_DATA(fixture_name) self_private, *self = NULL; \ @@ -430,7 +438,9 @@ if (_metadata->exit_code) \ _exit(0); \ *_metadata->no_teardown = false; \ + enable_kcov(_metadata); \ fixture_name##_##test_name(_metadata, self, variant->data); \ + disable_kcov(_metadata, test_full_name); \ _metadata->teardown_fn(false, _metadata, self, variant->data); \ _exit(0); \ } else if (child < 0 || child != waitpid(child, &status, 0)) { \ @@ -470,6 +480,8 @@ object->teardown_fn = &wrapper_##fixture_name##_##test_name##_teardown; \ object->termsig = signal; \ object->timeout = tmout; \ + object->kcov_fd = -1; \ + object->kcov_slots = -1; \ _##fixture_name##_##test_name##_object = object; \ __register_test(object); \ } \ @@ -908,7 +920,8 @@ __register_fixture_variant(struct __fixture_metadata *f, struct __test_metadata { const char *name; void (*fn)(struct __test_metadata *, - struct __fixture_variant_metadata *); + struct __fixture_variant_metadata *, + char *test_name); pid_t pid; /* pid of test when being run */ struct __fixture_metadata *fixture; void (*teardown_fn)(bool in_parent, struct __test_metadata *_metadata, @@ -923,6 +936,10 @@ struct __test_metadata { const void *variant; struct __test_results *results; struct __test_metadata *prev, *next; + int kcov_fd; + int kcov_slots; + char *kcov_dir; + unsigned long *kcov_mem; }; static inline bool __test_passed(struct __test_metadata *metadata) @@ -1185,6 +1202,114 @@ static bool test_enabled(int argc, char **argv, return !has_positive; } +#define KCOV_SLOTS 4096 + +static void enable_kcov(struct __test_metadata *t) +{ + char *slots; + int err; + + t->kcov_dir = getenv("KCOV_OUTPUT"); + if (!t->kcov_dir || *t->kcov_dir == '\0') + return; + + slots = getenv("KCOV_SLOTS"); + if (slots && *slots != '\0') + sscanf(slots, "%d", &t->kcov_slots); + if (t->kcov_slots <= 0) + t->kcov_slots = KCOV_SLOTS; + + t->kcov_fd = open("/sys/kernel/debug/kcov", O_RDWR); + if (t->kcov_fd < 0) { + ksft_print_msg("ERROR OPENING KCOV FD\n"); + goto err; + } + + err = ioctl(t->kcov_fd, KCOV_INIT_TRACE, t->kcov_slots); + if (err) { + ksft_print_msg("ERROR INITIALISING KCOV\n"); + goto err; + } + + t->kcov_mem = mmap(NULL, sizeof(unsigned long) * t->kcov_slots, + PROT_READ | PROT_WRITE, MAP_SHARED, t->kcov_fd, 0); + if ((void *)t->kcov_mem == MAP_FAILED) { + ksft_print_msg("ERROR ALLOCATING MEMORY FOR KCOV\n"); + goto err; + } + + err = ioctl(t->kcov_fd, KCOV_ENABLE, KCOV_TRACE_PC); + if (err) { + ksft_print_msg("ERROR ENABLING KCOV\n"); + goto err; + } + + __atomic_store_n(&t->kcov_mem[0], 0, __ATOMIC_RELAXED); + return; +err: + t->exit_code = KSFT_FAIL; + _exit(KSFT_FAIL); +} + +static void disable_kcov(struct __test_metadata *t, char *test_name) +{ + int slots, err, dir, fd, i; + + if (t->kcov_fd == -1) + return; + + slots = __atomic_load_n(&t->kcov_mem[0], __ATOMIC_RELAXED); + if (slots == t->kcov_slots - 1) + ksft_print_msg("Set KCOV_SLOTS to a value greater than %d\n", t->kcov_slots); + + err = ioctl(t->kcov_fd, KCOV_DISABLE, 0); + if (err) { + ksft_print_msg("ERROR DISABLING KCOV\n"); + goto out; + } + + err = mkdir(t->kcov_dir, 0755); + if (err == -1 && errno != EEXIST) { + ksft_print_msg("ERROR CREATING '%s'\n", t->kcov_dir); + goto out; + } + err = 0; + + dir = open(t->kcov_dir, O_DIRECTORY); + if (dir < 0) { + ksft_print_msg("ERROR OPENING %s\n", t->kcov_dir); + err = dir; + goto out; + } + + fd = openat(dir, test_name, O_RDWR | O_CREAT | O_TRUNC); + + close(dir); + + if (fd == -1) { + ksft_print_msg("ERROR CREATING '%s' at '%s'\n", test_name, t->kcov_dir); + err = fd; + goto out; + } + + for (i = 0; i < slots; i++) { + char buf[64]; + int size; + + size = snprintf(buf, 64, "0x%lx\n", t->kcov_mem[i + 1]); + write(fd, buf, size); + } + +out: + munmap(t->kcov_mem, sizeof(t->kcov_mem[0]) * t->kcov_slots); + close(t->kcov_fd); + + if (err) { + t->exit_code = KSFT_FAIL; + _exit(KSFT_FAIL); + } +} + static void __run_test(struct __fixture_metadata *f, struct __fixture_variant_metadata *variant, struct __test_metadata *t) @@ -1216,7 +1341,7 @@ static void __run_test(struct __fixture_metadata *f, t->exit_code = KSFT_FAIL; } else if (child == 0) { setpgrp(); - t->fn(t, variant); + t->fn(t, variant, test_name); _exit(t->exit_code); } else { t->pid = child; -- 2.51.1.838.g19442a804e-goog

17 hours, 27 minutes

3
3
0 0

[PATCH v3] selftests: af_unix: Add tests for ECONNRESET and EOF semantics

by Sunday Adelodun

Add selftests to verify and document Linux’s intended behaviour for UNIX domain sockets (SOCK_STREAM and SOCK_DGRAM) when a peer closes. The tests verify that: 1. SOCK_STREAM returns EOF when the peer closes normally. 2. SOCK_STREAM returns ECONNRESET if the peer closes with unread data. 3. SOCK_SEQPACKET returns EOF when the peer closes normally. 4. SOCK_SEQPACKET returns ECONNRESET if the peer closes with unread data. 5. SOCK_DGRAM does not return ECONNRESET when the peer closes. This follows up on review feedback suggesting a selftest to clarify Linux’s semantics. Suggested-by: Kuniyuki Iwashima <kuniyu(a)google.com> Signed-off-by: Sunday Adelodun <adelodunolaoluwa(a)yahoo.com> --- tools/testing/selftests/net/af_unix/Makefile | 1 + .../selftests/net/af_unix/unix_connreset.c | 179 ++++++++++++++++++ 2 files changed, 180 insertions(+) create mode 100644 tools/testing/selftests/net/af_unix/unix_connreset.c diff --git a/tools/testing/selftests/net/af_unix/Makefile b/tools/testing/selftests/net/af_unix/Makefile index de805cbbdf69..5826a8372451 100644 --- a/tools/testing/selftests/net/af_unix/Makefile +++ b/tools/testing/selftests/net/af_unix/Makefile @@ -7,6 +7,7 @@ TEST_GEN_PROGS := \ scm_pidfd \ scm_rights \ unix_connect \ + unix_connreset \ # end of TEST_GEN_PROGS include ../../lib.mk diff --git a/tools/testing/selftests/net/af_unix/unix_connreset.c b/tools/testing/selftests/net/af_unix/unix_connreset.c new file mode 100644 index 000000000000..6f43435d96e2 --- /dev/null +++ b/tools/testing/selftests/net/af_unix/unix_connreset.c @@ -0,0 +1,179 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Selftest for AF_UNIX socket close and ECONNRESET behaviour. + * + * This test verifies: + * 1. SOCK_STREAM returns EOF when the peer closes normally. + * 2. SOCK_STREAM returns ECONNRESET if peer closes with unread data. + * 3. SOCK_SEQPACKET returns EOF when the peer closes normally. + * 4. SOCK_SEQPACKET returns ECONNRESET if the peer closes with unread data. + * 5. SOCK_DGRAM does not return ECONNRESET when the peer closes. + * + * These tests document the intended Linux behaviour. + * + */ + +#define _GNU_SOURCE +#include <stdlib.h> +#include <string.h> +#include <fcntl.h> +#include <unistd.h> +#include <errno.h> +#include <sys/socket.h> +#include <sys/un.h> +#include "../../kselftest_harness.h" + +#define SOCK_PATH "/tmp/af_unix_connreset.sock" + +static void remove_socket_file(void) +{ + unlink(SOCK_PATH); +} + +FIXTURE(unix_sock) +{ + int server; + int client; + int child; +}; + +FIXTURE_VARIANT(unix_sock) +{ + int socket_type; + const char *name; +}; + +/* Define variants: stream and datagram */ +FIXTURE_VARIANT_ADD(unix_sock, stream) { + .socket_type = SOCK_STREAM, + .name = "SOCK_STREAM", +}; + +FIXTURE_VARIANT_ADD(unix_sock, dgram) { + .socket_type = SOCK_DGRAM, + .name = "SOCK_DGRAM", +}; + +FIXTURE_VARIANT_ADD(unix_sock, seqpacket) { + .socket_type = SOCK_SEQPACKET, + .name = "SOCK_SEQPACKET", +}; + +FIXTURE_SETUP(unix_sock) +{ + struct sockaddr_un addr = {}; + int err; + + addr.sun_family = AF_UNIX; + strcpy(addr.sun_path, SOCK_PATH); + remove_socket_file(); + + self->server = socket(AF_UNIX, variant->socket_type, 0); + ASSERT_LT(-1, self->server); + + err = bind(self->server, (struct sockaddr *)&addr, sizeof(addr)); + ASSERT_EQ(0, err); + + if (variant->socket_type == SOCK_STREAM || + variant->socket_type == SOCK_SEQPACKET) { + err = listen(self->server, 1); + ASSERT_EQ(0, err); + + self->client = socket(AF_UNIX, variant->socket_type, 0); + ASSERT_LT(-1, self->client); + + err = connect(self->client, (struct sockaddr *)&addr, sizeof(addr)); + ASSERT_EQ(0, err); + + self->child = accept(self->server, NULL, NULL); + ASSERT_LT(-1, self->child); + } else { + /* Datagram: bind and connect only */ + self->client = socket(AF_UNIX, SOCK_DGRAM | SOCK_NONBLOCK, 0); + ASSERT_LT(-1, self->client); + + err = connect(self->client, (struct sockaddr *)&addr, sizeof(addr)); + ASSERT_EQ(0, err); + } +} + +FIXTURE_TEARDOWN(unix_sock) +{ + if (variant->socket_type == SOCK_STREAM || + variant->socket_type == SOCK_SEQPACKET) + close(self->child); + + close(self->client); + close(self->server); + remove_socket_file(); +} + +/* Test 1: peer closes normally */ +TEST_F(unix_sock, eof) +{ + char buf[16] = {}; + ssize_t n; + + /* Peer closes normally */ + if (variant->socket_type == SOCK_STREAM || + variant->socket_type == SOCK_SEQPACKET) + close(self->child); + else + close(self->server); + + n = recv(self->client, buf, sizeof(buf), 0); + TH_LOG("%s: recv=%zd errno=%d (%s)", variant->name, n, errno, strerror(errno)); + if (variant->socket_type == SOCK_STREAM || + variant->socket_type == SOCK_SEQPACKET) { + ASSERT_EQ(0, n); + } else { + ASSERT_EQ(-1, n); + ASSERT_EQ(EAGAIN, errno); + } +} + +/* Test 2: peer closes with unread data */ +TEST_F(unix_sock, reset_unread) +{ + char buf[16] = {}; + ssize_t n; + + /* Send data that will remain unread by client */ + send(self->client, "hello", 5, 0); + close(self->child); + + n = recv(self->client, buf, sizeof(buf), 0); + TH_LOG("%s: recv=%zd errno=%d (%s)", variant->name, n, errno, strerror(errno)); + if (variant->socket_type == SOCK_STREAM || + variant->socket_type == SOCK_SEQPACKET) { + ASSERT_EQ(-1, n); + ASSERT_EQ(ECONNRESET, errno); + } else { + ASSERT_EQ(-1, n); + ASSERT_EQ(EAGAIN, errno); + } +} + +/* Test 3: SOCK_DGRAM peer close */ +TEST_F(unix_sock, dgram_reset) +{ + char buf[16] = {}; + ssize_t n; + + send(self->client, "hello", 5, 0); + close(self->server); + + n = recv(self->client, buf, sizeof(buf), 0); + TH_LOG("%s: recv=%zd errno=%d (%s)", variant->name, n, errno, strerror(errno)); + if (variant->socket_type == SOCK_STREAM || + variant->socket_type == SOCK_SEQPACKET) { + ASSERT_EQ(-1, n); + ASSERT_EQ(ECONNRESET, errno); + } else { + ASSERT_EQ(-1, n); + ASSERT_EQ(EAGAIN, errno); + } +} + +TEST_HARNESS_MAIN + -- 2.43.0

17 hours, 29 minutes

2
5
0 0

[PATCH RESEND 0/5] release of KTAP version 2

by Rae Moar

Hi all! I wanted to resend out this series to respark the discussion on KTAP version 2. Many of the features proposed are already in use by KUnit. This would add these features to the KTAP documentation. Note that all the features of KTAP v2 are backwards compatible. Also, today is my last day at Google so I will be responding with my personal email afterwards. -- This patch series represents the final release of KTAP version 2. There have been open discussions on version 2 for just over 2 years. This patch series marks the end of KTAP version 2 development and beginning of the KTAP version 3 development. The largest component of KTAP version 2 release is the addition of test metadata to the specification. KTAP metadata could include any test information that is pertinent for user interaction before or after the running of the test. For example, the test file path or the test speed. Example of KTAP Metadata: KTAP version 2 #:ktap_test: main #:ktap_arch: uml 1..1 KTAP version 2 #:ktap_test: suite_1 #:ktap_subsystem: example #:ktap_test_file: lib/test.c 1..2 ok 1 test_1 #:ktap_test: test_2 #:ktap_speed: very_slow # test_2 has begun #:custom_is_flaky: true ok 2 test_2 # suite_1 has passed ok 1 suite_1 The release also includes some formatting fixes and changes to update the specification to version 2. Frank Rowand (2): ktap_v2: change version to 2-rc in KTAP specification ktap_v2: change "version 1" to "version 2" in examples Rae Moar (3): ktap_v2: add test metadata ktap_v2: formatting fixes to ktap spec ktap_v2: change version to 2 in KTAP specification Documentation/dev-tools/ktap.rst | 273 +++++++++++++++++++++++++++++-- 1 file changed, 257 insertions(+), 16 deletions(-) base-commit: 9de5f847ef8fa205f4fd704a381d32ecb5b66da9 -- 2.51.2.1041.gc1ab5b90ca-goog

19 hours, 6 minutes

1
5
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror