This patch adds commas to clarify sentence structure:
- "To confirm look for" --> "To confirm, look for"
- "If you do remove this file" --> "If you do, remove this file"
Signed-off-by: Brian Ochoa <brianeochoa(a)gmail.com>
---
tools/testing/selftests/firmware/fw_fallback.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/firmware/fw_fallback.sh b/tools/testing/selftests/firmware/fw_fallback.sh
index 70d18be46af5..cd1ff88feb28 100755
--- a/tools/testing/selftests/firmware/fw_fallback.sh
+++ b/tools/testing/selftests/firmware/fw_fallback.sh
@@ -173,13 +173,13 @@ test_syfs_timeout()
echo ""
echo "This might be a distribution udev rule setup by your distribution"
echo "to immediately cancel all fallback requests, this must be"
- echo "removed before running these tests. To confirm look for"
+ echo "removed before running these tests. To confirm, look for"
echo "a firmware rule like /lib/udev/rules.d/50-firmware.rules"
echo "and see if you have something like this:"
echo ""
echo "SUBSYSTEM==\"firmware\", ACTION==\"add\", ATTR{loading}=\"-1\""
echo ""
- echo "If you do remove this file or comment out this line before"
+ echo "If you do, remove this file or comment out this line before"
echo "proceeding with these tests."
exit 1
fi
--
2.34.1
Fix few spelling mistakes in net selftests
Signed-off-by: Chandra Mohan Sundar <chandru.dav(a)gmail.com>
---
tools/testing/selftests/net/fcnal-test.sh | 4 ++--
tools/testing/selftests/net/fdb_flush.sh | 2 +-
tools/testing/selftests/net/fib_nexthops.sh | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/net/fcnal-test.sh b/tools/testing/selftests/net/fcnal-test.sh
index 899dbad0104b..4fcc38907e48 100755
--- a/tools/testing/selftests/net/fcnal-test.sh
+++ b/tools/testing/selftests/net/fcnal-test.sh
@@ -3667,7 +3667,7 @@ ipv6_addr_bind_novrf()
# when it really should not
a=${NSA_LO_IP6}
log_start
- show_hint "Tecnically should fail since address is not on device but kernel allows"
+ show_hint "Technically should fail since address is not on device but kernel allows"
run_cmd nettest -6 -s -l ${a} -I ${NSA_DEV} -t1 -b
log_test_addr ${a} $? 0 "TCP socket bind to out of scope local address"
}
@@ -3724,7 +3724,7 @@ ipv6_addr_bind_vrf()
# passes when it really should not
a=${VRF_IP6}
log_start
- show_hint "Tecnically should fail since address is not on device but kernel allows"
+ show_hint "Technically should fail since address is not on device but kernel allows"
run_cmd nettest -6 -s -l ${a} -I ${NSA_DEV} -t1 -b
log_test_addr ${a} $? 0 "TCP socket bind to VRF address with device bind"
diff --git a/tools/testing/selftests/net/fdb_flush.sh b/tools/testing/selftests/net/fdb_flush.sh
index d5e3abb8658c..9931a1e36e3d 100755
--- a/tools/testing/selftests/net/fdb_flush.sh
+++ b/tools/testing/selftests/net/fdb_flush.sh
@@ -583,7 +583,7 @@ vxlan_test_flush_by_remote_attributes()
$IP link del dev vx10
$IP link add name vx10 type vxlan dstport "$VXPORT" external
- # For multicat FDB entries, the VXLAN driver stores a linked list of
+ # For multicast FDB entries, the VXLAN driver stores a linked list of
# remotes for a given key. Verify that only the expected remotes are
# flushed.
multicast_fdb_entries_add
diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh
index 77c83d9508d3..bea1282e0281 100755
--- a/tools/testing/selftests/net/fib_nexthops.sh
+++ b/tools/testing/selftests/net/fib_nexthops.sh
@@ -741,7 +741,7 @@ ipv6_fcnal()
run_cmd "$IP nexthop add id 52 via 2001:db8:92::3"
log_test $? 2 "Create nexthop - gw only"
- # gw is not reachable throught given dev
+ # gw is not reachable through given dev
run_cmd "$IP nexthop add id 53 via 2001:db8:3::3 dev veth1"
log_test $? 2 "Create nexthop - invalid gw+dev combination"
--
2.43.0
The generic_map_lookup_batch currently returns EINTR if it fails with
ENOENT and retries several times on bpf_map_copy_value. The next batch
would start from the same location, presuming it's a transient issue.
This is incorrect if a map can actually have "holes", i.e.
"get_next_key" can return a key that does not point to a valid value. At
least the array of maps type may contain such holes legitly. Right now
these holes show up, generic batch lookup cannot proceed any more. It
will always fail with EINTR errors.
This patch fixes this behavior by skipping the non-existing key, and
does not return EINTR any more.
V2->V3: deleted a unused macro
V1->V2: split the fix and selftests; fixed a few selftests issues.
V2: https://lore.kernel.org/bpf/cover.1738905497.git.yan@cloudflare.com/
V1: https://lore.kernel.org/bpf/Z6OYbS4WqQnmzi2z@debian.debian/
Yan Zhai (2):
bpf: skip non exist keys in generic_map_lookup_batch
selftests: bpf: test batch lookup on array of maps with holes
kernel/bpf/syscall.c | 18 ++----
.../bpf/map_tests/map_in_map_batch_ops.c | 62 +++++++++++++------
2 files changed, 49 insertions(+), 31 deletions(-)
--
2.39.5
Hi all,
This patch series continues the work to migrate the *.sh tests into
prog_tests framework.
test_xdp_redirect_multi.sh tests the XDP redirections done through
bpf_redirect_map().
This is already partly covered by test_xdp_veth.c that already tests
map redirections at XDP level. What isn't covered yet by test_xdp_veth is
the use of the broadcast flags (BPF_F_BROADCAST or BPF_F_EXCLUDE_INGRESS)
and XDP egress programs.
Hence, this patch series add test cases to test_xdp_veth.c to get rid of
the test_xdp_redirect_multi.sh:
- PATCH 1 & 2 Rework test_xdp_veth.c to avoid using the root namespace
- PATCH 3 and 4 cover the broadcast flags
- PATCH 5 covers the XDP egress programs
NOTE: While working on this iteration I ran into a memory leak in
net/core/rtnetlink.c that leads to oom-kill when running ./test_progs in
a loop. This leak has been fixed by commit 1438f5d07b9a ("rtnetlink:
fix netns leak with rtnl_setlink()") in the net tree.
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>
---
Changes in v5:
- Remove the patches that were applied from previous iteration
- Add PATCH 1 & 2 to avoid using the root namespace so the veth indexes
don't get incremented on every ./test_progs call
- PATCH 3: Remove unnecessary <linux/ip.h> header
- Link to v4: https://lore.kernel.org/r/20250131-redirect-multi-v4-0-970b33678512@bootlin…
Changes in v4:
- Remove the NO_IP #define
- append_tid() takes string's size as input to ensure there is enough
space to fit the thread ID at the end
- Fix PATCH 12's commit log
- Link to v3: https://lore.kernel.org/r/20250128-redirect-multi-v3-0-c1ce69997c01@bootlin…
Changes in v3:
- Add append_tid() helper and use unique names to allow parallel testing
- Check create_network()'s return value through ASSERT_OK()
- Remove check_ping() and unused defines
- Change next_veth type (from string to int)
- Link to v2: https://lore.kernel.org/r/20250121-redirect-multi-v2-0-fc9cacabc6b2@bootlin…
Changes in v2:
- Use serial_test_* to avoid conflict between tests
- Link to v1: https://lore.kernel.org/r/20250121-redirect-multi-v1-0-b215e35ff505@bootlin…
---
Bastien Curutchet (eBPF Foundation) (6):
selftests/bpf: test_xdp_veth: Create struct net_configuration
selftests/bpf: test_xdp_veth: Use a dedicated namespace
selftests/bpf: Optionally select broadcasting flags
selftests/bpf: test_xdp_veth: Add XDP broadcast redirection tests
selftests/bpf: test_xdp_veth: Add XDP program on egress test
selftests/bpf: Remove test_xdp_redirect_multi.sh
tools/testing/selftests/bpf/Makefile | 2 -
.../selftests/bpf/prog_tests/test_xdp_veth.c | 435 ++++++++++++++++++---
.../testing/selftests/bpf/progs/xdp_redirect_map.c | 88 +++++
.../selftests/bpf/progs/xdp_redirect_multi_kern.c | 41 +-
.../selftests/bpf/test_xdp_redirect_multi.sh | 214 ----------
tools/testing/selftests/bpf/xdp_redirect_multi.c | 226 -----------
6 files changed, 491 insertions(+), 515 deletions(-)
---
base-commit: 36ab3d3de536753a4b9b2b4c4ce26af41917a378
change-id: 20250103-redirect-multi-245d6eafb5d1
Best regards,
--
Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>
I've removed the RFC tag from this version of the series, but the items
that I'm looking for feedback on remains the same:
- The userspace ABI, in particular:
- The vector length used for the SVE registers, access to the SVE
registers and access to ZA and (if available) ZT0 depending on
the current state of PSTATE.{SM,ZA}.
- The use of a single finalisation for both SVE and SME.
- The addition of control for enabling fine grained traps in a similar
manner to FGU but without the UNDEF, I'm not clear if this is desired
at all and at present this requires symmetric read and write traps like
FGU. That seemed like it might be desired from an implementation
point of view but we already have one case where we enable an
asymmetric trap (for ARM64_WORKAROUND_AMPERE_AC03_CPU_38) and it
seems generally useful to enable asymmetrically.
This series implements support for SME use in non-protected KVM guests.
Much of this is very similar to SVE, the main additional challenge that
SME presents is that it introduces a new vector length similar to the
SVE vector length and two new controls which change the registers seen
by guests:
- PSTATE.ZA enables the ZA matrix register and, if SME2 is supported,
the ZT0 LUT register.
- PSTATE.SM enables streaming mode, a new floating point mode which
uses the SVE register set with the separately configured SME vector
length. In streaming mode implementation of the FFR register is
optional.
It is also permitted to build systems which support SME without SVE, in
this case when not in streaming mode no SVE registers or instructions
are available. Further, there is no requirement that there be any
overlap in the set of vector lengths supported by SVE and SME in a
system, this is expected to be a common situation in practical systems.
Since there is a new vector length to configure we introduce a new
feature parallel to the existing SVE one with a new pseudo register for
the streaming mode vector length. Due to the overlap with SVE caused by
streaming mode rather than finalising SME as a separate feature we use
the existing SVE finalisation to also finalise SME, a new define
KVM_ARM_VCPU_VEC is provided to help make user code clearer. Finalising
SVE and SME separately would introduce complication with register access
since finalising SVE makes the SVE regsiters writeable by userspace and
doing multiple finalisations results in an error being reported.
Dealing with a state where the SVE registers are writeable due to one of
SVE or SME being finalised but may have their VL changed by the other
being finalised seems like needless complexity with minimal practical
utility, it seems clearer to just express directly that only one
finalisation can be done in the ABI.
Access to the floating point registers follows the architecture:
- When both SVE and SME are present:
- If PSTATE.SM == 0 the vector length used for the Z and P registers
is the SVE vector length.
- If PSTATE.SM == 1 the vector length used for the Z and P registers
is the SME vector length.
- If only SME is present:
- If PSTATE.SM == 0 the Z and P registers are inaccessible and the
floating point state accessed via the encodings for the V registers.
- If PSTATE.SM == 1 the vector length used for the Z and P registers
- The SME specific ZA and ZT0 registers are only accessible if SVCR.ZA is 1.
The VMM must understand this, in particular when loading state SVCR
should be configured before other state.
There are a large number of subfeatures for SME, most of which only
offer additional instructions but some of which (SME2 and FA64) add
architectural state. These are configured via the ID registers as per
usual.
The new KVM_ARM_VCPU_VEC feature and ZA and ZT0 registers have not been
added to the get-reg-list selftest, the idea of supporting additional
features there without restructuring the program to generate all
possible feature combinations has been rejected. I will post a separate
series which does that restructuring.
No support is present for protected guests, this is expected to be added
separately, the series is already rather large and pKVM in general
offers a subset of features.
This series is based on Mark Rutland's fix series:
https://lore.kernel.org/r/20250210195226.1215254-1-mark.rutland@arm.com
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Changes in v4:
- Rebase onto v6.14-rc2 and Mark Rutland's fixes.
- Expose SME to nested guests.
- Additional cleanups and test fixes following on from the rebase.
- Link to v3: https://lore.kernel.org/r/20241220-kvm-arm64-sme-v3-0-05b018c1ffeb@kernel.o…
Changes in v3:
- Rebase onto v6.12-rc2.
- Link to v2: https://lore.kernel.org/r/20231222-kvm-arm64-sme-v2-0-da226cb180bb@kernel.o…
Changes in v2:
- Rebase onto v6.7-rc3.
- Configure subfeatures based on host system only.
- Complete nVHE support.
- There was some snafu with sending v1 out, it didn't make it to the
lists but in case it hit people's inboxes I'm sending as v2.
---
Mark Brown (27):
arm64/fpsimd: Update FA64 and ZT0 enables when loading SME state
arm64/fpsimd: Decide to save ZT0 and streaming mode FFR at bind time
arm64/fpsimd: Check enable bit for FA64 when saving EFI state
arm64/fpsimd: Determine maximum virtualisable SME vector length
KVM: arm64: Introduce non-UNDEF FGT control
KVM: arm64: Pay attention to FFR parameter in SVE save and load
KVM: arm64: Pull ctxt_has_ helpers to start of sysreg-sr.h
KVM: arm64: Move SVE state access macros after feature test macros
KVM: arm64: Rename SVE finalization constants to be more general
KVM: arm64: Document the KVM ABI for SME
KVM: arm64: Define internal features for SME
KVM: arm64: Rename sve_state_reg_region
KVM: arm64: Store vector lengths in an array
KVM: arm64: Implement SME vector length configuration
KVM: arm64: Support SME control registers
KVM: arm64: Support TPIDR2_EL0
KVM: arm64: Support SME identification registers for guests
KVM: arm64: Support SME priority registers
KVM: arm64: Provide assembly for SME state restore
KVM: arm64: Support userspace access to streaming mode Z and P registers
KVM: arm64: Expose SME specific state to userspace
KVM: arm64: Context switch SME state for normal guests
KVM: arm64: Handle SME exceptions
KVM: arm64: Expose SME to nested guests
KVM: arm64: Provide interface for configuring and enabling SME for guests
KVM: arm64: selftests: Add SME system registers to get-reg-list
KVM: arm64: selftests: Add SME to set_id_regs test
Documentation/virt/kvm/api.rst | 117 +++++++---
arch/arm64/include/asm/fpsimd.h | 22 ++
arch/arm64/include/asm/kvm_emulate.h | 12 +-
arch/arm64/include/asm/kvm_host.h | 143 ++++++++++---
arch/arm64/include/asm/kvm_hyp.h | 4 +-
arch/arm64/include/asm/kvm_pkvm.h | 2 +-
arch/arm64/include/asm/vncr_mapping.h | 2 +
arch/arm64/include/uapi/asm/kvm.h | 33 +++
arch/arm64/kernel/cpufeature.c | 2 -
arch/arm64/kernel/fpsimd.c | 86 ++++----
arch/arm64/kvm/arm.c | 10 +
arch/arm64/kvm/fpsimd.c | 19 +-
arch/arm64/kvm/guest.c | 262 ++++++++++++++++++++---
arch/arm64/kvm/handle_exit.c | 14 ++
arch/arm64/kvm/hyp/fpsimd.S | 18 +-
arch/arm64/kvm/hyp/include/hyp/switch.h | 141 ++++++++++--
arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 77 ++++---
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 9 +-
arch/arm64/kvm/hyp/nvhe/pkvm.c | 4 +-
arch/arm64/kvm/hyp/nvhe/switch.c | 11 +-
arch/arm64/kvm/hyp/vhe/switch.c | 21 +-
arch/arm64/kvm/nested.c | 3 +-
arch/arm64/kvm/reset.c | 154 +++++++++----
arch/arm64/kvm/sys_regs.c | 118 +++++++++-
include/uapi/linux/kvm.h | 1 +
tools/testing/selftests/kvm/arm64/get-reg-list.c | 32 ++-
tools/testing/selftests/kvm/arm64/set_id_regs.c | 29 ++-
27 files changed, 1078 insertions(+), 268 deletions(-)
---
base-commit: 6a25088d268ce4c2163142ead7fe1975bb687cb7
change-id: 20230301-kvm-arm64-sme-06a1246d3636
prerequisite-message-id: 20250210195226.1215254-1-mark.rutland(a)arm.com
prerequisite-patch-id: 615ab9c526e9f6242bd5b8d7188e96fb66fb0e64
prerequisite-patch-id: e5c4f2ff9c9ba01a0f659dd1e8bf6396de46e197
prerequisite-patch-id: 0794d28526755180847841c045a6b7cb3d800c16
prerequisite-patch-id: 079d3a8a680f793b593268eeba000acc55a0b4ec
prerequisite-patch-id: a3428f67a5ee49f2b01208f30b57984d5409d8f5
prerequisite-patch-id: 26393e401e9eae7cff5bb1d3bdb18b4e29ffc8fe
prerequisite-patch-id: 64f9819f751da4a1c73b9d1b292ccee6afda89f6
prerequisite-patch-id: 0355baaa8ceb31dc85d015b56084c33416f78041
Best regards,
--
Mark Brown <broonie(a)kernel.org>
From: Tejun Heo <tj(a)kernel.org>
[ Upstream commit e9fe182772dcb2630964724fd93e9c90b68ea0fd ]
dsp_local_on has several incorrect assumptions, one of which is that
p->nr_cpus_allowed always tracks p->cpus_ptr. This is not true when a task
is scheduled out while migration is disabled - p->cpus_ptr is temporarily
overridden to the previous CPU while p->nr_cpus_allowed remains unchanged.
This led to sporadic test faliures when dsp_local_on_dispatch() tries to put
a migration disabled task to a different CPU. Fix it by keeping the previous
CPU when migration is disabled.
There are SCX schedulers that make use of p->nr_cpus_allowed. They should
also implement explicit handling for p->migration_disabled.
Signed-off-by: Tejun Heo <tj(a)kernel.org>
Reported-by: Ihor Solodrai <ihor.solodrai(a)pm.me>
Cc: Andrea Righi <arighi(a)nvidia.com>
Cc: Changwoo Min <changwoo(a)igalia.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/sched_ext/dsp_local_on.bpf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
index c9a2da0575a0f..eea06decb6f59 100644
--- a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
+++ b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
@@ -43,7 +43,7 @@ void BPF_STRUCT_OPS(dsp_local_on_dispatch, s32 cpu, struct task_struct *prev)
if (!p)
return;
- if (p->nr_cpus_allowed == nr_cpus)
+ if (p->nr_cpus_allowed == nr_cpus && !p->migration_disabled)
target = bpf_get_prandom_u32() % nr_cpus;
else
target = scx_bpf_task_cpu(p);
--
2.39.5
From: Tejun Heo <tj(a)kernel.org>
[ Upstream commit e9fe182772dcb2630964724fd93e9c90b68ea0fd ]
dsp_local_on has several incorrect assumptions, one of which is that
p->nr_cpus_allowed always tracks p->cpus_ptr. This is not true when a task
is scheduled out while migration is disabled - p->cpus_ptr is temporarily
overridden to the previous CPU while p->nr_cpus_allowed remains unchanged.
This led to sporadic test faliures when dsp_local_on_dispatch() tries to put
a migration disabled task to a different CPU. Fix it by keeping the previous
CPU when migration is disabled.
There are SCX schedulers that make use of p->nr_cpus_allowed. They should
also implement explicit handling for p->migration_disabled.
Signed-off-by: Tejun Heo <tj(a)kernel.org>
Reported-by: Ihor Solodrai <ihor.solodrai(a)pm.me>
Cc: Andrea Righi <arighi(a)nvidia.com>
Cc: Changwoo Min <changwoo(a)igalia.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/sched_ext/dsp_local_on.bpf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
index fbda6bf546712..758b479bd1ee1 100644
--- a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
+++ b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
@@ -43,7 +43,7 @@ void BPF_STRUCT_OPS(dsp_local_on_dispatch, s32 cpu, struct task_struct *prev)
if (!p)
return;
- if (p->nr_cpus_allowed == nr_cpus)
+ if (p->nr_cpus_allowed == nr_cpus && !p->migration_disabled)
target = bpf_get_prandom_u32() % nr_cpus;
else
target = scx_bpf_task_cpu(p);
--
2.39.5
The quiet infrastructure was moved out of Makefile.build to accomidate
the new syscall table generation scripts in perf. Syscall table
generation wanted to also be able to be quiet, so instead of again
copying the code to set the quiet variables, the code was moved into
Makefile.perf to be used globally. This was not the right solution. It
should have been moved even further upwards in the call chain.
Makefile.include is imported in many files so this seems like a proper
place to put it.
To:
Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com>
---
Charlie Jenkins (2):
tools: Unify top-level quiet infrastructure
tools: Remove redundant quiet setup
tools/arch/arm64/tools/Makefile | 6 -----
tools/bpf/Makefile | 6 -----
tools/bpf/bpftool/Documentation/Makefile | 6 -----
tools/bpf/bpftool/Makefile | 6 -----
tools/bpf/resolve_btfids/Makefile | 2 --
tools/bpf/runqslower/Makefile | 5 +---
tools/build/Makefile | 8 +-----
tools/lib/bpf/Makefile | 13 ----------
tools/lib/perf/Makefile | 13 ----------
tools/lib/thermal/Makefile | 13 ----------
tools/objtool/Makefile | 6 -----
tools/perf/Makefile.perf | 41 -------------------------------
tools/scripts/Makefile.include | 31 ++++++++++++++++++++++-
tools/testing/selftests/bpf/Makefile.docs | 6 -----
tools/testing/selftests/hid/Makefile | 2 --
tools/thermal/lib/Makefile | 13 ----------
tools/tracing/latency/Makefile | 6 -----
tools/tracing/rtla/Makefile | 6 -----
tools/verification/rv/Makefile | 6 -----
19 files changed, 32 insertions(+), 163 deletions(-)
---
base-commit: 2014c95afecee3e76ca4a56956a936e23283f05b
change-id: 20250203-quiet_tools-9a6ea9d65a19
--
- Charlie
Hi Matthew,
Can you please take a look at Patch 1 and let me know if the new xarray
function looks good to you? I will send __filemap_add_folio() and
shmem_split_large_entry() changes separately.
Hi all,
This patchset adds a new buddy allocator like (or non-uniform) large folio
split from a order-n folio to order-m with m < n. It reduces
1. the total number of after-split folios from 2^(n-m) to n-m+1;
2. the amount of memory needed for multi-index xarray split from 2^(n/6-m/6) to
n/6-m/6, assuming XA_CHUNK_SHIFT=6;
3. keep more large folios after a split from all order-m folios to
order-(n-1) to order-m folios.
For example, to split an order-9 to order-0, folio split generates 10
(or 11 for anonymous memory) folios instead of 512, allocates 1 xa_node
instead of 8, and leaves 1 order-8, 1 order-7, ..., 1 order-1 and 2 order-0
folios (or 4 order-0 for anonymous memory) instead of 512 order-0 folios.
It is on top of mm-everything-2025-02-07-05-27 with V6 reverted. It is ready to
be merged.
Instead of duplicating existing split_huge_page*() code, __folio_split()
is introduced as the shared backend code for both
split_huge_page_to_list_to_order() and folio_split(). __folio_split()
can support both uniform split and buddy allocator like (or non-uniform) split.
All existing split_huge_page*() users can be gradually converted to use
folio_split() if possible. In this patchset, I converted
truncate_inode_partial_folio() to use folio_split().
xfstests quick group passed for both tmpfs and xfs.
Changelog
===
From V6[8]:
1. Added an xarray function xas_try_split() to support iterative folio split,
removing the need of using xas_split_alloc() and xas_split(). The
function guarantees that at most one xa_node is allocated for each
call.
2. Added concrete numbers of after-split folios and xa_node savings to
cover letter, commit log. (per Andrew)
From V5[7]:
1. Split shmem to any lower order patches are in mm tree, so dropped
from this series.
2. Rename split_folio_at() to try_folio_split() to clarify that
non-uniform split will not be used if it is not supported.
From V4[6]:
1. Enabled shmem support in both uniform and buddy allocator like split
and added selftests for it.
2. Added functions to check if uniform split and buddy allocator like
split are supported for the given folio and order.
3. Made truncate fall back to uniform split if buddy allocator split is
not supported (CONFIG_READ_ONLY_THP_FOR_FS and FS without large folio).
4. Added the missing folio_clear_has_hwpoisoned() to
__split_unmapped_folio().
From V3[5]:
1. Used xas_split_alloc(GFP_NOWAIT) instead of xas_nomem(), since extra
operations inside xas_split_alloc() are needed for correctness.
2. Enabled folio_split() for shmem and no issue was found with xfstests
quick test group.
3. Split both ends of a truncate range in truncate_inode_partial_folio()
to avoid wasting memory in shmem truncate (per David Hildenbrand).
4. Removed page_in_folio_offset() since page_folio() does the same
thing.
5. Finished truncate related tests from xfstests quick test group on XFS and
tmpfs without issues.
6. Disabled buddy allocator like split on CONFIG_READ_ONLY_THP_FOR_FS
and FS without large folio. This check was missed in the prior
versions.
From V2[3]:
1. Incorporated all the feedback from Kirill[4].
2. Used GFP_NOWAIT for xas_nomem().
3. Tested the code path when xas_nomem() fails.
4. Added selftests for folio_split().
5. Fixed no THP config build error.
From V1[2]:
1. Split the original patch 1 into multiple ones for easy review (per
Kirill).
2. Added xas_destroy() to avoid memory leak.
3. Fixed nr_dropped not used error (per kernel test robot).
4. Added proper error handling when xas_nomem() fails to allocate memory
for xas_split() during buddy allocator like split.
From RFC[1]:
1. Merged backend code of split_huge_page_to_list_to_order() and
folio_split(). The same code is used for both uniform split and buddy
allocator like split.
2. Use xas_nomem() instead of xas_split_alloc() for folio_split().
3. folio_split() now leaves the first after-split folio unlocked,
instead of the one containing the given page, since
the caller of truncate_inode_partial_folio() locks and unlocks the
first folio.
4. Extended split_huge_page debugfs to use folio_split().
5. Added truncate_inode_partial_folio() as first user of folio_split().
Design
===
folio_split() splits a large folio in the same way as buddy allocator
splits a large free page for allocation. The purpose is to minimize the
number of folios after the split. For example, if user wants to free the
3rd subpage in a order-9 folio, folio_split() will split the order-9 folio
as:
O-0, O-0, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-8 if it is anon
O-1, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-9 if it is pagecache
Since anon folio does not support order-1 yet.
The split process is similar to existing approach:
1. Unmap all page mappings (split PMD mappings if exist);
2. Split meta data like memcg, page owner, page alloc tag;
3. Copy meta data in struct folio to sub pages, but instead of spliting
the whole folio into multiple smaller ones with the same order in a
shot, this approach splits the folio iteratively. Taking the example
above, this approach first splits the original order-9 into two order-8,
then splits left part of order-8 to two order-7 and so on;
4. Post-process split folios, like write mapping->i_pages for pagecache,
adjust folio refcounts, add split folios to corresponding list;
5. Remap split folios
6. Unlock split folios.
__split_unmapped_folio() and __split_folio_to_order() replace
__split_huge_page() and __split_huge_page_tail() respectively.
__split_unmapped_folio() uses different approaches to perform
uniform split and buddy allocator like split:
1. uniform split: one single call to __split_folio_to_order() is used to
uniformly split the given folio. All resulting folios are put back to
the list after split. The folio containing the given page is left to
caller to unlock and others are unlocked.
2. buddy allocator like (or non-uniform) split: (old_order - new_order) calls
to __split_folio_to_order() are used to split the given folio at order N to
order N-1. After each call, the target folio is changed to the one
containing the page, which is given as a folio_split() parameter.
After each call, folios not containing the page are put back to the list.
The folio containing the page is put back to the list when its order
is new_order. All folios are unlocked except the first folio, which
is left to caller to unlock.
Patch Overview
===
1. Patch 1 added a new xarray function xas_try_split() to perform
iterative xarray split.
2. Patch 2 added __split_unmapped_folio() and __split_folio_to_order() to
prepare for moving to new backend split code.
3. Patch 3 moved common code in split_huge_page_to_list_to_order() to
__folio_split().
4. Patch 4 added new folio_split() and made
split_huge_page_to_list_to_order() share the new
__split_unmapped_folio() with folio_split().
5. Patch 5 removed no longer used __split_huge_page() and
__split_huge_page_tail().
6. Patch 6 added a new in_folio_offset to split_huge_page debugfs for
folio_split() test.
7. Patch 7 used try_folio_split() for truncate operation.
8. Patch 8 added folio_split() tests.
Any comments and/or suggestions are welcome. Thanks.
[1] https://lore.kernel.org/linux-mm/20241008223748.555845-1-ziy@nvidia.com/
[2] https://lore.kernel.org/linux-mm/20241028180932.1319265-1-ziy@nvidia.com/
[3] https://lore.kernel.org/linux-mm/20241101150357.1752726-1-ziy@nvidia.com/
[4] https://lore.kernel.org/linux-mm/e6ppwz5t4p4kvir6eqzoto4y5fmdjdxdyvxvtw43nc…
[5] https://lore.kernel.org/linux-mm/20241205001839.2582020-1-ziy@nvidia.com/
[6] https://lore.kernel.org/linux-mm/20250106165513.104899-1-ziy@nvidia.com/
[7] https://lore.kernel.org/linux-mm/20250116211042.741543-1-ziy@nvidia.com/
[8] https://lore.kernel.org/linux-mm/20250205031417.1771278-1-ziy@nvidia.com/
Zi Yan (8):
xarray: add xas_try_split() to split a multi-index entry.
mm/huge_memory: add two new (not yet used) functions for folio_split()
mm/huge_memory: move folio split common code to __folio_split()
mm/huge_memory: add buddy allocator like (non-uniform) folio_split()
mm/huge_memory: remove the old, unused __split_huge_page()
mm/huge_memory: add folio_split() to debugfs testing interface.
mm/truncate: use buddy allocator like folio split for truncate
operation.
selftests/mm: add tests for folio_split(), buddy allocator like split.
Documentation/core-api/xarray.rst | 14 +-
include/linux/huge_mm.h | 36 +
include/linux/xarray.h | 7 +
lib/test_xarray.c | 47 ++
lib/xarray.c | 136 +++-
mm/huge_memory.c | 751 ++++++++++++------
mm/truncate.c | 31 +-
tools/testing/radix-tree/Makefile | 1 +
.../selftests/mm/split_huge_page_test.c | 34 +-
9 files changed, 772 insertions(+), 285 deletions(-)
--
2.47.2
Hi all,
After much delay, v6 of the KUnit/Rust integration patchset is here.
This change incorporates most of Miguels suggestions from v5 (save for
some of the copyright headers I wasn't comfortable unilaterally
changing). This means the documentation is much improved, and it should
work more cleanly on Rust 1.83 and 1.84, no longer requiring
static_mut_refs or const_mut_refs. (I'm not 100% sure I understand all
of the details of this, but I'm comfortable enough with how it's ended
up.)
This has been rebased against 6.14-rc1/rust-next, and should be able to
comfortably go in via either the KUnit or Rust trees. My suspicion is
that there's more likely to be conflicts with the Rust work (due to the
changes in rust/macros/lib.rs) than with KUnit, where there are no
current patches which would break the API, so maybe it makes the most
sense for it to go in via Rust for 6.15.
This series was originally written by José Expósito, and has been
modified and updated by Matt Gilbride, Miguel Ojeda, and myself. The
original version can be found here:
https://github.com/Rust-for-Linux/linux/pull/950
Add support for writing KUnit tests in Rust. While Rust doctests are
already converted to KUnit tests and run, they're really better suited
for examples, rather than as first-class unit tests.
This series implements a series of direct Rust bindings for KUnit tests,
as well as a new macro which allows KUnit tests to be written using a
close variant of normal Rust unit test syntax. The only change required
is replacing '#[cfg(test)]' with '#[kunit_tests(kunit_test_suite_name)]'
An example test would look like:
#[kunit_tests(rust_kernel_hid_driver)]
mod tests {
use super::*;
use crate::{c_str, driver, hid, prelude::*};
use core::ptr;
struct SimpleTestDriver;
impl Driver for SimpleTestDriver {
type Data = ();
}
#[test]
fn rust_test_hid_driver_adapter() {
let mut hid = bindings::hid_driver::default();
let name = c_str!("SimpleTestDriver");
static MODULE: ThisModule = unsafe { ThisModule::from_ptr(ptr::null_mut()) };
let res = unsafe {
<hid::Adapter<SimpleTestDriver> as driver::DriverOps>::register(&mut hid, name, &MODULE)
};
assert_eq!(res, Err(ENODEV)); // The mock returns -19
}
}
Please give this a go, and make sure I haven't broken it! There's almost
certainly a lot of improvements which can be made -- and there's a fair
case to be made for replacing some of this with generated C code which
can use the C macros -- but this is hopefully an adequate implementation
for now, and the interface can (with luck) remain the same even if the
implementation changes.
A few small notable missing features:
- Attributes (like the speed of a test) are hardcoded to the default
value.
- Similarly, the module name attribute is hardcoded to NULL. In C, we
use the KBUILD_MODNAME macro, but I couldn't find a way to use this
from Rust which wasn't more ugly than just disabling it.
- Assertions are not automatically rewritten to use KUnit assertions.
---
Changes since v5:
https://lore.kernel.org/all/20241213081035.2069066-1-davidgow@google.com/
- Rebased against 6.14-rc1
- Fixed a bunch of warnings / clippy lints introduced in Rust 1.83 and
1.84.
- No longer needs static_mut_refs / const_mut_refs, and is much cleaned
up as a result. (Thanks, Miguel)
- Major documentation and example fixes. (Thanks, Miguel)
Changes since v4:
https://lore.kernel.org/linux-kselftest/20241101064505.3820737-1-davidgow@g…
- Rebased against 6.13-rc1
- Allowed an unused_unsafe warning after the behaviour of addr_of_mut!()
changed in Rust 1.82. (Thanks Boqun, Miguel)
- "Expect" that the sample assert_eq!(1+1, 2) produces a clippy warning
due to a redundant assertion. (Thanks Boqun, Miguel)
- Fix some missing safety comments, and remove some unneeded 'unsafe'
blocks. (Thanks Boqun)
- Fix a couple of minor rustfmt issues which were triggering checkpatch
warnings.
Changes since v3:
https://lore.kernel.org/linux-kselftest/20241030045719.3085147-2-davidgow@g…
- The kunit_unsafe_test_suite!() macro now panic!s if the suite name is
too long, triggering a compile error. (Thanks, Alice!)
- The #[kunit_tests()] macro now preserves span information, so
errors can be better reported. (Thanks, Boqun!)
- The example tests have been updated to no longer use assert_eq!() with
a constant bool argument (which triggered a clippy warning now we
have the span info).
Changes since v2:
https://lore.kernel.org/linux-kselftest/20241029092422.2884505-1-davidgow@g…
- Include missing rust/macros/kunit.rs file from v2. (Thanks Boqun!)
- The kunit_unsafe_test_suite!() macro will truncate the name of the
suite if it is too long. (Thanks Alice!)
- The proc macro now emits an error if the suite name is too long.
- We no longer needlessly use UnsafeCell<> in
kunit_unsafe_test_suite!(). (Thanks Alice!)
Changes since v1:
https://lore.kernel.org/lkml/20230720-rustbind-v1-0-c80db349e3b5@google.com…
- Rebase on top of the latest rust-next (commit 718c4069896c)
- Make kunit_case a const fn, rather than a macro (Thanks Boqun)
- As a result, the null terminator is now created with
kernel::kunit::kunit_case_null()
- Use the C kunit_get_current_test() function to implement
in_kunit_test(), rather than re-implementing it (less efficiently)
ourselves.
Changes since the GitHub PR:
- Rebased on top of kselftest/kunit
- Add const_mut_refs feature
This may conflict with https://lore.kernel.org/lkml/20230503090708.2524310-6-nmi@metaspace.dk/
- Add rust/macros/kunit.rs to the KUnit MAINTAINERS entry
---
José Expósito (3):
rust: kunit: add KUnit case and suite macros
rust: macros: add macro to easily run KUnit tests
rust: kunit: allow to know if we are in a test
MAINTAINERS | 1 +
rust/kernel/kunit.rs | 199 +++++++++++++++++++++++++++++++++++++++++++
rust/macros/kunit.rs | 161 ++++++++++++++++++++++++++++++++++
rust/macros/lib.rs | 29 +++++++
4 files changed, 390 insertions(+)
create mode 100644 rust/macros/kunit.rs
--
2.48.1.601.g30ceb7b040-goog
Series deals with one more case of vsock surprising BPF/sockmap by being
inconsistency about (having an) assigned transport.
KASAN: null-ptr-deref in range [0x0000000000000120-0x0000000000000127]
CPU: 7 UID: 0 PID: 56 Comm: kworker/7:0 Not tainted 6.14.0-rc1+
Workqueue: vsock-loopback vsock_loopback_work
RIP: 0010:vsock_read_skb+0x4b/0x90
Call Trace:
sk_psock_verdict_data_ready+0xa4/0x2e0
virtio_transport_recv_pkt+0x1ca8/0x2acc
vsock_loopback_work+0x27d/0x3f0
process_one_work+0x846/0x1420
worker_thread+0x5b3/0xf80
kthread+0x35a/0x700
ret_from_fork+0x2d/0x70
ret_from_fork_asm+0x1a/0x30
This bug, similarly to commit f6abafcd32f9 ("vsock/bpf: return early if
transport is not assigned"), could be fixed with a single NULL check. But
instead, let's explore another approach: take a hint from
vsock_bpf_update_proto() and teach sockmap to accept only vsocks that are
already connected (no risk of transport being dropped or reassigned). At
the same time straight reject the listeners (vsock listening sockets do not
carry any transport anyway). This way BPF does not have to worry about
vsk->transport becoming NULL.
Signed-off-by: Michal Luczaj <mhal(a)rbox.co>
---
Michal Luczaj (4):
sockmap, vsock: For connectible sockets allow only connected
vsock/bpf: Warn on socket without transport
selftest/bpf: Adapt vsock_delete_on_close to sockmap rejecting unconnected
selftest/bpf: Add vsock test for sockmap rejecting unconnected
net/core/sock_map.c | 3 +
net/vmw_vsock/af_vsock.c | 3 +
net/vmw_vsock/vsock_bpf.c | 2 +-
.../selftests/bpf/prog_tests/sockmap_basic.c | 70 ++++++++++++++++------
4 files changed, 59 insertions(+), 19 deletions(-)
---
base-commit: 9c01a177c2e4b55d2bcce8a1f6bdd1d46a8320e3
change-id: 20250210-vsock-listen-sockmap-nullptr-e6e82ca79611
Best regards,
--
Michal Luczaj <mhal(a)rbox.co>
Add PKEY_UNRESTRICTED macro to mman.h and use it in selftests.
For context, this change will also allow for more consistent update of the
Glibc manual which in turn will help with introducing memory protection
keys on AArch64 targets.
Applies to 5bc55a333a2f (tag: v6.13-rc7).
Note that I couldn't build ppc tests so I would appreciate if someone
could check the 3rd patch. Thank you!
Signed-off-by: Yury Khrustalev <yury.khrustalev(a)arm.com>
---
Changes in v4:
- Removed change to tools/include/uapi/asm-generic/mman-common.h as it is not
necessary.
Link to v3: https://lore.kernel.org/all/20241028090715.509527-1-yury.khrustalev@arm.com/
Changes in v3:
- Replaced previously missed 0-s tools/testing/selftests/mm/mseal_test.c
- Replaced previously missed 0-s in tools/testing/selftests/mm/mseal_test.c
Link to v2: https://lore.kernel.org/linux-arch/20241027170006.464252-2-yury.khrustalev@…
Changes in v2:
- Update tools/include/uapi/asm-generic/mman-common.h as well
- Add usages of the new macro to selftests.
Link to v1: https://lore.kernel.org/linux-arch/20241022120128.359652-1-yury.khrustalev@…
---
Yury Khrustalev (3):
mm/pkey: Add PKEY_UNRESTRICTED macro
selftests/mm: Use PKEY_UNRESTRICTED macro
selftests/powerpc: Use PKEY_UNRESTRICTED macro
include/uapi/asm-generic/mman-common.h | 1 +
tools/testing/selftests/mm/mseal_test.c | 6 +++---
tools/testing/selftests/mm/pkey-helpers.h | 3 ++-
tools/testing/selftests/mm/pkey_sighandler_tests.c | 4 ++--
tools/testing/selftests/mm/protection_keys.c | 2 +-
tools/testing/selftests/powerpc/include/pkeys.h | 2 +-
tools/testing/selftests/powerpc/mm/pkey_exec_prot.c | 2 +-
tools/testing/selftests/powerpc/mm/pkey_siginfo.c | 2 +-
tools/testing/selftests/powerpc/ptrace/core-pkey.c | 6 +++---
tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c | 6 +++---
10 files changed, 18 insertions(+), 16 deletions(-)
--
2.39.5
kunit_skip() and kunit_mark_skipped() can only be passed a pointer
to a struct kunit, not struct kunit_suite (only kunit_log() actually
supports both). Rename their first argument accordingly.
Signed-off-by: Kevin Brodsky <kevin.brodsky(a)arm.com>
---
Cc: Brendan Higgins <brendan.higgins(a)linux.dev>
Cc: David Gow <davidgow(a)google.com>
Cc: Rae Moar <rmoar(a)google.com>
Cc: linux-kselftest(a)vger.kernel.org
---
include/kunit/test.h | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/include/kunit/test.h b/include/kunit/test.h
index 58dbab60f853..0ffb97c78566 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -553,9 +553,9 @@ void kunit_cleanup(struct kunit *test);
void __printf(2, 3) kunit_log_append(struct string_stream *log, const char *fmt, ...);
/**
- * kunit_mark_skipped() - Marks @test_or_suite as skipped
+ * kunit_mark_skipped() - Marks @test as skipped
*
- * @test_or_suite: The test context object.
+ * @test: The test context object.
* @fmt: A printk() style format string.
*
* Marks the test as skipped. @fmt is given output as the test status
@@ -563,18 +563,18 @@ void __printf(2, 3) kunit_log_append(struct string_stream *log, const char *fmt,
*
* Test execution continues after kunit_mark_skipped() is called.
*/
-#define kunit_mark_skipped(test_or_suite, fmt, ...) \
+#define kunit_mark_skipped(test, fmt, ...) \
do { \
- WRITE_ONCE((test_or_suite)->status, KUNIT_SKIPPED); \
- scnprintf((test_or_suite)->status_comment, \
+ WRITE_ONCE((test)->status, KUNIT_SKIPPED); \
+ scnprintf((test)->status_comment, \
KUNIT_STATUS_COMMENT_SIZE, \
fmt, ##__VA_ARGS__); \
} while (0)
/**
- * kunit_skip() - Marks @test_or_suite as skipped
+ * kunit_skip() - Marks @test as skipped
*
- * @test_or_suite: The test context object.
+ * @test: The test context object.
* @fmt: A printk() style format string.
*
* Skips the test. @fmt is given output as the test status
@@ -582,10 +582,10 @@ void __printf(2, 3) kunit_log_append(struct string_stream *log, const char *fmt,
*
* Test execution is halted after kunit_skip() is called.
*/
-#define kunit_skip(test_or_suite, fmt, ...) \
+#define kunit_skip(test, fmt, ...) \
do { \
- kunit_mark_skipped((test_or_suite), fmt, ##__VA_ARGS__);\
- kunit_try_catch_throw(&((test_or_suite)->try_catch)); \
+ kunit_mark_skipped((test), fmt, ##__VA_ARGS__); \
+ kunit_try_catch_throw(&((test)->try_catch)); \
} while (0)
/*
base-commit: 0ad2507d5d93f39619fc42372c347d6006b64319
--
2.47.0
Following a similar rationale as commit e4835f1da425f ("kunit: tool:
Build compile_commands.json"), make a common developer tool available by
default for KUnit users.
Compared to compile_commands.json, there is a little more work to be
done to build the GDB scripts. Is it enough to affect development cycle
duration? Unscientific evaluation:
rm -rf .kunit; time tools/testing/kunit/kunit.py build --kunitconfig ./lib/kunit/.kunitconfig --jobs 96
Without this patch it took 14.77s, with this patch it took 14.83. So,
although `make scripts_gdb` is pretty slow, presumably most of that is
just the overhead of running Kbuild at all, actually building the
scripts is approximately free.
Note also, to actually get the GDB scripts the user needs to enable
CONFIG_SCRIPTS_GDB, but building the scripts_gdb target without that is
still harmless.
Signed-off-by: Brendan Jackman <jackmanb(a)google.com>
---
tools/testing/kunit/kunit_kernel.py | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
index e76d7894b6c5195ece49f0d8c7ac35130df428a9..33b5f7351cbb5d0be240cb52db2bc1fa94aeb75e 100644
--- a/tools/testing/kunit/kunit_kernel.py
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -72,8 +72,8 @@ class LinuxSourceTreeOperations:
raise ConfigError(e.output.decode())
def make(self, jobs: int, build_dir: str, make_options: Optional[List[str]]) -> None:
- command = ['make', 'all', 'compile_commands.json', 'ARCH=' + self._linux_arch,
- 'O=' + build_dir, '--jobs=' + str(jobs)]
+ command = ['make', 'all', 'compile_commands.json', 'scripts_gdb',
+ 'ARCH=' + self._linux_arch, 'O=' + build_dir, '--jobs=' + str(jobs)]
if make_options:
command.extend(make_options)
if self._cross_compile:
---
base-commit: 521d60e196ecb215f425e04e9ab33e02beaffbe3
change-id: 20250121-kunit-gdb-b27315b4f2d8
Best regards,
--
Brendan Jackman <jackmanb(a)google.com>
Greetings:
Welcome to v8. Minor change, see changelog below. Re-tested on my mlx5
system both with and without CONFIG_XDP_SOCKETS enabled and both with
and without NETIF set.
This is an attempt to followup on something Jakub asked me about [1],
adding an xsk attribute to queues and more clearly documenting which
queues are linked to NAPIs...
After the RFC [2], Jakub suggested creating an empty nest for queues
which have a pool, so I've adjusted this version to work that way.
The nest can be extended in the future to express attributes about XSK
as needed. Queues which are not used for AF_XDP do not have the xsk
attribute present.
I've run the included test on:
- my mlx5 machine (via NETIF=)
- without setting NETIF
And the test seems to pass in both cases.
Thanks,
Joe
[1]: https://lore.kernel.org/netdev/20250113143109.60afa59a@kernel.org/
[2]: https://lore.kernel.org/netdev/20250129172431.65773-1-jdamato@fastly.com/
v8:
- Update the Makefile in patch 3 to use TEST_GEN_FILES instead of
TEST_GET_PROGS.
- Fix a codespell complaint in xdp_helper.c.
v7: https://lore.kernel.org/netdev/20250213192336.42156-1-jdamato@fastly.com/
- Added CONFIG_XDP_SOCKETS to selftests/driver/net/config as suggested
by Stanislav.
- Updated xdp_helper.c to return -1 for AF_XDP non-existence, but 1
for other failures.
- Updated queues.py to mark test as skipped if AF_XDP does not exist.
v6: https://lore.kernel.org/bpf/20250210193903.16235-1-jdamato@fastly.com/
- Added ifdefs for CONFIG_XDP_SOCKETS in patch 2 as Stanislav
suggested.
v5: https://lore.kernel.org/bpf/20250208041248.111118-1-jdamato@fastly.com/
- Removed unused ret variable from patch 2 as Simon suggested.
v4: https://lore.kernel.org/lkml/20250207030916.32751-1-jdamato@fastly.com/
- Add patch 1, as suggested by Jakub, which adds an empty nest helper.
- Use the helper in patch 2, which makes the code cleaner and prevents
a possible bug.
v3: https://lore.kernel.org/netdev/20250204191108.161046-1-jdamato@fastly.com/
- Change comment format in patch 2 to avoid kdoc warnings. No other
changes.
v2: https://lore.kernel.org/all/20250203185828.19334-1-jdamato@fastly.com/
- Switched from RFC to actual submission now that net-next is open
- Adjusted patch 1 to include an empty nest as suggested by Jakub
- Adjusted patch 2 to update the test based on changes to patch 1, and
to incorporate some Python feedback from Jakub :)
rfc: https://lore.kernel.org/netdev/20250129172431.65773-1-jdamato@fastly.com/
Joe Damato (3):
netlink: Add nla_put_empty_nest helper
netdev-genl: Add an XSK attribute to queues
selftests: drv-net: Test queue xsk attribute
Documentation/netlink/specs/netdev.yaml | 13 ++-
include/net/netlink.h | 15 +++
include/uapi/linux/netdev.h | 6 ++
net/core/netdev-genl.c | 12 +++
tools/include/uapi/linux/netdev.h | 6 ++
.../testing/selftests/drivers/net/.gitignore | 2 +
tools/testing/selftests/drivers/net/Makefile | 3 +
tools/testing/selftests/drivers/net/config | 1 +
tools/testing/selftests/drivers/net/queues.py | 42 +++++++-
.../selftests/drivers/net/xdp_helper.c | 98 +++++++++++++++++++
10 files changed, 194 insertions(+), 4 deletions(-)
create mode 100644 tools/testing/selftests/drivers/net/.gitignore
create mode 100644 tools/testing/selftests/drivers/net/xdp_helper.c
base-commit: 7a7e0197133d18cfd9931e7d3a842d0f5730223f
--
2.43.0
Currently the mm selftests refuse to run if we don't have huge page
support but there are plenty of tests that don't depend on this feature,
relax this requirement to allow coverage on relevant systems (eg, most
32 bit arm ones).
While doing this I noticed a bug with an existing check if we're running
THP tests, the fix overlaps with the above change so is sent as part of
a series.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
Mark Brown (2):
selftests/mm: Fix check for running THP tests
selftests/mm: Allow tests to run with no huge pages support
tools/testing/selftests/mm/run_vmtests.sh | 68 +++++++++++++++++++------------
1 file changed, 43 insertions(+), 25 deletions(-)
---
base-commit: a64dcfb451e254085a7daee5fe51bf22959d52d3
change-id: 20250211-kselftest-mm-no-hugepages-ee5917a170eb
Best regards,
--
Mark Brown <broonie(a)kernel.org>
A task in the kernel (task_mm_cid_work) runs somewhat periodically to
compact the mm_cid for each process. Add a test to validate that it runs
correctly and timely.
The test spawns 1 thread pinned to each CPU, then each thread, including
the main one, runs in short bursts for some time. During this period, the
mm_cids should be spanning all numbers between 0 and nproc.
At the end of this phase, a thread with high enough mm_cid (>= nproc/2)
is selected to be the new leader, all other threads terminate.
After some time, the only running thread should see 0 as mm_cid, if that
doesn't happen, the compaction mechanism didn't work and the test fails.
Since mm_cid compaction is less likely for tasks running in short
bursts, we increase the likelihood by just running a busy loop at every
iteration. This compaction is a best effort work and this behaviour is
currently acceptable.
The test never fails if only 1 core is available, in which case, we
cannot test anything as the only available mm_cid is 0.
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Signed-off-by: Gabriele Monaco <gmonaco(a)redhat.com>
---
tools/testing/selftests/rseq/.gitignore | 1 +
tools/testing/selftests/rseq/Makefile | 2 +-
.../selftests/rseq/mm_cid_compaction_test.c | 208 ++++++++++++++++++
3 files changed, 210 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/rseq/mm_cid_compaction_test.c
diff --git a/tools/testing/selftests/rseq/.gitignore b/tools/testing/selftests/rseq/.gitignore
index 16496de5f6ce4..2c89f97e4f737 100644
--- a/tools/testing/selftests/rseq/.gitignore
+++ b/tools/testing/selftests/rseq/.gitignore
@@ -3,6 +3,7 @@ basic_percpu_ops_test
basic_percpu_ops_mm_cid_test
basic_test
basic_rseq_op_test
+mm_cid_compaction_test
param_test
param_test_benchmark
param_test_compare_twice
diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile
index 5a3432fceb586..ce1b38f46a355 100644
--- a/tools/testing/selftests/rseq/Makefile
+++ b/tools/testing/selftests/rseq/Makefile
@@ -16,7 +16,7 @@ OVERRIDE_TARGETS = 1
TEST_GEN_PROGS = basic_test basic_percpu_ops_test basic_percpu_ops_mm_cid_test param_test \
param_test_benchmark param_test_compare_twice param_test_mm_cid \
- param_test_mm_cid_benchmark param_test_mm_cid_compare_twice
+ param_test_mm_cid_benchmark param_test_mm_cid_compare_twice mm_cid_compaction_test
TEST_GEN_PROGS_EXTENDED = librseq.so
diff --git a/tools/testing/selftests/rseq/mm_cid_compaction_test.c b/tools/testing/selftests/rseq/mm_cid_compaction_test.c
new file mode 100644
index 0000000000000..8808500466d02
--- /dev/null
+++ b/tools/testing/selftests/rseq/mm_cid_compaction_test.c
@@ -0,0 +1,208 @@
+// SPDX-License-Identifier: LGPL-2.1
+#define _GNU_SOURCE
+#include <assert.h>
+#include <pthread.h>
+#include <sched.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stddef.h>
+
+#include "../kselftest.h"
+#include "rseq.h"
+
+#define VERBOSE 0
+#define printf_verbose(fmt, ...) \
+ do { \
+ if (VERBOSE) \
+ printf(fmt, ##__VA_ARGS__); \
+ } while (0)
+
+/* 0.5 s */
+#define RUNNER_PERIOD 500000
+/* Number of runs before we terminate or get the token */
+#define THREAD_RUNS 5
+
+/*
+ * Number of times we check that the mm_cid were compacted.
+ * Checks are repeated every RUNNER_PERIOD.
+ */
+#define MM_CID_COMPACT_TIMEOUT 10
+
+struct thread_args {
+ int cpu;
+ int num_cpus;
+ pthread_mutex_t *token;
+ pthread_barrier_t *barrier;
+ pthread_t *tinfo;
+ struct thread_args *args_head;
+};
+
+static void __noreturn *thread_runner(void *arg)
+{
+ struct thread_args *args = arg;
+ int i, ret, curr_mm_cid;
+ cpu_set_t cpumask;
+
+ CPU_ZERO(&cpumask);
+ CPU_SET(args->cpu, &cpumask);
+ ret = pthread_setaffinity_np(pthread_self(), sizeof(cpumask), &cpumask);
+ if (ret) {
+ errno = ret;
+ perror("Error: failed to set affinity");
+ abort();
+ }
+ pthread_barrier_wait(args->barrier);
+
+ for (i = 0; i < THREAD_RUNS; i++)
+ usleep(RUNNER_PERIOD);
+ curr_mm_cid = rseq_current_mm_cid();
+ /*
+ * We select one thread with high enough mm_cid to be the new leader.
+ * All other threads (including the main thread) will terminate.
+ * After some time, the mm_cid of the only remaining thread should
+ * converge to 0, if not, the test fails.
+ */
+ if (curr_mm_cid >= args->num_cpus / 2 &&
+ !pthread_mutex_trylock(args->token)) {
+ printf_verbose(
+ "cpu%d has mm_cid=%d and will be the new leader.\n",
+ sched_getcpu(), curr_mm_cid);
+ for (i = 0; i < args->num_cpus; i++) {
+ if (args->tinfo[i] == pthread_self())
+ continue;
+ ret = pthread_join(args->tinfo[i], NULL);
+ if (ret) {
+ errno = ret;
+ perror("Error: failed to join thread");
+ abort();
+ }
+ }
+ pthread_barrier_destroy(args->barrier);
+ free(args->tinfo);
+ free(args->token);
+ free(args->barrier);
+ free(args->args_head);
+
+ for (i = 0; i < MM_CID_COMPACT_TIMEOUT; i++) {
+ curr_mm_cid = rseq_current_mm_cid();
+ printf_verbose("run %d: mm_cid=%d on cpu%d.\n", i,
+ curr_mm_cid, sched_getcpu());
+ if (curr_mm_cid == 0)
+ exit(EXIT_SUCCESS);
+ /*
+ * Currently mm_cid compaction is less likely for tasks
+ * running in short bursts: increase likelihood by just
+ * running for some time doing nothing.
+ */
+ for (int j = 0; j < 0xffff; j++)
+ for (int k = 0; k < 0xffff; k++)
+ asm("");
+ usleep(RUNNER_PERIOD);
+ }
+ exit(EXIT_FAILURE);
+ }
+ printf_verbose("cpu%d has mm_cid=%d and is going to terminate.\n",
+ sched_getcpu(), curr_mm_cid);
+ pthread_exit(NULL);
+}
+
+int test_mm_cid_compaction(void)
+{
+ cpu_set_t affinity;
+ int i, j, ret = 0, num_threads;
+ pthread_t *tinfo;
+ pthread_mutex_t *token;
+ pthread_barrier_t *barrier;
+ struct thread_args *args;
+
+ sched_getaffinity(0, sizeof(affinity), &affinity);
+ num_threads = CPU_COUNT(&affinity);
+ tinfo = calloc(num_threads, sizeof(*tinfo));
+ if (!tinfo) {
+ perror("Error: failed to allocate tinfo");
+ return -1;
+ }
+ args = calloc(num_threads, sizeof(*args));
+ if (!args) {
+ perror("Error: failed to allocate args");
+ ret = -1;
+ goto out_free_tinfo;
+ }
+ token = malloc(sizeof(*token));
+ if (!token) {
+ perror("Error: failed to allocate token");
+ ret = -1;
+ goto out_free_args;
+ }
+ barrier = malloc(sizeof(*barrier));
+ if (!barrier) {
+ perror("Error: failed to allocate barrier");
+ ret = -1;
+ goto out_free_token;
+ }
+ if (num_threads == 1) {
+ fprintf(stderr, "Cannot test on a single cpu. "
+ "Skipping mm_cid_compaction test.\n");
+ /* only skipping the test, this is not a failure */
+ goto out_free_barrier;
+ }
+ pthread_mutex_init(token, NULL);
+ ret = pthread_barrier_init(barrier, NULL, num_threads);
+ if (ret) {
+ errno = ret;
+ perror("Error: failed to initialise barrier");
+ goto out_free_barrier;
+ }
+ for (i = 0, j = 0; i < CPU_SETSIZE && j < num_threads; i++) {
+ if (!CPU_ISSET(i, &affinity))
+ continue;
+ args[j].num_cpus = num_threads;
+ args[j].tinfo = tinfo;
+ args[j].token = token;
+ args[j].barrier = barrier;
+ args[j].cpu = i;
+ args[j].args_head = args;
+ if (!j) {
+ /* The first thread is the main one */
+ tinfo[0] = pthread_self();
+ ++j;
+ continue;
+ }
+ ret = pthread_create(&tinfo[j], NULL, thread_runner, &args[j]);
+ if (ret) {
+ errno = ret;
+ perror("Error: failed to create thread");
+ abort();
+ }
+ ++j;
+ }
+ printf_verbose("Started %d threads.\n", num_threads);
+
+ /* Also main thread will terminate if it is not selected as leader */
+ thread_runner(&args[0]);
+
+ /* only reached in case of errors */
+out_free_barrier:
+ free(barrier);
+out_free_token:
+ free(token);
+out_free_args:
+ free(args);
+out_free_tinfo:
+ free(tinfo);
+
+ return ret;
+}
+
+int main(int argc, char **argv)
+{
+ if (!rseq_mm_cid_available()) {
+ fprintf(stderr, "Error: rseq_mm_cid unavailable\n");
+ return -1;
+ }
+ if (test_mm_cid_compaction())
+ return -1;
+ return 0;
+}
--
2.48.1
This series is a follow-up to [1], which adds mTHP support to khugepaged.
mTHP khugepaged support was necessary for the global="defer" and
mTHP="inherit" case (and others) to make sense.
We've seen cases were customers switching from RHEL7 to RHEL8 see a
significant increase in the memory footprint for the same workloads.
Through our investigations we found that a large contributing factor to
the increase in RSS was an increase in THP usage.
For workloads like MySQL, or when using allocators like jemalloc, it is
often recommended to set /transparent_hugepages/enabled=never. This is
in part due to performance degradations and increased memory waste.
This series introduces enabled=defer, this setting acts as a middle
ground between always and madvise. If the mapping is MADV_HUGEPAGE, the
page fault handler will act normally, making a hugepage if possible. If
the allocation is not MADV_HUGEPAGE, then the page fault handler will
default to the base size allocation. The caveat is that khugepaged can
still operate on pages thats not MADV_HUGEPAGE.
This allows for two things... one, applications specifically designed to
use hugepages will get them, and two, applications that don't use
hugepages can still benefit from them without aggressively inserting
THPs at every possible chance. This curbs the memory waste, and defers
the use of hugepages to khugepaged. Khugepaged can then scan the memory
for eligible collapsing.
Admins may want to lower max_ptes_none, if not, khugepaged may
aggressively collapse single allocations into hugepages.
TESTING:
- Built for x86_64, aarch64, ppc64le, and s390x
- selftests mm
- In [1] I provided a script [2] that has multiple access patterns
- lots of general use. These changes have been running in my VM for some time
- redis testing. This test was my original case for the defer mode. What I was
able to prove was that THP=always leads to increased max_latency cases; hence
why it is recommended to disable THPs for redis servers. However with 'defer'
we dont have the max_latency spikes and can still get the system to utilize
THPs. I further tested this with the mTHP defer setting and found that redis
(and probably other jmalloc users) can utilize THPs via defer (+mTHP defer)
without a large latency penalty and some potential gains.
I uploaded some mmtest results here [3] which compares:
stock+thp=never
stock+(m)thp=always
khugepaged-mthp + defer (max_ptes_none=64)
The results show that (m)THPs can cause some throughput regression in some
cases, but also has gains in other cases. The mTHP+defer results have more
gains and less losses over the (m)THP=always case.
V2 Changes:
- base changes on mTHP khugepaged support
- Fix selftests parsing issue
- add mTHP defer option
- add mTHP defer Documentation
[1] - https://lkml.org/lkml/2025/2/10/1982
[2] - https://gitlab.com/npache/khugepaged_mthp_test
[3] - https://people.redhat.com/npache/mthp_khugepaged_defer/testoutput2/output.h…
Nico Pache (5):
mm: defer THP insertion to khugepaged
mm: document transparent_hugepage=defer usage
selftests: mm: add defer to thp setting parser
khugepaged: add defer option to mTHP options
mm: document mTHP defer setting
Documentation/admin-guide/mm/transhuge.rst | 40 ++++++++++---
include/linux/huge_mm.h | 18 +++++-
mm/huge_memory.c | 69 +++++++++++++++++++---
mm/khugepaged.c | 10 ++--
tools/testing/selftests/mm/thp_settings.c | 1 +
tools/testing/selftests/mm/thp_settings.h | 1 +
6 files changed, 115 insertions(+), 24 deletions(-)
--
2.48.1
damos_quota_goal.py selftest see if DAMOS quota goals tuning feature
increases or reduces the effective size quota for given score as
expected. The tuning feature sets the minimum quota size as one byte,
so if the effective size quota is already one, we cannot expect it
further be reduced. However the test is not aware of the edge case, and
fails since it shown no expected change of the effective quota. Handle
the case by updating the failure logic for no change to see if it was
the case, and simply skips to next test input.
Fixes: f1c07c0a1662b ("selftests/damon: add a test for DAMOS quota goal")
Cc: <stable(a)vger.kernel.org> # 6.10.x
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Closes: https://lore.kernel.org/oe-lkp/202502171423.b28a918d-lkp@intel.com
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
tools/testing/selftests/damon/damos_quota_goal.py | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/damon/damos_quota_goal.py b/tools/testing/selftests/damon/damos_quota_goal.py
index 18246f3b62f7..f76e0412b564 100755
--- a/tools/testing/selftests/damon/damos_quota_goal.py
+++ b/tools/testing/selftests/damon/damos_quota_goal.py
@@ -63,6 +63,9 @@ def main():
if last_effective_bytes != 0 else -1.0))
if last_effective_bytes == goal.effective_bytes:
+ # effective quota was already minimum that cannot be more reduced
+ if expect_increase is False and last_effective_bytes == 1:
+ continue
print('efective bytes not changed: %d' % goal.effective_bytes)
exit(1)
base-commit: 20017459916819f8ae15ca3840e71fbf0ea8354e
--
2.39.5
Here is a couple of patches to fix issues. I think mount_options.tc's one
is a real bug(I'm not sure how it worked), but another one is an enhancement
for (my) execution environment.
Anyway, those should go through kselftests tree.
Thank you,
---
Masami Hiramatsu (Google) (2):
selftests/ftrace: Fix to use remount when testing mount GID option
selftests/ftrace: Make uprobe test more robust against binary name
.../ftrace/test.d/00basic/mount_options.tc | 8 ++++----
.../ftrace/test.d/dynevent/add_remove_uprobe.tc | 4 +++-
2 files changed, 7 insertions(+), 5 deletions(-)
--
Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
On Sat, Feb 15, 2025 at 02:52:22PM -0500, Tamir Duberstein wrote:
> On Sat, Feb 15, 2025 at 1:51 PM kernel test robot <lkp(a)intel.com> wrote:
> I am not able to reproduce these warnings with clang 19.1.7. They also
> don't obviously make sense to me.
Please, when reply, remove boielrplate stuff!
I have just wasted a couple of minutes to understand what's going on in the
message that is 2700 lines of text as the reply to the bot message which was
~700 lines.
--
With Best Regards,
Andy Shevchenko
PTRACE_SET_SYSCALL_INFO is a generic ptrace API that complements
PTRACE_GET_SYSCALL_INFO by letting the ptracer modify details of
system calls the tracee is blocked in.
This API allows ptracers to obtain and modify system call details in a
straightforward and architecture-agnostic way, providing a consistent way
of manipulating the system call number and arguments across architectures.
As in case of PTRACE_GET_SYSCALL_INFO, PTRACE_SET_SYSCALL_INFO also
does not aim to address numerous architecture-specific system call ABI
peculiarities, like differences in the number of system call arguments
for such system calls as pread64 and preadv.
The current implementation supports changing only those bits of system call
information that are used by strace system call tampering, namely, syscall
number, syscall arguments, and syscall return value.
Support of changing additional details returned by PTRACE_GET_SYSCALL_INFO,
such as instruction pointer and stack pointer, could be added later if
needed, by using struct ptrace_syscall_info.flags to specify the additional
details that should be set. Currently, "flags" and "reserved" fields of
struct ptrace_syscall_info must be initialized with zeroes; "arch",
"instruction_pointer", and "stack_pointer" fields are currently ignored.
PTRACE_SET_SYSCALL_INFO currently supports only PTRACE_SYSCALL_INFO_ENTRY,
PTRACE_SYSCALL_INFO_EXIT, and PTRACE_SYSCALL_INFO_SECCOMP operations.
Other operations could be added later if needed.
Ideally, PTRACE_SET_SYSCALL_INFO should have been introduced along with
PTRACE_GET_SYSCALL_INFO, but it didn't happen. The last straw that
convinced me to implement PTRACE_SET_SYSCALL_INFO was apparent failure
to provide an API of changing the first system call argument on riscv
architecture [1].
ptrace(2) man page:
long ptrace(enum __ptrace_request request, pid_t pid, void *addr, void *data);
...
PTRACE_SET_SYSCALL_INFO
Modify information about the system call that caused the stop.
The "data" argument is a pointer to struct ptrace_syscall_info
that specifies the system call information to be set.
The "addr" argument should be set to sizeof(struct ptrace_syscall_info)).
[1] https://lore.kernel.org/all/59505464-c84a-403d-972f-d4b2055eeaac@gmail.com/
Notes:
v6:
* mips: Submit mips_get_syscall_arg() o32 fix via mips tree
to get it merged into v6.14-rc3
* Rebase to v6.14-rc3
* v5: https://lore.kernel.org/all/20250210113336.GA887@strace.io/
v5:
* ptrace: Extend the commit message to say that the new API does not aim
to address numerous architecture-specific syscall ABI peculiarities
* selftests: Add a workaround for s390 16-bit syscall numbers
* Add more Acked-by
* v4: https://lore.kernel.org/all/20250203065849.GA14120@strace.io/
v4:
* Split out syscall_set_return_value() for hexagon into a separate patch
* s390: Change the style of syscall_set_arguments() implementation as
requested
* Add more Reviewed-by
* v3: https://lore.kernel.org/all/20250128091445.GA8257@strace.io/
v3:
* powerpc: Submit syscall_set_return_value() fix for "sc" case separately
* mips: Do not introduce erroneous argument truncation on mips n32,
add a detailed description to the commit message of the
mips_get_syscall_arg() change
* ptrace: Add explicit padding to the end of struct ptrace_syscall_info,
simplify obtaining of user ptrace_syscall_info,
do not introduce PTRACE_SYSCALL_INFO_SIZE_VER0
* ptrace: Change the return type of ptrace_set_syscall_info_* functions
from "unsigned long" to "int"
* ptrace: Add -ERANGE check to ptrace_set_syscall_info_exit(),
add comments to -ERANGE checks
* ptrace: Update comments about supported syscall stops
* selftests: Extend set_syscall_info test, fix for mips n32
* Add Tested-by and Reviewed-by
v2:
* Add patch to fix syscall_set_return_value() on powerpc
* Add patch to fix mips_get_syscall_arg() on mips
* Add syscall_set_return_value() implementation on hexagon
* Add syscall_set_return_value() invocation to syscall_set_nr()
on arm and arm64.
* Fix syscall_set_nr() and mips_set_syscall_arg() on mips
* Add a comment to syscall_set_nr() on arc, powerpc, s390, sh,
and sparc
* Remove redundant ptrace_syscall_info.op assignments in
ptrace_get_syscall_info_*
* Minor style tweaks in ptrace_get_syscall_info_op()
* Remove syscall_set_return_value() invocation from
ptrace_set_syscall_info_entry()
* Skip syscall_set_arguments() invocation in case of syscall number -1
in ptrace_set_syscall_info_entry()
* Split ptrace_syscall_info.reserved into ptrace_syscall_info.reserved
and ptrace_syscall_info.flags
* Use __kernel_ulong_t instead of unsigned long in set_syscall_info test
Dmitry V. Levin (6):
hexagon: add syscall_set_return_value()
syscall.h: add syscall_set_arguments()
syscall.h: introduce syscall_set_nr()
ptrace_get_syscall_info: factor out ptrace_get_syscall_info_op
ptrace: introduce PTRACE_SET_SYSCALL_INFO request
selftests/ptrace: add a test case for PTRACE_SET_SYSCALL_INFO
arch/arc/include/asm/syscall.h | 25 +
arch/arm/include/asm/syscall.h | 37 ++
arch/arm64/include/asm/syscall.h | 29 +
arch/csky/include/asm/syscall.h | 13 +
arch/hexagon/include/asm/syscall.h | 21 +
arch/loongarch/include/asm/syscall.h | 15 +
arch/m68k/include/asm/syscall.h | 7 +
arch/microblaze/include/asm/syscall.h | 7 +
arch/mips/include/asm/syscall.h | 46 ++
arch/nios2/include/asm/syscall.h | 16 +
arch/openrisc/include/asm/syscall.h | 13 +
arch/parisc/include/asm/syscall.h | 19 +
arch/powerpc/include/asm/syscall.h | 20 +
arch/riscv/include/asm/syscall.h | 16 +
arch/s390/include/asm/syscall.h | 21 +
arch/sh/include/asm/syscall_32.h | 24 +
arch/sparc/include/asm/syscall.h | 22 +
arch/um/include/asm/syscall-generic.h | 19 +
arch/x86/include/asm/syscall.h | 43 ++
arch/xtensa/include/asm/syscall.h | 18 +
include/asm-generic/syscall.h | 30 +
include/uapi/linux/ptrace.h | 7 +-
kernel/ptrace.c | 179 +++++-
tools/testing/selftests/ptrace/Makefile | 2 +-
.../selftests/ptrace/set_syscall_info.c | 519 ++++++++++++++++++
25 files changed, 1141 insertions(+), 27 deletions(-)
create mode 100644 tools/testing/selftests/ptrace/set_syscall_info.c
base-commit: 0ad2507d5d93f39619fc42372c347d6006b64319
--
ldv
If you wish to utilise a pidfd interface to refer to the current process or
thread it is rather cumbersome, requiring something like:
int pidfd = pidfd_open(getpid(), 0 or PIDFD_THREAD);
...
close(pidfd);
Or the equivalent call opening /proc/self. It is more convenient to use a
sentinel value to indicate to an interface that accepts a pidfd that we
simply wish to refer to the current process thread.
This series introduces sentinels for this purposes which can be passed as
the pidfd in this instance rather than having to establish a dummy fd for
this purpose.
It is useful to refer to both the current thread from the userland's
perspective for which we use PIDFD_SELF, and the current process from the
userland's perspective, for which we use PIDFD_SELF_PROCESS.
There is unfortunately some confusion between the kernel and userland as to
what constitutes a process - a thread from the userland perspective is a
process in userland, and a userland process is a thread group (more
specifically the thread group leader from the kernel perspective). We
therefore alias things thusly:
* PIDFD_SELF_THREAD aliased by PIDFD_SELF - use PIDTYPE_PID.
* PIDFD_SELF_THREAD_GROUP alised by PIDFD_SELF_PROCESS - use PIDTYPE_TGID.
In all of the kernel code we refer to PIDFD_SELF_THREAD and
PIDFD_SELF_THREAD_GROUP. However we expect users to use PIDFD_SELF and
PIDFD_SELF_PROCESS.
This matters for cases where, for instance, a user unshare()'s FDs or does
thread-specific signal handling and where the user would be hugely confused
if the FDs referenced or signal processed referred to the thread group
leader rather than the individual thread.
For now we only adjust pidfd_get_task() and the pidfd_send_signal() system
call with specific handling for this, implementing this functionality for
process_madvise(), process_mrelease() (albeit, using it here wouldn't
really make sense) and pidfd_send_signal().
We defer making further changes, as this would require a significant rework
of the pidfd mechanism.
The motivating case here is to support PIDFD_SELF in process_madvise(), so
this suffices for immediate uses. Moving forward, this can be further
expanded to other uses.
v7:
* Reworked implementation according to Christian's requirements. We now
only support process_madvise() and pidfd_send_signal() system calls with
PIDFD_SELF as specified.
* Updated tests to account for broken pidfd_open_test.c implementation.
* Fixed missing includes in pidfd self tests.
* Removed tests relating to functionality no longer supported.
* Update guard pages test to use PIDFD_SELF.
v6:
* Avoid static inline in UAPI header as suggested by Pedro.
* Place PIDFD_SELF values out of range of errors and any other sentinel as
suggested by Pedro.
https://lore.kernel.org/linux-mm/cover.1729926229.git.lorenzo.stoakes@oracl…
v5:
* Fixup self test dependencies on pidfd/pidfd.h.
https://lore.kernel.org/linux-mm/cover.1729848252.git.lorenzo.stoakes@oracl…
v4:
* Avoid returning an fd in the __pidfd_get_pid() function as pointed out by
Christian, instead simply always pin the pid and maintain fd scope in the
helper alone.
* Add wrapper header file in tools/include/linux to allow for import of
UAPI pidfd.h header without encountering the collision between system
fcntl.h and linux/fcntl.h as discussed with Shuah and John.
* Fixup tests to import the UAPI pidfd.h header working around conflicts
between system fcntl.h and linux/fcntl.h which the UAPI pidfd.h imports,
as reported by Shuah.
* Use an int for pidfd_is_self_sentinel() to avoid any dependency on
stdbool.h in userland.
https://lore.kernel.org/linux-mm/cover.1729198898.git.lorenzo.stoakes@oracl…
v3:
* Do not fput() an invalid fd as reported by kernel test bot.
* Fix unintended churn from moving variable declaration.
https://lore.kernel.org/linux-mm/cover.1729073310.git.lorenzo.stoakes@oracl…
v2:
* Fix tests as reported by Shuah.
* Correct RFC version lore link.
https://lore.kernel.org/linux-mm/cover.1728643714.git.lorenzo.stoakes@oracl…
Non-RFC v1:
* Removed RFC tag - there seems to be general consensus that this change is
a good idea, but perhaps some debate to be had on implementation. It
seems sensible then to move forward with the RFC flag removed.
* Introduced PIDFD_SELF_THREAD, PIDFD_SELF_THREAD_GROUP and their aliases
PIDFD_SELF and PIDFD_SELF_PROCESS respectively.
* Updated testing accordingly.
https://lore.kernel.org/linux-mm/cover.1728578231.git.lorenzo.stoakes@oracl…
RFC version:
https://lore.kernel.org/linux-mm/cover.1727644404.git.lorenzo.stoakes@oracl…
Lorenzo Stoakes (6):
pidfd: add PIDFD_SELF* sentinels to refer to own thread/process
selftests/pidfd: add missing system header imcludes to pidfd tests
tools: testing: separate out wait_for_pid() into helper header
selftests: pidfd: add pidfd.h UAPI wrapper
selftests: pidfd: add tests for PIDFD_SELF_*
selftests/mm: use PIDFD_SELF in guard pages test
include/uapi/linux/pidfd.h | 24 ++++
kernel/pid.c | 24 +++-
kernel/signal.c | 106 +++++++++++-------
tools/include/linux/pidfd.h | 14 +++
tools/testing/selftests/cgroup/test_kill.c | 2 +-
tools/testing/selftests/mm/Makefile | 4 +
tools/testing/selftests/mm/guard-pages.c | 15 +--
.../pid_namespace/regression_enomem.c | 2 +-
tools/testing/selftests/pidfd/Makefile | 3 +-
tools/testing/selftests/pidfd/pidfd.h | 28 +----
.../selftests/pidfd/pidfd_fdinfo_test.c | 1 +
tools/testing/selftests/pidfd/pidfd_helpers.h | 39 +++++++
.../testing/selftests/pidfd/pidfd_open_test.c | 6 +-
.../selftests/pidfd/pidfd_setns_test.c | 1 +
tools/testing/selftests/pidfd/pidfd_test.c | 76 +++++++++++--
15 files changed, 242 insertions(+), 103 deletions(-)
create mode 100644 tools/include/linux/pidfd.h
create mode 100644 tools/testing/selftests/pidfd/pidfd_helpers.h
--
2.48.1
Hello,
kernel test robot noticed "kernel-selftests.lib.prime_numbers.sh.fail" on:
commit: 313b38a6ecb46db4002925af91b64df4f2b76d1f ("lib/prime_numbers: convert self-test to KUnit")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
[test failed on linux-next/master 0ae0fa3bf0b44c8611d114a9f69985bf451010c3]
in testcase: kernel-selftests
version: kernel-selftests-x86_64-78a632a2086c-1_20250215
with following parameters:
group: lib
config: x86_64-rhel-9.4-kselftests
compiler: gcc-12
test machine: 4 threads Intel(R) Xeon(R) CPU E3-1225 v5 @ 3.30GHz (Skylake) with 16G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang(a)intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202502171110.708d965a-lkp@intel.com
# timeout set to 300
# selftests: lib: prime_numbers.sh
# Warning: file prime_numbers.sh is missing!
not ok 3 selftests: lib: prime_numbers.sh
(missed the change for tools/testing/selftests/lib/Makefile?)
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250217/202502171110.708d965a-lkp@…
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Hi,
This series carries forward the effort to add Kselftest for PCI Endpoint
Subsystem started by Aman Gupta [1] a while ago. I reworked the initial version
based on another patch that fixes the return values of IOCTLs in
pci_endpoint_test driver and did many cleanups. Since the resulting work
modified the initial version substantially, I took over the authorship.
This series also incorporates the review comment by Shuah Khan [2] to move the
existing tests from 'tools/pci' to 'tools/testing/kselftest/pci_endpoint' before
migrating to Kselftest framework. I made sure that the tests are executable in
each commit and updated documentation accordingly.
- Mani
[1] https://lore.kernel.org/linux-pci/20221007053934.5188-1-aman1.gupta@samsung…
[2] https://lore.kernel.org/linux-pci/b2a5db97-dc59-33ab-71cd-f591e0b1b34d@linu…
Changes in v6:
* Fixed the documentation to pass max MSI and MSI-X count to configfs
* Collected tags
Changes in v5:
* Incorporated comments from Niklas
* Added a patch to fix the DMA MEMCPY check in pci-epf-test driver
* Collected tags
* Rebased on top of pci/next 0333f56dbbf7ef6bb46d2906766c3e1b2a04a94d
Changes in v4:
* Dropped the BAR fix patches and submitted them separately:
https://lore.kernel.org/linux-pci/20241231130224.38206-1-manivannan.sadhasi…
* Rebased on top of pci/next 9e1b45d7a5bc0ad20f6b5267992da422884b916e
Changes in v3:
* Collected tags.
* Added a note about failing testcase 10 and command to skip it in
documentation.
* Removed Aman Gupta and Padmanabhan Rajanbabu from CC as their addresses are
bouncing.
Changes in v2:
* Added a patch that fixes return values of IOCTL in pci_endpoint_test driver
* Moved the existing tests to new location before migrating
* Added a fix for BARs on Qcom devices
* Updated documentation and also added fixture variants for memcpy & DMA modes
Manivannan Sadhasivam (4):
PCI: endpoint: pci-epf-test: Fix the check for DMA MEMCPY test
misc: pci_endpoint_test: Fix the return value of IOCTL
selftests: Move PCI Endpoint tests from tools/pci to Kselftests
selftests: pci_endpoint: Migrate to Kselftest framework
Documentation/PCI/endpoint/pci-test-howto.rst | 174 +++++-------
MAINTAINERS | 2 +-
drivers/misc/pci_endpoint_test.c | 255 +++++++++--------
drivers/pci/endpoint/functions/pci-epf-test.c | 4 +-
tools/pci/Build | 1 -
tools/pci/Makefile | 58 ----
tools/pci/pcitest.c | 264 ------------------
tools/pci/pcitest.sh | 73 -----
tools/testing/selftests/Makefile | 1 +
.../testing/selftests/pci_endpoint/.gitignore | 2 +
tools/testing/selftests/pci_endpoint/Makefile | 7 +
tools/testing/selftests/pci_endpoint/config | 4 +
.../pci_endpoint/pci_endpoint_test.c | 221 +++++++++++++++
13 files changed, 437 insertions(+), 629 deletions(-)
delete mode 100644 tools/pci/Build
delete mode 100644 tools/pci/Makefile
delete mode 100644 tools/pci/pcitest.c
delete mode 100644 tools/pci/pcitest.sh
create mode 100644 tools/testing/selftests/pci_endpoint/.gitignore
create mode 100644 tools/testing/selftests/pci_endpoint/Makefile
create mode 100644 tools/testing/selftests/pci_endpoint/config
create mode 100644 tools/testing/selftests/pci_endpoint/pci_endpoint_test.c
--
2.25.1
tun used to simply advance iov_iter when it needs to pad virtio header,
which leaves the garbage in the buffer as is. This is especially
problematic when tun starts to allow enabling the hash reporting
feature; even if the feature is enabled, the packet may lack a hash
value and may contain a hole in the virtio header because the packet
arrived before the feature gets enabled or does not contain the
header fields to be hashed. If the hole is not filled with zero, it is
impossible to tell if the packet lacks a hash value.
In theory, a user of tun can fill the buffer with zero before calling
read() to avoid such a problem, but leaving the garbage in the buffer is
awkward anyway so fill the buffer in tun.
The specification also says the device MUST set num_buffers to 1 when
the field is present so set it when the specified header size is big
enough to contain the field.
Signed-off-by: Akihiko Odaki <akihiko.odaki(a)daynix.com>
---
drivers/net/tap.c | 2 +-
drivers/net/tun.c | 6 ++++--
drivers/net/tun_vnet.h | 14 +++++++++-----
3 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 1287e241f4454fb8ec4975bbaded5fbaa88e3cc8..d96009153c316f669e626d95002e5fe8add3a1b2 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -711,7 +711,7 @@ static ssize_t tap_put_user(struct tap_queue *q,
int total;
if (q->flags & IFF_VNET_HDR) {
- struct virtio_net_hdr vnet_hdr;
+ struct virtio_net_hdr_v1 vnet_hdr;
vnet_hdr_len = READ_ONCE(q->vnet_hdr_sz);
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index b14231a743915c2851eaae49d757b763ec4a8841..a3aed7e42c63d8b8f523c0141c7d970ab185178c 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1987,7 +1987,9 @@ static ssize_t tun_put_user_xdp(struct tun_struct *tun,
ssize_t ret;
if (tun->flags & IFF_VNET_HDR) {
- struct virtio_net_hdr gso = { 0 };
+ struct virtio_net_hdr_v1 gso = {
+ .num_buffers = cpu_to_tun_vnet16(tun->flags, 1)
+ };
vnet_hdr_sz = READ_ONCE(tun->vnet_hdr_sz);
ret = tun_vnet_hdr_put(vnet_hdr_sz, iter, &gso);
@@ -2040,7 +2042,7 @@ static ssize_t tun_put_user(struct tun_struct *tun,
}
if (vnet_hdr_sz) {
- struct virtio_net_hdr gso;
+ struct virtio_net_hdr_v1 gso;
ret = tun_vnet_hdr_from_skb(tun->flags, tun->dev, skb, &gso);
if (ret)
diff --git a/drivers/net/tun_vnet.h b/drivers/net/tun_vnet.h
index fd7411c4447ffb180e032fe3e22f6709c30da8e9..b4f406f522728f92266898969831c26a87930f6a 100644
--- a/drivers/net/tun_vnet.h
+++ b/drivers/net/tun_vnet.h
@@ -135,15 +135,17 @@ static inline int tun_vnet_hdr_get(int sz, unsigned int flags,
}
static inline int tun_vnet_hdr_put(int sz, struct iov_iter *iter,
- const struct virtio_net_hdr *hdr)
+ const struct virtio_net_hdr_v1 *hdr)
{
+ int content_sz = MIN(sizeof(*hdr), sz);
+
if (unlikely(iov_iter_count(iter) < sz))
return -EINVAL;
- if (unlikely(copy_to_iter(hdr, sizeof(*hdr), iter) != sizeof(*hdr)))
+ if (unlikely(copy_to_iter(hdr, content_sz, iter) != content_sz))
return -EFAULT;
- iov_iter_advance(iter, sz - sizeof(*hdr));
+ iov_iter_zero(sz - content_sz, iter);
return 0;
}
@@ -157,11 +159,11 @@ static inline int tun_vnet_hdr_to_skb(unsigned int flags, struct sk_buff *skb,
static inline int tun_vnet_hdr_from_skb(unsigned int flags,
const struct net_device *dev,
const struct sk_buff *skb,
- struct virtio_net_hdr *hdr)
+ struct virtio_net_hdr_v1 *hdr)
{
int vlan_hlen = skb_vlan_tag_present(skb) ? VLAN_HLEN : 0;
- if (virtio_net_hdr_from_skb(skb, hdr,
+ if (virtio_net_hdr_from_skb(skb, (struct virtio_net_hdr *)hdr,
tun_vnet_is_little_endian(flags), true,
vlan_hlen)) {
struct skb_shared_info *sinfo = skb_shinfo(skb);
@@ -179,6 +181,8 @@ static inline int tun_vnet_hdr_from_skb(unsigned int flags,
return -EINVAL;
}
+ hdr->num_buffers = cpu_to_tun_vnet16(flags, 1);
+
return 0;
}
---
base-commit: f54eab84fc17ef79b701e29364b7d08ca3a1d2f6
change-id: 20250116-buffers-96e14bf023fc
prerequisite-change-id: 20241230-tun-66e10a49b0c7:v6
prerequisite-patch-id: 871dc5f146fb6b0e3ec8612971a8e8190472c0fb
prerequisite-patch-id: 2797ed249d32590321f088373d4055ff3f430a0e
prerequisite-patch-id: ea3370c72d4904e2f0536ec76ba5d26784c0cede
prerequisite-patch-id: 837e4cf5d6b451424f9b1639455e83a260c4440d
prerequisite-patch-id: ea701076f57819e844f5a35efe5cbc5712d3080d
prerequisite-patch-id: 701646fb43ad04cc64dd2bf13c150ccbe6f828ce
prerequisite-patch-id: 53176dae0c003f5b6c114d43f936cf7140d31bb5
Best regards,
--
Akihiko Odaki <akihiko.odaki(a)daynix.com>
Fixes a minor grammatical issue in a comment in close_range_test.c
where "and and" was mistakenly repeated.
Signed-off-by: Imanol <imvalient(a)protonmail.com>
---
tools/testing/selftests/core/close_range_test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/core/close_range_test.c b/tools/testing/selftests/core/close_range_test.c
index e0d9851fe1c9..c19e8d037211 100644
--- a/tools/testing/selftests/core/close_range_test.c
+++ b/tools/testing/selftests/core/close_range_test.c
@@ -506,7 +506,7 @@ TEST(close_range_cloexec_unshare_syzbot)
/*
* Create a huge gap in the fd table. When we now call
- * CLOSE_RANGE_UNSHARE with a shared fd table and and with ~0U as upper
+ * CLOSE_RANGE_UNSHARE with a shared fd table and with ~0U as upper
* bound the kernel will only copy up to fd1 file descriptors into the
* new fd table. If the kernel is buggy and doesn't handle
* CLOSE_RANGE_CLOEXEC correctly it will not have copied all file
--
2.43.0
While taking a look at '[PATCH net] pktgen: Avoid out-of-range in
get_imix_entries' ([1]) and '[PATCH net v2] pktgen: Avoid out-of-bounds
access in get_imix_entries' ([2], [3]) and doing some tests and code review
I detected that the /proc/net/pktgen/... parsing logic does not honour the
user given buffer bounds (resulting in out-of-bounds access).
This can be observed e.g. by the following simple test (sometimes the
old/'longer' previous value is re-read from the buffer):
$ echo add_device lo@0 > /proc/net/pktgen/kpktgend_0
$ echo "min_pkt_size 12345" > /proc/net/pktgen/lo\@0 && grep min_pkt_size /proc/net/pktgen/lo\@0
Params: count 1000 min_pkt_size: 12345 max_pkt_size: 0
Result: OK: min_pkt_size=12345
$ echo -n "min_pkt_size 123" > /proc/net/pktgen/lo\@0 && grep min_pkt_size /proc/net/pktgen/lo\@0
Params: count 1000 min_pkt_size: 12345 max_pkt_size: 0
Result: OK: min_pkt_size=12345
$ echo "min_pkt_size 123" > /proc/net/pktgen/lo\@0 && grep min_pkt_size /proc/net/pktgen/lo\@0
Params: count 1000 min_pkt_size: 123 max_pkt_size: 0
Result: OK: min_pkt_size=123
So fix the out-of-bounds access (and some minor findings) and add a simple
proc_net_pktgen selftest...
Patch set splited into part I
- net: pktgen: replace ENOTSUPP with EOPNOTSUPP
- net: pktgen: enable 'param=value' parsing
- net: pktgen: fix hex32_arg parsing for short reads
- net: pktgen: fix 'rate 0' error handling (return -EINVAL)
- net: pktgen: fix 'ratep 0' error handling (return -EINVAL)
- net: pktgen: fix ctrl interface command parsing
- net: pktgen: fix access outside of user given buffer in pktgen_thread_write()
And part II (this one):
- net: pktgen: use defines for the various dec/hex number parsing digits lengths
- net: pktgen: fix mix of int/long
- net: pktgen: remove extra tmp variable (re-use len instead)
- net: pktgen: remove some superfluous variable initializing
- net: pktgen: fix mpls maximum labels list parsing
- net: pktgen: fix access outside of user given buffer in pktgen_if_write()
- net: pktgen: fix mpls reset parsing
- net: pktgen: remove all superfluous index assignements
- selftest: net: add proc_net_pktgen
Regards,
Peter
Changes v4 -> v5:
- split up patchset into part i/ii (suggested by Simon Horman)
- add rev-by Simon Horman
- net: pktgen: align some variable declarations to the most common pattern
-> net: pktgen: fix mix of int/long
- instead of align to most common pattern (int) adjust all usages to
size_t for i and max and ssize_t for len and adjust function signatures
of hex32_arg(), count_trail_chars(), num_arg() and strn_len() accordingly
- respect reverse xmas tree order for local variable declarations (where
possible without too much code churn)
- update subject line and patch description
- dropped net: pktgen: hex32_arg/num_arg error out in case no characters are
available
- keep empty hex/num arg is implicit assumed as zero value
- dropped net: pktgen: num_arg error out in case no valid character is parsed
- keep empty hex/num arg is implicit assumed as zero value
- Change patch description ('Fixes:' -> 'Addresses the following:',
suggested by Simon Horman)
- net: pktgen: remove all superfluous index assignements
- new patch (suggested by Simon Horman)
- selftest: net: add proc_net_pktgen
- addapt to dropped patch 'net: pktgen: hex32_arg/num_arg error out in case
no characters are available', empty hex/num arg is now implicit assumed as
zero value (instead of failure)
Changes v3 -> v4:
- add rev-by Simon Horman
- new patch 'net: pktgen: use defines for the various dec/hex number parsing
digits lengths' (suggested by Simon Horman)
- replace C99 comment (suggested by Paolo Abeni)
- drop available characters check in strn_len() (suggested by Paolo Abeni)
- factored out patch 'net: pktgen: align some variable declarations to the
most common pattern' (suggested by Paolo Abeni)
- factored out patch 'net: pktgen: remove extra tmp variable (re-use len
instead)' (suggested by Paolo Abeni)
- factored out patch 'net: pktgen: remove some superfluous variable
initializing' (suggested by Paolo Abeni)
- factored out patch 'net: pktgen: fix mpls maximum labels list parsing'
(suggested by Paolo Abeni)
- factored out 'net: pktgen: hex32_arg/num_arg error out in case no
characters are available' (suggested by Paolo Abeni)
- factored out 'net: pktgen: num_arg error out in case no valid character
is parsed' (suggested by Paolo Abeni)
Changes v2 -> v3:
- new patch: 'net: pktgen: fix ctrl interface command parsing'
- new patch: 'net: pktgen: fix mpls reset parsing'
- tools/testing/selftests/net/proc_net_pktgen.c:
- fix typo in change description ('v1 -> v1' and tyop)
- rename some vars to better match usage
add_loopback_0 -> thr_cmd_add_loopback_0
rm_loopback_0 -> thr_cmd_rm_loopback_0
wrong_ctrl_cmd -> wrong_thr_cmd
legacy_ctrl_cmd -> legacy_thr_cmd
ctrl_fd -> thr_fd
- add ctrl interface tests
Changes v1 -> v2:
- new patch: 'net: pktgen: fix hex32_arg parsing for short reads'
- new patch: 'net: pktgen: fix 'rate 0' error handling (return -EINVAL)'
- new patch: 'net: pktgen: fix 'ratep 0' error handling (return -EINVAL)'
- net/core/pktgen.c: additional fix get_imix_entries() and get_labels()
- tools/testing/selftests/net/proc_net_pktgen.c:
- fix tyop not vs. nod (suggested by Jakub Kicinski)
- fix misaligned line (suggested by Jakub Kicinski)
- enable fomerly commented out CONFIG_XFRM dependent test (command spi),
as CONFIG_XFRM is enabled via tools/testing/selftests/net/config
CONFIG_XFRM_INTERFACE/CONFIG_XFRM_USER (suggestex by Jakub Kicinski)
- add CONFIG_NET_PKTGEN=m to tools/testing/selftests/net/config
(suggested by Jakub Kicinski)
- add modprobe pktgen to FIXTURE_SETUP() (suggested by Jakub Kicinski)
- fix some checkpatch warnings (Missing a blank line after declarations)
- shrink line length by re-naming some variables (command -> cmd,
device -> dev)
- add 'rate 0' testcase
- add 'ratep 0' testcase
[1] https://lore.kernel.org/netdev/20241006221221.3744995-1-artem.chernyshev@re…
[2] https://lore.kernel.org/netdev/20250109083039.14004-1-pchelkin@ispras.ru/
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
Peter Seiderer (8):
net: pktgen: fix mix of int/long
net: pktgen: remove extra tmp variable (re-use len instead)
net: pktgen: remove some superfluous variable initializing
net: pktgen: fix mpls maximum labels list parsing
net: pktgen: fix access outside of user given buffer in
pktgen_if_write()
net: pktgen: fix mpls reset parsing
net: pktgen: remove all superfluous index assignements
selftest: net: add proc_net_pktgen
net/core/pktgen.c | 288 ++++----
tools/testing/selftests/net/Makefile | 1 +
tools/testing/selftests/net/config | 1 +
tools/testing/selftests/net/proc_net_pktgen.c | 646 ++++++++++++++++++
4 files changed, 806 insertions(+), 130 deletions(-)
create mode 100644 tools/testing/selftests/net/proc_net_pktgen.c
--
2.48.1
This series expands the XDP TX metadata framework to allow user
applications to pass per packet 64-bit launch time directly to the kernel
driver, requesting launch time hardware offload support. The XDP TX
metadata framework will not perform any clock conversion or packet
reordering.
Please note that the role of Tx metadata is just to pass the launch time,
not to enable the offload feature. Users will need to enable the launch
time hardware offload feature of the device by using the respective
command, such as the tc-etf command.
Although some devices use the tc-etf command to enable their launch time
hardware offload feature, xsk packets will not go through the etf qdisc.
Therefore, in my opinion, the launch time should always be based on the PTP
Hardware Clock (PHC). Thus, i did not include a clock ID to indicate the
clock source.
To simplify the test steps, I modified the xdp_hw_metadata bpf self-test
tool in such a way that it will set the launch time based on the offset
provided by the user and the value of the Receive Hardware Timestamp, which
is against the PHC. This will eliminate the need to discipline System Clock
with the PHC and then use clock_gettime() to get the time.
Please note that AF_XDP lacks a feedback mechanism to inform the
application if the requested launch time is invalid. So, users are expected
to familiar with the horizon of the launch time of the device they use and
not request a launch time that is beyond the horizon. Otherwise, the driver
might interpret the launch time incorrectly and react wrongly. For stmmac
and igc, where modulo computation is used, a launch time larger than the
horizon will cause the device to transmit the packet earlier that the
requested launch time.
Although there is no feedback mechanism for the launch time request
for now, user still can check whether the requested launch time is
working or not, by requesting the Transmit Completion Hardware Timestamp.
v11:
- regenerate netdev_xsk_flags based on latest netdev.yaml (Jakub)
v10: https://lore.kernel.org/netdev/20250207021943.814768-1-yoong.siang.song@int…
- use net_err_ratelimited(), instead of net_ratelimit() (Maciej)
- accumulate the amount of used descs in local variable and update the
igc_metadata_request::used_desc once (Maciej)
- Ensure reverse christmas tree rule (Maciej)
V9: https://lore.kernel.org/netdev/20250206060408.808325-1-yoong.siang.song@int…
- Remove the igc_desc_unused() checking (Maciej)
- Ensure that skb allocation and DMA mapping work before proceeding to
fill in igc_tx_buffer info, context desc, and data desc (Maciej)
- Rate limit the error messages (Maciej)
- Update the comment to indicate that the 2 descriptors needed by the
empty frame are already taken into consideration (Maciej)
- Handle the case where the insertion of an empty frame fails and
explain the reason behind (Maciej)
- put self SOB tag as last tag (Maciej)
V8: https://lore.kernel.org/netdev/20250205024116.798862-1-yoong.siang.song@int…
- check the number of used descriptor in xsk_tx_metadata_request()
by using used_desc of struct igc_metadata_request, and then decreases
the budget with it (Maciej)
- submit another bug fix patch to set the buffer type for empty frame (Maciej):
https://lore.kernel.org/netdev/20250205023603.798819-1-yoong.siang.song@int…
V7: https://lore.kernel.org/netdev/20250204004907.789330-1-yoong.siang.song@int…
- split the refactoring code of igc empty packet insertion into a separate
commit (Faizal)
- add explanation on why the value "4" is used as igc transmit budget
(Faizal)
- perform a stress test by sending 1000 packets with 10ms interval and
launch time set to 500us in the future (Faizal & Yong Liang)
V6: https://lore.kernel.org/netdev/20250116155350.555374-1-yoong.siang.song@int…
- fix selftest build errors by using asprintf() and realloc(), instead of
managing the buffer sizes manually (Daniel, Stanislav)
V5: https://lore.kernel.org/netdev/20250114152718.120588-1-yoong.siang.song@int…
- change netdev feature name from tx-launch-time to tx-launch-time-fifo
to explicitly state the FIFO behaviour (Stanislav)
- improve the looping of xdp_hw_metadata app to wait for packet tx
completion to be more readable by using clock_gettime() (Stanislav)
- add launch time setup steps into xdp_hw_metadata app (Stanislav)
V4: https://lore.kernel.org/netdev/20250106135506.9687-1-yoong.siang.song@intel…
- added XDP launch time support to the igc driver (Jesper & Florian)
- added per-driver launch time limitation on xsk-tx-metadata.rst (Jesper)
- added explanation on FIFO behavior on xsk-tx-metadata.rst (Jakub)
- added step to enable launch time in the commit message (Jesper & Willem)
- explicitly documented the type of launch_time and which clock source
it is against (Willem)
V3: https://lore.kernel.org/netdev/20231203165129.1740512-1-yoong.siang.song@in…
- renamed to use launch time (Jesper & Willem)
- changed the default launch time in xdp_hw_metadata apps from 1s to 0.1s
because some NICs do not support such a large future time.
V2: https://lore.kernel.org/netdev/20231201062421.1074768-1-yoong.siang.song@in…
- renamed to use Earliest TxTime First (Willem)
- renamed to use txtime (Willem)
V1: https://lore.kernel.org/netdev/20231130162028.852006-1-yoong.siang.song@int…
Song Yoong Siang (5):
xsk: Add launch time hardware offload support to XDP Tx metadata
selftests/bpf: Add launch time request to xdp_hw_metadata
net: stmmac: Add launch time support to XDP ZC
igc: Refactor empty frame insertion for launch time support
igc: Add launch time support to XDP ZC
Documentation/netlink/specs/netdev.yaml | 4 +
Documentation/networking/xsk-tx-metadata.rst | 62 +++++++
drivers/net/ethernet/intel/igc/igc.h | 1 +
drivers/net/ethernet/intel/igc/igc_main.c | 143 +++++++++++----
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 +
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 13 ++
include/net/xdp_sock.h | 10 ++
include/net/xdp_sock_drv.h | 1 +
include/uapi/linux/if_xdp.h | 10 ++
include/uapi/linux/netdev.h | 3 +
net/core/netdev-genl.c | 2 +
net/xdp/xsk.c | 3 +
tools/include/uapi/linux/if_xdp.h | 10 ++
tools/include/uapi/linux/netdev.h | 3 +
tools/testing/selftests/bpf/xdp_hw_metadata.c | 168 +++++++++++++++++-
15 files changed, 396 insertions(+), 39 deletions(-)
--
2.34.1
This series expands the XDP TX metadata framework to allow user
applications to pass per packet 64-bit launch time directly to the kernel
driver, requesting launch time hardware offload support. The XDP TX
metadata framework will not perform any clock conversion or packet
reordering.
Please note that the role of Tx metadata is just to pass the launch time,
not to enable the offload feature. Users will need to enable the launch
time hardware offload feature of the device by using the respective
command, such as the tc-etf command.
Although some devices use the tc-etf command to enable their launch time
hardware offload feature, xsk packets will not go through the etf qdisc.
Therefore, in my opinion, the launch time should always be based on the PTP
Hardware Clock (PHC). Thus, i did not include a clock ID to indicate the
clock source.
To simplify the test steps, I modified the xdp_hw_metadata bpf self-test
tool in such a way that it will set the launch time based on the offset
provided by the user and the value of the Receive Hardware Timestamp, which
is against the PHC. This will eliminate the need to discipline System Clock
with the PHC and then use clock_gettime() to get the time.
Please note that AF_XDP lacks a feedback mechanism to inform the
application if the requested launch time is invalid. So, users are expected
to familiar with the horizon of the launch time of the device they use and
not request a launch time that is beyond the horizon. Otherwise, the driver
might interpret the launch time incorrectly and react wrongly. For stmmac
and igc, where modulo computation is used, a launch time larger than the
horizon will cause the device to transmit the packet earlier that the
requested launch time.
Although there is no feedback mechanism for the launch time request
for now, user still can check whether the requested launch time is
working or not, by requesting the Transmit Completion Hardware Timestamp.
v10:
- use net_err_ratelimited(), instead of net_ratelimit() (Maciej)
- accumulate the amount of used descs in local variable and update the
igc_metadata_request::used_desc once (Maciej)
- Ensure reverse christmas tree rule (Maciej)
V9: https://lore.kernel.org/netdev/20250206060408.808325-1-yoong.siang.song@int…
- Remove the igc_desc_unused() checking (Maciej)
- Ensure that skb allocation and DMA mapping work before proceeding to
fill in igc_tx_buffer info, context desc, and data desc (Maciej)
- Rate limit the error messages (Maciej)
- Update the comment to indicate that the 2 descriptors needed by the
empty frame are already taken into consideration (Maciej)
- Handle the case where the insertion of an empty frame fails and
explain the reason behind (Maciej)
- put self SOB tag as last tag (Maciej)
V8: https://lore.kernel.org/netdev/20250205024116.798862-1-yoong.siang.song@int…
- check the number of used descriptor in xsk_tx_metadata_request()
by using used_desc of struct igc_metadata_request, and then decreases
the budget with it (Maciej)
- submit another bug fix patch to set the buffer type for empty frame (Maciej):
https://lore.kernel.org/netdev/20250205023603.798819-1-yoong.siang.song@int…
V7: https://lore.kernel.org/netdev/20250204004907.789330-1-yoong.siang.song@int…
- split the refactoring code of igc empty packet insertion into a separate
commit (Faizal)
- add explanation on why the value "4" is used as igc transmit budget
(Faizal)
- perform a stress test by sending 1000 packets with 10ms interval and
launch time set to 500us in the future (Faizal & Yong Liang)
V6: https://lore.kernel.org/netdev/20250116155350.555374-1-yoong.siang.song@int…
- fix selftest build errors by using asprintf() and realloc(), instead of
managing the buffer sizes manually (Daniel, Stanislav)
V5: https://lore.kernel.org/netdev/20250114152718.120588-1-yoong.siang.song@int…
- change netdev feature name from tx-launch-time to tx-launch-time-fifo
to explicitly state the FIFO behaviour (Stanislav)
- improve the looping of xdp_hw_metadata app to wait for packet tx
completion to be more readable by using clock_gettime() (Stanislav)
- add launch time setup steps into xdp_hw_metadata app (Stanislav)
V4: https://lore.kernel.org/netdev/20250106135506.9687-1-yoong.siang.song@intel…
- added XDP launch time support to the igc driver (Jesper & Florian)
- added per-driver launch time limitation on xsk-tx-metadata.rst (Jesper)
- added explanation on FIFO behavior on xsk-tx-metadata.rst (Jakub)
- added step to enable launch time in the commit message (Jesper & Willem)
- explicitly documented the type of launch_time and which clock source
it is against (Willem)
V3: https://lore.kernel.org/netdev/20231203165129.1740512-1-yoong.siang.song@in…
- renamed to use launch time (Jesper & Willem)
- changed the default launch time in xdp_hw_metadata apps from 1s to 0.1s
because some NICs do not support such a large future time.
V2: https://lore.kernel.org/netdev/20231201062421.1074768-1-yoong.siang.song@in…
- renamed to use Earliest TxTime First (Willem)
- renamed to use txtime (Willem)
V1: https://lore.kernel.org/netdev/20231130162028.852006-1-yoong.siang.song@int…
Song Yoong Siang (5):
xsk: Add launch time hardware offload support to XDP Tx metadata
selftests/bpf: Add launch time request to xdp_hw_metadata
net: stmmac: Add launch time support to XDP ZC
igc: Refactor empty frame insertion for launch time support
igc: Add launch time support to XDP ZC
Documentation/netlink/specs/netdev.yaml | 4 +
Documentation/networking/xsk-tx-metadata.rst | 62 +++++++
drivers/net/ethernet/intel/igc/igc.h | 1 +
drivers/net/ethernet/intel/igc/igc_main.c | 143 +++++++++++----
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 +
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 13 ++
include/net/xdp_sock.h | 10 ++
include/net/xdp_sock_drv.h | 1 +
include/uapi/linux/if_xdp.h | 10 ++
include/uapi/linux/netdev.h | 3 +
net/core/netdev-genl.c | 2 +
net/xdp/xsk.c | 3 +
tools/include/uapi/linux/if_xdp.h | 10 ++
tools/include/uapi/linux/netdev.h | 3 +
tools/testing/selftests/bpf/xdp_hw_metadata.c | 168 +++++++++++++++++-
15 files changed, 396 insertions(+), 39 deletions(-)
--
2.34.1
Correct the typo "adress" to "address" in a comment in conntrack_icmp_related.sh
to improve clarity.
Signed-off-by: Marcelo Moreira <marcelomoreira1905(a)gmail.com>
---
tools/testing/selftests/net/netfilter/conntrack_icmp_related.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/netfilter/conntrack_icmp_related.sh b/tools/testing/selftests/net/netfilter/conntrack_icmp_related.sh
index c63d840ead61..f63b7f12b36a 100755
--- a/tools/testing/selftests/net/netfilter/conntrack_icmp_related.sh
+++ b/tools/testing/selftests/net/netfilter/conntrack_icmp_related.sh
@@ -171,7 +171,7 @@ table inet filter {
}
EOF
-# make sure NAT core rewrites adress of icmp error if nat is used according to
+# make sure NAT core rewrites address of icmp error if nat is used according to
# conntrack nat information (icmp error will be directed at nsrouter1 address,
# but it needs to be routed to nsclient1 address).
ip netns exec "$nsrouter1" nft -f - <<EOF
--
2.48.1
Hi all,
This is v7 of the Rust/KUnit integration patch. I think all of the
suggestions have at least been responded to (even if there are a few I'm
leaving as either future projects or matters of taste). Hopefully this
is good-to-go for 6.15, so we can start using it concurrently with
making any additional improvements we may wish.
This series was originally written by José Expósito, and has been
modified and updated by Matt Gilbride, Miguel Ojeda, and myself. The
original version can be found here:
https://github.com/Rust-for-Linux/linux/pull/950
Add support for writing KUnit tests in Rust. While Rust doctests are
already converted to KUnit tests and run, they're really better suited
for examples, rather than as first-class unit tests.
This series implements a series of direct Rust bindings for KUnit tests,
as well as a new macro which allows KUnit tests to be written using a
close variant of normal Rust unit test syntax. The only change required
is replacing '#[cfg(test)]' with '#[kunit_tests(kunit_test_suite_name)]'
An example test would look like:
#[kunit_tests(rust_kernel_hid_driver)]
mod tests {
use super::*;
use crate::{c_str, driver, hid, prelude::*};
use core::ptr;
struct SimpleTestDriver;
impl Driver for SimpleTestDriver {
type Data = ();
}
#[test]
fn rust_test_hid_driver_adapter() {
let mut hid = bindings::hid_driver::default();
let name = c_str!("SimpleTestDriver");
static MODULE: ThisModule = unsafe { ThisModule::from_ptr(ptr::null_mut()) };
let res = unsafe {
<hid::Adapter<SimpleTestDriver> as driver::DriverOps>::register(&mut hid, name, &MODULE)
};
assert_eq!(res, Err(ENODEV)); // The mock returns -19
}
}
Please give this a go, and make sure I haven't broken it! There's almost
certainly a lot of improvements which can be made -- and there's a fair
case to be made for replacing some of this with generated C code which
can use the C macros -- but this is hopefully an adequate implementation
for now, and the interface can (with luck) remain the same even if the
implementation changes.
A few small notable missing features:
- Attributes (like the speed of a test) are hardcoded to the default
value.
- Similarly, the module name attribute is hardcoded to NULL. In C, we
use the KBUILD_MODNAME macro, but I couldn't find a way to use this
from Rust which wasn't more ugly than just disabling it.
- Assertions are not automatically rewritten to use KUnit assertions.
---
Changes since v6:
https://lore.kernel.org/rust-for-linux/20250214074051.1619256-1-davidgow@go…
- Fixed an [allow(unused_unsafe)] which ended up in patch 2 instead of
patch 1. (Thanks, Tamir!)
- Doc comments now have several useful links. (Thanks, Tamir!)
- Fix a potential compile error under macos. (Thanks, Tamir!)
- Several small tidy-ups to limit unsafe usage. (Thanks, Tamir!)
Changes since v5:
https://lore.kernel.org/all/20241213081035.2069066-1-davidgow@google.com/
- Rebased against 6.14-rc1
- Fixed a bunch of warnings / clippy lints introduced in Rust 1.83 and
1.84.
- No longer needs static_mut_refs / const_mut_refs, and is much cleaned
up as a result. (Thanks, Miguel)
- Major documentation and example fixes. (Thanks, Miguel)
Changes since v4:
https://lore.kernel.org/linux-kselftest/20241101064505.3820737-1-davidgow@g…
- Rebased against 6.13-rc1
- Allowed an unused_unsafe warning after the behaviour of addr_of_mut!()
changed in Rust 1.82. (Thanks Boqun, Miguel)
- "Expect" that the sample assert_eq!(1+1, 2) produces a clippy warning
due to a redundant assertion. (Thanks Boqun, Miguel)
- Fix some missing safety comments, and remove some unneeded 'unsafe'
blocks. (Thanks Boqun)
- Fix a couple of minor rustfmt issues which were triggering checkpatch
warnings.
Changes since v3:
https://lore.kernel.org/linux-kselftest/20241030045719.3085147-2-davidgow@g…
- The kunit_unsafe_test_suite!() macro now panic!s if the suite name is
too long, triggering a compile error. (Thanks, Alice!)
- The #[kunit_tests()] macro now preserves span information, so
errors can be better reported. (Thanks, Boqun!)
- The example tests have been updated to no longer use assert_eq!() with
a constant bool argument (which triggered a clippy warning now we
have the span info).
Changes since v2:
https://lore.kernel.org/linux-kselftest/20241029092422.2884505-1-davidgow@g…
- Include missing rust/macros/kunit.rs file from v2. (Thanks Boqun!)
- The kunit_unsafe_test_suite!() macro will truncate the name of the
suite if it is too long. (Thanks Alice!)
- The proc macro now emits an error if the suite name is too long.
- We no longer needlessly use UnsafeCell<> in
kunit_unsafe_test_suite!(). (Thanks Alice!)
Changes since v1:
https://lore.kernel.org/lkml/20230720-rustbind-v1-0-c80db349e3b5@google.com…
- Rebase on top of the latest rust-next (commit 718c4069896c)
- Make kunit_case a const fn, rather than a macro (Thanks Boqun)
- As a result, the null terminator is now created with
kernel::kunit::kunit_case_null()
- Use the C kunit_get_current_test() function to implement
in_kunit_test(), rather than re-implementing it (less efficiently)
ourselves.
Changes since the GitHub PR:
- Rebased on top of kselftest/kunit
- Add const_mut_refs feature
This may conflict with https://lore.kernel.org/lkml/20230503090708.2524310-6-nmi@metaspace.dk/
- Add rust/macros/kunit.rs to the KUnit MAINTAINERS entry
---
José Expósito (3):
rust: kunit: add KUnit case and suite macros
rust: macros: add macro to easily run KUnit tests
rust: kunit: allow to know if we are in a test
MAINTAINERS | 1 +
rust/kernel/kunit.rs | 199 +++++++++++++++++++++++++++++++++++++++++++
rust/macros/kunit.rs | 161 ++++++++++++++++++++++++++++++++++
rust/macros/lib.rs | 29 +++++++
4 files changed, 390 insertions(+)
create mode 100644 rust/macros/kunit.rs
--
2.48.1.601.g30ceb7b040-goog
Fix misspellings flagged by codespell.
Signed-off-by: Ujwal Kundur <ujwal.kundur(a)gmail.com>
---
tools/testing/selftests/mm/uffd-common.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/mm/uffd-common.c b/tools/testing/selftests/mm/uffd-common.c
index 717539eddf98..254409972ddb 100644
--- a/tools/testing/selftests/mm/uffd-common.c
+++ b/tools/testing/selftests/mm/uffd-common.c
@@ -323,7 +323,7 @@ int uffd_test_ctx_init(uint64_t features, const char **errmsg)
ret = userfaultfd_open(&features);
if (ret) {
if (errmsg)
- *errmsg = "possible lack of priviledge";
+ *errmsg = "possible lack of privilege";
return ret;
}
@@ -348,7 +348,7 @@ int uffd_test_ctx_init(uint64_t features, const char **errmsg)
/*
* After initialization of area_src, we must explicitly release pages
* for area_dst to make sure it's fully empty. Otherwise we could have
- * some area_dst pages be errornously initialized with zero pages,
+ * some area_dst pages be erroneously initialized with zero pages,
* hence we could hit memory corruption later in the test.
*
* One example is when THP is globally enabled, above allocate_area()
--
2.20.1
The driver for the 8250 console is not used, as no port is found.
Instead the prom0 bootconsole is used the whole time.
The prom driver translates '\n' to '\r\n' before handing of the message
off to the firmware. The firmware performs the same translation again.
In the final output produced by QEMU each line ends with '\r\r\n'.
This breaks the kunit parser, which can only handle '\r\n' and '\n'.
Use the Zilog console instead. It works correctly, is the one documented
by the QEMU manual and also saves a bit of codesize:
Before=4051011, After=4023326, chg -0.68%
Observed on QEMU 9.2.0.
Fixes: 87c9c1631788 ("kunit: tool: add support for QEMU")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
---
tools/testing/kunit/qemu_configs/sparc.py | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/testing/kunit/qemu_configs/sparc.py b/tools/testing/kunit/qemu_configs/sparc.py
index e975c4331a7c2a74f8ade61c3f31ff0d37314545..256d9573b44646533d1a6f768976628adc87921e 100644
--- a/tools/testing/kunit/qemu_configs/sparc.py
+++ b/tools/testing/kunit/qemu_configs/sparc.py
@@ -2,8 +2,9 @@ from ..qemu_config import QemuArchParams
QEMU_ARCH = QemuArchParams(linux_arch='sparc',
kconfig='''
-CONFIG_SERIAL_8250=y
-CONFIG_SERIAL_8250_CONSOLE=y''',
+CONFIG_SERIAL_SUNZILOG=y
+CONFIG_SERIAL_SUNZILOG_CONSOLE=y
+''',
qemu_arch='sparc',
kernel_path='arch/sparc/boot/zImage',
kernel_command_line='console=ttyS0 mem=256M',
---
base-commit: 2014c95afecee3e76ca4a56956a936e23283f05b
change-id: 20250214-kunit-qemu-sparc-console-73ece282d867
Best regards,
--
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
Syzbot caught an array out-of-bounds bug [1]. It turns out that when the
BPF program runs through do_misc_fixups(), it allocates an extra 8 bytes
on the call stack, which eventually causes stack_depth to exceed 512.
I was able to reproduce this issue probabilistically by enabling
CONFIG_UBSAN=y and disabling CONFIG_BPF_JIT_ALWAYS_ON with the selfttest
I provide in second patch(although it doesn't happen every time - I didn't
dig deeper into why UBSAN behaves this way).
Furthermore, if I set /proc/sys/net/core/bpf_jit_enable to 0 to disable
the jit, a panic occurs, and the reason is the same, that bpf_func is
assigned an incorrect address.
[---[ end trace ]---
[Oops: general protection fault, probably for non-canonical address
0x100f0e0e0d090808: 0000 [#1] PREEMPT SMP NOPTI
[Tainted: [W]=WARN, [O]=OOT_MODULE
[RIP: 0010:bpf_test_run+0x1d2/0x360
[RSP: 0018:ffffafc7955178a0 EFLAGS: 00010246
[RAX: 100f0e0e0d090808 RBX: ffff8e9fdb2c4100 RCX: 0000000000000018
[RDX: 00000000002b5b18 RSI: ffffafc780497048 RDI: ffff8ea04d601700
[RBP: ffffafc780497000 R08: ffffafc795517a0c R09: 0000000000000000
[R10: 0000000000000000 R11: fefefefefefefeff R12: ffff8ea04d601700
[R13: ffffafc795517928 R14: ffffafc795517928 R15: 0000000000000000
[CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[CR2: 00007f181c064648 CR3: 00000001aa2be003 CR4: 0000000000770ef0
[DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[PKRU: 55555554
[Call Trace:
[ <TASK>
[ ? die_addr+0x36/0x90
[ ? exc_general_protection+0x237/0x430
[ ? asm_exc_general_protection+0x26/0x30
[ ? bpf_test_run+0x1d2/0x360
[ ? bpf_test_run+0x10d/0x360
[ ? __link_object+0x12a/0x1e0
[ ? slab_build_skb+0x23/0x130
[ ? kmem_cache_alloc_noprof+0x2ea/0x3f0
[ ? sk_prot_alloc+0xc2/0x120
[ bpf_prog_test_run_skb+0x21b/0x590
[ __sys_bpf+0x340/0xa80
[ __x64_sys_bpf+0x1e/0x30
---
v2 -> v3:
Optimized some code naming and conditional judgment logic.
https://lore.kernel.org/bpf/20250213131214.164982-1-mrpre@163.com/T/
v1 -> v2:
Directly reject loading programs with a stack size greater than 512 when
jit disabled.(Suggested by Alexei Starovoitov)
https://lore.kernel.org/bpf/20250212135251.85487-1-mrpre@163.com/T/
Jiayuan Chen (3):
bpf: Fix array bounds error with may_goto
selftests/bpf: Introduce __load_if_JITed annotation for tests
selftests/bpf: Add selftest for may_goto
kernel/bpf/core.c | 19 +++++--
kernel/bpf/verifier.c | 7 +++
tools/testing/selftests/bpf/progs/bpf_misc.h | 2 +
.../selftests/bpf/progs/verifier_stack_ptr.c | 52 +++++++++++++++++++
tools/testing/selftests/bpf/test_loader.c | 26 ++++++++++
5 files changed, 102 insertions(+), 4 deletions(-)
--
2.47.1
Update error message in mptcp_connect.sh to use a more formal and
consistent wording. The phrase "not able to" was replaced with "Unable to"
for better clarity.
Signed-off-by: Marcelo Moreira <marcelomoreira1905(a)gmail.com>
---
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
index 5e3c56253274..bdc9c22af6be 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
@@ -261,8 +261,8 @@ check_mptcp_disabled()
# mainly to cover more code
if ! ip netns exec ${disabled_ns} sysctl net.mptcp >/dev/null; then
- mptcp_lib_pr_fail "not able to list net.mptcp sysctl knobs"
- mptcp_lib_result_fail "not able to list net.mptcp sysctl knobs"
+ mptcp_lib_pr_fail "Unable to list net.mptcp sysctl knobs"
+ mptcp_lib_result_fail "Unable to list net.mptcp sysctl knobs"
ret=${KSFT_FAIL}
return 1
fi
--
2.48.1
This is one of just 3 remaining "Test Module" kselftests (the others
being bitmap and scanf), the rest having been converted to KUnit.
I tested this using:
$ tools/testing/kunit/kunit.py run --arch arm64 --make_options LLVM=1 printf
I have also sent out a series converting scanf[0].
Link: https://lore.kernel.org/all/20250204-scanf-kunit-convert-v3-0-386d7c3ee714@… [0]
Signed-off-by: Tamir Duberstein <tamird(a)gmail.com>
---
Changes in v3:
- Remove extraneous trailing newlines from failure messages.
- Replace `pr_warn` with `kunit_warn`.
- Drop arch changes.
- Remove KUnit boilerplate from CONFIG_PRINTF_KUNIT_TEST help text.
- Restore `total_tests` counting.
- Remove tc_fail macro in last patch.
- Link to v2: https://lore.kernel.org/r/20250207-printf-kunit-convert-v2-0-057b23860823@g…
Changes in v2:
- Incorporate code review from prior work[0] by Arpitha Raghunandan.
- Link to v1: https://lore.kernel.org/r/20250204-printf-kunit-convert-v1-0-ecf1b846a4de@g…
Link: https://lore.kernel.org/lkml/20200817043028.76502-1-98.arpi@gmail.com/t/#u [0]
---
Tamir Duberstein (2):
printf: convert self-test to KUnit
printf: break kunit into test cases
Documentation/core-api/printk-formats.rst | 2 +-
MAINTAINERS | 2 +-
lib/Kconfig.debug | 12 +-
lib/Makefile | 2 +-
lib/{test_printf.c => printf_kunit.c} | 429 +++++++++++++-----------------
tools/testing/selftests/lib/config | 1 -
tools/testing/selftests/lib/printf.sh | 4 -
7 files changed, 192 insertions(+), 260 deletions(-)
---
base-commit: a64dcfb451e254085a7daee5fe51bf22959d52d3
change-id: 20250131-printf-kunit-convert-fd4012aa2ec6
Best regards,
--
Tamir Duberstein <tamird(a)gmail.com>
The current code returns -ENOMEM if test->bar[barno] is NULL.
There can be two reasons why test->bar[barno] is NULL:
1) The pci_ioremap_bar() call in pci_endpoint_test_probe() failed.
2) The BAR was skipped, because it is disabled by the endpoint.
Many PCI endpoint controller drivers will disable all BARs in their
init function. A disabled BAR will have a size of 0.
A PCI endpoint function driver will be able to enable any BAR that
is not marked as BAR_RESERVED (which means that the BAR should not
be touched by the EPF driver).
Thus, perform check if the size is 0, before checking if
test->bar[barno] is NULL, such that we can return different errors.
This will allow the selftests to return SKIP instead of FAIL for
disabled BARs.
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
Hello PCI maintainers.
This patch might give a trivial conflict with:
https://lore.kernel.org/linux-pci/20250123095906.3578241-2-cassel@kernel.or…
because the context lines (lines that haven't been changed)
might be different. If there is a conflict, simply look at
this patch by itself, and resolution should be trivial.
drivers/misc/pci_endpoint_test.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
index d5ac71a49386..b95980b29eb9 100644
--- a/drivers/misc/pci_endpoint_test.c
+++ b/drivers/misc/pci_endpoint_test.c
@@ -292,11 +292,13 @@ static int pci_endpoint_test_bar(struct pci_endpoint_test *test,
void *read_buf __free(kfree) = NULL;
struct pci_dev *pdev = test->pdev;
+ bar_size = pci_resource_len(pdev, barno);
+ if (!bar_size)
+ return -ENODATA;
+
if (!test->bar[barno])
return -ENOMEM;
- bar_size = pci_resource_len(pdev, barno);
-
if (barno == test->test_reg_bar)
bar_size = 0x4;
--
2.48.1
Instead of inlining equivalents, use lib.sh-provided primitives.
Use defer to manage vx lifetime.
This will make it easier to extend the test in the next patch.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Ido Schimmel <idosch(a)nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor(a)blackwall.org>
---
Notes:
CC: Simon Horman <horms(a)kernel.org>
CC: Shuah Khan <shuah(a)kernel.org>
CC: linux-kselftest(a)vger.kernel.org
.../net/test_vxlan_fdb_changelink.sh | 39 ++++++++++++-------
1 file changed, 24 insertions(+), 15 deletions(-)
diff --git a/tools/testing/selftests/net/test_vxlan_fdb_changelink.sh b/tools/testing/selftests/net/test_vxlan_fdb_changelink.sh
index 2d442cdab11e..6f2bca4b346c 100755
--- a/tools/testing/selftests/net/test_vxlan_fdb_changelink.sh
+++ b/tools/testing/selftests/net/test_vxlan_fdb_changelink.sh
@@ -1,29 +1,38 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0
-# Check FDB default-remote handling across "ip link set".
+ALL_TESTS="
+ test_set_remote
+"
+source lib.sh
check_remotes()
{
local what=$1; shift
local N=$(bridge fdb sh dev vx | grep 00:00:00:00:00:00 | wc -l)
- echo -ne "expected two remotes after $what\t"
- if [[ $N != 2 ]]; then
- echo "[FAIL]"
- EXIT_STATUS=1
- else
- echo "[ OK ]"
- fi
+ ((N == 2))
+ check_err $? "expected 2 remotes after $what, got $N"
}
-ip link add name vx up type vxlan id 2000 dstport 4789
-bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.20 self permanent
-bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.30 self permanent
-check_remotes "fdb append"
+# Check FDB default-remote handling across "ip link set".
+test_set_remote()
+{
+ RET=0
-ip link set dev vx type vxlan remote 192.0.2.30
-check_remotes "link set"
+ ip_link_add vx up type vxlan id 2000 dstport 4789
+ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.20 self permanent
+ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.30 self permanent
+ check_remotes "fdb append"
+
+ ip link set dev vx type vxlan remote 192.0.2.30
+ check_remotes "link set"
+
+ log_test 'FDB default-remote handling across "ip link set"'
+}
+
+trap defer_scopes_cleanup EXIT
+
+tests_run
-ip link del dev vx
exit $EXIT_STATUS
--
2.47.0
This helper could be useful to more than just forwarding tests.
Move it upstairs and port over to log_test_skip().
Split the function into two parts: the bit that actually checks and
reports skip, which is in a new function check_command(). And a bit
that exits the test script if the check fails. This allows users
consistent checking behavior while giving an option to bail out from
a single test without bailing out of the whole script.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Ido Schimmel <idosch(a)nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor(a)blackwall.org>
---
Notes:
CC: Simon Horman <horms(a)kernel.org>
CC: Shuah Khan <shuah(a)kernel.org>
CC: linux-kselftest(a)vger.kernel.org
tools/testing/selftests/net/forwarding/lib.sh | 10 ----------
tools/testing/selftests/net/lib.sh | 19 +++++++++++++++++++
2 files changed, 19 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 8de80acf249e..508f3c700d71 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -291,16 +291,6 @@ if [[ "$CHECK_TC" = "yes" ]]; then
check_tc_version
fi
-require_command()
-{
- local cmd=$1; shift
-
- if [[ ! -x "$(command -v "$cmd")" ]]; then
- echo "SKIP: $cmd not installed"
- exit $ksft_skip
- fi
-}
-
# IPv6 support was added in v3.0
check_mtools_version()
{
diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh
index 0bd9a038a1f0..975be4fdbcdb 100644
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -450,6 +450,25 @@ kill_process()
{ kill $pid && wait $pid; } 2>/dev/null
}
+check_command()
+{
+ local cmd=$1; shift
+
+ if [[ ! -x "$(command -v "$cmd")" ]]; then
+ log_test_skip "$cmd not installed"
+ return $EXIT_STATUS
+ fi
+}
+
+require_command()
+{
+ local cmd=$1; shift
+
+ if ! check_command "$cmd"; then
+ exit $EXIT_STATUS
+ fi
+}
+
ip_link_add()
{
local name=$1; shift
--
2.47.0