The following test regressions noticed while running selftests/mm gup_longterm
test cases on Dragonboard-845c, Dragonboard-410c, rock-pi-4, qemu-arm64 and
qemu-x86_64 this build have required selftest/mm/configs included and toolchain
is clang nightly.
Regressions found on Dragonboard-845c, Dragonboard-410c, rock-pi-4,
qemu-arm64 and qemu-x86_64
- selftests mm gup_longterm fails
Regression Analysis:
- New regression? Yes
- Reproducibility? Yes
Test regression: selftests mm gup_longterm error while loading shared
libraries liburing.so.2 cannot open shared object file No such file or
directory
Test regression: selftests mm cow error while loading shared libraries
liburing.so.2 cannot open shared object file No such file or directory
Test regression: selftests mm mlock-random-test exit=139
Test regression: selftests mm pagemap_ioctl exit=1
Test regression: selftests mm guard_regions file hole_punch
Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
## Test log
Linux version 6.15.0-next-20250606 (tuxmake@tuxmake) (Debian clang
version 21.0.0 (++20250602112323+c5a56f74fef7-1~exp1~20250602112342.1487),
Debian LLD 21.0.0) #1 SMP PREEMPT @1749190532
running ./gup_longterm
----------------------
./gup_longterm: error while loading shared libraries: liburing.so.2:
cannot open shared object file: No such file or directory
[FAIL]
not ok 14 gup_longterm # exit=127
./cow: error while loading shared libraries: liburing.so.2: cannot
open shared object file: No such file or directory
[FAIL]
not ok 50 cow # exit=127
running ./mlock-random-test
---------------------------
TAP version 13
1..2
[ 311.408456] traps: mlock-random-te[21661] general protection fault
ip:7f63210dbf0f sp:7ffdff6fca28 error:0 in
libc.so.6[adf0f,7f6321056000+165000]
[FAIL]
not ok 23 mlock-random-test # exit=139
running ./pagemap_ioctl
...
ok 53 Huge page testing: only two middle pages dirty
ok 54 # SKIP Hugetlb shmem testing: all new pages must not be written (dirty)
ok 55 # SKIP Hugetlb shmem testing: all pages must be written (dirty)
ok 56 # SKIP Hugetlb shmem testing: all pages dirty other than first
and the last one
ok 57 # SKIP Hugetlb shmem testing: PM_SCAN_WP_MATCHING | PM_SCAN_CHECK_WPASYNC
ok 58 # SKIP Hugetlb shmem testing: only middle page dirty
ok 59 # SKIP Hugetlb shmem testing: only two middle pages dirty
ok 60 # SKIP Hugetlb mem testing: all new pages must not be written (dirty)
ok 61 # SKIP Hugetlb mem testing: all pages must be written (dirty)
ok 62 # SKIP Hugetlb mem testing: all pages dirty other than first and
the last one
ok 63 # SKIP Hugetlb mem testing: PM_SCAN_WP_MATCHING |
PM_SCAN_CHECK_WPASYNC[ 241.731600] run_vmtests.sh (456): drop_caches:
3
ok 64 # SKIP Hugetlb mem testing: only middle page dirty
ok 65 # SKIP Hugetlb mem testing: only two middle pages dirty
Bail out! uffd-test creation failed 12 Cannot allocate memory
12 skipped test(s) detected. Consider enabling relevant config options
to improve coverage.
Planned tests != run tests (115 != 65)
Totals: pass:53 fail:0 xfail:0 xpass:0 skip:12 error:0
[FAIL]
# not ok 48 pagemap_ioctl # exit=1
running ./guard-regions
...
RUN guard_regions.file.hole_punch ...
guard-regions.c:1905:hole_punch:Expected madvise(&ptr[3 * page_size],
4 * page_size, MADV_REMOVE) (-1) == 0 (0)
hole_punch: Test terminated by assertion
FAIL guard_regions.file.hole_punch
not ok 80 guard_regions.file.hole_punch
## Source
* Kernel version: 6.16.0-rc2
* Git tree: https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
* Git sha: 050f8ad7b58d9079455af171ac279c4b9b828c11
* Git describe: next-20250616
* Project details:
https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250616/
* Architectures: arm64, x86_64
* Test environments: Dragonboard-845c, Dragonboard-410c, rock-pi-4,
qemu-arm64, qemu-x86_64 and x86
* Toolchains: clang nightly
* Kconfigs: selftest/mm/config+defconfig+
## Test
* Test log: https://qa-reports.linaro.org/api/testruns/28766026/log_file/
* Test log 2: https://qa-reports.linaro.org/api/testruns/28743077/log_file/
* Build details:
https://regressions.linaro.org/lkft/linux-next-master/next-20250616/kselfte…
* Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/2ya0viPHafKAe0u89drI…
* Kernel config:
https://storage.tuxsuite.com/public/linaro/lkft/builds/2ya0viPHafKAe0u89drI…
## Steps to reproduce
- tuxrun \
--runtime podman \
--device qemu-x86_64 \
--boot-args rw \
--kernel https://storage.tuxsuite.com/public/linaro/lkft/builds/2ya0wmVl0eHb9koWyQYC…
\
--rootfs https://storage.tuxboot.com/debian/20250605/trixie/amd64/rootfs.ext4.xz
\
--modules https://storage.tuxsuite.com/public/linaro/lkft/builds/2ya0wmVl0eHb9koWyQYC…
/usr/ \
--parameters MODULES_PATH=/usr/ \
--parameters
SQUAD_URL=https://qa-reports.linaro.org//api/submit/lkft/linux-next-master/…
\
--parameters SKIPFILE=skipfile-lkft.yaml \
--parameters
KSELFTEST=https://storage.tuxsuite.com/public/linaro/lkft/builds/2ya0wmVl0e…
\
--image docker.io/linaro/tuxrun-dispatcher:v1.2.2 \
--tests kselftest-mm \
--timeouts boot=15
--
Linaro LKFT
https://lkft.linaro.org
Add basic support to run various MIPS variants via kunit_tool using the
virtualized malta platform.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
---
Changes in v4:
- Rebase on v6.16-rc1
- Pick up reviews from David
- Clarify that GIC page is linked to vDSO
- Link to v3: https://lore.kernel.org/r/20250415-kunit-mips-v3-0-4ec2461b5a7e@linutronix.…
Changes in v3:
- Also skip VDSO_RANDOMIZE_SIZE adjustment for kthreads
- Link to v2: https://lore.kernel.org/r/20250414-kunit-mips-v2-0-4cf01e1a29e6@linutronix.…
Changes in v2:
- Fix usercopy kunit test by handling ABI-less tasks in stack_top()
- Drop change to mm initialization.
The broken test is not built by default anymore.
- Link to v1: https://lore.kernel.org/r/20250212-kunit-mips-v1-0-eb49c9d76615@linutronix.…
---
Thomas Weißschuh (2):
MIPS: Don't crash in stack_top() for tasks without ABI or vDSO
kunit: qemu_configs: Add MIPS configurations
arch/mips/kernel/process.c | 16 +++++++++-------
tools/testing/kunit/qemu_configs/mips.py | 18 ++++++++++++++++++
tools/testing/kunit/qemu_configs/mips64.py | 19 +++++++++++++++++++
tools/testing/kunit/qemu_configs/mips64el.py | 19 +++++++++++++++++++
tools/testing/kunit/qemu_configs/mipsel.py | 18 ++++++++++++++++++
5 files changed, 83 insertions(+), 7 deletions(-)
---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20241014-kunit-mips-e4fe1c265ed7
Best regards,
--
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
Tests may wish to add other interfaces to listen on. Notably locally
generated traffic uses dummy interfaces. The multicast daemon needs to know
about these so that it allows forming rules that involve these interfaces,
and so that net.ipv4.conf.X.mc_forwarding is set for the interfaces.
To that end, allow passing in a list of interfaces to configure in addition
to all the physical ones.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor(a)blackwall.org>
---
Notes:
v2:
- Adjust as per shellcheck citations
- Retain Nik's R-b, the changes were very minor.
---
CC: Shuah Khan <shuah(a)kernel.org>
CC: linux-kselftest(a)vger.kernel.org
tools/testing/selftests/net/forwarding/lib.sh | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 253847372062..83ee6a07e072 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -1760,9 +1760,12 @@ mc_send()
adf_mcd_start()
{
+ local ifs=("$@")
+
local table_name="$MCD_TABLE_NAME"
local smcroutedir
local pid
+ local if
local i
check_command "$MCD" || return 1
@@ -1776,6 +1779,16 @@ adf_mcd_start()
"$smcroutedir/$table_name.conf"
done
+ for if in "${ifs[@]}"; do
+ if ! ip_link_has_flag "$if" MULTICAST; then
+ ip link set dev "$if" multicast on
+ defer ip link set dev "$if" multicast off
+ fi
+
+ echo "phyint $if enable" >> \
+ "$smcroutedir/$table_name.conf"
+ done
+
"$MCD" -N -I "$table_name" -f "$smcroutedir/$table_name.conf" \
-P "$smcroutedir/$table_name.pid"
busywait "$BUSYWAIT_TIMEOUT" test -e "$smcroutedir/$table_name.pid"
--
2.49.0
Initially netpoll and netconsole were created together, and some
functions are in the wrong file. Seperate netconsole-only functions
in netconsole, avoiding exports.
1. Expose netpoll logging macros in the public header to enable consistent
log formatting across netpoll consumers.
2. Relocate netconsole-specific functions from netpoll to the netconsole
module where they are actually used, reducing unnecessary coupling.
3. Remove unnecessary function exports
4. Rename netpoll parsing functions in netconsole to better reflect their
specific usage.
5. Create a test to check that cmdline works fine. This was in my todo
list since [1], this was a good time to add it here to make sure this
patchset doesn't regress.
PS: The code was split in a way that it is easy to review. When copying
the functions from netpoll to netconsole, I do not change than other
than adding `static`. This will make checkpatch unhappy, but, further
patches will address the issues. It is done this way to make it easy for
reviewers.
Link: https://lore.kernel.org/netdev/Z36TlACdNMwFD7wv@dev-ushankar.dev.purestorag… [1]
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
Changes in v3:
- The cleanup on the netcons_cmdline.sh test was not cleaning the
netdevsim. Clean it at the end of the test. (Jakub)
- Link to v2: https://lore.kernel.org/r/20250611-rework-v2-0-ab1d92b458ca@debian.org
Changes in v2:
- No change in the code. Just rebased the patches onto netnext/main
- Link to v1: https://lore.kernel.org/r/20250610-rework-v1-0-7cfde283f246@debian.org
---
Breno Leitao (8):
netpoll: remove __netpoll_cleanup from exported API
netpoll: expose netpoll logging macros in public header
netpoll: relocate netconsole-specific functions to netconsole module
netpoll: move netpoll_print_options to netconsole
netconsole: rename functions to better reflect their purpose
netconsole: improve code style in parser function
selftests: net: Refactor cleanup logic in lib_netcons.sh
selftests: net: add netconsole test for cmdline configuration
drivers/net/netconsole.c | 137 ++++++++++++++++++++-
include/linux/netpoll.h | 10 +-
net/core/netpoll.c | 136 +-------------------
tools/testing/selftests/drivers/net/Makefile | 1 +
.../selftests/drivers/net/lib/sh/lib_netcons.sh | 59 ++++++---
.../selftests/drivers/net/netcons_cmdline.sh | 52 ++++++++
6 files changed, 240 insertions(+), 155 deletions(-)
---
base-commit: 6d4e01d29d87356924f1521ca6df7a364e948f13
change-id: 20250603-rework-c175cad8d22e
Best regards,
--
Breno Leitao <leitao(a)debian.org>
The following build warnings were noticed while building selftests/mm
with clang nightly toolchain for arm64 and x86_64 architectures.
Regressions found on arm64 and x86_64
- Build/clang-nightly-lkftconfig-kselftest
Regression Analysis:
- New regression? Yes
- Reproducibility? Yes
Build regression: selftests mm pkey_sighandler_tests.c warning
duplicate 'inline' declaration specifier [-Wduplicate-decl-specifier]
Build regression: selftests mm mremap_test.c warning pointer
comparison always evaluates to false [-Wtautological-compare]
Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
## Build log
make[4]: Entering directory '/builds/linux/tools/testing/selftests/mm'
/bin/sh ./check_config.sh clang --target=aarch64-linux-gnu
-fintegrated-as -Werror=unknown-warning-option
-Werror=ignored-optimization-argument -Werror=option-ignored
-Werror=unused-command-line-argument --target=aarch64-linux-gnu
-fintegrated-as
CC cow
CC compaction_test
CC gup_longterm
CC gup_test
CC hmm-tests
CC hugetlb-madvise
CC hugetlb-read-hwpoison
CC hugetlb-soft-offline
CC hugepage-mmap
CC hugepage-mremap
CC hugepage-shm
CC hugepage-vmemmap
CC khugepaged
CC madv_populate
CC map_fixed_noreplace
CC map_hugetlb
CC map_populate
CC memfd_secret
CC migration
CC mkdirty
CC mlock-random-test
CC mlock2-tests
CC mrelease_test
CC mremap_dontunmap
CC mremap_test
mremap_test.c:425:31: warning: pointer comparison always evaluates to
false [-Wtautological-compare]
425 | if (addr + c.dest_alignment < addr) {
| ^
1 warning generated.
CC mseal_test
CC on-fault-limit
CC pagemap_ioctl
CC pfnmap
CC thuge-gen
CC transhuge-stress
CC uffd-stress
CC uffd-unit-tests
CC uffd-wp-mremap
CC split_huge_page_test
CC ksm_tests
CC ksm_functional_tests
CC mdwe_test
CC hugetlb_fault_after_madv
CC hugetlb_madv_vs_map
CC hugetlb_dio
CC droppable
CC guard-regions
CC merge
CC protection_keys
CC pkey_sighandler_tests
pkey_sighandler_tests.c:44:15: warning: duplicate 'inline' declaration
specifier [-Wduplicate-decl-specifier]
44 | static inline __always_inline
| ^
/usr/lib/gcc-cross/aarch64-linux-gnu/12/../../../../aarch64-linux-gnu/include/sys/cdefs.h:424:26:
note: expanded from macro '__always_inline'
424 | # define __always_inline __inline __attribute__ ((__always_inline__))
| ^
1 warning generated.
CC va_high_addr_switch
CC virtual_address_range
CC write_to_hugetlbfs
Warning: missing Module.symvers, please have the kernel built first.
page_frag test will be skipped.
make[4]: Leaving directory '/builds/linux/tools/testing/selftests/mm'
## Source
* Kernel version: 6.16.0-rc2
* Git tree: https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
* Git sha: 050f8ad7b58d9079455af171ac279c4b9b828c11
* Git describe: next-20250616
* Project details:
https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250616/
* Architectures: arm64, x86_64
* Toolchains: clang nightly
* Kconfigs: selftest/mm/config+defconfig+
## Build arm64
* Build log: https://qa-reports.linaro.org/api/testruns/28765515/log_file/
* Build details:
https://regressions.linaro.org/lkft/linux-next-master/next-20250616/log-par…
* Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/2ya0viPHafKAe0u89drI…
* Kernel config:
https://storage.tuxsuite.com/public/linaro/lkft/builds/2ya0viPHafKAe0u89drI…
## Steps to reproduce on arm64
- tuxmake --runtime podman --target-arch arm64 --toolchain clang-20 \
--kconfig defconfig \
--kconfig-add
https://gitlab.com/Linaro/lkft/kernel-fragments/-/raw/main/netdev.config
\
--kconfig-add
https://gitlab.com/Linaro/lkft/kernel-fragments/-/raw/main/systemd.config
\
--kconfig-add CONFIG_SYN_COOKIES=y \
--kconfig-add CONFIG_SCHEDSTATS=y LLVM=1 LLVM_IAS=1 debugkernel
dtbs dtbs-legacy headers kernel kselftest modules
--
Linaro LKFT
https://lkft.linaro.org
This patch series introduces a new feature to netconsole which allows
appending a message ID to the userdata dictionary.
If the msgid feature is enabled, the message ID is built from a per-target 32
bit counter that is incremented and appended to every message sent to the target.
Example::
echo 1 > "/sys/kernel/config/netconsole/cmdline0/userdata/msgid_enabled"
echo "This is message #1" > /dev/kmsg
echo "This is message #2" > /dev/kmsg
13,434,54928466,-;This is message #1
msgid=1
13,435,54934019,-;This is message #2
msgid=2
This feature can be used by the target to detect if messages were dropped or
reordered before reaching the target. This allows system administrators to
assess the reliability of their netconsole pipeline and detect loss of messages
due to network contention or temporary unavailability.
Suggested-by: Breno Leitao <leitao(a)debian.org>
Signed-off-by: Gustavo Luiz Duarte <gustavold(a)gmail.com>
---
Changes in v2:
- Use wrapping_assign_add() to avoid warnings in UBSAN and friends.
- Improve documentation to clarify wrapping and distinguish msgid from sequnum.
- Rebase and fix conflict in prepare_extradata().
- Link to v1: https://lore.kernel.org/r/20250611-netconsole-msgid-v1-0-1784a51feb1e@gmail…
---
Gustavo Luiz Duarte (5):
netconsole: introduce 'msgid' as a new sysdata field
netconsole: implement configfs for msgid_enabled
netconsole: append msgid to sysdata
selftests: netconsole: Add tests for 'msgid' feature in sysdata
docs: netconsole: document msgid feature
Documentation/networking/netconsole.rst | 32 +++++++++++
drivers/net/netconsole.c | 65 ++++++++++++++++++++++
.../selftests/drivers/net/netcons_sysdata.sh | 30 ++++++++++
3 files changed, 127 insertions(+)
---
base-commit: 535de528015b56e34a40a8e1eb1629fadf809a84
change-id: 20250609-netconsole-msgid-b93c6f8e9c60
Best regards,
--
Gustavo Luiz Duarte <gustavold(a)gmail.com>
From: Chia-Yu Chang <chia-yu.chang(a)nokia-bell-labs.com>
Hello,
Please find the DualPI2 patch v18.
This patch serise adds DualPI Improved with a Square (DualPI2) with following features:
* Supports congestion controls that comply with the Prague requirements in RFC9331 (e.g. TCP-Prague)
* Coupled dual-queue that separates the L4S traffic in a low latency queue (L-queue), without harming remaining traffic that is scheduled in classic queue (C-queue) due to congestion-coupling using PI2 as defined in RFC9332
* Configurable overload strategies
* Use of sojourn time to reliably estimate queue delay
* Supports ECN L4S-identifier (IP.ECN==0b*1) to classify traffic into respective queues
For more details of DualPI2, please refer IETF RFC9332 (https://datatracker.ietf.org/doc/html/rfc9332).
Best regards,
Chia-Yu
---
v19 (14-Jun-2025)
- Fix one typo in the comment of #1 (ALOK TIWARI <alok.a.tiwari(a)oracle.com>)
- Update commit message of #4 (ALOK TIWARI <alok.a.tiwari(a)oracle.com>)
- Wrap long lines of Documentation/netlink/specs/tc.yaml to within 80 characters (Jakub Kicinski <kuba(a)kernel.org>)
v18 (13-Jun-2025)
- Add the num of enum used by DualPI2 and fix name and name-prefix of DualPI2 enum and attribute
- Replace from_timer() with timer_container_of() (Pedro Tammela <pctammela(a)mojatatu.com>)
v17 (25-May-2025, Resent at 11-Jun-2025)
- Replace 0xffffffff with U32_MAX (Paolo Abeni <pabeni(a)redhat.com>)
- Use helper function qdisc_dequeue_internal() and add new helper function skb_apply_step() (Paolo Abeni <pabeni(a)redhat.com>)
- Add s64 casting when calculating the delta of the PI controller (Paolo Abeni <pabeni(a)redhat.com>)
- Change the drop reason into SKB_DROP_REASON_QDISC_CONGESTED for drop_early (Paolo Abeni <pabeni(a)redhat.com>)
- Modify the condition to remove the original skb when enqueuing multiple GSO segments (Paolo Abeni <pabeni(a)redhat.com>)
- Add READ_ONCE() in dualpi2_dump_stat() (Paolo Abeni <pabeni(a)redhat.com>)
- Add comments, brackets, and brackets for readability (Paolo Abeni <pabeni(a)redhat.com>)
v16 (16-MAy-2025)
- Add qdisc_lock() to dualpi2_timer() in dualpi2_timer (Paolo Abeni <pabeni(a)redhat.com>)
- Introduce convert_ns_to_usec() to convert usec to nsec without overflow in #1 (Paolo Abeni <pabeni(a)redhat.com>)
- Update convert_us_tonsec() to convert nsec to usec without overflow in #2 (Paolo Abeni <pabeni(a)redhat.com>)
- Add more descriptions with respect to DualPI2 in the cover ltter and add changelog in each patch (Paolo Abeni <pabeni(a)redhat.com>)
v15 (09-May-2025)
- Add enum of TCA_DUALPI2_ECN_MASK_CLA_ECT to remove potential leakeage in #1 (Simon Horman <horms(a)kernel.org>)
- Fix one typo in comment of #2
- Update tc.yaml in #5 to aligh with the updated enum of pkt_sched.h
v14 (05-May-2025)
- Modify tc.yaml: (1) Replace flags with enum and remove enum-as-flags, (2) Remove credit-queue in xstats, and (3) Change attribute types (Donald Hunter <donald.hun
- Add enum and fix the ordering of variables in pkt_sched.h to align with the modified tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Add validators for DROP_OVERLOAD, DROP_EARLY, ECN_MASK, and SPLIT_GSO in sch_dualpi2.c (Donald Hunter <donald.hunter(a)gmail.com>)
- Update dualpi2.json to align with the updated variable order in pkt_sched.h
- Reorder patches (Donald Hunter <donald.hunter(a)gmail.com>)
v13 (26-Apr-2025)
- Use dashes in member names to follow YNL conventions in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Define enumerations separately for flags of drop-early, drop-overload, ecn-mask, credit-queue in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Change the types of split-gso and step-packets into flag in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Revert to u32/u8 types for tc-dualpi2-xstats members in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Add new test cases in tc-tests/qdiscs/dualpi2.json to cover all dualpi2 parameters (Donald Hunter <donald.hunter(a)gmail.com>)
- Change the type of TCA_DUALPI2_STEP_PACKETS into NLA_FLAG (Donald Hunter <donald.hunter(a)gmail.com>)
v12 (22-Apr-2025)
- Remove anonymous struct in sch_dualpi2.c (Paolo Abeni <pabeni(a)redhat.com>)
- Replace u32/u8 with uint and s32 with int in tc spec document (Paolo Abeni <pabeni(a)redhat.com>)
- Introduce get_memory_limit function to handle potential overflow when multipling limit with MTU (Paolo Abeni <pabeni(a)redhat.com>)
- Double the packet length to further include packet overhead in memory_limit (Paolo Abeni <pabeni(a)redhat.com>)
- Remove the check of qdisc_qlen(sch) when calling qdisc_tree_reduce_backlog (Paolo Abeni <pabeni(a)redhat.com>)
v11 (15-Apr-2025)
- Replace hstimer_init with hstimer_setup in sch_dualpi2.c
v10 (25-Mar-2025)
- Remove leftover include in include/linux/netdevice.h and anonymous struct in sch_dualpi2.c (Paolo Abeni <pabeni(a)redhat.com>)
- Use kfree_skb_reason() and add SKB_DROP_REASON_DUALPI2_STEP_DROP drop reason (Paolo Abeni <pabeni(a)redhat.com>)
- Split sch_dualpi2.c into 3 patches (and overall 5 patches): Struct definition & parsing, Dump stats & configuration, Enqueue/Dequeue (Paolo Abeni <pabeni(a)redhat.com>)
v9 (16-Mar-2025)
- Fix mem_usage error in previous version
- Add min_qlen_step to the dualpi2 attribute as the minimum queue length in number of packets in the L-queue to start step threshold marking.
In previous versions, this value was fixed to 2, so the step threshold was applied to mark packets in the L queue only when the queue length of the L queue was greater than or equal to 2 packets.
This will cause larger queuing delays for L4S traffic at low rates (<20Mbps). So we parameterize it and change the default value to 0.
Comparison of tcp_1down run 'HTB 20Mbit + DUALPI2 + 10ms base delay'
Old versions:
avg median # data pts
Ping (ms) ICMP : 11.55 11.70 ms 350
TCP upload avg : 18.96 N/A Mbits/s 350
TCP upload sum : 18.96 N/A Mbits/s 350
New version (v9):
avg median # data pts
Ping (ms) ICMP : 10.81 10.70 ms 350
TCP upload avg : 18.91 N/A Mbits/s 350
TCP upload sum : 18.91 N/A Mbits/s 350
Comparison of tcp_1down run 'HTB 10Mbit + DUALPI2 + 10ms base delay'
Old versions:
avg median # data pts
Ping (ms) ICMP : 12.61 12.80 ms 350
TCP upload avg : 9.48 N/A Mbits/s 350
TCP upload sum : 9.48 N/A Mbits/s 350
New version (v9):
avg median # data pts
Ping (ms) ICMP : 11.06 10.80 ms 350
TCP upload avg : 9.43 N/A Mbits/s 350
TCP upload sum : 9.43 N/A Mbits/s 350
Comparison of tcp_1down run 'HTB 10Mbit + DUALPI2 + 10ms base delay'
Old versions:
avg median # data pts
Ping (ms) ICMP : 40.86 37.45 ms 350
TCP upload avg : 0.88 N/A Mbits/s 350
TCP upload sum : 0.88 N/A Mbits/s 350
TCP upload::1 : 0.88 0.97 Mbits/s 350
New version (v9):
avg median # data pts
Ping (ms) ICMP : 11.07 10.40 ms 350
TCP upload avg : 0.55 N/A Mbits/s 350
TCP upload sum : 0.55 N/A Mbits/s 350
TCP upload::1 : 0.55 0.59 Mbits/s 350
v8 (11-Mar-2025)
- Fix warning messages in v7
v7 (07-Mar-2025)
- Separate into 3 patches to avoid mixing changes of documentation, selftest, and code. (Cong Wang <xiyou.wangcong(a)gmail.com>)
v6 (04-Mar-2025)
- Add modprobe for dulapi2 in tc-testing script tc-testing/tdc.sh (Jakub Kicinski <kuba(a)kernel.org>)
- Update test cases in dualpi2.json
- Update commit message
v5 (22-Feb-2025)
- A comparison was done between MQ + DUALPI2, MQ + FQ_PIE, MQ + FQ_CODEL:
Unshaped 1gigE with 4 download streams test:
- Summary of tcp_4down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 1.19 1.34 ms 349
TCP download avg : 235.42 N/A Mbits/s 349
TCP download sum : 941.68 N/A Mbits/s 349
TCP download::1 : 235.19 235.39 Mbits/s 349
TCP download::2 : 235.03 235.35 Mbits/s 349
TCP download::3 : 236.89 235.44 Mbits/s 349
TCP download::4 : 234.57 235.19 Mbits/s 349
- Summary of tcp_4down run 'MQ + FQ_PIE'
avg median # data pts
Ping (ms) ICMP : 1.21 1.37 ms 350
TCP download avg : 235.42 N/A Mbits/s 350
TCP download sum : 941.61 N/A Mbits/s 350
TCP download::1 : 232.54 233.13 Mbits/s 350
TCP download::2 : 232.52 232.80 Mbits/s 350
TCP download::3 : 233.14 233.78 Mbits/s 350
TCP download::4 : 243.41 241.48 Mbits/s 350
- Summary of tcp_4down run 'MQ + DUALPI2'
avg median # data pts
Ping (ms) ICMP : 1.19 1.34 ms 349
TCP download avg : 235.42 N/A Mbits/s 349
TCP download sum : 941.68 N/A Mbits/s 349
TCP download::1 : 235.19 235.39 Mbits/s 349
TCP download::2 : 235.03 235.35 Mbits/s 349
TCP download::3 : 236.89 235.44 Mbits/s 349
TCP download::4 : 234.57 235.19 Mbits/s 349
Unshaped 1gigE with 128 download streams test:
- Summary of tcp_128down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 1.88 1.86 ms 350
TCP download avg : 7.39 N/A Mbits/s 350
TCP download sum : 946.47 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + FQ_PIE':
avg median # data pts
Ping (ms) ICMP : 1.88 1.86 ms 350
TCP download avg : 7.39 N/A Mbits/s 350
TCP download sum : 946.47 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + DUALPI2':
avg median # data pts
Ping (ms) ICMP : 1.88 1.86 ms 350
TCP download avg : 7.39 N/A Mbits/s 350
TCP download sum : 946.47 N/A Mbits/s 350
Unshaped 10gigE with 4 download streams test:
- Summary of tcp_4down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 0.22 0.23 ms 350
TCP download avg : 2354.08 N/A Mbits/s 350
TCP download sum : 9416.31 N/A Mbits/s 350
TCP download::1 : 2353.65 2352.81 Mbits/s 350
TCP download::2 : 2354.54 2354.21 Mbits/s 350
TCP download::3 : 2353.56 2353.78 Mbits/s 350
TCP download::4 : 2354.56 2354.45 Mbits/s 350
- Summary of tcp_4down run 'MQ + FQ_PIE':
avg median # data pts
Ping (ms) ICMP : 0.20 0.19 ms 350
TCP download avg : 2354.76 N/A Mbits/s 350
TCP download sum : 9419.04 N/A Mbits/s 350
TCP download::1 : 2354.77 2353.89 Mbits/s 350
TCP download::2 : 2353.41 2354.29 Mbits/s 350
TCP download::3 : 2356.18 2354.19 Mbits/s 350
TCP download::4 : 2354.68 2353.15 Mbits/s 350
- Summary of tcp_4down run 'MQ + DUALPI2':
avg median # data pts
Ping (ms) ICMP : 0.24 0.24 ms 350
TCP download avg : 2354.11 N/A Mbits/s 350
TCP download sum : 9416.43 N/A Mbits/s 350
TCP download::1 : 2354.75 2353.93 Mbits/s 350
TCP download::2 : 2353.15 2353.75 Mbits/s 350
TCP download::3 : 2353.49 2353.72 Mbits/s 350
TCP download::4 : 2355.04 2353.73 Mbits/s 350
Unshaped 10gigE with 128 download streams test:
- Summary of tcp_128down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 7.57 8.69 ms 350
TCP download avg : 73.97 N/A Mbits/s 350
TCP download sum : 9467.82 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + FQ_PIE':
avg median # data pts
Ping (ms) ICMP : 7.82 8.91 ms 350
TCP download avg : 73.97 N/A Mbits/s 350
TCP download sum : 9468.42 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + DUALPI2':
avg median # data pts
Ping (ms) ICMP : 6.87 7.93 ms 350
TCP download avg : 73.95 N/A Mbits/s 350
TCP download sum : 9465.87 N/A Mbits/s 350
From the results shown above, we see small differences between combinations.
- Update commit message to include results of no_split_gso and split_gso (Dave Taht <dave.taht(a)gmail.com> and Paolo Abeni <pabeni(a)redhat.com>)
- Add memlimit in the dualpi2 attribute, and add memory_used, max_memory_used, memory_limit in dualpi2 stats (Dave Taht <dave.taht(a)gmail.com>)
- Update note in sch_dualpi2.c related to BBRv3 status (Dave Taht <dave.taht(a)gmail.com>)
- Update license identifier (Dave Taht <dave.taht(a)gmail.com>)
- Add selftest in tools/testing/selftests/tc-testing (Cong Wang <xiyou.wangcong(a)gmail.com>)
- Use netlink policies for parameter checks (Jamal Hadi Salim <jhs(a)mojatatu.com>)
- Modify texts & fix typos in Documentation/netlink/specs/tc.yaml (Dave Taht <dave.taht(a)gmail.com>)
- Add descriptions of packet counter statistics and the reset function of sch_dualpi2.c
- Fix step_thresh in packets
- Update code comments in sch_dualpi2.c
v4 (22-Oct-2024)
- Update statement in Kconfig for DualPI2 (Stephen Hemminger <stephen(a)networkplumber.org>)
- Put a blank line after #define in sch_dualpi2.c (Stephen Hemminger <stephen(a)networkplumber.org>)
- Fix line length warning.
v3 (19-Oct-2024)
- Fix compilaiton error
- Update Documentation/netlink/specs/tc.yaml (Jakub Kicinski <kuba(a)kernel.org>)
v2 (18-Oct-2024)
- Add Documentation/netlink/specs/tc.yaml (Jakub Kicinski <kuba(a)kernel.org>)
- Use dualpi2 instead of skb prefix (Jamal Hadi Salim <jhs(a)mojatatu.com>)
- Replace nla_parse_nested_deprecated with nla_parse_nested (Jamal Hadi Salim <jhs(a)mojatatu.com>)
- Fix line length warning
---
Chia-Yu Chang (4):
sched: Struct definition and parsing of dualpi2 qdisc
sched: Dump configuration and statistics of dualpi2 qdisc
selftests/tc-testing: Add selftests for qdisc DualPI2
Documentation: netlink: specs: tc: Add DualPI2 specification
Koen De Schepper (1):
sched: Add enqueue/dequeue of dualpi2 qdisc
Documentation/netlink/specs/tc.yaml | 166 +++
include/net/dropreason-core.h | 6 +
include/uapi/linux/pkt_sched.h | 70 +-
net/sched/Kconfig | 12 +
net/sched/Makefile | 1 +
net/sched/sch_dualpi2.c | 1146 +++++++++++++++++
tools/testing/selftests/tc-testing/config | 1 +
.../tc-testing/tc-tests/qdiscs/dualpi2.json | 254 ++++
tools/testing/selftests/tc-testing/tdc.sh | 1 +
9 files changed, 1656 insertions(+), 1 deletion(-)
create mode 100644 net/sched/sch_dualpi2.c
create mode 100644 tools/testing/selftests/tc-testing/tc-tests/qdiscs/dualpi2.json
--
2.34.1
Check hugetlbfs support before starting tests in run_hugetlbfs_test.sh.
Otherwise on a system that does not support hugetlbfs the free huge
pages availability check will fail with:
./run_hugetlbfs_test.sh: line 47: [: -lt: unary operator expected
./run_hugetlbfs_test.sh: line 60: 12577 Aborted (core dumped) ./memfd_test hugetlbfs
Aborted (core dumped)
And it will left a fuse_mnt process behind, which may cause some
unexpected issues.
Patch tested with a kernel that does not have hugetlbfs support enabled
and the test was skipped as expected.
Po-Hsu Lin (1):
selftests/memfd: skip hugetlbfs test if not supported
tools/testing/selftests/memfd/run_hugetlbfs_test.sh | 5 +++++
1 file changed, 5 insertions(+)
--
2.34.1
This started with a patch that enabled `clippy::ptr_as_ptr`. Benno
Lossin suggested I also look into `clippy::ptr_cast_constness` and I
discovered `clippy::as_ptr_cast_mut`. This series now enables all 3
lints. It also enables `clippy::as_underscore` which ensures other
pointer casts weren't missed.
As a later addition, `clippy::cast_lossless` and `clippy::ref_as_ptr`
are also enabled.
This series depends on "rust: retain pointer mut-ness in
`container_of!`"[1].
Link: https://lore.kernel.org/all/20250409-container-of-mutness-v1-1-64f472b94534… [1]
Signed-off-by: Tamir Duberstein <tamird(a)gmail.com>
---
Changes in v11:
- Rebase on v6.16-rc1.
- Replace some `as <integer>` with `as bindings::T` and others with `as
ffi::T`. (Miguel Ojeda)
- Revert explicit `ffi::c_void` import which is in the prelude. (Miguel Ojeda)
- Link to v10: https://lore.kernel.org/r/20250418-ptr-as-ptr-v10-0-3d63d27907aa@gmail.com
Changes in v10:
- Move fragment from "rust: enable `clippy::ptr_cast_constness` lint" to
"rust: enable `clippy::ptr_as_ptr` lint". (Boqun Feng)
- Replace `(...).into()` with `T::from(...)` where the destination type
isn't obvious in "rust: enable `clippy::cast_lossless` lint". (Boqun
Feng)
- Link to v9: https://lore.kernel.org/r/20250416-ptr-as-ptr-v9-0-18ec29b1b1f3@gmail.com
Changes in v9:
- Replace ref-to-ptr coercion using `let` bindings with
`core::ptr::from_{ref,mut}`. (Boqun Feng).
- Link to v8: https://lore.kernel.org/r/20250409-ptr-as-ptr-v8-0-3738061534ef@gmail.com
Changes in v8:
- Use coercion to go ref -> ptr.
- rustfmt.
- Rebase on v6.15-rc1.
- Extract first commit to its own series as it is shared with other
series.
- Link to v7: https://lore.kernel.org/r/20250325-ptr-as-ptr-v7-0-87ab452147b9@gmail.com
Changes in v7:
- Add patch to enable `clippy::ref_as_ptr`.
- Link to v6: https://lore.kernel.org/r/20250324-ptr-as-ptr-v6-0-49d1b7fd4290@gmail.com
Changes in v6:
- Drop strict provenance patch.
- Fix URLs in doc comments.
- Add patch to enable `clippy::cast_lossless`.
- Rebase on rust-next.
- Link to v5: https://lore.kernel.org/r/20250317-ptr-as-ptr-v5-0-5b5f21fa230a@gmail.com
Changes in v5:
- Use `pointer::addr` in OF. (Boqun Feng)
- Add documentation on stubs. (Benno Lossin)
- Mark stubs `#[inline]`.
- Pick up Alice's RB on a shared commit from
https://lore.kernel.org/all/Z9f-3Aj3_FWBZRrm@google.com/.
- Link to v4: https://lore.kernel.org/r/20250315-ptr-as-ptr-v4-0-b2d72c14dc26@gmail.com
Changes in v4:
- Add missing SoB. (Benno Lossin)
- Use `without_provenance_mut` in alloc. (Boqun Feng)
- Limit strict provenance lints to the `kernel` crate to avoid complex
logic in the build system. This can be revisited on MSRV >= 1.84.0.
- Rebase on rust-next.
- Link to v3: https://lore.kernel.org/r/20250314-ptr-as-ptr-v3-0-e7ba61048f4a@gmail.com
Changes in v3:
- Fixed clippy warning in rust/kernel/firmware.rs. (kernel test robot)
Link: https://lore.kernel.org/all/202503120332.YTCpFEvv-lkp@intel.com/
- s/as u64/as bindings::phys_addr_t/g. (Benno Lossin)
- Use strict provenance APIs and enable lints. (Benno Lossin)
- Link to v2: https://lore.kernel.org/r/20250309-ptr-as-ptr-v2-0-25d60ad922b7@gmail.com
Changes in v2:
- Fixed typo in first commit message.
- Added additional patches, converted to series.
- Link to v1: https://lore.kernel.org/r/20250307-ptr-as-ptr-v1-1-582d06514c98@gmail.com
---
Tamir Duberstein (6):
rust: enable `clippy::ptr_as_ptr` lint
rust: enable `clippy::ptr_cast_constness` lint
rust: enable `clippy::as_ptr_cast_mut` lint
rust: enable `clippy::as_underscore` lint
rust: enable `clippy::cast_lossless` lint
rust: enable `clippy::ref_as_ptr` lint
Makefile | 6 ++++
drivers/gpu/drm/drm_panic_qr.rs | 4 +--
rust/bindings/lib.rs | 3 ++
rust/kernel/alloc/allocator_test.rs | 2 +-
rust/kernel/alloc/kvec.rs | 4 +--
rust/kernel/block/mq/operations.rs | 2 +-
rust/kernel/block/mq/request.rs | 11 +++++--
rust/kernel/device.rs | 4 +--
rust/kernel/device_id.rs | 4 +--
rust/kernel/devres.rs | 17 +++++------
rust/kernel/dma.rs | 6 ++--
rust/kernel/drm/device.rs | 6 ++--
rust/kernel/error.rs | 2 +-
rust/kernel/firmware.rs | 3 +-
rust/kernel/fs/file.rs | 2 +-
rust/kernel/io.rs | 18 ++++++------
rust/kernel/kunit.rs | 11 ++++---
rust/kernel/list/impl_list_item_mod.rs | 2 +-
rust/kernel/miscdevice.rs | 2 +-
rust/kernel/mm/virt.rs | 52 +++++++++++++++++-----------------
rust/kernel/net/phy.rs | 4 +--
rust/kernel/of.rs | 6 ++--
rust/kernel/pci.rs | 11 ++++---
rust/kernel/platform.rs | 4 ++-
rust/kernel/print.rs | 6 ++--
rust/kernel/seq_file.rs | 2 +-
rust/kernel/str.rs | 14 ++++-----
rust/kernel/sync/poll.rs | 2 +-
rust/kernel/time/hrtimer/pin.rs | 2 +-
rust/kernel/time/hrtimer/pin_mut.rs | 2 +-
rust/kernel/uaccess.rs | 4 +--
rust/kernel/workqueue.rs | 8 +++---
rust/uapi/lib.rs | 3 ++
33 files changed, 128 insertions(+), 101 deletions(-)
---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20250307-ptr-as-ptr-21b1867fc4d4
Best regards,
--
Tamir Duberstein <tamird(a)gmail.com>
Hello,
this series follows some discussions started in [1] around bpf
trampolines limitations on specific cases. When a trampoline is
generated for a target function involving many arguments, it has to
properly find and save the arguments that has been passed through stack.
While this is doable with basic types (eg: scalars), it brings more
uncertainty when dealing with specific types like structs (many ABIs
allow to pass structures by value if they fit in a register or a pair of
registers). The issue is that those structures layout and location on
the stack can be altered (ie with attributes, like packed or
aligned(x)), and this kind of alteration is not encoded in dwarf or BTF,
making the trampolines clueless about the needed adjustments. Rather
than trying to support this specific case, as agreed in [2], this series
aims to properly deny it.
It targets all the architectures currently implementing
arch_prepare_bpf_trampoline (except aarch64, since it has been handled
while adding the support for many args):
- x86
- s390
- riscv
- powerpc
A small validation function is added in the JIT compiler for each of
those architectures, ensuring that no argument passed on stack is a
struct. If so, the trampoline creation is cancelled. Any check on args
already implemented in a JIT comp has been moved in this new function.
On top of that, it updates the tracing_struct_many_args test, which
now merely checks that this case is indeed denied.
[1] https://lore.kernel.org/bpf/20250411-many_args_arm64-v1-0-0a32fe72339e@boot…
[2] https://lore.kernel.org/bpf/CAADnVQKr3ftNt1uQVrXBE0a2o37ZYRo2PHqCoHUnw6PE5T…
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore(a)bootlin.com>
---
Alexis Lothoré (eBPF Foundation) (7):
bpf/x86: use define for max regs count used for arguments
bpf/x86: prevent trampoline attachment when args location on stack is uncertain
bpf/riscv: prevent trampoline attachment when args location on stack is uncertain
bpf/s390: prevent trampoline attachment when args location on stack is uncertain
bpf/powerpc64: use define for max regs count used for arguments
bpf/powerpc64: prevent trampoline attachment when args location on stack is uncertain
selftests/bpf: ensure that functions passing structs on stack can not be hooked
arch/powerpc/net/bpf_jit_comp.c | 38 ++++++++++--
arch/riscv/net/bpf_jit_comp64.c | 26 +++++++-
arch/s390/net/bpf_jit_comp.c | 33 ++++++++--
arch/x86/net/bpf_jit_comp.c | 50 ++++++++++++----
.../selftests/bpf/prog_tests/tracing_struct.c | 37 +-----------
.../selftests/bpf/progs/tracing_struct_many_args.c | 70 ----------------------
.../testing/selftests/bpf/test_kmods/bpf_testmod.c | 43 ++-----------
7 files changed, 129 insertions(+), 168 deletions(-)
---
base-commit: c4f4f8da70044d8b28fccf73016b4119f3e2fd50
change-id: 20250609-deny_trampoline_structs_on_stack-5bbc7bc20dd1
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
skb_ensure_writable actually makes sure that the header of the skb is
writable, and doesn't touch the payload. It doesn't need an
skb_frags_readable check.
Removing this check restores DSCP functionality with unreadable skbs as
it's called from dscp_tg.
Fixes: 65249feb6b3d ("net: add support for skbs with unreadable frags")
Signed-off-by: Mina Almasry <almasrymina(a)google.com>
---
net/core/skbuff.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 85fc82f72d26..d6420b74ea9c 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6261,9 +6261,6 @@ int skb_ensure_writable(struct sk_buff *skb, unsigned int write_len)
if (!pskb_may_pull(skb, write_len))
return -ENOMEM;
- if (!skb_frags_readable(skb))
- return -EFAULT;
-
if (!skb_cloned(skb) || skb_clone_writable(skb, write_len))
return 0;
base-commit: 6d4e01d29d87356924f1521ca6df7a364e948f13
--
2.50.0.rc1.591.g9c95f17f64-goog
From: Chia-Yu Chang <chia-yu.chang(a)nokia-bell-labs.com>
Hello,
Please find the DualPI2 patch v18.
This patch serise adds DualPI Improved with a Square (DualPI2) with following features:
* Supports congestion controls that comply with the Prague requirements in RFC9331 (e.g. TCP-Prague)
* Coupled dual-queue that separates the L4S traffic in a low latency queue (L-queue), without harming remaining traffic that is scheduled in classic queue (C-queue) due to congestion-coupling using PI2 as defined in RFC9332
* Configurable overload strategies
* Use of sojourn time to reliably estimate queue delay
* Supports ECN L4S-identifier (IP.ECN==0b*1) to classify traffic into respective queues
For more details of DualPI2, please refer IETF RFC9332 (https://datatracker.ietf.org/doc/html/rfc9332).
Best regards,
Chia-Yu
---
v18 (13-Jun-2025)
- Add the num of enum used by DualPI2 and fix name and name-prefix of DualPI2 enum and attribute
- Replace from_timer() with timer_container_of() (Pedro Tammela <pctammela(a)mojatatu.com>)
v17 (25-May-2025, Resent at 11-Jun-2025)
- Replace 0xffffffff with U32_MAX (Paolo Abeni <pabeni(a)redhat.com>)
- Use helper function qdisc_dequeue_internal() and add new helper function skb_apply_step() (Paolo Abeni <pabeni(a)redhat.com>)
- Add s64 casting when calculating the delta of the PI controller (Paolo Abeni <pabeni(a)redhat.com>)
- Change the drop reason into SKB_DROP_REASON_QDISC_CONGESTED for drop_early (Paolo Abeni <pabeni(a)redhat.com>)
- Modify the condition to remove the original skb when enqueuing multiple GSO segments (Paolo Abeni <pabeni(a)redhat.com>)
- Add READ_ONCE() in dualpi2_dump_stat() (Paolo Abeni <pabeni(a)redhat.com>)
- Add comments, brackets, and brackets for readability (Paolo Abeni <pabeni(a)redhat.com>)
v16 (16-MAy-2025)
- Add qdisc_lock() to dualpi2_timer() in dualpi2_timer (Paolo Abeni <pabeni(a)redhat.com>)
- Introduce convert_ns_to_usec() to convert usec to nsec without overflow in #1 (Paolo Abeni <pabeni(a)redhat.com>)
- Update convert_us_tonsec() to convert nsec to usec without overflow in #2 (Paolo Abeni <pabeni(a)redhat.com>)
- Add more descriptions with respect to DualPI2 in the cover ltter and add changelog in each patch (Paolo Abeni <pabeni(a)redhat.com>)
v15 (09-May-2025)
- Add enum of TCA_DUALPI2_ECN_MASK_CLA_ECT to remove potential leakeage in #1 (Simon Horman <horms(a)kernel.org>)
- Fix one typo in comment of #2
- Update tc.yaml in #5 to aligh with the updated enum of pkt_sched.h
v14 (05-May-2025)
- Modify tc.yaml: (1) Replace flags with enum and remove enum-as-flags, (2) Remove credit-queue in xstats, and (3) Change attribute types (Donald Hunter <donald.hun
- Add enum and fix the ordering of variables in pkt_sched.h to align with the modified tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Add validators for DROP_OVERLOAD, DROP_EARLY, ECN_MASK, and SPLIT_GSO in sch_dualpi2.c (Donald Hunter <donald.hunter(a)gmail.com>)
- Update dualpi2.json to align with the updated variable order in pkt_sched.h
- Reorder patches (Donald Hunter <donald.hunter(a)gmail.com>)
v13 (26-Apr-2025)
- Use dashes in member names to follow YNL conventions in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Define enumerations separately for flags of drop-early, drop-overload, ecn-mask, credit-queue in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Change the types of split-gso and step-packets into flag in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Revert to u32/u8 types for tc-dualpi2-xstats members in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Add new test cases in tc-tests/qdiscs/dualpi2.json to cover all dualpi2 parameters (Donald Hunter <donald.hunter(a)gmail.com>)
- Change the type of TCA_DUALPI2_STEP_PACKETS into NLA_FLAG (Donald Hunter <donald.hunter(a)gmail.com>)
v12 (22-Apr-2025)
- Remove anonymous struct in sch_dualpi2.c (Paolo Abeni <pabeni(a)redhat.com>)
- Replace u32/u8 with uint and s32 with int in tc spec document (Paolo Abeni <pabeni(a)redhat.com>)
- Introduce get_memory_limit function to handle potential overflow when multipling limit with MTU (Paolo Abeni <pabeni(a)redhat.com>)
- Double the packet length to further include packet overhead in memory_limit (Paolo Abeni <pabeni(a)redhat.com>)
- Remove the check of qdisc_qlen(sch) when calling qdisc_tree_reduce_backlog (Paolo Abeni <pabeni(a)redhat.com>)
v11 (15-Apr-2025)
- Replace hstimer_init with hstimer_setup in sch_dualpi2.c
v10 (25-Mar-2025)
- Remove leftover include in include/linux/netdevice.h and anonymous struct in sch_dualpi2.c (Paolo Abeni <pabeni(a)redhat.com>)
- Use kfree_skb_reason() and add SKB_DROP_REASON_DUALPI2_STEP_DROP drop reason (Paolo Abeni <pabeni(a)redhat.com>)
- Split sch_dualpi2.c into 3 patches (and overall 5 patches): Struct definition & parsing, Dump stats & configuration, Enqueue/Dequeue (Paolo Abeni <pabeni(a)redhat.com>)
v9 (16-Mar-2025)
- Fix mem_usage error in previous version
- Add min_qlen_step to the dualpi2 attribute as the minimum queue length in number of packets in the L-queue to start step threshold marking.
In previous versions, this value was fixed to 2, so the step threshold was applied to mark packets in the L queue only when the queue length of the L queue was greater than or equal to 2 packets.
This will cause larger queuing delays for L4S traffic at low rates (<20Mbps). So we parameterize it and change the default value to 0.
Comparison of tcp_1down run 'HTB 20Mbit + DUALPI2 + 10ms base delay'
Old versions:
avg median # data pts
Ping (ms) ICMP : 11.55 11.70 ms 350
TCP upload avg : 18.96 N/A Mbits/s 350
TCP upload sum : 18.96 N/A Mbits/s 350
New version (v9):
avg median # data pts
Ping (ms) ICMP : 10.81 10.70 ms 350
TCP upload avg : 18.91 N/A Mbits/s 350
TCP upload sum : 18.91 N/A Mbits/s 350
Comparison of tcp_1down run 'HTB 10Mbit + DUALPI2 + 10ms base delay'
Old versions:
avg median # data pts
Ping (ms) ICMP : 12.61 12.80 ms 350
TCP upload avg : 9.48 N/A Mbits/s 350
TCP upload sum : 9.48 N/A Mbits/s 350
New version (v9):
avg median # data pts
Ping (ms) ICMP : 11.06 10.80 ms 350
TCP upload avg : 9.43 N/A Mbits/s 350
TCP upload sum : 9.43 N/A Mbits/s 350
Comparison of tcp_1down run 'HTB 10Mbit + DUALPI2 + 10ms base delay'
Old versions:
avg median # data pts
Ping (ms) ICMP : 40.86 37.45 ms 350
TCP upload avg : 0.88 N/A Mbits/s 350
TCP upload sum : 0.88 N/A Mbits/s 350
TCP upload::1 : 0.88 0.97 Mbits/s 350
New version (v9):
avg median # data pts
Ping (ms) ICMP : 11.07 10.40 ms 350
TCP upload avg : 0.55 N/A Mbits/s 350
TCP upload sum : 0.55 N/A Mbits/s 350
TCP upload::1 : 0.55 0.59 Mbits/s 350
v8 (11-Mar-2025)
- Fix warning messages in v7
v7 (07-Mar-2025)
- Separate into 3 patches to avoid mixing changes of documentation, selftest, and code. (Cong Wang <xiyou.wangcong(a)gmail.com>)
v6 (04-Mar-2025)
- Add modprobe for dulapi2 in tc-testing script tc-testing/tdc.sh (Jakub Kicinski <kuba(a)kernel.org>)
- Update test cases in dualpi2.json
- Update commit message
v5 (22-Feb-2025)
- A comparison was done between MQ + DUALPI2, MQ + FQ_PIE, MQ + FQ_CODEL:
Unshaped 1gigE with 4 download streams test:
- Summary of tcp_4down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 1.19 1.34 ms 349
TCP download avg : 235.42 N/A Mbits/s 349
TCP download sum : 941.68 N/A Mbits/s 349
TCP download::1 : 235.19 235.39 Mbits/s 349
TCP download::2 : 235.03 235.35 Mbits/s 349
TCP download::3 : 236.89 235.44 Mbits/s 349
TCP download::4 : 234.57 235.19 Mbits/s 349
- Summary of tcp_4down run 'MQ + FQ_PIE'
avg median # data pts
Ping (ms) ICMP : 1.21 1.37 ms 350
TCP download avg : 235.42 N/A Mbits/s 350
TCP download sum : 941.61 N/A Mbits/s 350
TCP download::1 : 232.54 233.13 Mbits/s 350
TCP download::2 : 232.52 232.80 Mbits/s 350
TCP download::3 : 233.14 233.78 Mbits/s 350
TCP download::4 : 243.41 241.48 Mbits/s 350
- Summary of tcp_4down run 'MQ + DUALPI2'
avg median # data pts
Ping (ms) ICMP : 1.19 1.34 ms 349
TCP download avg : 235.42 N/A Mbits/s 349
TCP download sum : 941.68 N/A Mbits/s 349
TCP download::1 : 235.19 235.39 Mbits/s 349
TCP download::2 : 235.03 235.35 Mbits/s 349
TCP download::3 : 236.89 235.44 Mbits/s 349
TCP download::4 : 234.57 235.19 Mbits/s 349
Unshaped 1gigE with 128 download streams test:
- Summary of tcp_128down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 1.88 1.86 ms 350
TCP download avg : 7.39 N/A Mbits/s 350
TCP download sum : 946.47 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + FQ_PIE':
avg median # data pts
Ping (ms) ICMP : 1.88 1.86 ms 350
TCP download avg : 7.39 N/A Mbits/s 350
TCP download sum : 946.47 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + DUALPI2':
avg median # data pts
Ping (ms) ICMP : 1.88 1.86 ms 350
TCP download avg : 7.39 N/A Mbits/s 350
TCP download sum : 946.47 N/A Mbits/s 350
Unshaped 10gigE with 4 download streams test:
- Summary of tcp_4down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 0.22 0.23 ms 350
TCP download avg : 2354.08 N/A Mbits/s 350
TCP download sum : 9416.31 N/A Mbits/s 350
TCP download::1 : 2353.65 2352.81 Mbits/s 350
TCP download::2 : 2354.54 2354.21 Mbits/s 350
TCP download::3 : 2353.56 2353.78 Mbits/s 350
TCP download::4 : 2354.56 2354.45 Mbits/s 350
- Summary of tcp_4down run 'MQ + FQ_PIE':
avg median # data pts
Ping (ms) ICMP : 0.20 0.19 ms 350
TCP download avg : 2354.76 N/A Mbits/s 350
TCP download sum : 9419.04 N/A Mbits/s 350
TCP download::1 : 2354.77 2353.89 Mbits/s 350
TCP download::2 : 2353.41 2354.29 Mbits/s 350
TCP download::3 : 2356.18 2354.19 Mbits/s 350
TCP download::4 : 2354.68 2353.15 Mbits/s 350
- Summary of tcp_4down run 'MQ + DUALPI2':
avg median # data pts
Ping (ms) ICMP : 0.24 0.24 ms 350
TCP download avg : 2354.11 N/A Mbits/s 350
TCP download sum : 9416.43 N/A Mbits/s 350
TCP download::1 : 2354.75 2353.93 Mbits/s 350
TCP download::2 : 2353.15 2353.75 Mbits/s 350
TCP download::3 : 2353.49 2353.72 Mbits/s 350
TCP download::4 : 2355.04 2353.73 Mbits/s 350
Unshaped 10gigE with 128 download streams test:
- Summary of tcp_128down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 7.57 8.69 ms 350
TCP download avg : 73.97 N/A Mbits/s 350
TCP download sum : 9467.82 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + FQ_PIE':
avg median # data pts
Ping (ms) ICMP : 7.82 8.91 ms 350
TCP download avg : 73.97 N/A Mbits/s 350
TCP download sum : 9468.42 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + DUALPI2':
avg median # data pts
Ping (ms) ICMP : 6.87 7.93 ms 350
TCP download avg : 73.95 N/A Mbits/s 350
TCP download sum : 9465.87 N/A Mbits/s 350
From the results shown above, we see small differences between combinations.
- Update commit message to include results of no_split_gso and split_gso (Dave Taht <dave.taht(a)gmail.com> and Paolo Abeni <pabeni(a)redhat.com>)
- Add memlimit in the dualpi2 attribute, and add memory_used, max_memory_used, memory_limit in dualpi2 stats (Dave Taht <dave.taht(a)gmail.com>)
- Update note in sch_dualpi2.c related to BBRv3 status (Dave Taht <dave.taht(a)gmail.com>)
- Update license identifier (Dave Taht <dave.taht(a)gmail.com>)
- Add selftest in tools/testing/selftests/tc-testing (Cong Wang <xiyou.wangcong(a)gmail.com>)
- Use netlink policies for parameter checks (Jamal Hadi Salim <jhs(a)mojatatu.com>)
- Modify texts & fix typos in Documentation/netlink/specs/tc.yaml (Dave Taht <dave.taht(a)gmail.com>)
- Add descriptions of packet counter statistics and the reset function of sch_dualpi2.c
- Fix step_thresh in packets
- Update code comments in sch_dualpi2.c
v4 (22-Oct-2024)
- Update statement in Kconfig for DualPI2 (Stephen Hemminger <stephen(a)networkplumber.org>)
- Put a blank line after #define in sch_dualpi2.c (Stephen Hemminger <stephen(a)networkplumber.org>)
- Fix line length warning.
v3 (19-Oct-2024)
- Fix compilaiton error
- Update Documentation/netlink/specs/tc.yaml (Jakub Kicinski <kuba(a)kernel.org>)
v2 (18-Oct-2024)
- Add Documentation/netlink/specs/tc.yaml (Jakub Kicinski <kuba(a)kernel.org>)
- Use dualpi2 instead of skb prefix (Jamal Hadi Salim <jhs(a)mojatatu.com>)
- Replace nla_parse_nested_deprecated with nla_parse_nested (Jamal Hadi Salim <jhs(a)mojatatu.com>)
- Fix line length warning
---
Chia-Yu Chang (4):
sched: Struct definition and parsing of dualpi2 qdisc
sched: Dump configuration and statistics of dualpi2 qdisc
selftests/tc-testing: Add selftests for qdisc DualPI2
Documentation: netlink: specs: tc: Add DualPI2 specification
Koen De Schepper (1):
sched: Add enqueue/dequeue of dualpi2 qdisc
Documentation/netlink/specs/tc.yaml | 161 +++
include/net/dropreason-core.h | 6 +
include/uapi/linux/pkt_sched.h | 70 +-
net/sched/Kconfig | 12 +
net/sched/Makefile | 1 +
net/sched/sch_dualpi2.c | 1146 +++++++++++++++++
tools/testing/selftests/tc-testing/config | 1 +
.../tc-testing/tc-tests/qdiscs/dualpi2.json | 254 ++++
tools/testing/selftests/tc-testing/tdc.sh | 1 +
9 files changed, 1651 insertions(+), 1 deletion(-)
create mode 100644 net/sched/sch_dualpi2.c
create mode 100644 tools/testing/selftests/tc-testing/tc-tests/qdiscs/dualpi2.json
--
2.34.1
From: Chia-Yu Chang <chia-yu.chang(a)nokia-bell-labs.com>
Hello,
Please find the DualPI2 patch v18.
This patch serise adds DualPI Improved with a Square (DualPI2) with following features:
* Supports congestion controls that comply with the Prague requirements in RFC9331 (e.g. TCP-Prague)
* Coupled dual-queue that separates the L4S traffic in a low latency queue (L-queue), without harming remaining traffic that is scheduled in classic queue (C-queue) due to congestion-coupling using PI2 as defined in RFC9332
* Configurable overload strategies
* Use of sojourn time to reliably estimate queue delay
* Supports ECN L4S-identifier (IP.ECN==0b*1) to classify traffic into respective queues
For more details of DualPI2, please refer IETF RFC9332 (https://datatracker.ietf.org/doc/html/rfc9332).
Best regards,
Chia-Yu
---
v18 (13-Jun-2025)
- Add the num of enum used by DualPI2 and fix name and name-prefix of DualPI2 enum and attribute
- Replace from_timer() with timer_container_of() (Pedro Tammela <pctammela(a)mojatatu.com>)
v17 (25-May-2025, Resent at 11-Jun-2025)
- Replace 0xffffffff with U32_MAX (Paolo Abeni <pabeni(a)redhat.com>)
- Use helper function qdisc_dequeue_internal() and add new helper function skb_apply_step() (Paolo Abeni <pabeni(a)redhat.com>)
- Add s64 casting when calculating the delta of the PI controller (Paolo Abeni <pabeni(a)redhat.com>)
- Change the drop reason into SKB_DROP_REASON_QDISC_CONGESTED for drop_early (Paolo Abeni <pabeni(a)redhat.com>)
- Modify the condition to remove the original skb when enqueuing multiple GSO segments (Paolo Abeni <pabeni(a)redhat.com>)
- Add READ_ONCE() in dualpi2_dump_stat() (Paolo Abeni <pabeni(a)redhat.com>)
- Add comments, brackets, and brackets for readability (Paolo Abeni <pabeni(a)redhat.com>)
v16 (16-MAy-2025)
- Add qdisc_lock() to dualpi2_timer() in dualpi2_timer (Paolo Abeni <pabeni(a)redhat.com>)
- Introduce convert_ns_to_usec() to convert usec to nsec without overflow in #1 (Paolo Abeni <pabeni(a)redhat.com>)
- Update convert_us_tonsec() to convert nsec to usec without overflow in #2 (Paolo Abeni <pabeni(a)redhat.com>)
- Add more descriptions with respect to DualPI2 in the cover ltter and add changelog in each patch (Paolo Abeni <pabeni(a)redhat.com>)
v15 (09-May-2025)
- Add enum of TCA_DUALPI2_ECN_MASK_CLA_ECT to remove potential leakeage in #1 (Simon Horman <horms(a)kernel.org>)
- Fix one typo in comment of #2
- Update tc.yaml in #5 to aligh with the updated enum of pkt_sched.h
v14 (05-May-2025)
- Modify tc.yaml: (1) Replace flags with enum and remove enum-as-flags, (2) Remove credit-queue in xstats, and (3) Change attribute types (Donald Hunter <donald.hun
- Add enum and fix the ordering of variables in pkt_sched.h to align with the modified tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Add validators for DROP_OVERLOAD, DROP_EARLY, ECN_MASK, and SPLIT_GSO in sch_dualpi2.c (Donald Hunter <donald.hunter(a)gmail.com>)
- Update dualpi2.json to align with the updated variable order in pkt_sched.h
- Reorder patches (Donald Hunter <donald.hunter(a)gmail.com>)
v13 (26-Apr-2025)
- Use dashes in member names to follow YNL conventions in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Define enumerations separately for flags of drop-early, drop-overload, ecn-mask, credit-queue in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Change the types of split-gso and step-packets into flag in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Revert to u32/u8 types for tc-dualpi2-xstats members in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Add new test cases in tc-tests/qdiscs/dualpi2.json to cover all dualpi2 parameters (Donald Hunter <donald.hunter(a)gmail.com>)
- Change the type of TCA_DUALPI2_STEP_PACKETS into NLA_FLAG (Donald Hunter <donald.hunter(a)gmail.com>)
v12 (22-Apr-2025)
- Remove anonymous struct in sch_dualpi2.c (Paolo Abeni <pabeni(a)redhat.com>)
- Replace u32/u8 with uint and s32 with int in tc spec document (Paolo Abeni <pabeni(a)redhat.com>)
- Introduce get_memory_limit function to handle potential overflow when multipling limit with MTU (Paolo Abeni <pabeni(a)redhat.com>)
- Double the packet length to further include packet overhead in memory_limit (Paolo Abeni <pabeni(a)redhat.com>)
- Remove the check of qdisc_qlen(sch) when calling qdisc_tree_reduce_backlog (Paolo Abeni <pabeni(a)redhat.com>)
v11 (15-Apr-2025)
- Replace hstimer_init with hstimer_setup in sch_dualpi2.c
v10 (25-Mar-2025)
- Remove leftover include in include/linux/netdevice.h and anonymous struct in sch_dualpi2.c (Paolo Abeni <pabeni(a)redhat.com>)
- Use kfree_skb_reason() and add SKB_DROP_REASON_DUALPI2_STEP_DROP drop reason (Paolo Abeni <pabeni(a)redhat.com>)
- Split sch_dualpi2.c into 3 patches (and overall 5 patches): Struct definition & parsing, Dump stats & configuration, Enqueue/Dequeue (Paolo Abeni <pabeni(a)redhat.com>)
v9 (16-Mar-2025)
- Fix mem_usage error in previous version
- Add min_qlen_step to the dualpi2 attribute as the minimum queue length in number of packets in the L-queue to start step threshold marking.
In previous versions, this value was fixed to 2, so the step threshold was applied to mark packets in the L queue only when the queue length of the L queue was greater than or equal to 2 packets.
This will cause larger queuing delays for L4S traffic at low rates (<20Mbps). So we parameterize it and change the default value to 0.
Comparison of tcp_1down run 'HTB 20Mbit + DUALPI2 + 10ms base delay'
Old versions:
avg median # data pts
Ping (ms) ICMP : 11.55 11.70 ms 350
TCP upload avg : 18.96 N/A Mbits/s 350
TCP upload sum : 18.96 N/A Mbits/s 350
New version (v9):
avg median # data pts
Ping (ms) ICMP : 10.81 10.70 ms 350
TCP upload avg : 18.91 N/A Mbits/s 350
TCP upload sum : 18.91 N/A Mbits/s 350
Comparison of tcp_1down run 'HTB 10Mbit + DUALPI2 + 10ms base delay'
Old versions:
avg median # data pts
Ping (ms) ICMP : 12.61 12.80 ms 350
TCP upload avg : 9.48 N/A Mbits/s 350
TCP upload sum : 9.48 N/A Mbits/s 350
New version (v9):
avg median # data pts
Ping (ms) ICMP : 11.06 10.80 ms 350
TCP upload avg : 9.43 N/A Mbits/s 350
TCP upload sum : 9.43 N/A Mbits/s 350
Comparison of tcp_1down run 'HTB 10Mbit + DUALPI2 + 10ms base delay'
Old versions:
avg median # data pts
Ping (ms) ICMP : 40.86 37.45 ms 350
TCP upload avg : 0.88 N/A Mbits/s 350
TCP upload sum : 0.88 N/A Mbits/s 350
TCP upload::1 : 0.88 0.97 Mbits/s 350
New version (v9):
avg median # data pts
Ping (ms) ICMP : 11.07 10.40 ms 350
TCP upload avg : 0.55 N/A Mbits/s 350
TCP upload sum : 0.55 N/A Mbits/s 350
TCP upload::1 : 0.55 0.59 Mbits/s 350
v8 (11-Mar-2025)
- Fix warning messages in v7
v7 (07-Mar-2025)
- Separate into 3 patches to avoid mixing changes of documentation, selftest, and code. (Cong Wang <xiyou.wangcong(a)gmail.com>)
v6 (04-Mar-2025)
- Add modprobe for dulapi2 in tc-testing script tc-testing/tdc.sh (Jakub Kicinski <kuba(a)kernel.org>)
- Update test cases in dualpi2.json
- Update commit message
v5 (22-Feb-2025)
- A comparison was done between MQ + DUALPI2, MQ + FQ_PIE, MQ + FQ_CODEL:
Unshaped 1gigE with 4 download streams test:
- Summary of tcp_4down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 1.19 1.34 ms 349
TCP download avg : 235.42 N/A Mbits/s 349
TCP download sum : 941.68 N/A Mbits/s 349
TCP download::1 : 235.19 235.39 Mbits/s 349
TCP download::2 : 235.03 235.35 Mbits/s 349
TCP download::3 : 236.89 235.44 Mbits/s 349
TCP download::4 : 234.57 235.19 Mbits/s 349
- Summary of tcp_4down run 'MQ + FQ_PIE'
avg median # data pts
Ping (ms) ICMP : 1.21 1.37 ms 350
TCP download avg : 235.42 N/A Mbits/s 350
TCP download sum : 941.61 N/A Mbits/s 350
TCP download::1 : 232.54 233.13 Mbits/s 350
TCP download::2 : 232.52 232.80 Mbits/s 350
TCP download::3 : 233.14 233.78 Mbits/s 350
TCP download::4 : 243.41 241.48 Mbits/s 350
- Summary of tcp_4down run 'MQ + DUALPI2'
avg median # data pts
Ping (ms) ICMP : 1.19 1.34 ms 349
TCP download avg : 235.42 N/A Mbits/s 349
TCP download sum : 941.68 N/A Mbits/s 349
TCP download::1 : 235.19 235.39 Mbits/s 349
TCP download::2 : 235.03 235.35 Mbits/s 349
TCP download::3 : 236.89 235.44 Mbits/s 349
TCP download::4 : 234.57 235.19 Mbits/s 349
Unshaped 1gigE with 128 download streams test:
- Summary of tcp_128down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 1.88 1.86 ms 350
TCP download avg : 7.39 N/A Mbits/s 350
TCP download sum : 946.47 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + FQ_PIE':
avg median # data pts
Ping (ms) ICMP : 1.88 1.86 ms 350
TCP download avg : 7.39 N/A Mbits/s 350
TCP download sum : 946.47 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + DUALPI2':
avg median # data pts
Ping (ms) ICMP : 1.88 1.86 ms 350
TCP download avg : 7.39 N/A Mbits/s 350
TCP download sum : 946.47 N/A Mbits/s 350
Unshaped 10gigE with 4 download streams test:
- Summary of tcp_4down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 0.22 0.23 ms 350
TCP download avg : 2354.08 N/A Mbits/s 350
TCP download sum : 9416.31 N/A Mbits/s 350
TCP download::1 : 2353.65 2352.81 Mbits/s 350
TCP download::2 : 2354.54 2354.21 Mbits/s 350
TCP download::3 : 2353.56 2353.78 Mbits/s 350
TCP download::4 : 2354.56 2354.45 Mbits/s 350
- Summary of tcp_4down run 'MQ + FQ_PIE':
avg median # data pts
Ping (ms) ICMP : 0.20 0.19 ms 350
TCP download avg : 2354.76 N/A Mbits/s 350
TCP download sum : 9419.04 N/A Mbits/s 350
TCP download::1 : 2354.77 2353.89 Mbits/s 350
TCP download::2 : 2353.41 2354.29 Mbits/s 350
TCP download::3 : 2356.18 2354.19 Mbits/s 350
TCP download::4 : 2354.68 2353.15 Mbits/s 350
- Summary of tcp_4down run 'MQ + DUALPI2':
avg median # data pts
Ping (ms) ICMP : 0.24 0.24 ms 350
TCP download avg : 2354.11 N/A Mbits/s 350
TCP download sum : 9416.43 N/A Mbits/s 350
TCP download::1 : 2354.75 2353.93 Mbits/s 350
TCP download::2 : 2353.15 2353.75 Mbits/s 350
TCP download::3 : 2353.49 2353.72 Mbits/s 350
TCP download::4 : 2355.04 2353.73 Mbits/s 350
Unshaped 10gigE with 128 download streams test:
- Summary of tcp_128down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 7.57 8.69 ms 350
TCP download avg : 73.97 N/A Mbits/s 350
TCP download sum : 9467.82 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + FQ_PIE':
avg median # data pts
Ping (ms) ICMP : 7.82 8.91 ms 350
TCP download avg : 73.97 N/A Mbits/s 350
TCP download sum : 9468.42 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + DUALPI2':
avg median # data pts
Ping (ms) ICMP : 6.87 7.93 ms 350
TCP download avg : 73.95 N/A Mbits/s 350
TCP download sum : 9465.87 N/A Mbits/s 350
From the results shown above, we see small differences between combinations.
- Update commit message to include results of no_split_gso and split_gso (Dave Taht <dave.taht(a)gmail.com> and Paolo Abeni <pabeni(a)redhat.com>)
- Add memlimit in the dualpi2 attribute, and add memory_used, max_memory_used, memory_limit in dualpi2 stats (Dave Taht <dave.taht(a)gmail.com>)
- Update note in sch_dualpi2.c related to BBRv3 status (Dave Taht <dave.taht(a)gmail.com>)
- Update license identifier (Dave Taht <dave.taht(a)gmail.com>)
- Add selftest in tools/testing/selftests/tc-testing (Cong Wang <xiyou.wangcong(a)gmail.com>)
- Use netlink policies for parameter checks (Jamal Hadi Salim <jhs(a)mojatatu.com>)
- Modify texts & fix typos in Documentation/netlink/specs/tc.yaml (Dave Taht <dave.taht(a)gmail.com>)
- Add descriptions of packet counter statistics and the reset function of sch_dualpi2.c
- Fix step_thresh in packets
- Update code comments in sch_dualpi2.c
v4 (22-Oct-2024)
- Update statement in Kconfig for DualPI2 (Stephen Hemminger <stephen(a)networkplumber.org>)
- Put a blank line after #define in sch_dualpi2.c (Stephen Hemminger <stephen(a)networkplumber.org>)
- Fix line length warning.
v3 (19-Oct-2024)
- Fix compilaiton error
- Update Documentation/netlink/specs/tc.yaml (Jakub Kicinski <kuba(a)kernel.org>)
v2 (18-Oct-2024)
- Add Documentation/netlink/specs/tc.yaml (Jakub Kicinski <kuba(a)kernel.org>)
- Use dualpi2 instead of skb prefix (Jamal Hadi Salim <jhs(a)mojatatu.com>)
- Replace nla_parse_nested_deprecated with nla_parse_nested (Jamal Hadi Salim <jhs(a)mojatatu.com>)
- Fix line length warning
---
Chia-Yu Chang (4):
sched: Struct definition and parsing of dualpi2 qdisc
sched: Dump configuration and statistics of dualpi2 qdisc
selftests/tc-testing: Add selftests for qdisc DualPI2
Documentation: netlink: specs: tc: Add DualPI2 specification
Koen De Schepper (1):
sched: Add enqueue/dequeue of dualpi2 qdisc
Documentation/netlink/specs/tc.yaml | 161 +++
include/net/dropreason-core.h | 6 +
include/uapi/linux/pkt_sched.h | 70 +-
net/sched/Kconfig | 12 +
net/sched/Makefile | 1 +
net/sched/sch_dualpi2.c | 1146 +++++++++++++++++
tools/testing/selftests/tc-testing/config | 1 +
.../tc-testing/tc-tests/qdiscs/dualpi2.json | 254 ++++
tools/testing/selftests/tc-testing/tdc.sh | 1 +
9 files changed, 1651 insertions(+), 1 deletion(-)
create mode 100644 net/sched/sch_dualpi2.c
create mode 100644 tools/testing/selftests/tc-testing/tc-tests/qdiscs/dualpi2.json
--
2.34.1
This commit adds a new kernel selftest to verify RTNLGRP_IPV4_MCADDR
and RTNLGRP_IPV6_MCADDR notifications. The test works by adding and
removing a dummy interface and then confirming that the system
correctly receives join and removal notifications for the 224.0.0.1
and ff02::1 multicast addresses.
The test relies on the iproute2 version to be 6.13+.
Tested by the following command:
$ vng -v --user root --cpus 16 -- \
make -C tools/testing/selftests TARGETS=net
TEST_PROGS=rtnetlink_notification.sh \
TEST_GEN_PROGS="" run_tests
Cc: Maciej Żenczykowski <maze(a)google.com>
Cc: Lorenzo Colitti <lorenzo(a)google.com>
Signed-off-by: Yuyang Huang <yuyanghuang(a)google.com>
---
Changelog since v2:
- Move the test case to a separate file.
Changelog since v1:
- Skip the test if the iproute2 is too old.
tools/testing/selftests/net/Makefile | 1 +
.../selftests/net/rtnetlink_notification.sh | 159 ++++++++++++++++++
2 files changed, 160 insertions(+)
create mode 100755 tools/testing/selftests/net/rtnetlink_notification.sh
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 70a38f485d4d..ad258b25bc9d 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -40,6 +40,7 @@ TEST_PROGS += netns-name.sh
TEST_PROGS += link_netns.py
TEST_PROGS += nl_netdev.py
TEST_PROGS += rtnetlink.py
+TEST_PROGS += rtnetlink_notification.sh
TEST_PROGS += srv6_end_dt46_l3vpn_test.sh
TEST_PROGS += srv6_end_dt4_l3vpn_test.sh
TEST_PROGS += srv6_end_dt6_l3vpn_test.sh
diff --git a/tools/testing/selftests/net/rtnetlink_notification.sh b/tools/testing/selftests/net/rtnetlink_notification.sh
new file mode 100755
index 000000000000..a2c1afed5023
--- /dev/null
+++ b/tools/testing/selftests/net/rtnetlink_notification.sh
@@ -0,0 +1,159 @@
+#!/bin/bash
+#
+# This test is for checking rtnetlink notification callpaths, and get as much
+# coverage as possible.
+#
+# set -e
+
+ALL_TESTS="
+ kci_test_mcast_addr_notification
+"
+
+VERBOSE=0
+PAUSE=no
+PAUSE_ON_FAIL=no
+
+source lib.sh
+
+# set global exit status, but never reset nonzero one.
+check_err()
+{
+ if [ $ret -eq 0 ]; then
+ ret=$1
+ fi
+ [ -n "$2" ] && echo "$2"
+}
+
+run_cmd_common()
+{
+ local cmd="$*"
+ local out
+ if [ "$VERBOSE" = "1" ]; then
+ echo "COMMAND: ${cmd}"
+ fi
+ out=$($cmd 2>&1)
+ rc=$?
+ if [ "$VERBOSE" = "1" -a -n "$out" ]; then
+ echo " $out"
+ fi
+ return $rc
+}
+
+run_cmd() {
+ run_cmd_common "$@"
+ rc=$?
+ check_err $rc
+ return $rc
+}
+
+end_test()
+{
+ echo "$*"
+ [ "${VERBOSE}" = "1" ] && echo
+
+ if [[ $ret -ne 0 ]] && [[ "${PAUSE_ON_FAIL}" = "yes" ]]; then
+ echo "Hit enter to continue"
+ read a
+ fi;
+
+ if [ "${PAUSE}" = "yes" ]; then
+ echo "Hit enter to continue"
+ read a
+ fi
+
+}
+
+kci_test_mcast_addr_notification()
+{
+ local tmpfile
+ local monitor_pid
+ local match_result
+
+ tmpfile=$(mktemp)
+
+ ip monitor maddr > $tmpfile &
+ monitor_pid=$!
+ sleep 1
+ if [ ! -e "/proc/$monitor_pid" ]; then
+ end_test "SKIP: mcast addr notification: iproute2 too old"
+ rm $tmpfile
+ return $ksft_skip
+ fi
+
+ run_cmd ip link add name test-dummy1 type dummy
+ run_cmd ip link set test-dummy1 up
+ run_cmd ip link del dev test-dummy1
+ sleep 1
+
+ match_result=$(grep -cE "test-dummy1.*(224.0.0.1|ff02::1)" $tmpfile)
+
+ kill $monitor_pid
+ rm $tmpfile
+ # There should be 4 line matches as follows.
+ # 13: test-dummy1 inet6 mcast ff02::1 scope global
+ # 13: test-dummy1 inet mcast 224.0.0.1 scope global
+ # Deleted 13: test-dummy1 inet mcast 224.0.0.1 scope global
+ # Deleted 13: test-dummy1 inet6 mcast ff02::1 scope global
+ if [ $match_result -ne 4 ];then
+ end_test "FAIL: mcast addr notification"
+ return 1
+ fi
+ end_test "PASS: mcast addr notification"
+}
+
+kci_test_rtnl()
+{
+ local current_test
+ local ret=0
+
+ for current_test in ${TESTS:-$ALL_TESTS}; do
+ $current_test
+ check_err $?
+ done
+
+ return $ret
+}
+
+usage()
+{
+ cat <<EOF
+usage: ${0##*/} OPTS
+
+ -t <test> Test(s) to run (default: all)
+ (options: $(echo $ALL_TESTS))
+ -v Verbose mode (show commands and output)
+ -P Pause after every test
+ -p Pause after every failing test before cleanup (for debugging)
+EOF
+}
+
+#check for needed privileges
+if [ "$(id -u)" -ne 0 ];then
+ end_test "SKIP: Need root privileges"
+ exit $ksft_skip
+fi
+
+for x in ip;do
+ $x -Version 2>/dev/null >/dev/null
+ if [ $? -ne 0 ];then
+ end_test "SKIP: Could not run test without the $x tool"
+ exit $ksft_skip
+ fi
+done
+
+while getopts t:hvpP o; do
+ case $o in
+ t) TESTS=$OPTARG;;
+ v) VERBOSE=1;;
+ p) PAUSE_ON_FAIL=yes;;
+ P) PAUSE=yes;;
+ h) usage; exit 0;;
+ *) usage; exit 1;;
+ esac
+done
+
+[ $PAUSE = "yes" ] && PAUSE_ON_FAIL="no"
+
+kci_test_rtnl
+
+exit $?
--
2.50.0.rc1.591.g9c95f17f64-goog
Reading /proc/pid/maps requires read-locking mmap_lock which prevents any
other task from concurrently modifying the address space. This guarantees
coherent reporting of virtual address ranges, however it can block
important updates from happening. Oftentimes /proc/pid/maps readers are
low priority monitoring tasks and them blocking high priority tasks
results in priority inversion.
Locking the entire address space is required to present fully coherent
picture of the address space, however even current implementation does not
strictly guarantee that by outputting vmas in page-size chunks and
dropping mmap_lock in between each chunk. Address space modifications are
possible while mmap_lock is dropped and userspace reading the content is
expected to deal with possible concurrent address space modifications.
Considering these relaxed rules, holding mmap_lock is not strictly needed
as long as we can guarantee that a concurrently modified vma is reported
either in its original form or after it was modified.
This patchset switches from holding mmap_lock while reading /proc/pid/maps
to taking per-vma locks as we walk the vma tree. This reduces the
contention with tasks modifying the address space because they would have
to contend for the same vma as opposed to the entire address space. Same
is done for PROCMAP_QUERY ioctl which locks only the vma that fell into
the requested range instead of the entire address space. Previous version
of this patchset [1] tried to perform /proc/pid/maps reading under RCU,
however its implementation is quite complex and the results are worse than
the new version because it still relied on mmap_lock speculation which
retries if any part of the address space gets modified. New implementaion
is both simpler and results in less contention. Note that similar approach
would not work for /proc/pid/smaps reading as it also walks the page table
and that's not RCU-safe.
Paul McKenney's designed a test [2] to measure mmap/munmap latencies while
concurrently reading /proc/pid/maps. The test has a pair of processes
scanning /proc/PID/maps, and another process unmapping and remapping 4K
pages from a 128MB range of anonymous memory. At the end of each 10
second run, the latency of each mmap() or munmap() operation is measured,
and for each run the maximum and mean latency is printed. The map/unmap
process is started first, its PID is passed to the scanners, and then the
map/unmap process waits until both scanners are running before starting
its timed test. The scanners keep scanning until the specified
/proc/PID/maps file disappears. This test registered close to 10x
improvement in update latencies:
Before the change:
./run-proc-vs-map.sh --nsamples 100 --rawdata -- --busyduration 2
0.011 0.008 0.455
0.011 0.008 0.472
0.011 0.008 0.535
0.011 0.009 0.545
...
0.011 0.014 2.875
0.011 0.014 2.913
0.011 0.014 3.007
0.011 0.015 3.018
After the change:
./run-proc-vs-map.sh --nsamples 100 --rawdata -- --busyduration 2
0.006 0.005 0.036
0.006 0.005 0.039
0.006 0.005 0.039
0.006 0.005 0.039
...
0.006 0.006 0.403
0.006 0.006 0.474
0.006 0.006 0.479
0.006 0.006 0.498
The patchset also adds a number of tests to check for /proc/pid/maps data
coherency. They are designed to detect any unexpected data tearing while
performing some common address space modifications (vma split, resize and
remap). Even before these changes, reading /proc/pid/maps might have
inconsistent data because the file is read page-by-page with mmap_lock
being dropped between the pages. An example of user-visible inconsistency
can be that the same vma is printed twice: once before it was modified and
then after the modifications. For example if vma was extended, it might be
found and reported twice. What is not expected is to see a gap where there
should have been a vma both before and after modification. This patchset
increases the chances of such tearing, therefore it's event more important
now to test for unexpected inconsistencies.
[1] https://lore.kernel.org/all/20250418174959.1431962-1-surenb@google.com/
[2] https://github.com/paulmckrcu/proc-mmap_sem-test
Suren Baghdasaryan (7):
selftests/proc: add /proc/pid/maps tearing from vma split test
selftests/proc: extend /proc/pid/maps tearing test to include vma
resizing
selftests/proc: extend /proc/pid/maps tearing test to include vma
remapping
selftests/proc: test PROCMAP_QUERY ioctl while vma is concurrently
modified
selftests/proc: add verbose more for tests to facilitate debugging
mm/maps: read proc/pid/maps under per-vma lock
mm/maps: execute PROCMAP_QUERY ioctl under per-vma locks
fs/proc/internal.h | 6 +
fs/proc/task_mmu.c | 233 +++++-
tools/testing/selftests/proc/proc-pid-vm.c | 793 ++++++++++++++++++++-
3 files changed, 1011 insertions(+), 21 deletions(-)
base-commit: 2d0c297637e7d59771c1533847c666cdddc19884
--
2.49.0.1266.g31b7d2e469-goog
The netdevsim driver previously lacked RX statistics support, which
prevented its use with the GenerateTraffic() test framework, as this
framework verifies traffic flow by checking RX byte counts.
This patch migrates netdevsim from its custom statistics collection to
the NETDEV_PCPU_STAT_DSTATS framework, as suggested by Jakub. This
change not only standardizes the statistics handling but also adds the
necessary RX statistics support required by the test framework.
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
Changes in v2:
- Changed the RX collection place from nsim_napi_rx() to nsim_rcv (Joe
Damato)
- Collect RX dropped packets statistic in nsim_queue_free() (Jakub)
- Added a helper in dstat to add values to RX dropped packets
- Link to v1: https://lore.kernel.org/r/20250611-netdevsim_stat-v1-0-c11b657d96bf@debian.…
---
Breno Leitao (4):
netdevsim: migrate to dstats stats collection
netdevsim: collect statistics at RX side
net: add dev_dstats_rx_dropped_add() helper
netdevsim: account dropped packet length in stats on queue free
drivers/net/netdevsim/netdev.c | 48 ++++++++++++++++-----------------------
drivers/net/netdevsim/netdevsim.h | 5 ----
include/linux/netdevice.h | 10 ++++++++
3 files changed, 29 insertions(+), 34 deletions(-)
---
base-commit: 6d4e01d29d87356924f1521ca6df7a364e948f13
change-id: 20250610-netdevsim_stat-95995921e03e
Best regards,
--
Breno Leitao <leitao(a)debian.org>
From: Chia-Yu Chang <chia-yu.chang(a)nokia-bell-labs.com>
Hello,
Please find DUALPI2 iproute2 patch v9.
For more details of DualPI2, please refer IETF RFC9332
(https://datatracker.ietf.org/doc/html/rfc9332).
Best Regards,
Chia-Yu
---
v9 (13-Jun-25)
- Fix space issue and typos (ALOK TIWARI <alok.a.tiwari(a)oracle.com>)
- Change 'rtt_typical' to 'typical_rtt' in tc/q_dualpi2.c (ALOK TIWARI <alok.a.tiwari(a)oracle.com>)
- Add the num of enum used by DualPI2 in pkt_sched.h
v8 (09-May-25)
- Update pkt_sched.h with the one in nex-next
- Correct a typo in the comment within pkt_sched.h (ALOK TIWARI <alok.a.tiwari(a)oracle.com>)
- Update manual content in man/man8/tc-dualpi2.8 (ALOK TIWARI <alok.a.tiwari(a)oracle.com>)
- Update tc/q_dualpi2.c to fix missing blank lines and add missing case (ALOK TIWARI <alok.a.tiwari(a)oracle.com>)
v7 (05-May-25)
- Align pkt_sched.h with the v14 version of net-next due to spec modification in tc.yaml
- Reorganize dualpi2_print_opt() to match the order in tc.yaml
- Remove credit-queue in PRINT_JSON
v6 (26-Apr-25)
- Update JSON file output due to spec modification in tc.yaml of net-next
v5 (25-Mar-25)
- Use matches() to replace current strcmp() (Stephen Hemminger <stephen(a)networkplumber.org>)
- Use general parse_percent() for handling scaled percentage values (Stephen Hemminger <stephen(a)networkplumber.org>)
- Add print function for JSON of dualpi2 stats (Stephen Hemminger <stephen(a)networkplumber.org>)
v4 (16-Mar-25)
- Add min_qlen_step to the dualpi2 attribute as the minimum queue length in number of packets in the L-queue to start step marking.
v3 (21-Feb-25)
- Add memlimit to the dualpi2 attribute, and add memory_used, max_memory_used, and memory_limit in dualpi2 stats (Dave Taht <dave.taht(a)gmail.com>)
- Update the manual to align with the latest implementation and clarify the queue naming and default unit
- Use common "get_scaled_alpha_beta" and clean print_opt for Dualpi2
v2 (23-Oct-24)
- Rename get_float in dualpi2 to get_float_min_max in utils.c
- Move get_float from iplink_can.c in utils.c (Stephen Hemminger <stephen(a)networkplumber.org>)
- Add print function for JSON of dualpi2 (Stephen Hemminger <stephen(a)networkplumber.org>)
---
Chia-Yu Chang (1):
tc: add dualpi2 scheduler module
bash-completion/tc | 11 +-
include/uapi/linux/pkt_sched.h | 70 ++++-
include/utils.h | 2 +
ip/iplink_can.c | 14 -
lib/utils.c | 30 ++
man/man8/tc-dualpi2.8 | 249 +++++++++++++++
tc/Makefile | 1 +
tc/q_dualpi2.c | 534 +++++++++++++++++++++++++++++++++
8 files changed, 895 insertions(+), 16 deletions(-)
create mode 100644 man/man8/tc-dualpi2.8
create mode 100644 tc/q_dualpi2.c
--
2.34.1
When CONFIG_SBI_CONSOLE is enabled and there is no uart defined in the
device tree kvm-unit-tests fails to start.
Only check if uart exists in device tree if SBI_CONSOLE is false.
Signed-off-by: Jesse Taube <jesse(a)rivosinc.com>
---
lib/riscv/io.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/lib/riscv/io.c b/lib/riscv/io.c
index fb40adb7..96a3c048 100644
--- a/lib/riscv/io.c
+++ b/lib/riscv/io.c
@@ -104,6 +104,7 @@ static void uart0_init_acpi(void)
void io_init(void)
{
+#ifndef CONFIG_SBI_CONSOLE
if (dt_available())
uart0_init_fdt();
else
@@ -114,6 +115,7 @@ void io_init(void)
"Found uart at %p, but early base is %p.\n",
uart0_base, UART_EARLY_BASE);
}
+#endif
}
#ifdef CONFIG_SBI_CONSOLE
--
2.43.0
If CONFIG_UPROBES is not set, a merge subtest fails:
Failure log:
7151 12:46:54.627936 # # # RUN merge.handle_uprobe_upon_merged_vma ...
7152 12:46:54.639014 # # f /sys/bus/event_source/devices/uprobe/type
7153 12:46:54.639306 # # fopen: No such file or directory
7154 12:46:54.650451 # # # merge.c:473:handle_uprobe_upon_merged_vma:Expected read_sysfs("/sys/bus/event_source/devices/uprobe/type", &type) (1) == 0 (0)
7155 12:46:54.650730 # # # handle_uprobe_upon_merged_vma: Test terminated by assertion
7156 12:46:54.661750 # # # FAIL merge.handle_uprobe_upon_merged_vma
7157 12:46:54.662030 # # not ok 8 merge.handle_uprobe_upon_merged_vma
CONFIG_UPROBES is enabled by CONFIG_UPROBE_EVENTS, which gets enabled by
CONFIG_FTRACE. Therefore add these configs to selftests/mm/config so that
CI systems can include this config in the kernel build. To be completely
safe, add CONFIG_PROFILING too, to enable the dependency chain
PROFILING -> PERF_EVENTS -> UPROBE_EVENTS -> UPROBES.
Fixes: efe99fabeb11b ("selftests/mm: add test about uprobe pte be orphan during vma merge")
Reported-by: Aishwarya <aishwarya.tcv(a)arm.com>
Closes: https://lore.kernel.org/all/20250610103729.72440-1-aishwarya.tcv@arm.com/
Tested-by: Aishwarya TCV <aishwarya.tcv(a)arm.com>
Tested-by : Donet Tom <donettom(a)linux.ibm.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Signed-off-by: Dev Jain <dev.jain(a)arm.com>
---
v1->v2:
- Add CONFIG_UPROBES (Mark Brown)
- Add CONFIG_PROFILING (Lorenzo)
tools/testing/selftests/mm/config | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/mm/config b/tools/testing/selftests/mm/config
index a28baa536332..deba93379c80 100644
--- a/tools/testing/selftests/mm/config
+++ b/tools/testing/selftests/mm/config
@@ -8,3 +8,6 @@ CONFIG_GUP_TEST=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_MEM_SOFT_DIRTY=y
CONFIG_ANON_VMA_NAME=y
+CONFIG_FTRACE=y
+CONFIG_PROFILING=y
+CONFIG_UPROBES=y
--
2.30.2
Currently gup_longterm assumes that filesystems support fallocate() and uses
that to allocate space in files, however this is an optional feature and is
in particular not implemented by NFSv3 which is commonly used in CI systems
leading to spurious failures. Check for lack of support and report a skip
instead for that case.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
tools/testing/selftests/mm/gup_longterm.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/gup_longterm.c b/tools/testing/selftests/mm/gup_longterm.c
index 8a97ac5176a4..0e99494268ed 100644
--- a/tools/testing/selftests/mm/gup_longterm.c
+++ b/tools/testing/selftests/mm/gup_longterm.c
@@ -114,7 +114,15 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared)
}
if (fallocate(fd, 0, 0, size)) {
- if (size == pagesize) {
+ /*
+ * Some filesystems (eg, NFSv3) don't support
+ * fallocate(), report this as a skip rather than a
+ * test failure.
+ */
+ if (errno == EOPNOTSUPP) {
+ ksft_print_msg("fallocate() not supported by filesystem\n");
+ result = KSFT_SKIP;
+ } else if (size == pagesize) {
ksft_print_msg("fallocate() failed (%s)\n", strerror(errno));
result = KSFT_FAIL;
} else {
---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20250610-selftest-mm-gup-longterm-fallocate-nfs-21ef54627ef2
Best regards,
--
Mark Brown <broonie(a)kernel.org>
Initially netpoll and netconsole were created together, and some
functions are in the wrong file. Seperate netconsole-only functions
in netconsole, avoiding exports.
1. Expose netpoll logging macros in the public header to enable consistent
log formatting across netpoll consumers.
2. Relocate netconsole-specific functions from netpoll to the netconsole
module where they are actually used, reducing unnecessary coupling.
3. Remove unnecessary function exports
4. Rename netpoll parsing functions in netconsole to better reflect their
specific usage.
5. Create a test to check that cmdline works fine. This was in my todo
list since [1], this was a good time to add it here to make sure this
patchset doesn't regress.
PS: The code was split in a way that it is easy to review. When copying
the functions from netpoll to netconsole, I do not change than other
than adding `static`. This will make checkpatch unhappy, but, further
patches will address the issues. It is done this way to make it easy for
reviewers.
Link: https://lore.kernel.org/netdev/Z36TlACdNMwFD7wv@dev-ushankar.dev.purestorag… [1]
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
Changes in v2:
- No change in the code. Just rebased the patches onto netnext/main
- Link to v1: https://lore.kernel.org/r/20250610-rework-v1-0-7cfde283f246@debian.org
---
Breno Leitao (7):
netpoll: remove __netpoll_cleanup from exported API
netpoll: expose netpoll logging macros in public header
netpoll: relocate netconsole-specific functions to netconsole module
netpoll: move netpoll_print_options to netconsole
netconsole: rename functions to better reflect their purpose
netconsole: improve code style in parser function
selftest: netconsole: add test for cmdline configuration
drivers/net/netconsole.c | 137 ++++++++++++++++++++-
include/linux/netpoll.h | 10 +-
net/core/netpoll.c | 136 +-------------------
tools/testing/selftests/drivers/net/Makefile | 1 +
.../selftests/drivers/net/lib/sh/lib_netcons.sh | 39 +++++-
.../selftests/drivers/net/netcons_cmdline.sh | 52 ++++++++
6 files changed, 228 insertions(+), 147 deletions(-)
---
base-commit: 0097c4195b1d0ca57d15979626c769c74747b5a0
change-id: 20250603-rework-c175cad8d22e
Best regards,
--
Breno Leitao <leitao(a)debian.org>
When running the khugepaged selftest for shmem (./khugepaged all:shmem),
I encountered the following test failures:
"
Run test: collapse_full (khugepaged:shmem)
Collapse multiple fully populated PTE table.... Fail
...
Run test: collapse_single_pte_entry (khugepaged:shmem)
Collapse PTE table with single PTE entry present.... Fail
...
Run test: collapse_full_of_compound (khugepaged:shmem)
Allocate huge page... OK
Split huge page leaving single PTE page table full of compound pages... OK
Collapse PTE table full of compound pages.... Fail
"
The reason for the failure is that, it will set MADV_NOHUGEPAGE to prevent
khugepaged from continuing to scan shmem VMA after khugepaged finishes
scanning in the wait_for_scan() function. Moreover, shmem requires a refault
to establish PMD mappings.
However, after commit 2b0f922323cc ("mm: don't install PMD mappings when
THPs are disabled by the hw/process/vma"), PMD mappings are prevented if the
VMA is set with MADV_NOHUGEPAGE flag, so shmem cannot establish PMD mappings
during refault.
One way to fix this issue is to move the MADV_NOHUGEPAGE setting after the
shmem refault. After shmem refault and check huge, the test case will unmap
the shmem immediately. So it seems unnecessary to set the MADV_NOHUGEPAGE.
Then we can simply drop the MADV_NOHUGEPAGE setting, and all khugepaged test
cases passed.
Fixes: 2b0f922323cc ("mm: don't install PMD mappings when THPs are disabled by the hw/process/vma")
Reviewed-by: Zi Yan <ziy(a)nvidia.com>
Signed-off-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
---
Changes from v1:
- Add reviewed tag from Zi. Thanks.
- Drop the MADV_NOHUGEPAGE setting, per David.
---
tools/testing/selftests/mm/khugepaged.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selftests/mm/khugepaged.c
index 8a4d34cce36b..4341ce6b3b38 100644
--- a/tools/testing/selftests/mm/khugepaged.c
+++ b/tools/testing/selftests/mm/khugepaged.c
@@ -561,8 +561,6 @@ static bool wait_for_scan(const char *msg, char *p, int nr_hpages,
usleep(TICK);
}
- madvise(p, nr_hpages * hpage_pmd_size, MADV_NOHUGEPAGE);
-
return timeout == -1;
}
--
2.43.5
Nolibc is useful for selftests as the test programs can be very small,
and compiled with just a kernel crosscompiler, without userspace support.
Currently nolibc is only usable with kselftest.h, not the more
convenient to use kselftest_harness.h
This series provides this compatibility by removing the usage of problematic
libc features from the harness.
Based on nolibc/for-next.
The series is meant to be merged through the nolibc tree.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
---
Changes in v4:
- Drop patches for nolibc which where already applied
- Preserve signatures of test functions for tests making assumptions about them
drop 'selftests: harness: Always provide "self" and "variant"'
add 'selftests: harness: Add "variant" and "self" to test metadata'
adapt 'selftests: harness: Stop using setjmp()/longjmp()'
- Validate test function signatures in harness selftest
- Link to v3: https://lore.kernel.org/r/20250411-nolibc-kselftest-harness-v3-0-4d9c029589…
Changes in v3:
- Send patches to correct kselftest harness maintainers
- Move harness selftest to dedicated directory
- Add harness selftest to MAINTAINERS
- Integrate harness selftest cleanup with the selftest framework
- Consistently use "kselftest harness" in commit messages
- Properly propagate kselftest harness failure
- Link to v2: https://lore.kernel.org/r/20250407-nolibc-kselftest-harness-v2-0-f8812f76e9…
Changes in v2:
- Rebase unto v6.15-rc1
- Rename internal nolibc symbols
- Handle edge case of waitpid(INT_MIN) == ESRCH
- Fix arm configurations for final testing patch
- Clean up global getopt.h variable declarations
- Add Acks from Willy
- Link to v1: https://lore.kernel.org/r/20250304-nolibc-kselftest-harness-v1-0-adca7cd231…
---
Thomas Weißschuh (14):
selftests: harness: Add kselftest harness selftest
selftests: harness: Use C89 comment style
selftests: harness: Ignore unused variant argument warning
selftests: harness: Mark functions without prototypes static
selftests: harness: Remove inline qualifier for wrappers
selftests: harness: Remove dependency on libatomic
selftests: harness: Implement test timeouts through pidfd
selftests: harness: Don't set setup_completed for fixtureless tests
selftests: harness: Move teardown conditional into test metadata
selftests: harness: Add teardown callback to test metadata
selftests: harness: Add "variant" and "self" to test metadata
selftests: harness: Stop using setjmp()/longjmp()
selftests: harness: Guard includes on nolibc
HACK: selftests/nolibc: demonstrate usage of the kselftest harness
MAINTAINERS | 1 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/kselftest_harness.h | 175 +-
.../testing/selftests/kselftest_harness/.gitignore | 2 +
tools/testing/selftests/kselftest_harness/Makefile | 7 +
.../selftests/kselftest_harness/harness-selftest.c | 138 ++
.../kselftest_harness/harness-selftest.expected | 64 +
.../kselftest_harness/harness-selftest.sh | 13 +
tools/testing/selftests/nolibc/Makefile | 15 +-
tools/testing/selftests/nolibc/harness-selftest.c | 1 +
tools/testing/selftests/nolibc/nolibc-test.c | 1715 +-------------------
tools/testing/selftests/nolibc/run-tests.sh | 2 +-
12 files changed, 313 insertions(+), 1821 deletions(-)
---
base-commit: 2051d3b830c0889ae55e37e9e8ff0d43a4acd482
change-id: 20250130-nolibc-kselftest-harness-8b2c8cac43bf
Best regards,
--
Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>
Spelling fix:
conneciton --> connection
This is a non-functional change aimed at improving code clarity.
Signed-off-by: Ankit Chauhan <ankitchauhan2065(a)gmail.com>
---
tools/testing/selftests/net/tcp_ao/seq-ext.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/tcp_ao/seq-ext.c b/tools/testing/selftests/net/tcp_ao/seq-ext.c
index f00245263b20..6478da6a71c3 100644
--- a/tools/testing/selftests/net/tcp_ao/seq-ext.c
+++ b/tools/testing/selftests/net/tcp_ao/seq-ext.c
@@ -1,7 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
/* Check that after SEQ number wrap-around:
* 1. SEQ-extension has upper bytes set
- * 2. TCP conneciton is alive and no TCPAOBad segments
+ * 2. TCP connection is alive and no TCPAOBad segments
* In order to test (2), the test doesn't just adjust seq number for a queue
* on a connected socket, but migrates it to another sk+port number, so
* that there won't be any delayed packets that will fail to verify
--
2.34.1
From: Chia-Yu Chang <chia-yu.chang(a)nokia-bell-labs.com>
Hello,
Please find the DualPI2 patch v17.
This patch serise adds DualPI Improved with a Square (DualPI2) with following features:
* Supports congestion controls that comply with the Prague requirements in RFC9331 (e.g. TCP-Prague)
* Coupled dual-queue that separates the L4S traffic in a low latency queue (L-queue), without harming remaining traffic that is scheduled in classic queue (C-queue) due to congestion-coupling using PI2 as defined in RFC9332
* Configurable overload strategies
* Use of sojourn time to reliably estimate queue delay
* Supports ECN L4S-identifier (IP.ECN==0b*1) to classify traffic into respective queues
For more details of DualPI2, please refer IETF RFC9332 (https://datatracker.ietf.org/doc/html/rfc9332).
Best regards,
Chia-Yu
---
v17 (25-May-2025, Resent at 11-Jun-2025)
- Replace 0xffffffff with U32_MAX (Paolo Abeni <pabeni(a)redhat.com>)
- Use helper function qdisc_dequeue_internal() and add new helper function skb_apply_step() (Paolo Abeni <pabeni(a)redhat.com>)
- Add s64 casting when calculating the delta of the PI controller (Paolo Abeni <pabeni(a)redhat.com>)
- Change the drop reason into SKB_DROP_REASON_QDISC_CONGESTED for drop_early (Paolo Abeni <pabeni(a)redhat.com>)
- Modify the condition to remove the original skb when enqueuing multiple GSO segments (Paolo Abeni <pabeni(a)redhat.com>)
- Add READ_ONCE() in dualpi2_dump_stat() (Paolo Abeni <pabeni(a)redhat.com>)
- Add comments, brackets, and brackets for readability (Paolo Abeni <pabeni(a)redhat.com>)
v16 (16-MAy-2025)
- Add qdisc_lock() to dualpi2_timer() in dualpi2_timer (Paolo Abeni <pabeni(a)redhat.com>)
- Introduce convert_ns_to_usec() to convert usec to nsec without overflow in #1 (Paolo Abeni <pabeni(a)redhat.com>)
- Update convert_us_tonsec() to convert nsec to usec without overflow in #2 (Paolo Abeni <pabeni(a)redhat.com>)
- Add more descriptions with respect to DualPI2 in the cover ltter and add changelog in each patch (Paolo Abeni <pabeni(a)redhat.com>)
v15 (09-May-2025)
- Add enum of TCA_DUALPI2_ECN_MASK_CLA_ECT to remove potential leakeage in #1 (Simon Horman <horms(a)kernel.org>)
- Fix one typo in comment of #2
- Update tc.yaml in #5 to aligh with the updated enum of pkt_sched.h
v14 (05-May-2025)
- Modify tc.yaml: (1) Replace flags with enum and remove enum-as-flags, (2) Remove credit-queue in xstats, and (3) Change attribute types (Donald Hunter <donald.hun
- Add enum and fix the ordering of variables in pkt_sched.h to align with the modified tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Add validators for DROP_OVERLOAD, DROP_EARLY, ECN_MASK, and SPLIT_GSO in sch_dualpi2.c (Donald Hunter <donald.hunter(a)gmail.com>)
- Update dualpi2.json to align with the updated variable order in pkt_sched.h
- Reorder patches (Donald Hunter <donald.hunter(a)gmail.com>)
v13 (26-Apr-2025)
- Use dashes in member names to follow YNL conventions in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Define enumerations separately for flags of drop-early, drop-overload, ecn-mask, credit-queue in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Change the types of split-gso and step-packets into flag in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Revert to u32/u8 types for tc-dualpi2-xstats members in tc.yaml (Donald Hunter <donald.hunter(a)gmail.com>)
- Add new test cases in tc-tests/qdiscs/dualpi2.json to cover all dualpi2 parameters (Donald Hunter <donald.hunter(a)gmail.com>)
- Change the type of TCA_DUALPI2_STEP_PACKETS into NLA_FLAG (Donald Hunter <donald.hunter(a)gmail.com>)
v12 (22-Apr-2025)
- Remove anonymous struct in sch_dualpi2.c (Paolo Abeni <pabeni(a)redhat.com>)
- Replace u32/u8 with uint and s32 with int in tc spec document (Paolo Abeni <pabeni(a)redhat.com>)
- Introduce get_memory_limit function to handle potential overflow when multipling limit with MTU (Paolo Abeni <pabeni(a)redhat.com>)
- Double the packet length to further include packet overhead in memory_limit (Paolo Abeni <pabeni(a)redhat.com>)
- Remove the check of qdisc_qlen(sch) when calling qdisc_tree_reduce_backlog (Paolo Abeni <pabeni(a)redhat.com>)
v11 (15-Apr-2025)
- Replace hstimer_init with hstimer_setup in sch_dualpi2.c
v10 (25-Mar-2025)
- Remove leftover include in include/linux/netdevice.h and anonymous struct in sch_dualpi2.c (Paolo Abeni <pabeni(a)redhat.com>)
- Use kfree_skb_reason() and add SKB_DROP_REASON_DUALPI2_STEP_DROP drop reason (Paolo Abeni <pabeni(a)redhat.com>)
- Split sch_dualpi2.c into 3 patches (and overall 5 patches): Struct definition & parsing, Dump stats & configuration, Enqueue/Dequeue (Paolo Abeni <pabeni(a)redhat.com>)
v9 (16-Mar-2025)
- Fix mem_usage error in previous version
- Add min_qlen_step to the dualpi2 attribute as the minimum queue length in number of packets in the L-queue to start step threshold marking.
In previous versions, this value was fixed to 2, so the step threshold was applied to mark packets in the L queue only when the queue length of the L queue was greater than or equal to 2 packets.
This will cause larger queuing delays for L4S traffic at low rates (<20Mbps). So we parameterize it and change the default value to 0.
Comparison of tcp_1down run 'HTB 20Mbit + DUALPI2 + 10ms base delay'
Old versions:
avg median # data pts
Ping (ms) ICMP : 11.55 11.70 ms 350
TCP upload avg : 18.96 N/A Mbits/s 350
TCP upload sum : 18.96 N/A Mbits/s 350
New version (v9):
avg median # data pts
Ping (ms) ICMP : 10.81 10.70 ms 350
TCP upload avg : 18.91 N/A Mbits/s 350
TCP upload sum : 18.91 N/A Mbits/s 350
Comparison of tcp_1down run 'HTB 10Mbit + DUALPI2 + 10ms base delay'
Old versions:
avg median # data pts
Ping (ms) ICMP : 12.61 12.80 ms 350
TCP upload avg : 9.48 N/A Mbits/s 350
TCP upload sum : 9.48 N/A Mbits/s 350
New version (v9):
avg median # data pts
Ping (ms) ICMP : 11.06 10.80 ms 350
TCP upload avg : 9.43 N/A Mbits/s 350
TCP upload sum : 9.43 N/A Mbits/s 350
Comparison of tcp_1down run 'HTB 10Mbit + DUALPI2 + 10ms base delay'
Old versions:
avg median # data pts
Ping (ms) ICMP : 40.86 37.45 ms 350
TCP upload avg : 0.88 N/A Mbits/s 350
TCP upload sum : 0.88 N/A Mbits/s 350
TCP upload::1 : 0.88 0.97 Mbits/s 350
New version (v9):
avg median # data pts
Ping (ms) ICMP : 11.07 10.40 ms 350
TCP upload avg : 0.55 N/A Mbits/s 350
TCP upload sum : 0.55 N/A Mbits/s 350
TCP upload::1 : 0.55 0.59 Mbits/s 350
v8 (11-Mar-2025)
- Fix warning messages in v7
v7 (07-Mar-2025)
- Separate into 3 patches to avoid mixing changes of documentation, selftest, and code. (Cong Wang <xiyou.wangcong(a)gmail.com>)
v6 (04-Mar-2025)
- Add modprobe for dulapi2 in tc-testing script tc-testing/tdc.sh (Jakub Kicinski <kuba(a)kernel.org>)
- Update test cases in dualpi2.json
- Update commit message
v5 (22-Feb-2025)
- A comparison was done between MQ + DUALPI2, MQ + FQ_PIE, MQ + FQ_CODEL:
Unshaped 1gigE with 4 download streams test:
- Summary of tcp_4down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 1.19 1.34 ms 349
TCP download avg : 235.42 N/A Mbits/s 349
TCP download sum : 941.68 N/A Mbits/s 349
TCP download::1 : 235.19 235.39 Mbits/s 349
TCP download::2 : 235.03 235.35 Mbits/s 349
TCP download::3 : 236.89 235.44 Mbits/s 349
TCP download::4 : 234.57 235.19 Mbits/s 349
- Summary of tcp_4down run 'MQ + FQ_PIE'
avg median # data pts
Ping (ms) ICMP : 1.21 1.37 ms 350
TCP download avg : 235.42 N/A Mbits/s 350
TCP download sum : 941.61 N/A Mbits/s 350
TCP download::1 : 232.54 233.13 Mbits/s 350
TCP download::2 : 232.52 232.80 Mbits/s 350
TCP download::3 : 233.14 233.78 Mbits/s 350
TCP download::4 : 243.41 241.48 Mbits/s 350
- Summary of tcp_4down run 'MQ + DUALPI2'
avg median # data pts
Ping (ms) ICMP : 1.19 1.34 ms 349
TCP download avg : 235.42 N/A Mbits/s 349
TCP download sum : 941.68 N/A Mbits/s 349
TCP download::1 : 235.19 235.39 Mbits/s 349
TCP download::2 : 235.03 235.35 Mbits/s 349
TCP download::3 : 236.89 235.44 Mbits/s 349
TCP download::4 : 234.57 235.19 Mbits/s 349
Unshaped 1gigE with 128 download streams test:
- Summary of tcp_128down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 1.88 1.86 ms 350
TCP download avg : 7.39 N/A Mbits/s 350
TCP download sum : 946.47 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + FQ_PIE':
avg median # data pts
Ping (ms) ICMP : 1.88 1.86 ms 350
TCP download avg : 7.39 N/A Mbits/s 350
TCP download sum : 946.47 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + DUALPI2':
avg median # data pts
Ping (ms) ICMP : 1.88 1.86 ms 350
TCP download avg : 7.39 N/A Mbits/s 350
TCP download sum : 946.47 N/A Mbits/s 350
Unshaped 10gigE with 4 download streams test:
- Summary of tcp_4down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 0.22 0.23 ms 350
TCP download avg : 2354.08 N/A Mbits/s 350
TCP download sum : 9416.31 N/A Mbits/s 350
TCP download::1 : 2353.65 2352.81 Mbits/s 350
TCP download::2 : 2354.54 2354.21 Mbits/s 350
TCP download::3 : 2353.56 2353.78 Mbits/s 350
TCP download::4 : 2354.56 2354.45 Mbits/s 350
- Summary of tcp_4down run 'MQ + FQ_PIE':
avg median # data pts
Ping (ms) ICMP : 0.20 0.19 ms 350
TCP download avg : 2354.76 N/A Mbits/s 350
TCP download sum : 9419.04 N/A Mbits/s 350
TCP download::1 : 2354.77 2353.89 Mbits/s 350
TCP download::2 : 2353.41 2354.29 Mbits/s 350
TCP download::3 : 2356.18 2354.19 Mbits/s 350
TCP download::4 : 2354.68 2353.15 Mbits/s 350
- Summary of tcp_4down run 'MQ + DUALPI2':
avg median # data pts
Ping (ms) ICMP : 0.24 0.24 ms 350
TCP download avg : 2354.11 N/A Mbits/s 350
TCP download sum : 9416.43 N/A Mbits/s 350
TCP download::1 : 2354.75 2353.93 Mbits/s 350
TCP download::2 : 2353.15 2353.75 Mbits/s 350
TCP download::3 : 2353.49 2353.72 Mbits/s 350
TCP download::4 : 2355.04 2353.73 Mbits/s 350
Unshaped 10gigE with 128 download streams test:
- Summary of tcp_128down run 'MQ + FQ_CODEL':
avg median # data pts
Ping (ms) ICMP : 7.57 8.69 ms 350
TCP download avg : 73.97 N/A Mbits/s 350
TCP download sum : 9467.82 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + FQ_PIE':
avg median # data pts
Ping (ms) ICMP : 7.82 8.91 ms 350
TCP download avg : 73.97 N/A Mbits/s 350
TCP download sum : 9468.42 N/A Mbits/s 350
- Summary of tcp_128down run 'MQ + DUALPI2':
avg median # data pts
Ping (ms) ICMP : 6.87 7.93 ms 350
TCP download avg : 73.95 N/A Mbits/s 350
TCP download sum : 9465.87 N/A Mbits/s 350
From the results shown above, we see small differences between combinations.
- Update commit message to include results of no_split_gso and split_gso (Dave Taht <dave.taht(a)gmail.com> and Paolo Abeni <pabeni(a)redhat.com>)
- Add memlimit in the dualpi2 attribute, and add memory_used, max_memory_used, memory_limit in dualpi2 stats (Dave Taht <dave.taht(a)gmail.com>)
- Update note in sch_dualpi2.c related to BBRv3 status (Dave Taht <dave.taht(a)gmail.com>)
- Update license identifier (Dave Taht <dave.taht(a)gmail.com>)
- Add selftest in tools/testing/selftests/tc-testing (Cong Wang <xiyou.wangcong(a)gmail.com>)
- Use netlink policies for parameter checks (Jamal Hadi Salim <jhs(a)mojatatu.com>)
- Modify texts & fix typos in Documentation/netlink/specs/tc.yaml (Dave Taht <dave.taht(a)gmail.com>)
- Add descriptions of packet counter statistics and the reset function of sch_dualpi2.c
- Fix step_thresh in packets
- Update code comments in sch_dualpi2.c
v4 (22-Oct-2024)
- Update statement in Kconfig for DualPI2 (Stephen Hemminger <stephen(a)networkplumber.org>)
- Put a blank line after #define in sch_dualpi2.c (Stephen Hemminger <stephen(a)networkplumber.org>)
- Fix line length warning.
v3 (19-Oct-2024)
- Fix compilaiton error
- Update Documentation/netlink/specs/tc.yaml (Jakub Kicinski <kuba(a)kernel.org>)
v2 (18-Oct-2024)
- Add Documentation/netlink/specs/tc.yaml (Jakub Kicinski <kuba(a)kernel.org>)
- Use dualpi2 instead of skb prefix (Jamal Hadi Salim <jhs(a)mojatatu.com>)
- Replace nla_parse_nested_deprecated with nla_parse_nested (Jamal Hadi Salim <jhs(a)mojatatu.com>)
- Fix line length warning
---
Chia-Yu Chang (4):
sched: Struct definition and parsing of dualpi2 qdisc
sched: Dump configuration and statistics of dualpi2 qdisc
selftests/tc-testing: Add selftests for qdisc DualPI2
Documentation: netlink: specs: tc: Add DualPI2 specification
Koen De Schepper (1):
sched: Add enqueue/dequeue of dualpi2 qdisc
Documentation/netlink/specs/tc.yaml | 156 +++
include/net/dropreason-core.h | 6 +
include/uapi/linux/pkt_sched.h | 68 +
net/sched/Kconfig | 12 +
net/sched/Makefile | 1 +
net/sched/sch_dualpi2.c | 1146 +++++++++++++++++
tools/testing/selftests/tc-testing/config | 1 +
.../tc-testing/tc-tests/qdiscs/dualpi2.json | 254 ++++
tools/testing/selftests/tc-testing/tdc.sh | 1 +
9 files changed, 1645 insertions(+)
create mode 100644 net/sched/sch_dualpi2.c
create mode 100644 tools/testing/selftests/tc-testing/tc-tests/qdiscs/dualpi2.json
--
2.34.1
Tests may wish to add other interfaces to listen on. Notably locally
generated traffic uses dummy interfaces. The multicast daemon needs to know
about these so that it allows forming rules that involve these interfaces,
and so that net.ipv4.conf.X.mc_forwarding is set for the interfaces.
To that end, allow passing in a list of interfaces to configure in addition
to all the physical ones.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor(a)blackwall.org>
---
Notes:
v2:
- Adjust as per shellcheck citations
- Retain Nik's R-b, the changes were very minor.
---
CC: Shuah Khan <shuah(a)kernel.org>
CC: linux-kselftest(a)vger.kernel.org
tools/testing/selftests/net/forwarding/lib.sh | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 253847372062..83ee6a07e072 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -1760,9 +1760,12 @@ mc_send()
adf_mcd_start()
{
+ local ifs=("$@")
+
local table_name="$MCD_TABLE_NAME"
local smcroutedir
local pid
+ local if
local i
check_command "$MCD" || return 1
@@ -1776,6 +1779,16 @@ adf_mcd_start()
"$smcroutedir/$table_name.conf"
done
+ for if in "${ifs[@]}"; do
+ if ! ip_link_has_flag "$if" MULTICAST; then
+ ip link set dev "$if" multicast on
+ defer ip link set dev "$if" multicast off
+ fi
+
+ echo "phyint $if enable" >> \
+ "$smcroutedir/$table_name.conf"
+ done
+
"$MCD" -N -I "$table_name" -f "$smcroutedir/$table_name.conf" \
-P "$smcroutedir/$table_name.pid"
busywait "$BUSYWAIT_TIMEOUT" test -e "$smcroutedir/$table_name.pid"
--
2.49.0
IDT event delivery has a debug hole in which it does not generate #DB
upon returning to userspace before the first userspace instruction is
executed if the Trap Flag (TF) is set.
FRED closes this hole by introducing a software event flag, i.e., bit
17 of the augmented SS: if the bit is set and ERETU would result in
RFLAGS.TF = 1, a single-step trap will be pending upon completion of
ERETU.
However I overlooked properly setting and clearing the bit in different
situations. Thus when FRED is enabled, if the Trap Flag (TF) is set
without an external debugger attached, it can lead to an infinite loop
in the SIGTRAP handler. To avoid this, the software event flag in the
augmented SS must be cleared, ensuring that no single-step trap remains
pending when ERETU completes.
This patch set combines the fix [1] and its corresponding selftest [2]
(requested by Dave Hansen) into one patch set.
[1] https://lore.kernel.org/lkml/20250523050153.3308237-1-xin@zytor.com/
[2] https://lore.kernel.org/lkml/20250530230707.2528916-1-xin@zytor.com/
This patch set is based on tip/x86/urgent branch as of today.
Xin Li (Intel) (2):
x86/fred/signal: Prevent single-step upon ERETU completion
selftests/x86: Add a test to detect infinite sigtrap handler loop
arch/x86/include/asm/sighandling.h | 22 +++++
arch/x86/kernel/signal_32.c | 4 +
arch/x86/kernel/signal_64.c | 4 +
tools/testing/selftests/x86/Makefile | 2 +-
tools/testing/selftests/x86/sigtrap_loop.c | 97 ++++++++++++++++++++++
5 files changed, 128 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/x86/sigtrap_loop.c
base-commit: dd2922dcfaa3296846265e113309e5f7f138839f
--
2.49.0
This series addresses a regression in ethtool flow steering where rules
targeting the default RSS context (context 0) were incorrectly rejected.
The default RSS context always exists but is not stored in the rss_ctx
xarray like additional contexts. The current validation logic was
checking for the existence of context 0 in this array, causing valid
flow steering rules to be rejected.
This prevented configurations such as:
- High priority rules directing specific traffic to the default context
- Low priority catch-all rules directing remaining traffic to additional
contexts
Patch 1 fixes the validation logic to skip the existence check for
context 0.
Patch 2 adds a selftest that verifies this behavior.
Changelog -
v2->v3: https://lore.kernel.org/all/20250609120250.1630125-1-gal@nvidia.com/
* Reworded commit message.
* Fix pylint warning.
v1->v2: https://lore.kernel.org/all/20250225071348.509432-1-gal@nvidia.com/
* Reworded commit message.
* Added a selftest.
Gal Pressman (2):
net: ethtool: Don't check if RSS context exists in case of context 0
selftests: drv-net: rss_ctx: Add test for ntuple rules targeting
default RSS context
net/ethtool/ioctl.c | 3 +-
.../selftests/drivers/net/hw/rss_ctx.py | 59 ++++++++++++++++++-
2 files changed, 60 insertions(+), 2 deletions(-)
--
2.40.1
The SBI Firmware Feature extension allows the S-mode to request some
specific features (either hardware or software) to be enabled. This
series uses this extension to request misaligned access exception
delegation to S-mode in order to let the kernel handle it. It also adds
support for the KVM FWFT SBI extension based on the misaligned access
handling infrastructure.
FWFT SBI extension is part of the SBI V3.0 specifications [1]. It can be
tested using the qemu provided at [2] which contains the series from
[3]. Upstream kvm-unit-tests can be used inside kvm to tests the correct
delegation of misaligned exceptions. Upstream OpenSBI can be used.
Note: Since SBI V3.0 is not yet ratified, FWFT extension API is split
between interface only and implementation, allowing to pick only the
interface which do not have hard dependencies on SBI.
The tests can be run using the kselftest from series [4].
$ qemu-system-riscv64 \
-cpu rv64,trap-misaligned-access=true,v=true \
-M virt \
-m 1024M \
-bios fw_dynamic.bin \
-kernel Image
...
# ./misaligned
TAP version 13
1..23
# Starting 23 tests from 1 test cases.
# RUN global.gp_load_lh ...
# OK global.gp_load_lh
ok 1 global.gp_load_lh
# RUN global.gp_load_lhu ...
# OK global.gp_load_lhu
ok 2 global.gp_load_lhu
# RUN global.gp_load_lw ...
# OK global.gp_load_lw
ok 3 global.gp_load_lw
# RUN global.gp_load_lwu ...
# OK global.gp_load_lwu
ok 4 global.gp_load_lwu
# RUN global.gp_load_ld ...
# OK global.gp_load_ld
ok 5 global.gp_load_ld
# RUN global.gp_load_c_lw ...
# OK global.gp_load_c_lw
ok 6 global.gp_load_c_lw
# RUN global.gp_load_c_ld ...
# OK global.gp_load_c_ld
ok 7 global.gp_load_c_ld
# RUN global.gp_load_c_ldsp ...
# OK global.gp_load_c_ldsp
ok 8 global.gp_load_c_ldsp
# RUN global.gp_load_sh ...
# OK global.gp_load_sh
ok 9 global.gp_load_sh
# RUN global.gp_load_sw ...
# OK global.gp_load_sw
ok 10 global.gp_load_sw
# RUN global.gp_load_sd ...
# OK global.gp_load_sd
ok 11 global.gp_load_sd
# RUN global.gp_load_c_sw ...
# OK global.gp_load_c_sw
ok 12 global.gp_load_c_sw
# RUN global.gp_load_c_sd ...
# OK global.gp_load_c_sd
ok 13 global.gp_load_c_sd
# RUN global.gp_load_c_sdsp ...
# OK global.gp_load_c_sdsp
ok 14 global.gp_load_c_sdsp
# RUN global.fpu_load_flw ...
# OK global.fpu_load_flw
ok 15 global.fpu_load_flw
# RUN global.fpu_load_fld ...
# OK global.fpu_load_fld
ok 16 global.fpu_load_fld
# RUN global.fpu_load_c_fld ...
# OK global.fpu_load_c_fld
ok 17 global.fpu_load_c_fld
# RUN global.fpu_load_c_fldsp ...
# OK global.fpu_load_c_fldsp
ok 18 global.fpu_load_c_fldsp
# RUN global.fpu_store_fsw ...
# OK global.fpu_store_fsw
ok 19 global.fpu_store_fsw
# RUN global.fpu_store_fsd ...
# OK global.fpu_store_fsd
ok 20 global.fpu_store_fsd
# RUN global.fpu_store_c_fsd ...
# OK global.fpu_store_c_fsd
ok 21 global.fpu_store_c_fsd
# RUN global.fpu_store_c_fsdsp ...
# OK global.fpu_store_c_fsdsp
ok 22 global.fpu_store_c_fsdsp
# RUN global.gen_sigbus ...
[12797.988647] misaligned[618]: unhandled signal 7 code 0x1 at 0x0000000000014dc0 in misaligned[4dc0,10000+76000]
[12797.988990] CPU: 0 UID: 0 PID: 618 Comm: misaligned Not tainted 6.13.0-rc6-00008-g4ec4468967c9-dirty #51
[12797.989169] Hardware name: riscv-virtio,qemu (DT)
[12797.989264] epc : 0000000000014dc0 ra : 0000000000014d00 sp : 00007fffe165d100
[12797.989407] gp : 000000000008f6e8 tp : 0000000000095760 t0 : 0000000000000008
[12797.989544] t1 : 00000000000965d8 t2 : 000000000008e830 s0 : 00007fffe165d160
[12797.989692] s1 : 000000000000001a a0 : 0000000000000000 a1 : 0000000000000002
[12797.989831] a2 : 0000000000000000 a3 : 0000000000000000 a4 : ffffffffdeadbeef
[12797.989964] a5 : 000000000008ef61 a6 : 626769735f6e0000 a7 : fffffffffffff000
[12797.990094] s2 : 0000000000000001 s3 : 00007fffe165d838 s4 : 00007fffe165d848
[12797.990238] s5 : 000000000000001a s6 : 0000000000010442 s7 : 0000000000010200
[12797.990391] s8 : 000000000000003a s9 : 0000000000094508 s10: 0000000000000000
[12797.990526] s11: 0000555567460668 t3 : 00007fffe165d070 t4 : 00000000000965d0
[12797.990656] t5 : fefefefefefefeff t6 : 0000000000000073
[12797.990756] status: 0000000200004020 badaddr: 000000000008ef61 cause: 0000000000000006
[12797.990911] Code: 8793 8791 3423 fcf4 3783 fc84 c737 dead 0713 eef7 (c398) 0001
# OK global.gen_sigbus
ok 23 global.gen_sigbus
# PASSED: 23 / 23 tests passed.
# Totals: pass:23 fail:0 xfail:0 xpass:0 skip:0 error:0
With kvm-tools:
# lkvm run -k sbi.flat -m 128
Info: # lkvm run -k sbi.flat -m 128 -c 1 --name guest-97
Info: Removed ghost socket file "/root/.lkvm//guest-97.sock".
##########################################################################
# kvm-unit-tests
##########################################################################
... [test messages elided]
PASS: sbi: fwft: FWFT extension probing no error
PASS: sbi: fwft: get/set reserved feature 0x6 error == SBI_ERR_DENIED
PASS: sbi: fwft: get/set reserved feature 0x3fffffff error == SBI_ERR_DENIED
PASS: sbi: fwft: get/set reserved feature 0x80000000 error == SBI_ERR_DENIED
PASS: sbi: fwft: get/set reserved feature 0xbfffffff error == SBI_ERR_DENIED
PASS: sbi: fwft: misaligned_deleg: Get misaligned deleg feature no error
PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature invalid value error
PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature invalid value error
PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value no error
PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value 0
PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value no error
PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value 1
PASS: sbi: fwft: misaligned_deleg: Verify misaligned load exception trap in supervisor
SUMMARY: 50 tests, 2 unexpected failures, 12 skipped
This series is available at [5].
Link: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/vv3.0-rc2/… [1]
Link: https://github.com/rivosinc/qemu/tree/dev/cleger/misaligned [2]
Link: https://lore.kernel.org/all/20241211211933.198792-3-fkonrad@amd.com/T/ [3]
Link: https://lore.kernel.org/linux-riscv/20250414123543.1615478-1-cleger@rivosin… [4]
Link: https://github.com/rivosinc/linux/tree/dev/cleger/fwft [5]
---
V8:
- Move misaligned_access_speed under CONFIG_RISCV_MISALIGNED and add a
separate commit for that.
V7:
- Fix ifdefery build problems
- Move sbi_fwft_is_supported with fwft_set_req struct
- Added Atish Reviewed-by
- Updated KVM vcpu cfg hedeleg value in set_delegation
- Changed SBI ETIME error mapping to ETIMEDOUT
- Fixed a few typo reported by Alok
V6:
- Rename FWFT interface to remove "_local"
- Fix test for MEDELEG values in KVM FWFT support
- Add __init for unaligned_access_init()
- Rebased on master
V5:
- Return ERANGE as mapping for SBI_ERR_BAD_RANGE
- Removed unused sbi_fwft_get()
- Fix kernel for sbi_fwft_local_set_cpumask()
- Fix indentation for sbi_fwft_local_set()
- Remove spurious space in kvm_sbi_fwft_ops.
- Rebased on origin/master
- Remove fixes commits and sent them as a separate series [4]
V4:
- Check SBI version 3.0 instead of 2.0 for FWFT presence
- Use long for kvm_sbi_fwft operation return value
- Init KVM sbi extension even if default_disabled
- Remove revert_on_fail parameter for sbi_fwft_feature_set().
- Fix comments for sbi_fwft_set/get()
- Only handle local features (there are no globals yet in the spec)
- Add new SBI errors to sbi_err_map_linux_errno()
V3:
- Added comment about kvm sbi fwft supported/set/get callback
requirements
- Move struct kvm_sbi_fwft_feature in kvm_sbi_fwft.c
- Add a FWFT interface
V2:
- Added Kselftest for misaligned testing
- Added get_user() usage instead of __get_user()
- Reenable interrupt when possible in misaligned access handling
- Document that riscv supports unaligned-traps
- Fix KVM extension state when an init function is present
- Rework SBI misaligned accesses trap delegation code
- Added support for CPU hotplugging
- Added KVM SBI reset callback
- Added reset for KVM SBI FWFT lock
- Return SBI_ERR_DENIED_LOCKED when LOCK flag is set
Clément Léger (14):
riscv: sbi: add Firmware Feature (FWFT) SBI extensions definitions
riscv: sbi: remove useless parenthesis
riscv: sbi: add new SBI error mappings
riscv: sbi: add FWFT extension interface
riscv: sbi: add SBI FWFT extension calls
riscv: misaligned: request misaligned exception from SBI
riscv: misaligned: use on_each_cpu() for scalar misaligned access
probing
riscv: misaligned: declare misaligned_access_speed under
CONFIG_RISCV_MISALIGNED
riscv: misaligned: move emulated access uniformity check in a function
riscv: misaligned: add a function to check misalign trap delegability
RISC-V: KVM: add SBI extension init()/deinit() functions
RISC-V: KVM: add SBI extension reset callback
RISC-V: KVM: add support for FWFT SBI extension
RISC-V: KVM: add support for SBI_FWFT_MISALIGNED_DELEG
arch/riscv/include/asm/cpufeature.h | 14 +-
arch/riscv/include/asm/kvm_host.h | 5 +-
arch/riscv/include/asm/kvm_vcpu_sbi.h | 12 +
arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h | 29 +++
arch/riscv/include/asm/sbi.h | 60 +++++
arch/riscv/include/uapi/asm/kvm.h | 1 +
arch/riscv/kernel/sbi.c | 81 ++++++-
arch/riscv/kernel/traps_misaligned.c | 112 ++++++++-
arch/riscv/kernel/unaligned_access_speed.c | 8 +-
arch/riscv/kvm/Makefile | 1 +
arch/riscv/kvm/vcpu.c | 4 +-
arch/riscv/kvm/vcpu_sbi.c | 54 +++++
arch/riscv/kvm/vcpu_sbi_fwft.c | 257 +++++++++++++++++++++
arch/riscv/kvm/vcpu_sbi_sta.c | 3 +-
14 files changed, 620 insertions(+), 21 deletions(-)
create mode 100644 arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h
create mode 100644 arch/riscv/kvm/vcpu_sbi_fwft.c
--
2.49.0
This patch series introduces a new feature to netconsole which allows
appending a message ID to the userdata dictionary.
If the msgid feature is enabled, the message ID is built from a per-target 32
bit counter that is incremented and appended to every message sent to the target.
Example::
echo 1 > "/sys/kernel/config/netconsole/cmdline0/userdata/msgid_enabled"
echo "This is message #1" > /dev/kmsg
echo "This is message #2" > /dev/kmsg
13,434,54928466,-;This is message #1
msgid=1
13,435,54934019,-;This is message #2
msgid=2
This feature can be used by the target to detect if messages were dropped or
reordered before reaching the target. This allows system administrators to
assess the reliability of their netconsole pipeline and detect loss of messages
due to network contention or temporary unavailability.
Suggested-by: Breno Leitao <leitao(a)debian.org>
Signed-off-by: Gustavo Luiz Duarte <gustavold(a)gmail.com>
Note to maintainer:
This will conflict with a fix I sent recently to net:
c85bf1975108 netconsole: fix appending sysdata when sysdata_fields ==
SYSDATA_RELEASE
Please let me know if I should rebase at some point and send a v2.
---
Gustavo Luiz Duarte (5):
netconsole: introduce 'msgid' as a new sysdata field
netconsole: implement configfs for msgid_enabled
netconsole: append msgid to sysdata
selftests: netconsole: Add tests for 'msgid' feature in sysdata
docs: netconsole: document msgid feature
Documentation/networking/netconsole.rst | 22 +++++++
drivers/net/netconsole.c | 67 +++++++++++++++++++++-
.../selftests/drivers/net/netcons_sysdata.sh | 30 ++++++++++
3 files changed, 118 insertions(+), 1 deletion(-)
---
base-commit: 0097c4195b1d0ca57d15979626c769c74747b5a0
change-id: 20250609-netconsole-msgid-b93c6f8e9c60
Best regards,
--
Gustavo Luiz Duarte <gustavold(a)gmail.com>
Tests may wish to add other interfaces to listen on. Notably locally
generated traffic uses dummy interfaces. The multicast daemon needs to know
about these so that it allows forming rules that involve these interfaces,
and so that net.ipv4.conf.X.mc_forwarding is set for the interfaces.
To that end, allow passing in a list of interfaces to configure in addition
to all the physical ones.
Signed-off-by: Petr Machata <petrm(a)nvidia.com>
---
Notes:
CC: Shuah Khan <shuah(a)kernel.org>
CC: linux-kselftest(a)vger.kernel.org
tools/testing/selftests/net/forwarding/lib.sh | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
index 88e63562f5c5..5f144d75167a 100644
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -1760,6 +1760,8 @@ mc_send()
adf_mcd_start()
{
+ local ifs=("$@")
+ local if
local i
check_command $MCD || return 1
@@ -1775,6 +1777,16 @@ adf_mcd_start()
$smcroutedir/$table_name.conf
done
+ for if in ${ifs[@]}; do
+ if ! ip_link_has_flag "$if" MULTICAST; then
+ ip link set dev "$if" multicast on
+ defer ip link set dev "$if" multicast off
+ fi
+
+ echo "phyint $if enable" >> \
+ $smcroutedir/$table_name.conf
+ done
+
$MCD -N -I $table_name -f $smcroutedir/$table_name.conf \
-P $smcroutedir/$table_name.pid
busywait "$BUSYWAIT_TIMEOUT" test -e $smcroutedir/$table_name.pid
--
2.49.0