Note: it's net/ only bits and doesn't include changes, which shoulf be
merged separately and are posted separately. The full branch for
convenience is at [1], and the patch is here:
https://lore.kernel.org/io-uring/7486ab32e99be1f614b3ef8d0e9bc77015b173f7.1…
Many modern NICs support configurable receive buffer lengths, and zcrx and
memory providers can use buffers larger than 4K/PAGE_SIZE on x86 to improve
performance. When paired with hw-gro larger rx buffer sizes can drastically
reduce the number of buffers traversing the stack and save a lot of processing
time. It also allows to give to users larger contiguous chunks of data. The
idea was first floated around by Saeed during netdev conf 2024 and was
asked about by a few folks.
Single stream benchmarks showed up to ~30% CPU util improvement.
E.g. comparison for 4K vs 32K buffers using a 200Gbit NIC:
packets=23987040 (MB=2745098), rps=199559 (MB/s=22837)
CPU %usr %nice %sys %iowait %irq %soft %idle
0 1.53 0.00 27.78 2.72 1.31 66.45 0.22
packets=24078368 (MB=2755550), rps=200319 (MB/s=22924)
CPU %usr %nice %sys %iowait %irq %soft %idle
0 0.69 0.00 8.26 31.65 1.83 57.00 0.57
This series adds net infrastructure for memory providers configuring
the size and implements it for bnxt. It's an opt-in feature for drivers,
they should advertise support for the parameter in the qops and must check
if the hardware supports the given size. It's limited to memory providers
as it drastically simplifies implementation. It doesn't affect the fast
path zcrx uAPI, and the sizes is defined in zcrx terms, which allows it
to be flexible and adjusted in the future, see Patch 8 for details.
A liburing example can be found at [2]
full branch:
[1] https://github.com/isilence/linux.git zcrx/large-buffers-v7
Liburing example:
[2] https://github.com/isilence/liburing.git zcrx/rx-buf-len
v7: - Add xa_destroy
- Rebase
v6: - Update docs and add a selftest
v5: https://lore.kernel.org/netdev/cover.1760440268.git.asml.silence@gmail.com/
- Remove all unnecessary bits like configuration via netlink, and
multi-stage queue configuration.
v4: https://lore.kernel.org/all/cover.1760364551.git.asml.silence@gmail.com/
- Update fbnic qops
- Propagate max buf len for hns3
- Use configured buf size in __bnxt_alloc_rx_netmem
- Minor stylistic changes
v3: https://lore.kernel.org/all/cover.1755499375.git.asml.silence@gmail.com/
- Rebased, excluded zcrx specific patches
- Set agg_size_fac to 1 on warning
v2: https://lore.kernel.org/all/cover.1754657711.git.asml.silence@gmail.com/
- Add MAX_PAGE_ORDER check on pp init
- Applied comments rewording
- Adjust pp.max_len based on order
- Patch up mlx5 queue callbacks after rebase
- Minor ->queue_mgmt_ops refactoring
- Rebased to account for both fill level and agg_size_fac
- Pass providers buf length in struct pp_memory_provider_params and
apply it in __netdev_queue_confi().
- Use ->supported_ring_params to validate drivers support of set
qcfg parameters.
Jakub Kicinski (1):
eth: bnxt: adjust the fill level of agg queues with larger buffers
Pavel Begunkov (8):
net: page pool: xa init with destroy on pp init
net: page_pool: sanitise allocation order
net: memzero mp params when closing a queue
net: let pp memory provider to specify rx buf len
eth: bnxt: store rx buffer size per queue
eth: bnxt: allow providers to set rx buf size
io_uring/zcrx: document area chunking parameter
selftests: iou-zcrx: test large chunk sizes
Documentation/networking/iou-zcrx.rst | 20 +++
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 118 ++++++++++++++----
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 +
drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 6 +-
drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h | 2 +-
include/net/netdev_queues.h | 9 ++
include/net/page_pool/types.h | 1 +
net/core/netdev_rx_queue.c | 14 ++-
net/core/page_pool.c | 4 +
.../selftests/drivers/net/hw/iou-zcrx.c | 72 +++++++++--
.../selftests/drivers/net/hw/iou-zcrx.py | 37 ++++++
11 files changed, 236 insertions(+), 49 deletions(-)
--
2.52.0
This patch series introduces a new configfs attribute that enables sending
messages directly through netconsole without going through the kernel's logging
infrastructure.
This feature allows users to send custom messages, alerts, or status updates
directly to netconsole receivers by writing to
/sys/kernel/config/netconsole/<target>/send_msg, without poluting kernel
buffers, and sending msgs to the serial, which could be slow.
At Meta this is currently used in two cases right now (through printk by
now):
a) When a new workload enters or leave the machine.
b) From time to time, as a "ping" to make sure the netconsole/machine
is alive.
The implementation reuses the existing message transmission functions
(send_msg_udp() and send_ext_msg_udp()) to handle both basic and extended
message formats.
Regarding code organization, this version uses forward declarations for
send_msg_udp() and send_ext_msg_udp() functions rather than relocating them
within the file. While forward declarations do add a small amount of
redundancy, they avoid the larger churn that would result from moving entire
function definitions.
---
Breno Leitao (4):
netconsole: extract message fragmentation into send_msg_udp()
netconsole: Add configfs attribute for direct message sending
selftests/netconsole: Switch to configfs send_msg interface
Documentation: netconsole: Document send_msg configfs attribute
Documentation/networking/netconsole.rst | 40 +++++++++++++++
drivers/net/netconsole.c | 59 ++++++++++++++++++----
.../selftests/drivers/net/netcons_sysdata.sh | 2 +-
3 files changed, 91 insertions(+), 10 deletions(-)
---
base-commit: ab084f0b8d6d2ee4b1c6a28f39a2a7430bdfa7f0
change-id: 20251127-netconsole_send_msg-89813956dc23
Best regards,
--
Breno Leitao <leitao(a)debian.org>
In commit 0297cdc12a87 ("KVM: selftests: Add option to rseq test to
override /dev/cpu_dma_latency"), a 'break' is missed before the option
'l' in the argument parsing loop, which leads to an unexpected core
dump in atoi_paranoid(). It tries to get the latency from non-existent
argument.
host$ ./rseq_test -u
Random seed: 0x6b8b4567
Segmentation fault (core dumped)
Add a 'break' before the option 'l' in the argument parsing loop to avoid
the unexpected core dump.
Fixes: 0297cdc12a87 ("KVM: selftests: Add option to rseq test to override /dev/cpu_dma_latency")
Cc: stable(a)vger.kernel.org # v6.15+
Signed-off-by: Gavin Shan <gshan(a)redhat.com>
---
tools/testing/selftests/kvm/rseq_test.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/kvm/rseq_test.c b/tools/testing/selftests/kvm/rseq_test.c
index 1375fca80bcdb..f80ad6b47d16b 100644
--- a/tools/testing/selftests/kvm/rseq_test.c
+++ b/tools/testing/selftests/kvm/rseq_test.c
@@ -215,6 +215,7 @@ int main(int argc, char *argv[])
switch (opt) {
case 'u':
skip_sanity_check = true;
+ break;
case 'l':
latency = atoi_paranoid(optarg);
break;
--
2.51.1
Dear all,
This patchset is just a respin of my latest PR to net-next, including all
modifications requested by Jakub and Sabrina.
However, this time I am also adding patches targeting selftest/net/ovpn, as
they come in handy for testing the new features (originally I wanted
them to be a separate PR, but it doesn't indeed make a lot of sense).
This said, since these kselftest patches are quite invasive, I didn't
feel confident with sending them in a PR right away, but I rather wanted
some feedback from Sabrina and Shuah first, if possible.
So here we go.
Once I get some approval on this batch, I'll send then send them all
to net-next again as PRv2.
Thanks a lot!
Regards,
Antonio Quartulli (1):
selftests: ovpn: allow compiling ovpn-cli.c with mbedtls3
Qingfang Deng (1):
ovpn: pktid: use bitops.h API
Ralf Lici (10):
selftests: ovpn: add notification parsing and matching
ovpn: notify userspace on client float event
ovpn: add support for asymmetric peer IDs
selftests: ovpn: check asymmetric peer-id
selftests: ovpn: add test for the FW mark feature
ovpn: consolidate crypto allocations in one chunk
ovpn: use bound device in UDP when available
selftests: ovpn: add test for bound device
ovpn: use bound address in UDP when available
selftests: ovpn: add test for bound address
Sabrina Dubroca (1):
ovpn: use correct array size to parse nested attributes in
ovpn_nl_key_swap_doit
Documentation/netlink/specs/ovpn.yaml | 23 +-
drivers/net/ovpn/crypto_aead.c | 162 +++++++---
drivers/net/ovpn/io.c | 8 +-
drivers/net/ovpn/netlink-gen.c | 13 +-
drivers/net/ovpn/netlink-gen.h | 6 +-
drivers/net/ovpn/netlink.c | 98 +++++-
drivers/net/ovpn/netlink.h | 2 +
drivers/net/ovpn/peer.c | 6 +
drivers/net/ovpn/peer.h | 4 +-
drivers/net/ovpn/pktid.c | 11 +-
drivers/net/ovpn/pktid.h | 2 +-
drivers/net/ovpn/skb.h | 13 +-
drivers/net/ovpn/udp.c | 10 +-
include/uapi/linux/ovpn.h | 2 +
tools/testing/selftests/net/ovpn/Makefile | 17 +-
.../selftests/net/ovpn/check_requirements.py | 37 +++
tools/testing/selftests/net/ovpn/common.sh | 60 +++-
tools/testing/selftests/net/ovpn/data64.key | 6 +-
.../selftests/net/ovpn/json/peer0-float.json | 9 +
.../selftests/net/ovpn/json/peer0.json | 6 +
.../selftests/net/ovpn/json/peer1-float.json | 1 +
.../selftests/net/ovpn/json/peer1.json | 1 +
.../selftests/net/ovpn/json/peer2-float.json | 1 +
.../selftests/net/ovpn/json/peer2.json | 1 +
.../selftests/net/ovpn/json/peer3-float.json | 1 +
.../selftests/net/ovpn/json/peer3.json | 1 +
.../selftests/net/ovpn/json/peer4-float.json | 1 +
.../selftests/net/ovpn/json/peer4.json | 1 +
.../selftests/net/ovpn/json/peer5-float.json | 1 +
.../selftests/net/ovpn/json/peer5.json | 1 +
.../selftests/net/ovpn/json/peer6-float.json | 1 +
.../selftests/net/ovpn/json/peer6.json | 1 +
tools/testing/selftests/net/ovpn/ovpn-cli.c | 281 +++++++++++-------
.../selftests/net/ovpn/requirements.txt | 1 +
.../testing/selftests/net/ovpn/tcp_peers.txt | 11 +-
.../selftests/net/ovpn/test-bind-addr.sh | 10 +
tools/testing/selftests/net/ovpn/test-bind.sh | 117 ++++++++
.../selftests/net/ovpn/test-close-socket.sh | 2 +-
tools/testing/selftests/net/ovpn/test-mark.sh | 81 +++++
tools/testing/selftests/net/ovpn/test.sh | 57 +++-
.../testing/selftests/net/ovpn/udp_peers.txt | 12 +-
41 files changed, 855 insertions(+), 224 deletions(-)
create mode 100755 tools/testing/selftests/net/ovpn/check_requirements.py
create mode 100644 tools/testing/selftests/net/ovpn/json/peer0-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer0.json
create mode 120000 tools/testing/selftests/net/ovpn/json/peer1-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer1.json
create mode 120000 tools/testing/selftests/net/ovpn/json/peer2-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer2.json
create mode 120000 tools/testing/selftests/net/ovpn/json/peer3-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer3.json
create mode 120000 tools/testing/selftests/net/ovpn/json/peer4-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer4.json
create mode 120000 tools/testing/selftests/net/ovpn/json/peer5-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer5.json
create mode 120000 tools/testing/selftests/net/ovpn/json/peer6-float.json
create mode 100644 tools/testing/selftests/net/ovpn/json/peer6.json
create mode 120000 tools/testing/selftests/net/ovpn/requirements.txt
create mode 100755 tools/testing/selftests/net/ovpn/test-bind-addr.sh
create mode 100755 tools/testing/selftests/net/ovpn/test-bind.sh
create mode 100755 tools/testing/selftests/net/ovpn/test-mark.sh
--
2.51.2
We would like to add support for checkpoint/restoring file descriptors
open on these "unmounted" mounts to CRIU (Checkpoint/Restore in
Userspace) [1].
Currently, we have no way to get mount info for these "unmounted" mounts
since they do appear in /proc/<pid>/mountinfo and statmount does not
work on them, since they do not belong to any mount namespace.
This patch helps us by providing a way to get mountinfo for these
"unmounted" mounts by using a fd on the mount.
Changes from v6 [2] to v7:
* Add kselftests for STATMOUNT_BY_FD flag.
* Instead of renaming mnt_id_req.mnt_ns_fd to mnt_id_req.fd introduce a
union so struct mnt_id_req looks like this:
struct mnt_id_req {
__u32 size;
union {
__u32 mnt_ns_fd;
__u32 mnt_fd;
};
__u64 mnt_id;
__u64 param;
__u64 mnt_ns_id;
};
* In case of STATMOUNT_BY_FD grab mnt_ns inside of do_statmount(),
since we get mnt_ns from mnt, which should happen under namespace lock.
* Remove the modifications made to grab_requested_mnt_ns, those were
never needed.
Changes from v5 [3] to v6:
* Instead of returning "[unmounted]" as the mount point for "unmounted"
mounts, we unset the STATMOUNT_MNT_POINT flag in statmount.mask.
* Instead of returning 0 as the mnt_ns_id for "unmounted" mounts, we
unset the STATMOUNT_MNT_NS_ID flag in statmount.mask.
* Added comment in `do_statmount` clarifying that the caller sets s->mnt
in case of STATMOUNT_BY_FD.
* In `do_statmount` move the mnt_ns_id and mnt_ns_empty() check just
before lookup_mnt_in_ns().
* We took another look at the capability checks for getting information
for "unmounted" mounts using an fd and decided to remove them for the
following reasons:
- All fs related information is available via fstatfs() without any
capability check.
- Mount information is also available via /proc/pid/mountinfo (without
any capability check).
- Given that we have access to a fd on the mount which tells us that
we had access to the mount at some point (or someone that had access
gave us the fd). So, we should be able to access mount info.
Changes from v4 [4] to v5:
Check only for s->root.mnt to be NULL instead of checking for both
s->root.mnt and s->root.dentry (I did not find a case where only one of
them would be NULL).
* Only allow system root (CAP_SYS_ADMIN in init_user_ns) to call
statmount() on fd's on "unmounted" mounts. We (mostly Pavel) spent some
time thinking about how our previous approach (of checking the opener's
file credentials) caused problems.
Please take a look at the linked pictures they describe everything more
clearly.
Case 1: A fd is on a normal mount (Link to Picture: [5])
Consider, a situation where we have two processes P1 and P2 and a file
F1. F1 is opened on mount ns M1 by P1. P1 is nested inside user
namespace U1 and U2. P2 is also in U1. P2 is also in a pid namespace and
mount namespace separate from M1.
P1 sends F1 to P2 (using a unix socket). But, P2 is unable to call
statmount() on F1 because since it is a separate pid and mount
namespace. This is good and expected.
Case 2: A fd is on a "unmounted" mount (Link to Picture: [6])
Consider a similar situation as Case 1. But now F1 is on a mounted that
has been "unmounted". Now, since we used openers credentials to check
for permissions P2 ends up having the ability call statmount() and get
mount info for this "unmounted" mount.
Hence, It is better to restrict the ability to call statmount() on fds
on "unmounted" mounts to system root only (There could also be other
cases than the one described above).
Changes from v3 [7] to v4:
* Change the string returned when there is no mountpoint to be
"[unmounted]" instead of "[detached]".
* Remove the new DEFINE_FREE put_file and use the one already present in
include/linux/file.h (fput) [8].
* Inside listmount consistently pass 0 in flags to copy_mnt_id_req and
prepare_klistmount()->grab_requested_mnt_ns() and remove flags from the
prepare_klistmount prototype.
* If STATMOUNT_BY_FD is set, check for mnt_ns_id == 0 && mnt_id == 0.
Changes from v2 [9] to v3:
* Rename STATMOUNT_FD flag to STATMOUNT_BY_FD.
* Fixed UAF bug caused by the reference to fd_mount being bound by scope
of CLASS(fd_raw, f)(kreq.fd) by using fget_raw instead.
* Reused @spare parameter in mnt_id_req instead of adding new fields to
the struct.
Changes from v1 [10] to v2:
v1 of this patchset, took a different approach and introduced a new
umount_mnt_ns, to which "unmounted" mounts would be moved to (instead of
their namespace being NULL) thus allowing them to be still available via
statmount.
Introducing umount_mnt_ns complicated namespace locking and modified
performance sensitive code [11] and it was agreed upon that fd-based
statmount would be better.
This code is also available on github [12].
[1]: https://github.com/checkpoint-restore/criu/pull/2754
[2]: https://lore.kernel.org/all/20251118084836.2114503-1-b.sachdev1904@gmail.co…
[3]: https://lore.kernel.org/criu/20251109053921.1320977-2-b.sachdev1904@gmail.c…
[4]: https://lore.kernel.org/all/20251029052037.506273-2-b.sachdev1904@gmail.com/
[5]: https://github.com/bsach64/linux/blob/statmount-fd-v5/fd_on_normal_mount.png
[6]: https://github.com/bsach64/linux/blob/statmount-fd-v5/file_on_unmounted_mou…
[7]: https://lore.kernel.org/all/20251024181443.786363-1-b.sachdev1904@gmail.com/
[8]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/inc…
[9]: https://lore.kernel.org/linux-fsdevel/20251011124753.1820802-1-b.sachdev190…
[10]: https://lore.kernel.org/linux-fsdevel/20251002125422.203598-1-b.sachdev1904…
[11]: https://lore.kernel.org/linux-fsdevel/7e4d9eb5-6dde-4c59-8ee3-358233f082d0@…
[12]: https://github.com/bsach64/linux/tree/statmount-fd-v7
Bhavik Sachdev (3):
statmount: permission check should return EPERM
statmount: accept fd as a parameter
selftests: statmount: tests for STATMOUNT_BY_FD
fs/namespace.c | 102 ++++---
include/uapi/linux/mount.h | 10 +-
.../filesystems/statmount/statmount.h | 15 +-
.../filesystems/statmount/statmount_test.c | 261 +++++++++++++++++-
.../filesystems/statmount/statmount_test_ns.c | 101 ++++++-
5 files changed, 430 insertions(+), 59 deletions(-)
--
2.52.0
Dear friends,
this patch series adds support for nested seccomp listeners. It allows container
runtimes and other sandboxing software to install seccomp listeners on top of
existing ones, which is useful for nested LXC containers and other similar use-cases.
I decided to go with conservative approach and limit the maximum number of nested listeners
to 8 per seccomp filter chain (MAX_LISTENERS_PER_PATH). This is done to avoid dynamic memory
allocations in the very hot __seccomp_filter() function, where we use a preallocated static
array on the stack to track matched listeners. 8 nested listeners should be enough for
almost any practical scenarios.
Expecting potential discussions around this patch series, I'm going to present a talk
at LPC 2025 about the design and implementation details of this feature [1].
Git tree (based on for-next/seccomp):
v2: https://github.com/mihalicyn/linux/commits/seccomp.mult.listeners.v2
current: https://github.com/mihalicyn/linux/commits/seccomp.mult.listeners
Changelog for version 2:
- add some explanatory comments
- add RWB tags from Tycho Andersen (thanks, Tycho! ;) )
- CC-ed Aleksa as he might be interested in this stuff too
Links to previous versions:
v1: https://lore.kernel.org/all/20251201122406.105045-1-aleksandr.mikhalitsyn@c…
tree: https://github.com/mihalicyn/linux/commits/seccomp.mult.listeners.v1
Link: https://lpc.events/event/19/contributions/2241/ [1]
Cc: linux-doc(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-kselftest(a)vger.kernel.org
Cc: bpf(a)vger.kernel.org
Cc: Kees Cook <kees(a)kernel.org>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Cc: Will Drewry <wad(a)chromium.org>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Aleksa Sarai <cyphar(a)cyphar.com>
Cc: Tycho Andersen <tycho(a)tycho.pizza>
Cc: Andrei Vagin <avagin(a)gmail.com>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Stéphane Graber <stgraber(a)stgraber.org>
Alexander Mikhalitsyn (6):
seccomp: remove unused argument from seccomp_do_user_notification
seccomp: prepare seccomp_run_filters() to support more than one
listener
seccomp: limit number of listeners in seccomp tree
seccomp: handle multiple listeners case
seccomp: relax has_duplicate_listeners check
tools/testing/selftests/seccomp: test nested listeners
.../userspace-api/seccomp_filter.rst | 6 +
include/linux/seccomp.h | 3 +-
include/uapi/linux/seccomp.h | 13 +-
kernel/seccomp.c | 116 +++++++++++--
tools/include/uapi/linux/seccomp.h | 13 +-
tools/testing/selftests/seccomp/seccomp_bpf.c | 162 ++++++++++++++++++
6 files changed, 286 insertions(+), 27 deletions(-)
--
2.43.0
This patchset introduces target resume capability to netconsole allowing
it to recover targets when underlying low-level interface comes back
online.
The patchset starts by refactoring netconsole state representation in
order to allow representing deactivated targets (targets that are
disabled due to interfaces unregister).
It then modifies netconsole to handle NETDEV_REGISTER events for such
targets, setups netpoll and forces the device UP. Targets are matched with
incoming interfaces depending on how they were bound in netconsole
(by mac or interface name). For these reasons, we also attempt resuming
on NETDEV_CHANGENAME.
The patchset includes a selftest that validates netconsole target state
transitions and that target is functional after resumed.
Signed-off-by: Andre Carvalho <asantostc(a)gmail.com>
---
Changes in v8:
- Handle NETDEV_REGISTER/CHANGENAME instead of NETDEV_UP (and force the device
UP), to increase the chances of succesfully resuming a target. This
requires using a workqueue instead of inline in the event notifier as
we can't UP the device otherwise.
- Link to v7: https://lore.kernel.org/r/20251126-netcons-retrigger-v7-0-1d86dba83b1c@gmai…
Changes in v7:
- selftest: use ${EXIT_STATUS} instead of ${ksft_pass} to avoid
shellcheck warning
- Link to v6: https://lore.kernel.org/r/20251121-netcons-retrigger-v6-0-9c03f5a2bd6f@gmai…
Changes in v6:
- Rebase on top of net-next to resolve conflicts, no functional changes.
- Link to v5: https://lore.kernel.org/r/20251119-netcons-retrigger-v5-0-2c7dda6055d6@gmai…
Changes in v5:
- patch 3: Set (de)enslaved target as DISABLED instead of DEACTIVATED to prevent
resuming it.
- selftest: Fix test cleanup by moving trap line to outside of loop and remove
unneeded 'local' keyword
- Rename maybe_resume_target to resume_target, add netconsole_ prefix to
process_resumable_targets.
- Hold device reference before calling __netpoll_setup.
- Link to v4: https://lore.kernel.org/r/20251116-netcons-retrigger-v4-0-5290b5f140c2@gmai…
Changes in v4:
- Simplify selftest cleanup, removing trap setup in loop.
- Drop netpoll helper (__setup_netpoll_hold) and manage reference inside
netconsole.
- Move resume_list processing logic to separate function.
- Link to v3: https://lore.kernel.org/r/20251109-netcons-retrigger-v3-0-1654c280bbe6@gmai…
Changes in v3:
- Resume by mac or interface name depending on how target was created.
- Attempt to resume target without holding target list lock, by moving
the target to a temporary list. This is required as netpoll may
attempt to allocate memory.
- Link to v2: https://lore.kernel.org/r/20250921-netcons-retrigger-v2-0-a0e84006237f@gmai…
Changes in v2:
- Attempt to resume target in the same thread, instead of using
workqueue .
- Add wrapper around __netpoll_setup (patch 4).
- Renamed resume_target to maybe_resume_target and moved conditionals to
inside its implementation, keeping code more clear.
- Verify that device addr matches target mac address when target was
setup using mac.
- Update selftest to cover targets bound by mac and interface name.
- Fix typo in selftest comment and sort tests alphabetically in
Makefile.
- Link to v1:
https://lore.kernel.org/r/20250909-netcons-retrigger-v1-0-3aea904926cf@gmai…
---
Andre Carvalho (3):
netconsole: convert 'enabled' flag to enum for clearer state management
netconsole: resume previously deactivated target
selftests: netconsole: validate target resume
Breno Leitao (2):
netconsole: add target_state enum
netconsole: add STATE_DEACTIVATED to track targets disabled by low level
drivers/net/netconsole.c | 171 +++++++++++++++++----
tools/testing/selftests/drivers/net/Makefile | 1 +
.../selftests/drivers/net/lib/sh/lib_netcons.sh | 35 ++++-
.../selftests/drivers/net/netcons_resume.sh | 97 ++++++++++++
4 files changed, 270 insertions(+), 34 deletions(-)
---
base-commit: ed01d2069e8b40eb283050b7119c25a67542a585
change-id: 20250816-netcons-retrigger-a4f547bfc867
Best regards,
--
Andre Carvalho <asantostc(a)gmail.com>
When replying to a ICMPv6 echo request that comes from localhost address
the right output ifindex is 1 (lo) and not rt6i_idev dev index. Use the
skb device ifindex instead. This fixes pinging to a local address from
localhost source address.
$ ping6 -I ::1 2001:1:1::2 -c 3
PING 2001:1:1::2 (2001:1:1::2) from ::1 : 56 data bytes
64 bytes from 2001:1:1::2: icmp_seq=1 ttl=64 time=0.037 ms
64 bytes from 2001:1:1::2: icmp_seq=2 ttl=64 time=0.069 ms
64 bytes from 2001:1:1::2: icmp_seq=3 ttl=64 time=0.122 ms
2001:1:1::2 ping statistics
3 packets transmitted, 3 received, 0% packet loss, time 2032ms
rtt min/avg/max/mdev = 0.037/0.076/0.122/0.035 ms
Fixes: 1b70d792cf67 ("ipv6: Use rt6i_idev index for echo replies to a local address")
Signed-off-by: Fernando Fernandez Mancera <fmancera(a)suse.de>
---
net/ipv6/icmp.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index 5d2f90babaa5..5de254043133 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -965,7 +965,9 @@ static enum skb_drop_reason icmpv6_echo_reply(struct sk_buff *skb)
fl6.daddr = ipv6_hdr(skb)->saddr;
if (saddr)
fl6.saddr = *saddr;
- fl6.flowi6_oif = icmp6_iif(skb);
+ fl6.flowi6_oif = ipv6_addr_type(&fl6.daddr) & IPV6_ADDR_LOOPBACK ?
+ skb->dev->ifindex :
+ icmp6_iif(skb);
fl6.fl6_icmp_type = type;
fl6.flowi6_mark = mark;
fl6.flowi6_uid = sock_net_uid(net, NULL);
--
2.51.1