- Linux-kselftest-mirror - lists.linaro.org

[PATCH] mm/huge_memory: drop beyond-EOF folios with the right number of refs.

by Zi Yan

When an after-split folio is large and needs to be dropped due to EOF, folio_put_refs(folio, folio_nr_pages(folio)) should be used to drop all page cache refs. Otherwise, the folio will not be freed, causing memory leak. This leak would happen on a filesystem with blocksize > page_size and a truncate is performed, where the blocksize makes folios split to >0 order ones, causing truncated folios not being freed. Fixes: c010d47f107f ("mm: thp: split huge page to any lower order pages") Reported-by: Hugh Dickins <hughd(a)google.com> Closes: https://lore.kernel.org/all/fcbadb7f-dd3e-21df-f9a7-2853b53183c4@google.com/ Cc: stable(a)vger.kernel.org Signed-off-by: Zi Yan <ziy(a)nvidia.com> --- mm/huge_memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 3d3ebdc002d5..373781b21e5c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3304,7 +3304,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, folio_account_cleaned(tail, inode_to_wb(folio->mapping->host)); __filemap_remove_folio(tail, NULL); - folio_put(tail); + folio_put_refs(tail, folio_nr_pages(tail)); } else if (!folio_test_anon(folio)) { __xa_store(&folio->mapping->i_pages, tail->index, tail, 0); -- 2.47.2

5 months, 2 weeks

1
0
0 0

[PATCH v9 0/8] Buddy allocator like (or non-uniform) folio split

by Zi Yan

Hi all, This patchset adds a new buddy allocator like (or non-uniform) large folio split from a order-n folio to order-m with m < n. It reduces 1. the total number of after-split folios from 2^(n-m) to n-m+1; 2. the amount of memory needed for multi-index xarray split from 2^(n/6-m/6) to n/6-m/6, assuming XA_CHUNK_SHIFT=6; 3. keep more large folios after a split from all order-m folios to order-(n-1) to order-m folios. For example, to split an order-9 to order-0, folio split generates 10 (or 11 for anonymous memory) folios instead of 512, allocates 1 xa_node instead of 8, and leaves 1 order-8, 1 order-7, ..., 1 order-1 and 2 order-0 folios (or 4 order-0 for anonymous memory) instead of 512 order-0 folios. Instead of duplicating existing split_huge_page*() code, __folio_split() is introduced as the shared backend code for both split_huge_page_to_list_to_order() and folio_split(). __folio_split() can support both uniform split and buddy allocator like (or non-uniform) split. All existing split_huge_page*() users can be gradually converted to use folio_split() if possible. In this patchset, I converted truncate_inode_partial_folio() to use folio_split(). xfstests quick group passed for both tmpfs and xfs. It is on top of mm-everything-2025-02-26-03-56 with V8 reverted. It is ready to be merged. Changelog === From V8[11]: 1. Removed gfp parameter from xas_try_split() and GFP_NOWAIT is used all the time. (per Baolin Wang) 2. Used __xas_init_node_for_split() instead of __xas_alloc_node_for_split() and moved node allocation out. It fixed a bug when xa_node is pre-allocated by xas_nomem() before xas_try_split() is called without being initialized for split. From V7[9]: 1. Fixed a wrong function name in lib/test_xarray.c. 2. Made __split_folio_to_order() never fail, since the old order check is already done in __folio_split(). (per David Hildenbrand) 3. Fixed an issue reported by syzbot[10] by not dropping the original folio during truncate. 4. Fixed a WARNING when READ_ONLY_THP_FOR_FS is enabled. (Thank David Hildenbrand for reporting the issue) 5. Used two separate struct page* parameters, split_at and lock_at, to specify at which subpage the non-uniform split happens and which subpage to keep locked after the split, respectively. It improves code readability. From V6[8]: 1. Added an xarray function xas_try_split() to support iterative folio split, removing the need of using xas_split_alloc() and xas_split(). The function guarantees that at most one xa_node is allocated for each call. 2. Added concrete numbers of after-split folios and xa_node savings to cover letter, commit log. (per Andrew) From V5[7]: 1. Split shmem to any lower order patches are in mm tree, so dropped from this series. 2. Rename split_folio_at() to try_folio_split() to clarify that non-uniform split will not be used if it is not supported. From V4[6]: 1. Enabled shmem support in both uniform and buddy allocator like split and added selftests for it. 2. Added functions to check if uniform split and buddy allocator like split are supported for the given folio and order. 3. Made truncate fall back to uniform split if buddy allocator split is not supported (CONFIG_READ_ONLY_THP_FOR_FS and FS without large folio). 4. Added the missing folio_clear_has_hwpoisoned() to __split_unmapped_folio(). From V3[5]: 1. Used xas_split_alloc(GFP_NOWAIT) instead of xas_nomem(), since extra operations inside xas_split_alloc() are needed for correctness. 2. Enabled folio_split() for shmem and no issue was found with xfstests quick test group. 3. Split both ends of a truncate range in truncate_inode_partial_folio() to avoid wasting memory in shmem truncate (per David Hildenbrand). 4. Removed page_in_folio_offset() since page_folio() does the same thing. 5. Finished truncate related tests from xfstests quick test group on XFS and tmpfs without issues. 6. Disabled buddy allocator like split on CONFIG_READ_ONLY_THP_FOR_FS and FS without large folio. This check was missed in the prior versions. From V2[3]: 1. Incorporated all the feedback from Kirill[4]. 2. Used GFP_NOWAIT for xas_nomem(). 3. Tested the code path when xas_nomem() fails. 4. Added selftests for folio_split(). 5. Fixed no THP config build error. From V1[2]: 1. Split the original patch 1 into multiple ones for easy review (per Kirill). 2. Added xas_destroy() to avoid memory leak. 3. Fixed nr_dropped not used error (per kernel test robot). 4. Added proper error handling when xas_nomem() fails to allocate memory for xas_split() during buddy allocator like split. From RFC[1]: 1. Merged backend code of split_huge_page_to_list_to_order() and folio_split(). The same code is used for both uniform split and buddy allocator like split. 2. Use xas_nomem() instead of xas_split_alloc() for folio_split(). 3. folio_split() now leaves the first after-split folio unlocked, instead of the one containing the given page, since the caller of truncate_inode_partial_folio() locks and unlocks the first folio. 4. Extended split_huge_page debugfs to use folio_split(). 5. Added truncate_inode_partial_folio() as first user of folio_split(). Design === folio_split() splits a large folio in the same way as buddy allocator splits a large free page for allocation. The purpose is to minimize the number of folios after the split. For example, if user wants to free the 3rd subpage in a order-9 folio, folio_split() will split the order-9 folio as: O-0, O-0, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-8 if it is anon O-1, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-9 if it is pagecache Since anon folio does not support order-1 yet. The split process is similar to existing approach: 1. Unmap all page mappings (split PMD mappings if exist); 2. Split meta data like memcg, page owner, page alloc tag; 3. Copy meta data in struct folio to sub pages, but instead of spliting the whole folio into multiple smaller ones with the same order in a shot, this approach splits the folio iteratively. Taking the example above, this approach first splits the original order-9 into two order-8, then splits left part of order-8 to two order-7 and so on; 4. Post-process split folios, like write mapping->i_pages for pagecache, adjust folio refcounts, add split folios to corresponding list; 5. Remap split folios 6. Unlock split folios. __split_unmapped_folio() and __split_folio_to_order() replace __split_huge_page() and __split_huge_page_tail() respectively. __split_unmapped_folio() uses different approaches to perform uniform split and buddy allocator like split: 1. uniform split: one single call to __split_folio_to_order() is used to uniformly split the given folio. All resulting folios are put back to the list after split. The folio containing the given page is left to caller to unlock and others are unlocked. 2. buddy allocator like (or non-uniform) split: (old_order - new_order) calls to __split_folio_to_order() are used to split the given folio at order N to order N-1. After each call, the target folio is changed to the one containing the page, which is given as a folio_split() parameter. After each call, folios not containing the page are put back to the list. The folio containing the page is put back to the list when its order is new_order. All folios are unlocked except the first folio, which is left to caller to unlock. Patch Overview === 1. Patch 1 added a new xarray function xas_try_split() to perform iterative xarray split. 2. Patch 2 added __split_unmapped_folio() and __split_folio_to_order() to prepare for moving to new backend split code. 3. Patch 3 moved common code in split_huge_page_to_list_to_order() to __folio_split(). 4. Patch 4 added new folio_split() and made split_huge_page_to_list_to_order() share the new __split_unmapped_folio() with folio_split(). 5. Patch 5 removed no longer used __split_huge_page() and __split_huge_page_tail(). 6. Patch 6 added a new in_folio_offset to split_huge_page debugfs for folio_split() test. 7. Patch 7 used try_folio_split() for truncate operation. 8. Patch 8 added folio_split() tests. Any comments and/or suggestions are welcome. Thanks. [1] https://lore.kernel.org/linux-mm/20241008223748.555845-1-ziy@nvidia.com/ [2] https://lore.kernel.org/linux-mm/20241028180932.1319265-1-ziy@nvidia.com/ [3] https://lore.kernel.org/linux-mm/20241101150357.1752726-1-ziy@nvidia.com/ [4] https://lore.kernel.org/linux-mm/e6ppwz5t4p4kvir6eqzoto4y5fmdjdxdyvxvtw43nc… [5] https://lore.kernel.org/linux-mm/20241205001839.2582020-1-ziy@nvidia.com/ [6] https://lore.kernel.org/linux-mm/20250106165513.104899-1-ziy@nvidia.com/ [7] https://lore.kernel.org/linux-mm/20250116211042.741543-1-ziy@nvidia.com/ [8] https://lore.kernel.org/linux-mm/20250205031417.1771278-1-ziy@nvidia.com/ [9] https://lore.kernel.org/linux-mm/20250211155034.268962-1-ziy@nvidia.com/ [10] https://lore.kernel.org/all/67af65cb.050a0220.21dd3.004a.GAE@google.com/ [11] https://lore.kernel.org/linux-mm/20250218235012.1542225-1-ziy@nvidia.com/ Zi Yan (8): xarray: add xas_try_split() to split a multi-index entry mm/huge_memory: add two new (not yet used) functions for folio_split() mm/huge_memory: move folio split common code to __folio_split() mm/huge_memory: add buddy allocator like (non-uniform) folio_split() mm/huge_memory: remove the old, unused __split_huge_page() mm/huge_memory: add folio_split() to debugfs testing interface mm/truncate: use buddy allocator like folio split for truncate operation selftests/mm: add tests for folio_split(), buddy allocator like split Documentation/core-api/xarray.rst | 14 +- include/linux/huge_mm.h | 36 + include/linux/xarray.h | 6 + lib/test_xarray.c | 52 ++ lib/xarray.c | 131 ++- mm/huge_memory.c | 755 ++++++++++++------ mm/truncate.c | 31 +- tools/testing/radix-tree/Makefile | 1 + .../selftests/mm/split_huge_page_test.c | 34 +- 9 files changed, 783 insertions(+), 277 deletions(-) -- 2.47.2

5 months, 2 weeks

5
30
0 0

[PATCH bpf-next v2 0/3] bpf: Fix use-after-free of sockmap

by Jiayuan Chen

1. Issue Syzkaller reported this issue [1]. 2. Reproduce We can reproduce this issue by using the test_sockmap_with_close_on_write() test I provided in selftest, also you need to apply the following patch to ensure 100% reproducibility (sleep after checking sock): ''' static void sk_psock_verdict_data_ready(struct sock *sk) { ....... if (unlikely(!sock)) return; + if (!strcmp("test_progs", current->comm)) { + printk("sleep 2s to wait socket freed\n"); + mdelay(2000); + printk("sleep end\n"); + } ops = READ_ONCE(sock->ops); if (!ops || !ops->read_skb) return; } ''' Then running './test_progs -v sockmap_basic', and if the kernel has KASAN enabled [2], you will see the following warning: ''' BUG: KASAN: slab-use-after-free in sk_psock_verdict_data_ready+0x29b/0x2d0 Read of size 8 at addr ffff88813a777020 by task test_progs/47055 Tainted: [O]=OOT_MODULE Call Trace: <TASK> dump_stack_lvl+0x53/0x70 print_address_description.constprop.0+0x30/0x420 ? sk_psock_verdict_data_ready+0x29b/0x2d0 print_report+0xb7/0x270 ? sk_psock_verdict_data_ready+0x29b/0x2d0 ? kasan_addr_to_slab+0xd/0xa0 ? sk_psock_verdict_data_ready+0x29b/0x2d0 kasan_report+0xca/0x100 ? sk_psock_verdict_data_ready+0x29b/0x2d0 sk_psock_verdict_data_ready+0x29b/0x2d0 unix_stream_sendmsg+0x4a6/0xa40 ? __pfx_unix_stream_sendmsg+0x10/0x10 ? fdget+0x2c1/0x3a0 __sys_sendto+0x39c/0x410 ''' 3. Reason ''' CPU0 CPU1 unix_stream_sendmsg(sk): other = unix_peer(sk) other->sk_data_ready(other): socket *sock = sk->sk_socket if (unlikely(!sock)) return; close(other): ... other->close() free(socket) READ_ONCE(sock->ops) ^ use 'sock' after free ''' For TCP, UDP, or other protocols, we have already performed rcu_read_lock() when the network stack receives packets in ip_input.c: ''' ip_local_deliver_finish(): rcu_read_lock() ip_protocol_deliver_rcu() xxx_rcv rcu_read_unlock() ''' However, for Unix sockets, sk_data_ready is called directly from the process context without rcu_read_lock() protection. 4. Solution Based on the fact that the 'struct socket' is released using call_rcu(), We add rcu_read_{un}lock() at the entrance and exit of our sk_data_ready. It will not increase performance overhead, at least for TCP and UDP, they are already in a relatively large critical section. Of course, we can also add a custom callback for Unix sockets and call rcu_read_lock() before calling _verdict_data_ready like this: ''' if (sk_is_unix(sk)) sk->sk_data_ready = sk_psock_verdict_data_ready_rcu; else sk->sk_data_ready = sk_psock_verdict_data_ready; sk_psock_verdict_data_ready_rcu(): rcu_read_lock() sk_psock_verdict_data_ready() rcu_read_unlock() ''' However, this will cause too many branches, and it's not suitable to distinguish network protocols in skmsg.c. [1] https://syzkaller.appspot.com/bug?extid=dd90a702f518e0eac072 [2] https://syzkaller.appspot.com/text?tag=KernelConfig&x=1362a5aee630ff34 --- v1 -> v2: 1. Add Fixes tag. 2. Extend selftest of edge case for TCP/UDP sockets. 3. Add Reviewed-by and Acked-by tag. https://lore.kernel.org/bpf/20250226132242.52663-1-jiayuan.chen@linux.dev/T… Jiayuan Chen (3): bpf, sockmap: avoid using sk_socket after free selftests/bpf: Add socketpair to create_pair to support unix socket selftests/bpf: Add edge case tests for sockmap net/core/skmsg.c | 18 ++++-- .../selftests/bpf/prog_tests/socket_helpers.h | 13 +++- .../selftests/bpf/prog_tests/sockmap_basic.c | 59 +++++++++++++++++++ 3 files changed, 84 insertions(+), 6 deletions(-) -- 2.47.1

5 months, 2 weeks

2
7
0 0

[PATCH bpf-next v12 0/5] xsk: TX metadata Launch Time support

by Song Yoong Siang

This series expands the XDP TX metadata framework to allow user applications to pass per packet 64-bit launch time directly to the kernel driver, requesting launch time hardware offload support. The XDP TX metadata framework will not perform any clock conversion or packet reordering. Please note that the role of Tx metadata is just to pass the launch time, not to enable the offload feature. Users will need to enable the launch time hardware offload feature of the device by using the respective command, such as the tc-etf command. Although some devices use the tc-etf command to enable their launch time hardware offload feature, xsk packets will not go through the etf qdisc. Therefore, in my opinion, the launch time should always be based on the PTP Hardware Clock (PHC). Thus, i did not include a clock ID to indicate the clock source. To simplify the test steps, I modified the xdp_hw_metadata bpf self-test tool in such a way that it will set the launch time based on the offset provided by the user and the value of the Receive Hardware Timestamp, which is against the PHC. This will eliminate the need to discipline System Clock with the PHC and then use clock_gettime() to get the time. Please note that AF_XDP lacks a feedback mechanism to inform the application if the requested launch time is invalid. So, users are expected to familiar with the horizon of the launch time of the device they use and not request a launch time that is beyond the horizon. Otherwise, the driver might interpret the launch time incorrectly and react wrongly. For stmmac and igc, where modulo computation is used, a launch time larger than the horizon will cause the device to transmit the packet earlier that the requested launch time. Although there is no feedback mechanism for the launch time request for now, user still can check whether the requested launch time is working or not, by requesting the Transmit Completion Hardware Timestamp. v12: - Fix the comment in include/uapi/linux/if_xdp.h to allign with what is generated by ./tools/net/ynl/ynl-regen.sh to avoid dirty tree error in the netdev/ynl checks. v11: https://lore.kernel.org/netdev/20250216074302.956937-1-yoong.siang.song@int… - regenerate netdev_xsk_flags based on latest netdev.yaml (Jakub) v10: https://lore.kernel.org/netdev/20250207021943.814768-1-yoong.siang.song@int… - use net_err_ratelimited(), instead of net_ratelimit() (Maciej) - accumulate the amount of used descs in local variable and update the igc_metadata_request::used_desc once (Maciej) - Ensure reverse christmas tree rule (Maciej) V9: https://lore.kernel.org/netdev/20250206060408.808325-1-yoong.siang.song@int… - Remove the igc_desc_unused() checking (Maciej) - Ensure that skb allocation and DMA mapping work before proceeding to fill in igc_tx_buffer info, context desc, and data desc (Maciej) - Rate limit the error messages (Maciej) - Update the comment to indicate that the 2 descriptors needed by the empty frame are already taken into consideration (Maciej) - Handle the case where the insertion of an empty frame fails and explain the reason behind (Maciej) - put self SOB tag as last tag (Maciej) V8: https://lore.kernel.org/netdev/20250205024116.798862-1-yoong.siang.song@int… - check the number of used descriptor in xsk_tx_metadata_request() by using used_desc of struct igc_metadata_request, and then decreases the budget with it (Maciej) - submit another bug fix patch to set the buffer type for empty frame (Maciej): https://lore.kernel.org/netdev/20250205023603.798819-1-yoong.siang.song@int… V7: https://lore.kernel.org/netdev/20250204004907.789330-1-yoong.siang.song@int… - split the refactoring code of igc empty packet insertion into a separate commit (Faizal) - add explanation on why the value "4" is used as igc transmit budget (Faizal) - perform a stress test by sending 1000 packets with 10ms interval and launch time set to 500us in the future (Faizal & Yong Liang) V6: https://lore.kernel.org/netdev/20250116155350.555374-1-yoong.siang.song@int… - fix selftest build errors by using asprintf() and realloc(), instead of managing the buffer sizes manually (Daniel, Stanislav) V5: https://lore.kernel.org/netdev/20250114152718.120588-1-yoong.siang.song@int… - change netdev feature name from tx-launch-time to tx-launch-time-fifo to explicitly state the FIFO behaviour (Stanislav) - improve the looping of xdp_hw_metadata app to wait for packet tx completion to be more readable by using clock_gettime() (Stanislav) - add launch time setup steps into xdp_hw_metadata app (Stanislav) V4: https://lore.kernel.org/netdev/20250106135506.9687-1-yoong.siang.song@intel… - added XDP launch time support to the igc driver (Jesper & Florian) - added per-driver launch time limitation on xsk-tx-metadata.rst (Jesper) - added explanation on FIFO behavior on xsk-tx-metadata.rst (Jakub) - added step to enable launch time in the commit message (Jesper & Willem) - explicitly documented the type of launch_time and which clock source it is against (Willem) V3: https://lore.kernel.org/netdev/20231203165129.1740512-1-yoong.siang.song@in… - renamed to use launch time (Jesper & Willem) - changed the default launch time in xdp_hw_metadata apps from 1s to 0.1s because some NICs do not support such a large future time. V2: https://lore.kernel.org/netdev/20231201062421.1074768-1-yoong.siang.song@in… - renamed to use Earliest TxTime First (Willem) - renamed to use txtime (Willem) V1: https://lore.kernel.org/netdev/20231130162028.852006-1-yoong.siang.song@int… Song Yoong Siang (5): xsk: Add launch time hardware offload support to XDP Tx metadata selftests/bpf: Add launch time request to xdp_hw_metadata net: stmmac: Add launch time support to XDP ZC igc: Refactor empty frame insertion for launch time support igc: Add launch time support to XDP ZC Documentation/netlink/specs/netdev.yaml | 4 + Documentation/networking/xsk-tx-metadata.rst | 62 +++++++ drivers/net/ethernet/intel/igc/igc.h | 1 + drivers/net/ethernet/intel/igc/igc_main.c | 143 +++++++++++---- drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 + .../net/ethernet/stmicro/stmmac/stmmac_main.c | 13 ++ include/net/xdp_sock.h | 10 ++ include/net/xdp_sock_drv.h | 1 + include/uapi/linux/if_xdp.h | 10 ++ include/uapi/linux/netdev.h | 3 + net/core/netdev-genl.c | 2 + net/xdp/xsk.c | 3 + tools/include/uapi/linux/if_xdp.h | 10 ++ tools/include/uapi/linux/netdev.h | 3 + tools/testing/selftests/bpf/xdp_hw_metadata.c | 168 +++++++++++++++++- 15 files changed, 396 insertions(+), 39 deletions(-) -- 2.34.1

5 months, 2 weeks

5
10
0 0

[PATCH net-next 3/4] tools/testing/selftests/cgroup/cgroup_util: add cg_get_id helper

by Alexander Mikhalitsyn

Cc: linux-kselftest(a)vger.kernel.org Cc: linux-kernel(a)vger.kernel.org Cc: netdev(a)vger.kernel.org Cc: cgroups(a)vger.kernel.org Cc: "David S. Miller" <davem(a)davemloft.net> Cc: Eric Dumazet <edumazet(a)google.com> Cc: Jakub Kicinski <kuba(a)kernel.org> Cc: Paolo Abeni <pabeni(a)redhat.com> Cc: Willem de Bruijn <willemb(a)google.com> Cc: Leon Romanovsky <leon(a)kernel.org> Cc: Arnd Bergmann <arnd(a)arndb.de> Cc: Christian Brauner <brauner(a)kernel.org> Cc: Kuniyuki Iwashima <kuniyu(a)amazon.com> Cc: Lennart Poettering <mzxreary(a)0pointer.de> Cc: Luca Boccassi <bluca(a)debian.org> Cc: Tejun Heo <tj(a)kernel.org> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: "Michal Koutný" <mkoutny(a)suse.com> Cc: Shuah Khan <shuah(a)kernel.org> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn(a)canonical.com> --- tools/testing/selftests/cgroup/cgroup_util.c | 15 +++++++++++++++ tools/testing/selftests/cgroup/cgroup_util.h | 2 ++ 2 files changed, 17 insertions(+) diff --git a/tools/testing/selftests/cgroup/cgroup_util.c b/tools/testing/selftests/cgroup/cgroup_util.c index 1e2d46636a0c..b60e0e1433f4 100644 --- a/tools/testing/selftests/cgroup/cgroup_util.c +++ b/tools/testing/selftests/cgroup/cgroup_util.c @@ -205,6 +205,21 @@ int cg_open(const char *cgroup, const char *control, int flags) return open(path, flags); } +/* + * Returns cgroup id on success, or -1 on failure. + */ +uint64_t cg_get_id(const char *cgroup) +{ + struct stat st; + int ret; + + ret = stat(cgroup, &st); + if (ret) + return -1; + + return st.st_ino; +} + int cg_write_numeric(const char *cgroup, const char *control, long value) { char buf[64]; diff --git a/tools/testing/selftests/cgroup/cgroup_util.h b/tools/testing/selftests/cgroup/cgroup_util.h index 19b131ee7707..3f2d9676ceda 100644 --- a/tools/testing/selftests/cgroup/cgroup_util.h +++ b/tools/testing/selftests/cgroup/cgroup_util.h @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 */ #include <stdbool.h> +#include <stdint.h> #include <stdlib.h> #include "../kselftest.h" @@ -39,6 +40,7 @@ long cg_read_key_long(const char *cgroup, const char *control, const char *key); extern long cg_read_lc(const char *cgroup, const char *control); extern int cg_write(const char *cgroup, const char *control, char *buf); extern int cg_open(const char *cgroup, const char *control, int flags); +extern uint64_t cg_get_id(const char *cgroup); int cg_write_numeric(const char *cgroup, const char *control, long value); extern int cg_run(const char *cgroup, int (*fn)(const char *cgroup, void *arg), -- 2.43.0

5 months, 2 weeks

1
0
0 0

🎁 Yes, a Truly Free SEO Cleanup Awaits!

by Free Backlinks Clean up

Hi there, I understand the skepticism—“free” offers often come with strings attached. But we’re genuinely committed to supporting webmasters and giving back to the community with no hidden catches. Simply fill out the form, and our team will deliver a comprehensive SEO cleanup within a day—no cost, no commitments. Get Your Free SEO Cleanup Here: https://www.1free-seo.com/get-started/ (https://www.1free-seo.com/get-started/) Cheers, Mike Larson WhatsApp: +1 (833) 854-6783 (https://wa.me/18338546783) Telegram: https://t.me/freeseosupport (https://t.me/freeseosupport) Book appointment: https://1free-seo.com/free-seo-consultaition/ (https://1free-seo.com/free-seo-consultaition/) Unsubscribe (https://clicks.1free-seo.com/?na=u&nk=2080001-9a11fc6fe9&nek=24-) | Manage your subscription (https://clicks.1free-seo.com/?na=p&nk=2080001-9a11fc6fe9&nek=24-) | View online (https://clicks.1free-seo.com/?na=v&nk=2080001-9a11fc6fe9&id=24)

5 months, 2 weeks

1
0
0 0

[PATCH v3 0/2] scanf: convert self-test to KUnit

by Tamir Duberstein

This is one of just 3 remaining "Test Module" kselftests (the others being bitmap and printf), the rest having been converted to KUnit. In addition to the enclosed patch, please consider this an RFC on the removal of the "Test Module" kselftest machinery. I tested this using: $ tools/testing/kunit/kunit.py run --arch arm64 --make_options LLVM=1 scanf Signed-off-by: Tamir Duberstein <tamird(a)gmail.com> --- Changes in v3: - Reduce diff noise in lib/Makefile. (Petr Mladek) - Split `scanf_test` into a few test cases. New output: : =================== scanf (10 subtests) ==================== : [PASSED] numbers_simple : ====================== numbers_list ======================= : [PASSED] delim=" " : [PASSED] delim=":" : [PASSED] delim="," : [PASSED] delim="-" : [PASSED] delim="/" : ================== [PASSED] numbers_list =================== : ============ numbers_list_field_width_typemax ============= : [PASSED] delim=" " : [PASSED] delim=":" : [PASSED] delim="," : [PASSED] delim="-" : [PASSED] delim="/" : ======== [PASSED] numbers_list_field_width_typemax ========= : =========== numbers_list_field_width_val_width ============ : [PASSED] delim=" " : [PASSED] delim=":" : [PASSED] delim="," : [PASSED] delim="-" : [PASSED] delim="/" : ======= [PASSED] numbers_list_field_width_val_width ======== : [PASSED] numbers_slice : [PASSED] numbers_prefix_overflow : [PASSED] test_simple_strtoull : [PASSED] test_simple_strtoll : [PASSED] test_simple_strtoul : [PASSED] test_simple_strtol : ====================== [PASSED] scanf ====================== : ============================================================ : Testing complete. Ran 22 tests: passed: 22 : Elapsed time: 5.517s total, 0.001s configuring, 5.440s building, 0.067s running - Link to v2: https://lore.kernel.org/r/20250203-scanf-kunit-convert-v2-1-277a618d804e@gm… Changes in v2: - Rename lib/{test_scanf.c => scanf_kunit.c}. (Andy Shevchenko) - Link to v1: https://lore.kernel.org/r/20250131-scanf-kunit-convert-v1-1-0976524f0eba@gm… --- Tamir Duberstein (2): scanf: convert self-test to KUnit scanf: break kunit into test cases MAINTAINERS | 2 +- arch/m68k/configs/amiga_defconfig | 1 - arch/m68k/configs/apollo_defconfig | 1 - arch/m68k/configs/atari_defconfig | 1 - arch/m68k/configs/bvme6000_defconfig | 1 - arch/m68k/configs/hp300_defconfig | 1 - arch/m68k/configs/mac_defconfig | 1 - arch/m68k/configs/multi_defconfig | 1 - arch/m68k/configs/mvme147_defconfig | 1 - arch/m68k/configs/mvme16x_defconfig | 1 - arch/m68k/configs/q40_defconfig | 1 - arch/m68k/configs/sun3_defconfig | 1 - arch/m68k/configs/sun3x_defconfig | 1 - arch/powerpc/configs/ppc64_defconfig | 1 - lib/Kconfig.debug | 20 +- lib/Makefile | 2 +- lib/scanf_kunit.c | 800 ++++++++++++++++++++++++++++++++++ lib/test_scanf.c | 814 ----------------------------------- tools/testing/selftests/lib/Makefile | 2 +- tools/testing/selftests/lib/config | 1 - tools/testing/selftests/lib/scanf.sh | 4 - 21 files changed, 820 insertions(+), 838 deletions(-) --- base-commit: a86bf2283d2c9769205407e2b54777c03d012939 change-id: 20250131-scanf-kunit-convert-f70dc33bb34c Best regards, -- Tamir Duberstein <tamird(a)gmail.com>

5 months, 2 weeks

3
6
0 0

[PATCH bpf-next 0/2] selftests/bpf: Move test_lwt_seg6local to test_progs

by Bastien Curutchet (eBPF Foundation)

Hi all, This patch series continues the work to migrate the script tests into prog_tests. test_lwt_seg6local.sh tests some bpf_lwt_* helpers. It contains only one test that uses a network topology quite different than the ones that can be found in others prog_tests/lwt_*.c files so I add a new prog_tests/lwt_seg6local.c file. While working on the migration I noticed that some routes present in the script weren't needed so PATCH 1 deletes them and then PATCH 2 migrates the test into the test_progs framework. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com> --- Bastien Curutchet (eBPF Foundation) (2): selftests/bpf: lwt_seg6local: Remove unused routes selftests/bpf: lwt_seg6local: Move test to test_progs tools/testing/selftests/bpf/Makefile | 1 - .../selftests/bpf/prog_tests/lwt_seg6local.c | 176 +++++++++++++++++++++ tools/testing/selftests/bpf/test_lwt_seg6local.sh | 156 ------------------ 3 files changed, 176 insertions(+), 157 deletions(-) --- base-commit: 86eb3a47230a41c6ccf5cdae8ee0a7e7292aa29d change-id: 20250214-seg6local-64bcde44b66e Best regards, -- Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>

5 months, 2 weeks

3
4
0 0

[PATCH v2] kunit: tool: Fix bug in parsing test plan

by Rae Moar

A bug was identified where the KTAP below caused an infinite loop: TAP version 13 ok 4 test_case 1..4 The infinite loop was caused by the parser not parsing a test plan if following a test result line. Fix this bug to correctly parse test plan line. Signed-off-by: Rae Moar <rmoar(a)google.com> --- Changes since v1: - Remove error reported when test plan is missing. tools/testing/kunit/kunit_parser.py | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py index 29fc27e8949b..da53a709773a 100644 --- a/tools/testing/kunit/kunit_parser.py +++ b/tools/testing/kunit/kunit_parser.py @@ -759,7 +759,7 @@ def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest: # If parsing the main/top-level test, parse KTAP version line and # test plan test.name = "main" - ktap_line = parse_ktap_header(lines, test, printer) + parse_ktap_header(lines, test, printer) test.log.extend(parse_diagnostic(lines)) parse_test_plan(lines, test) parent_test = True @@ -768,13 +768,12 @@ def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest: # the KTAP version line and/or subtest header line ktap_line = parse_ktap_header(lines, test, printer) subtest_line = parse_test_header(lines, test) + test.log.extend(parse_diagnostic(lines)) + parse_test_plan(lines, test) parent_test = (ktap_line or subtest_line) if parent_test: - # If KTAP version line and/or subtest header is found, attempt - # to parse test plan and print test header - test.log.extend(parse_diagnostic(lines)) - parse_test_plan(lines, test) print_test_header(test, printer) + expected_count = test.expected_count subtests = [] test_num = 1 base-commit: 0619a4868fc1b32b07fb9ed6c69adc5e5cf4e4b2 -- 2.49.0.rc0.332.g42c0ae87b1-goog

5 months, 2 weeks

3
2
0 0

[PATCH v5 bpf-next 0/2] security: Propagate caller information in bpf hooks

by Blaise Boscaccy

Hello, While trying to implement an eBPF gatekeeper program, we ran into an issue whereas the LSM hooks are missing some relevant data. Certain subcommands passed to the bpf() syscall can be invoked from either the kernel or userspace. Additionally, some fields in the bpf_attr struct contain pointers, and depending on where the subcommand was invoked, they could point to either user or kernel memory. One example of this is the bpf_prog_load subcommand and its fd_array. This data is made available and used by the verifier but not made available to the LSM subsystem. This patchset simply exposes that information to applicable LSM hooks. Change list: - v4 -> v5 - merge v4 selftest breakout patch back into a single patch - change "is_kernel" to "kernel" - add selftest using new kernel flag - v3 -> v4 - split out selftest changes into a separate patch - v2 -> v3 - reorder params so that the new boolean flag is the last param - fixup function signatures in bpf selftests - v1 -> v2 - Pass a boolean flag in lieu of bpfptr_t Revisions: - v4 https://lore.kernel.org/bpf/20250304203123.3935371-1-bboscaccy@linux.micros… - v3 https://lore.kernel.org/bpf/20250303222416.3909228-1-bboscaccy@linux.micros… - v2 https://lore.kernel.org/bpf/20250228165322.3121535-1-bboscaccy@linux.micros… - v1 https://lore.kernel.org/bpf/20250226003055.1654837-1-bboscaccy@linux.micros… Blaise Boscaccy (2): security: Propagate caller information in bpf hooks selftests/bpf: Add a kernel flag test for LSM bpf hook include/linux/lsm_hook_defs.h | 6 +-- include/linux/security.h | 12 +++--- kernel/bpf/syscall.c | 10 ++--- security/security.c | 15 ++++--- security/selinux/hooks.c | 6 +-- .../selftests/bpf/prog_tests/kernel_flag.c | 43 +++++++++++++++++++ .../selftests/bpf/progs/rcu_read_lock.c | 3 +- .../bpf/progs/test_cgroup1_hierarchy.c | 4 +- .../selftests/bpf/progs/test_kernel_flag.c | 32 ++++++++++++++ .../bpf/progs/test_kfunc_dynptr_param.c | 6 +-- .../selftests/bpf/progs/test_lookup_key.c | 2 +- .../selftests/bpf/progs/test_ptr_untrusted.c | 2 +- .../bpf/progs/test_task_under_cgroup.c | 2 +- .../bpf/progs/test_verify_pkcs7_sig.c | 2 +- 14 files changed, 112 insertions(+), 33 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/kernel_flag.c create mode 100644 tools/testing/selftests/bpf/progs/test_kernel_flag.c -- 2.48.1

5 months, 2 weeks

2
3
0 0

[PATCH v5 0/3] printf: convert self-test to KUnit

by Tamir Duberstein

This is one of just 3 remaining "Test Module" kselftests (the others being bitmap and scanf), the rest having been converted to KUnit. I tested this using: $ tools/testing/kunit/kunit.py run --arch arm64 --make_options LLVM=1 printf I have also sent out a series converting scanf[0]. Link: https://lore.kernel.org/all/20250204-scanf-kunit-convert-v3-0-386d7c3ee714@… [0] Signed-off-by: Tamir Duberstein <tamird(a)gmail.com> --- Changes in v5: - Update `do_test` `__printf` annotation (Rasmus Villemoes). - Link to v4: https://lore.kernel.org/r/20250214-printf-kunit-convert-v4-0-c254572f1565@g… Changes in v4: - Add patch "implicate test line in failure messages". - Rebase on linux-next, move scanf_kunit.c into lib/tests/. - Link to v3: https://lore.kernel.org/r/20250210-printf-kunit-convert-v3-0-ee6ac5500f5e@g… Changes in v3: - Remove extraneous trailing newlines from failure messages. - Replace `pr_warn` with `kunit_warn`. - Drop arch changes. - Remove KUnit boilerplate from CONFIG_PRINTF_KUNIT_TEST help text. - Restore `total_tests` counting. - Remove tc_fail macro in last patch. - Link to v2: https://lore.kernel.org/r/20250207-printf-kunit-convert-v2-0-057b23860823@g… Changes in v2: - Incorporate code review from prior work[0] by Arpitha Raghunandan. - Link to v1: https://lore.kernel.org/r/20250204-printf-kunit-convert-v1-0-ecf1b846a4de@g… Link: https://lore.kernel.org/lkml/20200817043028.76502-1-98.arpi@gmail.com/t/#u [0] --- Tamir Duberstein (3): printf: convert self-test to KUnit printf: break kunit into test cases printf: implicate test line in failure messages Documentation/core-api/printk-formats.rst | 4 +- MAINTAINERS | 2 +- lib/Kconfig.debug | 12 +- lib/Makefile | 1 - lib/tests/Makefile | 1 + lib/{test_printf.c => tests/printf_kunit.c} | 437 ++++++++++++---------------- tools/testing/selftests/lib/config | 1 - tools/testing/selftests/lib/printf.sh | 4 - 8 files changed, 200 insertions(+), 262 deletions(-) --- base-commit: d4b0fd87ff0d4338b259dc79b2b3c6f7e70e8afa change-id: 20250131-printf-kunit-convert-fd4012aa2ec6 Best regards, -- Tamir Duberstein <tamird(a)gmail.com>

5 months, 2 weeks

2
16
0 0

[PATCH net-next v8 0/6] tun: Introduce virtio-net hashing feature

by Akihiko Odaki

virtio-net have two usage of hashes: one is RSS and another is hash reporting. Conventionally the hash calculation was done by the VMM. However, computing the hash after the queue was chosen defeats the purpose of RSS. Another approach is to use eBPF steering program. This approach has another downside: it cannot report the calculated hash due to the restrictive nature of eBPF. Introduce the code to compute hashes to the kernel in order to overcome thse challenges. An alternative solution is to extend the eBPF steering program so that it will be able to report to the userspace, but it is based on context rewrites, which is in feature freeze. We can adopt kfuncs, but they will not be UAPIs. We opt to ioctl to align with other relevant UAPIs (KVM and vhost_net). The patches for QEMU to use this new feature was submitted as RFC and is available at: https://patchew.org/QEMU/20240915-hash-v3-0-79cb08d28647@daynix.com/ This work was presented at LPC 2024: https://lpc.events/event/18/contributions/1963/ V1 -> V2: Changed to introduce a new BPF program type. Signed-off-by: Akihiko Odaki <akihiko.odaki(a)daynix.com> --- Changes in v8: - Disabled IPv6 to eliminate noises in tests. - Added a branch in tap to avoid unnecessary dissection when hash reporting is disabled. - Removed unnecessary rtnl_lock(). - Extracted code to handle new ioctls into separate functions to avoid adding extra NULL checks to the code handling other ioctls. - Introduced variable named "fd" to __tun_chr_ioctl(). - s/-/=/g in a patch message to avoid confusing Git. - Link to v7: https://lore.kernel.org/r/20250228-rss-v7-0-844205cbbdd6@daynix.com Changes in v7: - Ensured to set hash_report to VIRTIO_NET_HASH_REPORT_NONE for VHOST_NET_F_VIRTIO_NET_HDR. - s/4/sizeof(u32)/ in patch "virtio_net: Add functions for hashing". - Added tap_skb_cb type. - Rebased. - Link to v6: https://lore.kernel.org/r/20250109-rss-v6-0-b1c90ad708f6@daynix.com Changes in v6: - Extracted changes to fill vnet header holes into another series. - Squashed patches "skbuff: Introduce SKB_EXT_TUN_VNET_HASH", "tun: Introduce virtio-net hash reporting feature", and "tun: Introduce virtio-net RSS" into patch "tun: Introduce virtio-net hash feature". - Dropped the RFC tag. - Link to v5: https://lore.kernel.org/r/20241008-rss-v5-0-f3cf68df005d@daynix.com Changes in v5: - Fixed a compilation error with CONFIG_TUN_VNET_CROSS_LE. - Optimized the calculation of the hash value according to: https://git.dpdk.org/dpdk/commit/?id=3fb1ea032bd6ff8317af5dac9af901f1f324ca… - Added patch "tun: Unify vnet implementation". - Dropped patch "tap: Pad virtio header with zero". - Added patch "selftest: tun: Test vnet ioctls without device". - Reworked selftests to skip for older kernels. - Documented the case when the underlying device is deleted and packets have queue_mapping set by TC. - Reordered test harness arguments. - Added code to handle fragmented packets. - Link to v4: https://lore.kernel.org/r/20240924-rss-v4-0-84e932ec0e6c@daynix.com Changes in v4: - Moved tun_vnet_hash_ext to if_tun.h. - Renamed virtio_net_toeplitz() to virtio_net_toeplitz_calc(). - Replaced htons() with cpu_to_be16(). - Changed virtio_net_hash_rss() to return void. - Reordered variable declarations in virtio_net_hash_rss(). - Removed virtio_net_hdr_v1_hash_from_skb(). - Updated messages of "tap: Pad virtio header with zero" and "tun: Pad virtio header with zero". - Fixed vnet_hash allocation size. - Ensured to free vnet_hash when destructing tun_struct. - Link to v3: https://lore.kernel.org/r/20240915-rss-v3-0-c630015db082@daynix.com Changes in v3: - Reverted back to add ioctl. - Split patch "tun: Introduce virtio-net hashing feature" into "tun: Introduce virtio-net hash reporting feature" and "tun: Introduce virtio-net RSS". - Changed to reuse hash values computed for automq instead of performing RSS hashing when hash reporting is requested but RSS is not. - Extracted relevant data from struct tun_struct to keep it minimal. - Added kernel-doc. - Changed to allow calling TUNGETVNETHASHCAP before TUNSETIFF. - Initialized num_buffers with 1. - Added a test case for unclassified packets. - Fixed error handling in tests. - Changed tests to verify that the queue index will not overflow. - Rebased. - Link to v2: https://lore.kernel.org/r/20231015141644.260646-1-akihiko.odaki@daynix.com --- Akihiko Odaki (6): virtio_net: Add functions for hashing net: flow_dissector: Export flow_keys_dissector_symmetric tun: Introduce virtio-net hash feature selftest: tun: Test vnet ioctls without device selftest: tun: Add tests for virtio-net hashing vhost/net: Support VIRTIO_NET_F_HASH_REPORT Documentation/networking/tuntap.rst | 7 + drivers/net/Kconfig | 1 + drivers/net/tap.c | 67 +++- drivers/net/tun.c | 98 +++++- drivers/net/tun_vnet.h | 159 ++++++++- drivers/vhost/net.c | 49 +-- include/linux/if_tap.h | 2 + include/linux/skbuff.h | 3 + include/linux/virtio_net.h | 188 ++++++++++ include/net/flow_dissector.h | 1 + include/uapi/linux/if_tun.h | 75 ++++ net/core/flow_dissector.c | 3 +- net/core/skbuff.c | 4 + tools/testing/selftests/net/Makefile | 2 +- tools/testing/selftests/net/tun.c | 656 ++++++++++++++++++++++++++++++++++- 15 files changed, 1254 insertions(+), 61 deletions(-) --- base-commit: dd83757f6e686a2188997cb58b5975f744bb7786 change-id: 20240403-rss-e737d89efa77 prerequisite-change-id: 20241230-tun-66e10a49b0c7:v6 prerequisite-patch-id: 871dc5f146fb6b0e3ec8612971a8e8190472c0fb prerequisite-patch-id: 2797ed249d32590321f088373d4055ff3f430a0e prerequisite-patch-id: ea3370c72d4904e2f0536ec76ba5d26784c0cede prerequisite-patch-id: 837e4cf5d6b451424f9b1639455e83a260c4440d prerequisite-patch-id: ea701076f57819e844f5a35efe5cbc5712d3080d prerequisite-patch-id: 701646fb43ad04cc64dd2bf13c150ccbe6f828ce prerequisite-patch-id: 53176dae0c003f5b6c114d43f936cf7140d31bb5 prerequisite-change-id: 20250116-buffers-96e14bf023fc:v2 prerequisite-patch-id: 25fd4f99d4236a05a5ef16ab79f3e85ee57e21cc Best regards, -- Akihiko Odaki <akihiko.odaki(a)daynix.com>

5 months, 2 weeks

4
9
0 0

[PATCH v2 0/4] RISC-V KVM PMU fix and selftest improvement

by Atish Patra

This series adds a fix for KVM PMU code and improves the pmu selftest by allowing generating precise number of interrupts. It also provided another additional option to the overflow test that allows user to generate custom number of LCOFI interrupts. Signed-off-by: Atish Patra <atishp(a)rivosinc.com> --- Changes in v2: - Initialized the local overflow irq variable to 0 indicate that it's not a allowed value. - Moved the introduction of argument option `n` to the last patch. - Link to v1: https://lore.kernel.org/r/20250226-kvm_pmu_improve-v1-0-74c058c2bf6d@rivosi… --- Atish Patra (4): RISC-V: KVM: Disable the kernel perf counter during configure KVM: riscv: selftests: Do not start the counter in the overflow handler KVM: riscv: selftests: Change command line option KVM: riscv: selftests: Allow number of interrupts to be configurable arch/riscv/kvm/vcpu_pmu.c | 1 + tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 81 ++++++++++++++++-------- 2 files changed, 57 insertions(+), 25 deletions(-) --- base-commit: 0ad2507d5d93f39619fc42372c347d6006b64319 change-id: 20250225-kvm_pmu_improve-fffd038b2404 -- Regards, Atish patra

5 months, 2 weeks

3
6
0 0

[PATCH v2 net 0/6] eth: bnxt: fix several bugs in the bnxt module

by Taehee Yoo

The first fixes setting incorrect skb->truesize. When xdp-mb prog returns XDP_PASS, skb is allocated and initialized. Currently, The truesize is calculated as BNXT_RX_PAGE_SIZE * sinfo->nr_frags, but sinfo->nr_frags is flushed by napi_build_skb(). So, it stores sinfo before calling napi_build_skb() and then use it for calculate truesize. The second fixes kernel panic in the bnxt_queue_mem_alloc(). The bnxt_queue_mem_alloc() accesses rx ring descriptor. rx ring descriptors are allocated when the interface is up and it's freed when the interface is down. So, if bnxt_queue_mem_alloc() is called when the interface is down, kernel panic occurs. This patch makes the bnxt_queue_mem_alloc() return -ENETDOWN if rx ring descriptors are not allocated. The third patch fixes kernel panic in the bnxt_queue_{start | stop}(). When a queue is restarted bnxt_queue_{start | stop}() are called. These functions set MRU to 0 to stop packet flow and then to set up the remaining things. MRU variable is a member of vnic_info[] the first vnic_info is for default and the second is for ntuple. The first vnic_info is always allocated when interface is up, but the second is allocated only when ntuple is enabled. (ethtool -K eth0 ntuple <on | off>). Currently, the bnxt_queue_{start | stop}() access vnic_info[BNXT_VNIC_NTUPLE] regardless of whether ntuple is enabled or not. So kernel panic occurs. This patch make the bnxt_queue_{start | stop}() use bp->nr_vnics instead of BNXT_VNIC_NTUPLE. The fourth patch fixes a warning due to checksum state. The bnxt_rx_pkt() checks whether skb->ip_summed is not CHECKSUM_NONE before updating ip_summed. if ip_summed is not CHECKSUM_NONE, it WARNS about it. However, the bnxt_xdp_build_skb() is called in XDP-MB-PASS path and it updates ip_summed earlier than bnxt_rx_pkt(). So, in the XDP-MB-PASS path, the bnxt_rx_pkt() always warns about checksum. Updating ip_summed at the bnxt_xdp_build_skb() is unnecessary and duplicate, so it is removed. The fifth patch makes net_devmem_unbind_dmabuf() ignore -ENETDOWN. When devmem socket is closed, net_devmem_unbind_dmabuf() is called to unbind/release resources. If interface is down, the driver returns -ENETDOWN. The -ENETDOWN return value is not an actual error, because the interface will release resources when the interface is down. So, net_devmem_unbind_dmabuf() needs to ignore -ENETDOWN. The last patch adds XDP testcases to tools/testing/selftests/drivers/net/ping.py. v2: - Do not use num_frags in the bnxt_xdp_build_skb(). (1/6) - Add Review tags from Somnath and Jakub. (2/6) - Add new patch for fixing checksum warning. (4/6) - Add new patch for fixing warning in net_devmem_unbind_dmabuf(). (5/6) - Add new XDP testcases to ping.py (6/6) Taehee Yoo (6): eth: bnxt: fix truesize for mb-xdp-pass case eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc() eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue restart logic eth: bnxt: do not update checksum in bnxt_xdp_build_skb() net: devmem: do not WARN conditionally after netdev_rx_queue_restart() selftests: drv-net: add xdp cases for ping.py drivers/net/ethernet/broadcom/bnxt/bnxt.c | 36 ++-- drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 18 +- drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h | 6 +- net/core/devmem.c | 4 +- tools/testing/selftests/drivers/net/ping.py | 200 ++++++++++++++++-- .../testing/selftests/net/lib/xdp_dummy.bpf.c | 6 + 6 files changed, 221 insertions(+), 49 deletions(-) -- 2.34.1

5 months, 2 weeks

3
9
0 0

[PATCHv4 net 0/3] bond: fix xfrm offload issues

by Hangbin Liu

The first patch fixes the incorrect locks using in bond driver. The second patch fixes the xfrm offload feature during setup active-backup mode. The third patch add a ipsec offload testing. v4: hold xs->lock for bond_ipsec_{del, add}_sa_all (Cosmin Ratiu) use the defer helpers in lib.sh for selftest (Petr Machata) v3: move the ipsec deletion to bond_ipsec_free_sa (Cosmin Ratiu) v2: do not turn carrier on if bond change link failed (Nikolay Aleksandrov) move the mutex lock to a work queue (Cosmin Ratiu) Hangbin Liu (3): bonding: move IPsec deletion to bond_ipsec_free_sa bonding: fix xfrm offload feature setup on active-backup mode selftests: bonding: add ipsec offload test drivers/net/bonding/bond_main.c | 55 +++++-- drivers/net/bonding/bond_netlink.c | 16 +- include/net/bonding.h | 1 + .../selftests/drivers/net/bonding/Makefile | 3 +- .../drivers/net/bonding/bond_ipsec_offload.sh | 154 ++++++++++++++++++ .../selftests/drivers/net/bonding/config | 4 + 6 files changed, 208 insertions(+), 25 deletions(-) create mode 100755 tools/testing/selftests/drivers/net/bonding/bond_ipsec_offload.sh -- 2.46.0

5 months, 2 weeks

4
13
0 0

[PATCH net-next v2 1/2] selftests: drv-net: add path helper for net/lib

by Jakub Kicinski

Looks like a lot of users of recently added env.rpath() actually want to access stuff under net/lib. Add another helper. Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> --- tools/testing/selftests/drivers/net/hds.py | 2 +- tools/testing/selftests/drivers/net/hw/csum.py | 2 +- tools/testing/selftests/drivers/net/lib/py/env.py | 7 +++++++ 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/drivers/net/hds.py b/tools/testing/selftests/drivers/net/hds.py index 7cc74faed743..def44c10349a 100755 --- a/tools/testing/selftests/drivers/net/hds.py +++ b/tools/testing/selftests/drivers/net/hds.py @@ -20,7 +20,7 @@ from lib.py import defer, ethtool, ip def _xdp_onoff(cfg): - prog = cfg.rpath("../../net/lib/xdp_dummy.bpf.o") + prog = cfg.lpath("xdp_dummy.bpf.o") ip("link set dev %s xdp obj %s sec xdp" % (cfg.ifname, prog)) ip("link set dev %s xdp off" % cfg.ifname) diff --git a/tools/testing/selftests/drivers/net/hw/csum.py b/tools/testing/selftests/drivers/net/hw/csum.py index 701aca1361e0..49ec98aef579 100755 --- a/tools/testing/selftests/drivers/net/hw/csum.py +++ b/tools/testing/selftests/drivers/net/hw/csum.py @@ -88,7 +88,7 @@ from lib.py import bkg, cmd, wait_port_listen with NetDrvEpEnv(__file__, nsim_test=False) as cfg: check_nic_features(cfg) - cfg.bin_local = cfg.rpath("../../../net/lib/csum") + cfg.bin_local = cfg.lpath("csum") cfg.bin_remote = cfg.remote.deploy(cfg.bin_local) cases = [] diff --git a/tools/testing/selftests/drivers/net/lib/py/env.py b/tools/testing/selftests/drivers/net/lib/py/env.py index fd4d674e6c72..2a1f8bd0ec19 100644 --- a/tools/testing/selftests/drivers/net/lib/py/env.py +++ b/tools/testing/selftests/drivers/net/lib/py/env.py @@ -30,6 +30,13 @@ from .remote import Remote src_dir = Path(self.src_path).parent.resolve() return (src_dir / path).as_posix() + def lpath(self, path): + """ + Similar to rpath, but for files in net/lib TARGET. + """ + lib_dir = (Path(__file__).parent / "../../../../net/lib").resolve() + return (lib_dir / path).as_posix() + def _load_env_file(self): env = os.environ.copy() -- 2.48.1

5 months, 2 weeks

3
7
0 0

[PATCH net-next] selftests: openvswitch: don't hardcode the drop reason subsys

by Jakub Kicinski

WiFi removed one of their subsys entries from drop reasons, in commit 286e69677065 ("wifi: mac80211: Drop cooked monitor support") SKB_DROP_REASON_SUBSYS_OPENVSWITCH is now 2 not 3. The drop reasons are not uAPI, read the correct value from debug info. We need to enable vmlinux BTF, otherwise pahole needs a few GB of memory to decode the enum name. Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> --- CC: shuah(a)kernel.org CC: pshelar(a)ovn.org CC: aconole(a)redhat.com CC: amorenoz(a)redhat.com CC: linux-kselftest(a)vger.kernel.org CC: dev(a)openvswitch.org --- tools/testing/selftests/net/config | 2 ++ .../testing/selftests/net/openvswitch/openvswitch.sh | 11 ++++++++--- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/net/config b/tools/testing/selftests/net/config index 5b9baf708950..3365bcc35304 100644 --- a/tools/testing/selftests/net/config +++ b/tools/testing/selftests/net/config @@ -18,6 +18,8 @@ CONFIG_DUMMY=y CONFIG_BRIDGE_VLAN_FILTERING=y CONFIG_BRIDGE=y CONFIG_CRYPTO_CHACHA20POLY1305=m +CONFIG_DEBUG_INFO_BTF=y +CONFIG_DEBUG_INFO_BTF_MODULES=n CONFIG_VLAN_8021Q=y CONFIG_GENEVE=m CONFIG_IFB=y diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh b/tools/testing/selftests/net/openvswitch/openvswitch.sh index 960e1ab4dd04..3c8d3455d8e7 100755 --- a/tools/testing/selftests/net/openvswitch/openvswitch.sh +++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh @@ -330,6 +330,11 @@ test_psample() { # - drop packets and verify the right drop reason is reported test_drop_reason() { which perf >/dev/null 2>&1 || return $ksft_skip + which pahole >/dev/null 2>&1 || return $ksft_skip + + ovs_drop_subsys=$(pahole -C skb_drop_reason_subsys | + awk '/OPENVSWITCH/ { print $3; }' | + tr -d ,) sbx_add "test_drop_reason" || return $? @@ -373,7 +378,7 @@ test_drop_reason() { "in_port(2),eth(),eth_type(0x0800),ipv4(src=172.31.110.20,proto=1),icmp()" 'drop' ovs_drop_record_and_run "test_drop_reason" ip netns exec client ping -c 2 172.31.110.20 - ovs_drop_reason_count 0x30001 # OVS_DROP_FLOW_ACTION + ovs_drop_reason_count 0x${ovs_drop_subsys}0001 # OVS_DROP_FLOW_ACTION if [[ "$?" -ne "2" ]]; then info "Did not detect expected drops: $?" return 1 @@ -390,7 +395,7 @@ test_drop_reason() { ovs_drop_record_and_run \ "test_drop_reason" ip netns exec client nc -i 1 -zuv 172.31.110.20 6000 - ovs_drop_reason_count 0x30004 # OVS_DROP_EXPLICIT_ACTION_ERROR + ovs_drop_reason_count 0x${ovs_drop_subsys}0004 # OVS_DROP_EXPLICIT_ACTION_ERROR if [[ "$?" -ne "1" ]]; then info "Did not detect expected explicit error drops: $?" return 1 @@ -398,7 +403,7 @@ test_drop_reason() { ovs_drop_record_and_run \ "test_drop_reason" ip netns exec client nc -i 1 -zuv 172.31.110.20 7000 - ovs_drop_reason_count 0x30003 # OVS_DROP_EXPLICIT_ACTION + ovs_drop_reason_count 0x${ovs_drop_subsys}0003 # OVS_DROP_EXPLICIT_ACTION if [[ "$?" -ne "1" ]]; then info "Did not detect expected explicit drops: $?" return 1 -- 2.48.1

5 months, 2 weeks

4
3
0 0

[PATCH net-next 1/2] selftests: net: fix error message in bpf_offload

by Jakub Kicinski

We hit a following exception on timeout, nmaps is never set: Test bpftool bound info reporting (own ns)... Traceback (most recent call last): File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 1128, in <module> check_dev_info(False, "") File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 583, in check_dev_info maps = bpftool_map_list_wait(expected=2, ns=ns) File "/home/virtme/testing-1/tools/testing/selftests/net/./bpf_offload.py", line 215, in bpftool_map_list_wait raise Exception("Time out waiting for map counts to stabilize want %d, have %d" % (expected, nmaps)) NameError: name 'nmaps' is not defined Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> --- CC: shuah(a)kernel.org CC: linux-kselftest(a)vger.kernel.org CC: bpf(a)vger.kernel.org --- tools/testing/selftests/net/bpf_offload.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/net/bpf_offload.py b/tools/testing/selftests/net/bpf_offload.py index fd0d959914e4..4a9be8c49561 100755 --- a/tools/testing/selftests/net/bpf_offload.py +++ b/tools/testing/selftests/net/bpf_offload.py @@ -207,9 +207,11 @@ netns = [] # net namespaces to be removed raise Exception("Time out waiting for program counts to stabilize want %d, have %d" % (expected, nprogs)) def bpftool_map_list_wait(expected=0, n_retry=20, ns=""): + nmaps = None for i in range(n_retry): maps = bpftool_map_list(ns=ns) - if len(maps) == expected: + nmaps = len(maps) + if nmaps == expected: return maps time.sleep(0.05) raise Exception("Time out waiting for map counts to stabilize want %d, have %d" % (expected, nmaps)) -- 2.48.1

5 months, 2 weeks

3
4
0 0

[PATCH net-next v6 0/8] Device memory TCP TX

by Mina Almasry

v6: https://lore.kernel.org/netdev/20250222191517.743530-1-almasrymina@google.c… === v6 has no major changes. Addressed a few issues from Paolo and David, and collected Acks from Stan. Thank you everyone for the review! Changes: - retain behavior to process MSG_FASTOPEN even if the provided cmsg is invalid (Paolo). - Rework the freeing of tx_vec slightly (it now has its own err label). (Paolo). - Squash the commit that makes dmabuf unbinding scheduled work into the same one which implements the TX path so we don't run into future errors on bisecting (Paolo). - Fix/add comments to explain how dmabuf binding refcounting works (David). v5: https://lore.kernel.org/netdev/20250220020914.895431-1-almasrymina@google.c… === v5 has no major changes; it clears up the relatively minor issues pointed out to in v4, and rebases the series on top of net-next to resolve the conflict with a patch that raced to the tree. It also collects the review tags from v4. Changes: - Rebase to net-next - Fix issues in selftest (Stan). - Address comments in the devmem and netmem driver docs (Stan and Bagas) - Fix zerocopy_fill_skb_from_devmem return error code (Stan). v4: https://lore.kernel.org/netdev/20250203223916.1064540-1-almasrymina@google.… === v4 mainly addresses the critical driver support issue surfaced in v3 by Paolo and Stan. Drivers aiming to support netmem_tx should make sure not to pass the netmem dma-addrs to the dma-mapping APIs, as these dma-addrs may come from dma-bufs. Additionally other feedback from v3 is addressed. Major changes: - Add helpers to handle netmem dma-addrs. Add GVE support for netmem_tx. - Fix binding->tx_vec not being freed on error paths during the tx binding. - Add a minimal devmem_tx test to devmem.py. - Clean up everything obsolete from the cover letter (Paolo). v3: https://patchwork.kernel.org/project/netdevbpf/list/?series=929401&state=* === Address minor comments from RFCv2 and fix a few build warnings and ynl-regen issues. No major changes. RFC v2: https://patchwork.kernel.org/project/netdevbpf/list/?series=920056&state=* ======= RFC v2 addresses much of the feedback from RFC v1. I plan on sending something close to this as net-next reopens, sending it slightly early to get feedback if any. Major changes: -------------- - much improved UAPI as suggested by Stan. We now interpret the iov_base of the passed in iov from userspace as the offset into the dmabuf to send from. This removes the need to set iov.iov_base = NULL which may be confusing to users, and enables us to send multiple iovs in the same sendmsg() call. ncdevmem and the docs show a sample use of that. - Removed the duplicate dmabuf iov_iter in binding->iov_iter. I think this is good improvment as it was confusing to keep track of 2 iterators for the same sendmsg, and mistracking both iterators caused a couple of bugs reported in the last iteration that are now resolved with this streamlining. - Improved test coverage in ncdevmem. Now multiple sendmsg() are tested, and sending multiple iovs in the same sendmsg() is tested. - Fixed issue where dmabuf unmapping was happening in invalid context (Stan). ==================================================================== The TX path had been dropped from the Device Memory TCP patch series post RFCv1 [1], to make that series slightly easier to review. This series rebases the implementation of the TX path on top of the net_iov/netmem framework agreed upon and merged. The motivation for the feature is thoroughly described in the docs & cover letter of the original proposal, so I don't repeat the lengthy descriptions here, but they are available in [1]. Full outline on usage of the TX path is detailed in the documentation included with this series. Test example is available via the kselftest included in the series as well. The series is relatively small, as the TX path for this feature largely piggybacks on the existing MSG_ZEROCOPY implementation. Patch Overview: --------------- 1. Documentation & tests to give high level overview of the feature being added. 1. Add netmem refcounting needed for the TX path. 2. Devmem TX netlink API. 3. Devmem TX net stack implementation. 4. Make dma-buf unbinding scheduled work to handle TX cases where it gets freed from contexts where we can't sleep. 5. Add devmem TX documentation. 6. Add scaffolding enabling driver support for netmem_tx. Add helpers, driver feature flag, and docs to enable drivers to declare netmem_tx support. 7. Guard netmem_tx against being enabled against drivers that don't support it. 8. Add devmem_tx selftests. Add TX path to ncdevmem and add a test to devmem.py. Testing: -------- Testing is very similar to devmem TCP RX path. The ncdevmem test used for the RX path is now augemented with client functionality to test TX path. * Test Setup: Kernel: net-next with this RFC and memory provider API cherry-picked locally. Hardware: Google Cloud A3 VMs. NIC: GVE with header split & RSS & flow steering support. Performance results are not included with this version, unfortunately. I'm having issues running the dma-buf exporter driver against the upstream kernel on my test setup. The issues are specific to that dma-buf exporter and do not affect this patch series. I plan to follow up this series with perf fixes if the tests point to issues once they're up and running. Special thanks to Stan who took a stab at rebasing the TX implementation on top of the netmem/net_iov framework merged. Parts of his proposal [2] that are reused as-is are forked off into their own patches to give full credit. [1] https://lore.kernel.org/netdev/20240909054318.1809580-1-almasrymina@google.… [2] https://lore.kernel.org/netdev/20240913150913.1280238-2-sdf@fomichev.me/T/#… Cc: sdf(a)fomichev.me Cc: asml.silence(a)gmail.com Cc: dw(a)davidwei.uk Cc: Jamal Hadi Salim <jhs(a)mojatatu.com> Cc: Victor Nogueira <victor(a)mojatatu.com> Cc: Pedro Tammela <pctammela(a)mojatatu.com> Cc: Samiullah Khawaja <skhawaja(a)google.com> Mina Almasry (7): net: add get_netmem/put_netmem support net: devmem: Implement TX path net: add devmem TCP TX documentation net: enable driver support for netmem TX gve: add netmem TX support to GVE DQO-RDA mode net: check for driver support in netmem TX selftests: ncdevmem: Implement devmem TCP TX Stanislav Fomichev (1): net: devmem: TCP tx netlink api Documentation/netlink/specs/netdev.yaml | 12 + Documentation/networking/devmem.rst | 150 ++++++++- .../networking/net_cachelines/net_device.rst | 1 + Documentation/networking/netdev-features.rst | 5 + Documentation/networking/netmem.rst | 23 +- drivers/net/ethernet/google/gve/gve_main.c | 4 + drivers/net/ethernet/google/gve/gve_tx_dqo.c | 8 +- include/linux/netdevice.h | 2 + include/linux/skbuff.h | 17 +- include/linux/skbuff_ref.h | 4 +- include/net/netmem.h | 23 ++ include/net/sock.h | 1 + include/uapi/linux/netdev.h | 1 + net/core/datagram.c | 48 ++- net/core/dev.c | 3 + net/core/devmem.c | 115 ++++++- net/core/devmem.h | 77 ++++- net/core/netdev-genl-gen.c | 13 + net/core/netdev-genl-gen.h | 1 + net/core/netdev-genl.c | 73 ++++- net/core/skbuff.c | 48 ++- net/core/sock.c | 6 + net/ipv4/ip_output.c | 3 +- net/ipv4/tcp.c | 50 ++- net/ipv6/ip6_output.c | 3 +- net/vmw_vsock/virtio_transport_common.c | 5 +- tools/include/uapi/linux/netdev.h | 1 + .../selftests/drivers/net/hw/devmem.py | 26 +- .../selftests/drivers/net/hw/ncdevmem.c | 300 +++++++++++++++++- 29 files changed, 950 insertions(+), 73 deletions(-) base-commit: 80c4a0015ce249cf0917a04dbb3cc652a6811079 -- 2.48.1.658.g4767266eb4-goog

5 months, 2 weeks

6
24
0 0

[PATCH] selftests/bpf: Move test_lwt_ip_encap to test_progs

by Bastien Curutchet (eBPF Foundation)

test_lwt_ip_encap.sh isn't used by the BPF CI. Add a new file in the test_progs framework to migrate the tests done by test_lwt_ip_encap.sh. It uses the same network topology and the same BPF programs located in progs/test_lwt_ip_encap.c. Rework the GSO part to avoid using nc and dd. Remove test_lwt_ip_encap.sh and its Makefile entry. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com> --- tools/testing/selftests/bpf/Makefile | 3 +- .../selftests/bpf/prog_tests/lwt_ip_encap.c | 540 +++++++++++++++++++++ tools/testing/selftests/bpf/test_lwt_ip_encap.sh | 476 ------------------ 3 files changed, 541 insertions(+), 478 deletions(-) diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index e6a02d5b87d123cef7e6b41bfbc96d34083f38e1..df4814b5200a5a0e732b19ab3a5975957fb7cbc9 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -95,7 +95,7 @@ TEST_GEN_PROGS += test_progs-cpuv4 TEST_INST_SUBDIRS += cpuv4 endif -TEST_GEN_FILES = test_lwt_ip_encap.bpf.o test_tc_edt.bpf.o +TEST_GEN_FILES = test_tc_edt.bpf.o TEST_FILES = xsk_prereqs.sh $(wildcard progs/btf_dump_test_case_*.c) # Order correspond to 'make run_tests' order @@ -104,7 +104,6 @@ TEST_PROGS := test_kmod.sh \ test_lirc_mode2.sh \ test_xdp_vlan_mode_generic.sh \ test_xdp_vlan_mode_native.sh \ - test_lwt_ip_encap.sh \ test_tc_tunnel.sh \ test_tc_edt.sh \ test_xdping.sh \ diff --git a/tools/testing/selftests/bpf/prog_tests/lwt_ip_encap.c b/tools/testing/selftests/bpf/prog_tests/lwt_ip_encap.c new file mode 100644 index 0000000000000000000000000000000000000000..61fcded43b46cab7775237c6d85de07b5df7d87e --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/lwt_ip_encap.c @@ -0,0 +1,540 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <netinet/in.h> + +#include "network_helpers.h" +#include "test_progs.h" + +#define BPF_FILE "test_lwt_ip_encap.bpf.o" + +#define NETNS_NAME_SIZE 32 +#define NETNS_BASE "ns-lwt-ip-encap" + +#define IP4_ADDR_1 "172.16.1.100" +#define IP4_ADDR_2 "172.16.2.100" +#define IP4_ADDR_3 "172.16.3.100" +#define IP4_ADDR_4 "172.16.4.100" +#define IP4_ADDR_5 "172.16.5.100" +#define IP4_ADDR_6 "172.16.6.100" +#define IP4_ADDR_7 "172.16.7.100" +#define IP4_ADDR_8 "172.16.8.100" +#define IP4_ADDR_GRE "172.16.16.100" + +#define IP4_ADDR_SRC IP4_ADDR_1 +#define IP4_ADDR_DST IP4_ADDR_4 + +#define IP6_ADDR_1 "fb01::1" +#define IP6_ADDR_2 "fb02::1" +#define IP6_ADDR_3 "fb03::1" +#define IP6_ADDR_4 "fb04::1" +#define IP6_ADDR_5 "fb05::1" +#define IP6_ADDR_6 "fb06::1" +#define IP6_ADDR_7 "fb07::1" +#define IP6_ADDR_8 "fb08::1" +#define IP6_ADDR_GRE "fb10::1" + +#define IP6_ADDR_SRC IP6_ADDR_1 +#define IP6_ADDR_DST IP6_ADDR_4 + +/* Setup/topology: + * + * NS1 NS2 NS3 + * veth1 <---> veth2 veth3 <---> veth4 (the top route) + * veth5 <---> veth6 veth7 <---> veth8 (the bottom route) + * + * Each vethN gets IP[4|6]_ADDR_N address. + * + * IP*_ADDR_SRC = IP*_ADDR_1 + * IP*_ADDR_DST = IP*_ADDR_4 + * + * All tests test pings from IP*_ADDR__SRC to IP*_ADDR_DST. + * + * By default, routes are configured to allow packets to go + * IP*_ADDR_1 <=> IP*_ADDR_2 <=> IP*_ADDR_3 <=> IP*_ADDR_4 (the top route). + * + * A GRE device is installed in NS3 with IP*_ADDR_GRE, and + * NS1/NS2 are configured to route packets to IP*_ADDR_GRE via IP*_ADDR_8 + * (the bottom route). + * + * Tests: + * + * 1. Routes NS2->IP*_ADDR_DST are brought down, so the only way a ping + * from IP*_ADDR_SRC to IP*_ADDR_DST can work is via IP*_ADDR_GRE. + * + * 2a. In an egress test, a bpf LWT_XMIT program is installed on veth1 + * that encaps the packets with an IP/GRE header to route to IP*_ADDR_GRE. + * + * ping: SRC->[encap at veth1:egress]->GRE:decap->DST + * ping replies go DST->SRC directly + * + * 2b. In an ingress test, a bpf LWT_IN program is installed on veth2 + * that encaps the packets with an IP/GRE header to route to IP*_ADDR_GRE. + * + * ping: SRC->[encap at veth2:ingress]->GRE:decap->DST + * ping replies go DST->SRC directly + */ + +static int create_ns(char *name, size_t name_sz) +{ + if (!name) + goto fail; + + if (!ASSERT_OK(append_tid(name, name_sz), "append TID")) + goto fail; + + SYS(fail, "ip netns add %s", name); + + /* rp_filter gets confused by what these tests are doing, so disable it */ + SYS(fail, "ip netns exec %s sysctl -wq net.ipv4.conf.all.rp_filter=0", name); + SYS(fail, "ip netns exec %s sysctl -wq net.ipv4.conf.default.rp_filter=0", name); + /* Disable IPv6 DAD because it sometimes takes too long and fails tests */ + SYS(fail, "ip netns exec %s sysctl -wq net.ipv6.conf.all.accept_dad=0", name); + SYS(fail, "ip netns exec %s sysctl -wq net.ipv6.conf.default.accept_dad=0", name); + + return 0; +fail: + return -1; +} + +static int set_top_addr(const char *ns1, const char *ns2, const char *ns3) +{ + SYS(fail, "ip -n %s a add %s/24 dev veth1", ns1, IP4_ADDR_1); + SYS(fail, "ip -n %s a add %s/24 dev veth2", ns2, IP4_ADDR_2); + SYS(fail, "ip -n %s a add %s/24 dev veth3", ns2, IP4_ADDR_3); + SYS(fail, "ip -n %s a add %s/24 dev veth4", ns3, IP4_ADDR_4); + SYS(fail, "ip -n %s -6 a add %s/128 dev veth1", ns1, IP6_ADDR_1); + SYS(fail, "ip -n %s -6 a add %s/128 dev veth2", ns2, IP6_ADDR_2); + SYS(fail, "ip -n %s -6 a add %s/128 dev veth3", ns2, IP6_ADDR_3); + SYS(fail, "ip -n %s -6 a add %s/128 dev veth4", ns3, IP6_ADDR_4); + + SYS(fail, "ip -n %s link set dev veth1 up", ns1); + SYS(fail, "ip -n %s link set dev veth2 up", ns2); + SYS(fail, "ip -n %s link set dev veth3 up", ns2); + SYS(fail, "ip -n %s link set dev veth4 up", ns3); + + return 0; +fail: + return 1; +} + +static int set_bottom_addr(const char *ns1, const char *ns2, const char *ns3) +{ + SYS(fail, "ip -n %s a add %s/24 dev veth5", ns1, IP4_ADDR_5); + SYS(fail, "ip -n %s a add %s/24 dev veth6", ns2, IP4_ADDR_6); + SYS(fail, "ip -n %s a add %s/24 dev veth7", ns2, IP4_ADDR_7); + SYS(fail, "ip -n %s a add %s/24 dev veth8", ns3, IP4_ADDR_8); + SYS(fail, "ip -n %s -6 a add %s/128 dev veth5", ns1, IP6_ADDR_5); + SYS(fail, "ip -n %s -6 a add %s/128 dev veth6", ns2, IP6_ADDR_6); + SYS(fail, "ip -n %s -6 a add %s/128 dev veth7", ns2, IP6_ADDR_7); + SYS(fail, "ip -n %s -6 a add %s/128 dev veth8", ns3, IP6_ADDR_8); + + SYS(fail, "ip -n %s link set dev veth5 up", ns1); + SYS(fail, "ip -n %s link set dev veth6 up", ns2); + SYS(fail, "ip -n %s link set dev veth7 up", ns2); + SYS(fail, "ip -n %s link set dev veth8 up", ns3); + + return 0; +fail: + return 1; +} + +static int configure_vrf(const char *ns1, const char *ns2) +{ + if (!ns1 || !ns2) + goto fail; + + SYS(fail, "ip -n %s link add red type vrf table 1001", ns1); + SYS(fail, "ip -n %s link set red up", ns1); + SYS(fail, "ip -n %s route add table 1001 unreachable default metric 8192", ns1); + SYS(fail, "ip -n %s -6 route add table 1001 unreachable default metric 8192", ns1); + SYS(fail, "ip -n %s link set veth1 vrf red", ns1); + SYS(fail, "ip -n %s link set veth5 vrf red", ns1); + + SYS(fail, "ip -n %s link add red type vrf table 1001", ns2); + SYS(fail, "ip -n %s link set red up", ns2); + SYS(fail, "ip -n %s route add table 1001 unreachable default metric 8192", ns2); + SYS(fail, "ip -n %s -6 route add table 1001 unreachable default metric 8192", ns2); + SYS(fail, "ip -n %s link set veth2 vrf red", ns2); + SYS(fail, "ip -n %s link set veth3 vrf red", ns2); + SYS(fail, "ip -n %s link set veth6 vrf red", ns2); + SYS(fail, "ip -n %s link set veth7 vrf red", ns2); + + return 0; +fail: + return -1; +} + +static int configure_ns1(const char *ns1, const char *vrf) +{ + struct nstoken *nstoken = NULL; + + if (!ns1 || !vrf) + goto fail; + + nstoken = open_netns(ns1); + if (!ASSERT_OK_PTR(nstoken, "open ns1")) + goto fail; + + /* Top route */ + SYS(fail, "ip route add %s/32 dev veth1 %s", IP4_ADDR_2, vrf); + SYS(fail, "ip route add default dev veth1 via %s %s", IP4_ADDR_2, vrf); + SYS(fail, "ip -6 route add %s/128 dev veth1 %s", IP6_ADDR_2, vrf); + SYS(fail, "ip -6 route add default dev veth1 via %s %s", IP6_ADDR_2, vrf); + /* Bottom route */ + SYS(fail, "ip route add %s/32 dev veth5 %s", IP4_ADDR_6, vrf); + SYS(fail, "ip route add %s/32 dev veth5 via %s %s", IP4_ADDR_7, IP4_ADDR_6, vrf); + SYS(fail, "ip route add %s/32 dev veth5 via %s %s", IP4_ADDR_8, IP4_ADDR_6, vrf); + SYS(fail, "ip -6 route add %s/128 dev veth5 %s", IP6_ADDR_6, vrf); + SYS(fail, "ip -6 route add %s/128 dev veth5 via %s %s", IP6_ADDR_7, IP6_ADDR_6, vrf); + SYS(fail, "ip -6 route add %s/128 dev veth5 via %s %s", IP6_ADDR_8, IP6_ADDR_6, vrf); + + close_netns(nstoken); + return 0; +fail: + close_netns(nstoken); + return -1; +} + +static int configure_ns2(const char *ns2, const char *vrf) +{ + struct nstoken *nstoken = NULL; + + if (!ns2 || !vrf) + goto fail; + + nstoken = open_netns(ns2); + if (!ASSERT_OK_PTR(nstoken, "open ns2")) + goto fail; + + SYS(fail, "ip netns exec %s sysctl -wq net.ipv4.ip_forward=1", ns2); + SYS(fail, "ip netns exec %s sysctl -wq net.ipv6.conf.all.forwarding=1", ns2); + + /* Top route */ + SYS(fail, "ip route add %s/32 dev veth2 %s", IP4_ADDR_1, vrf); + SYS(fail, "ip route add %s/32 dev veth3 %s", IP4_ADDR_4, vrf); + SYS(fail, "ip -6 route add %s/128 dev veth2 %s", IP6_ADDR_1, vrf); + SYS(fail, "ip -6 route add %s/128 dev veth3 %s", IP6_ADDR_4, vrf); + /* Bottom route */ + SYS(fail, "ip route add %s/32 dev veth6 %s", IP4_ADDR_5, vrf); + SYS(fail, "ip route add %s/32 dev veth7 %s", IP4_ADDR_8, vrf); + SYS(fail, "ip -6 route add %s/128 dev veth6 %s", IP6_ADDR_5, vrf); + SYS(fail, "ip -6 route add %s/128 dev veth7 %s", IP6_ADDR_8, vrf); + + close_netns(nstoken); + return 0; +fail: + close_netns(nstoken); + return -1; +} + +static int configure_ns3(const char *ns3) +{ + struct nstoken *nstoken = NULL; + + if (!ns3) + goto fail; + + nstoken = open_netns(ns3); + if (!ASSERT_OK_PTR(nstoken, "open ns3")) + goto fail; + + /* Top route */ + SYS(fail, "ip route add %s/32 dev veth4", IP4_ADDR_3); + SYS(fail, "ip route add %s/32 dev veth4 via %s", IP4_ADDR_1, IP4_ADDR_3); + SYS(fail, "ip route add %s/32 dev veth4 via %s", IP4_ADDR_2, IP4_ADDR_3); + SYS(fail, "ip -6 route add %s/128 dev veth4", IP6_ADDR_3); + SYS(fail, "ip -6 route add %s/128 dev veth4 via %s", IP6_ADDR_1, IP6_ADDR_3); + SYS(fail, "ip -6 route add %s/128 dev veth4 via %s", IP6_ADDR_2, IP6_ADDR_3); + /* Bottom route */ + SYS(fail, "ip route add %s/32 dev veth8", IP4_ADDR_7); + SYS(fail, "ip route add %s/32 dev veth8 via %s", IP4_ADDR_5, IP4_ADDR_7); + SYS(fail, "ip route add %s/32 dev veth8 via %s", IP4_ADDR_6, IP4_ADDR_7); + SYS(fail, "ip -6 route add %s/128 dev veth8", IP6_ADDR_7); + SYS(fail, "ip -6 route add %s/128 dev veth8 via %s", IP6_ADDR_5, IP6_ADDR_7); + SYS(fail, "ip -6 route add %s/128 dev veth8 via %s", IP6_ADDR_6, IP6_ADDR_7); + + /* Configure IPv4 GRE device */ + SYS(fail, "ip tunnel add gre_dev mode gre remote %s local %s ttl 255", + IP4_ADDR_1, IP4_ADDR_GRE); + SYS(fail, "ip link set gre_dev up"); + SYS(fail, "ip a add %s dev gre_dev", IP4_ADDR_GRE); + + /* Configure IPv6 GRE device */ + SYS(fail, "ip tunnel add gre6_dev mode ip6gre remote %s local %s ttl 255", + IP6_ADDR_1, IP6_ADDR_GRE); + SYS(fail, "ip link set gre6_dev up"); + SYS(fail, "ip a add %s dev gre6_dev", IP6_ADDR_GRE); + + close_netns(nstoken); + return 0; +fail: + close_netns(nstoken); + return -1; +} + +static int setup_network(char *ns1, char *ns2, char *ns3, const char *vrf) +{ + if (!ns1 || !ns2 || !ns3 || !vrf) + goto fail; + + SYS(fail, "ip -n %s link add veth1 type veth peer name veth2 netns %s", ns1, ns2); + SYS(fail, "ip -n %s link add veth3 type veth peer name veth4 netns %s", ns2, ns3); + SYS(fail, "ip -n %s link add veth5 type veth peer name veth6 netns %s", ns1, ns2); + SYS(fail, "ip -n %s link add veth7 type veth peer name veth8 netns %s", ns2, ns3); + + if (vrf[0]) { + if (!ASSERT_OK(configure_vrf(ns1, ns2), "configure vrf")) + goto fail; + } + if (!ASSERT_OK(set_top_addr(ns1, ns2, ns3), "set top addresses")) + goto fail; + + if (!ASSERT_OK(set_bottom_addr(ns1, ns2, ns3), "set bottom addresses")) + goto fail; + + if (!ASSERT_OK(configure_ns1(ns1, vrf), "configure ns1 routes")) + goto fail; + + if (!ASSERT_OK(configure_ns2(ns2, vrf), "configure ns2 routes")) + goto fail; + + if (!ASSERT_OK(configure_ns3(ns3), "configure ns3 routes")) + goto fail; + + /* Link bottom route to the GRE tunnels */ + SYS(fail, "ip -n %s route add %s/32 dev veth5 via %s %s", + ns1, IP4_ADDR_GRE, IP4_ADDR_6, vrf); + SYS(fail, "ip -n %s route add %s/32 dev veth7 via %s %s", + ns2, IP4_ADDR_GRE, IP4_ADDR_8, vrf); + SYS(fail, "ip -n %s -6 route add %s/128 dev veth5 via %s %s", + ns1, IP6_ADDR_GRE, IP6_ADDR_6, vrf); + SYS(fail, "ip -n %s -6 route add %s/128 dev veth7 via %s %s", + ns2, IP6_ADDR_GRE, IP6_ADDR_8, vrf); + + return 0; +fail: + return -1; +} + +int remove_routes_to_gredev(const char *ns1, const char *ns2, const char *vrf) +{ + SYS(fail, "ip -n %s route del %s dev veth5 %s", ns1, IP4_ADDR_GRE, vrf); + SYS(fail, "ip -n %s route del %s dev veth7 %s", ns2, IP4_ADDR_GRE, vrf); + SYS(fail, "ip -n %s -6 route del %s/128 dev veth5 %s", ns1, IP6_ADDR_GRE, vrf); + SYS(fail, "ip -n %s -6 route del %s/128 dev veth7 %s", ns2, IP6_ADDR_GRE, vrf); + + return 0; +fail: + return -1; +} + +int add_unreachable_routes_to_gredev(const char *ns1, const char *ns2, const char *vrf) +{ + SYS(fail, "ip -n %s route add unreachable %s/32 %s", ns1, IP4_ADDR_GRE, vrf); + SYS(fail, "ip -n %s route add unreachable %s/32 %s", ns2, IP4_ADDR_GRE, vrf); + SYS(fail, "ip -n %s -6 route add unreachable %s/128 %s", ns1, IP6_ADDR_GRE, vrf); + SYS(fail, "ip -n %s -6 route add unreachable %s/128 %s", ns2, IP6_ADDR_GRE, vrf); + + return 0; +fail: + return -1; +} + +#define GSO_SIZE 5000 +#define GSO_TCP_PORT 9000 +/* This tests the fix from commit ea0371f78799 ("net: fix GSO in bpf_lwt_push_ip_encap") */ +static int test_gso_fix(const char *ns1, const char *ns3, int family) +{ + const char *ip_addr = family == AF_INET ? IP4_ADDR_DST : IP6_ADDR_DST; + char gso_packet[GSO_SIZE] = {}; + struct nstoken *nstoken = NULL; + int sfd, cfd, afd; + ssize_t bytes; + int ret = -1; + + if (!ns1 || !ns3) + return ret; + + nstoken = open_netns(ns3); + if (!ASSERT_OK_PTR(nstoken, "open ns3")) + return ret; + + sfd = start_server_str(family, SOCK_STREAM, ip_addr, GSO_TCP_PORT, NULL); + if (!ASSERT_OK_FD(sfd, "start server")) + goto close_netns; + + close_netns(nstoken); + + nstoken = open_netns(ns1); + if (!ASSERT_OK_PTR(nstoken, "open ns1")) + goto close_server; + + cfd = connect_to_addr_str(family, SOCK_STREAM, ip_addr, GSO_TCP_PORT, NULL); + if (!ASSERT_OK_FD(cfd, "connect to server")) + goto close_server; + + close_netns(nstoken); + nstoken = NULL; + + afd = accept(sfd, NULL, NULL); + if (!ASSERT_OK_FD(afd, "accept")) + goto close_client; + + /* Send a packet larger than MTU */ + bytes = send(cfd, gso_packet, GSO_SIZE, 0); + if (!ASSERT_EQ(bytes, GSO_SIZE, "send packet")) + goto close_accept; + + /* Verify we received all expected bytes */ + bytes = read(afd, gso_packet, GSO_SIZE); + if (!ASSERT_EQ(bytes, GSO_SIZE, "receive packet")) + goto close_accept; + + ret = 0; + +close_accept: + close(afd); +close_client: + close(cfd); +close_server: + close(sfd); +close_netns: + close_netns(nstoken); + + return ret; +} + +static int check_ping_ok(const char *ns1) +{ + SYS(fail, "ip netns exec %s ping -c 1 -W1 -I veth1 %s > /dev/null", ns1, IP4_ADDR_DST); + SYS(fail, "ip netns exec %s ping6 -c 1 -W1 -I veth1 %s > /dev/null", ns1, IP6_ADDR_DST); + return 0; +fail: + return -1; +} + +static int check_ping_fails(const char *ns1) +{ + int ret; + + ret = SYS_NOFAIL("ip netns exec %s ping -c 1 -W1 -I veth1 %s", ns1, IP4_ADDR_DST); + if (!ret) + return -1; + + ret = SYS_NOFAIL("ip netns exec %s ping6 -c 1 -W1 -I veth1 %s", ns1, IP6_ADDR_DST); + if (!ret) + return -1; + + return 0; +} + +#define EGRESS true +#define INGRESS false +#define IPV4_ENCAP true +#define IPV6_ENCAP false +static void lwt_ip_encap(bool ipv4_encap, bool egress, const char *vrf) +{ + char ns1[NETNS_NAME_SIZE] = NETNS_BASE "-1-"; + char ns2[NETNS_NAME_SIZE] = NETNS_BASE "-2-"; + char ns3[NETNS_NAME_SIZE] = NETNS_BASE "-3-"; + char *sec = ipv4_encap ? "encap_gre" : "encap_gre6"; + + if (!vrf) + return; + + if (!ASSERT_OK(create_ns(ns1, NETNS_NAME_SIZE), "create ns1")) + goto out; + if (!ASSERT_OK(create_ns(ns2, NETNS_NAME_SIZE), "create ns2")) + goto out; + if (!ASSERT_OK(create_ns(ns3, NETNS_NAME_SIZE), "create ns3")) + goto out; + + if (!ASSERT_OK(setup_network(ns1, ns2, ns3, vrf), "setup network")) + goto out; + + /* By default, pings work */ + if (!ASSERT_OK(check_ping_ok(ns1), "ping OK")) + goto out; + + /* Remove NS2->DST routes, ping fails */ + SYS(out, "ip -n %s route del %s/32 dev veth3 %s", ns2, IP4_ADDR_DST, vrf); + SYS(out, "ip -n %s -6 route del %s/128 dev veth3 %s", ns2, IP6_ADDR_DST, vrf); + if (!ASSERT_OK(check_ping_fails(ns1), "ping expected fail")) + goto out; + + /* Install replacement routes (LWT/eBPF), pings succeed */ + if (egress) { + SYS(out, "ip -n %s route add %s encap bpf xmit obj %s sec %s dev veth1 %s", + ns1, IP4_ADDR_DST, BPF_FILE, sec, vrf); + SYS(out, "ip -n %s -6 route add %s encap bpf xmit obj %s sec %s dev veth1 %s", + ns1, IP6_ADDR_DST, BPF_FILE, sec, vrf); + } else { + SYS(out, "ip -n %s route add %s encap bpf in obj %s sec %s dev veth2 %s", + ns2, IP4_ADDR_DST, BPF_FILE, sec, vrf); + SYS(out, "ip -n %s -6 route add %s encap bpf in obj %s sec %s dev veth2 %s", + ns2, IP6_ADDR_DST, BPF_FILE, sec, vrf); + } + + if (!ASSERT_OK(check_ping_ok(ns1), "ping OK")) + goto out; + + /* Skip GSO tests with VRF: VRF routing needs properly assigned + * source IP/device, which is easy to do with ping but hard with TCP. + */ + if (egress && !vrf[0]) { + if (!ASSERT_OK(test_gso_fix(ns1, ns3, AF_INET), "test GSO")) + goto out; + } + + /* Negative test: remove routes to GRE devices: ping fails */ + if (!ASSERT_OK(remove_routes_to_gredev(ns1, ns2, vrf), "remove routes to gredev")) + goto out; + if (!ASSERT_OK(check_ping_fails(ns1), "ping expected fail")) + goto out; + + /* Another negative test */ + if (!ASSERT_OK(add_unreachable_routes_to_gredev(ns1, ns2, vrf), + "add unreachable routes")) + goto out; + ASSERT_OK(check_ping_fails(ns1), "ping expected fail"); + +out: + SYS_NOFAIL("ip netns del %s", ns1); + SYS_NOFAIL("ip netns del %s", ns2); + SYS_NOFAIL("ip netns del %s", ns3); +} + +void test_lwt_ip_encap_vrf_ipv6(void) +{ + if (test__start_subtest("egress")) + lwt_ip_encap(IPV6_ENCAP, EGRESS, "vrf red"); + + if (test__start_subtest("ingress")) + lwt_ip_encap(IPV6_ENCAP, INGRESS, "vrf red"); +} + +void test_lwt_ip_encap_vrf_ipv4(void) +{ + if (test__start_subtest("egress")) + lwt_ip_encap(IPV4_ENCAP, EGRESS, "vrf red"); + + if (test__start_subtest("ingress")) + lwt_ip_encap(IPV4_ENCAP, INGRESS, "vrf red"); +} + +void test_lwt_ip_encap_ipv6(void) +{ + if (test__start_subtest("egress")) + lwt_ip_encap(IPV6_ENCAP, EGRESS, ""); + + if (test__start_subtest("ingress")) + lwt_ip_encap(IPV6_ENCAP, INGRESS, ""); +} + +void test_lwt_ip_encap_ipv4(void) +{ + if (test__start_subtest("egress")) + lwt_ip_encap(IPV4_ENCAP, EGRESS, ""); + + if (test__start_subtest("ingress")) + lwt_ip_encap(IPV4_ENCAP, INGRESS, ""); +} diff --git a/tools/testing/selftests/bpf/test_lwt_ip_encap.sh b/tools/testing/selftests/bpf/test_lwt_ip_encap.sh deleted file mode 100755 index 1e565f47aca940d8dc7235d823c48537d7a708b8..0000000000000000000000000000000000000000 --- a/tools/testing/selftests/bpf/test_lwt_ip_encap.sh +++ /dev/null @@ -1,476 +0,0 @@ -#!/bin/bash -# SPDX-License-Identifier: GPL-2.0 -# -# Setup/topology: -# -# NS1 NS2 NS3 -# veth1 <---> veth2 veth3 <---> veth4 (the top route) -# veth5 <---> veth6 veth7 <---> veth8 (the bottom route) -# -# each vethN gets IPv[4|6]_N address -# -# IPv*_SRC = IPv*_1 -# IPv*_DST = IPv*_4 -# -# all tests test pings from IPv*_SRC to IPv*_DST -# -# by default, routes are configured to allow packets to go -# IP*_1 <=> IP*_2 <=> IP*_3 <=> IP*_4 (the top route) -# -# a GRE device is installed in NS3 with IPv*_GRE, and -# NS1/NS2 are configured to route packets to IPv*_GRE via IP*_8 -# (the bottom route) -# -# Tests: -# -# 1. routes NS2->IPv*_DST are brought down, so the only way a ping -# from IP*_SRC to IP*_DST can work is via IPv*_GRE -# -# 2a. in an egress test, a bpf LWT_XMIT program is installed on veth1 -# that encaps the packets with an IP/GRE header to route to IPv*_GRE -# -# ping: SRC->[encap at veth1:egress]->GRE:decap->DST -# ping replies go DST->SRC directly -# -# 2b. in an ingress test, a bpf LWT_IN program is installed on veth2 -# that encaps the packets with an IP/GRE header to route to IPv*_GRE -# -# ping: SRC->[encap at veth2:ingress]->GRE:decap->DST -# ping replies go DST->SRC directly - -BPF_FILE="test_lwt_ip_encap.bpf.o" -if [[ $EUID -ne 0 ]]; then - echo "This script must be run as root" - echo "FAIL" - exit 1 -fi - -readonly NS1="ns1-$(mktemp -u XXXXXX)" -readonly NS2="ns2-$(mktemp -u XXXXXX)" -readonly NS3="ns3-$(mktemp -u XXXXXX)" - -readonly IPv4_1="172.16.1.100" -readonly IPv4_2="172.16.2.100" -readonly IPv4_3="172.16.3.100" -readonly IPv4_4="172.16.4.100" -readonly IPv4_5="172.16.5.100" -readonly IPv4_6="172.16.6.100" -readonly IPv4_7="172.16.7.100" -readonly IPv4_8="172.16.8.100" -readonly IPv4_GRE="172.16.16.100" - -readonly IPv4_SRC=$IPv4_1 -readonly IPv4_DST=$IPv4_4 - -readonly IPv6_1="fb01::1" -readonly IPv6_2="fb02::1" -readonly IPv6_3="fb03::1" -readonly IPv6_4="fb04::1" -readonly IPv6_5="fb05::1" -readonly IPv6_6="fb06::1" -readonly IPv6_7="fb07::1" -readonly IPv6_8="fb08::1" -readonly IPv6_GRE="fb10::1" - -readonly IPv6_SRC=$IPv6_1 -readonly IPv6_DST=$IPv6_4 - -TEST_STATUS=0 -TESTS_SUCCEEDED=0 -TESTS_FAILED=0 - -TMPFILE="" - -process_test_results() -{ - if [[ "${TEST_STATUS}" -eq 0 ]] ; then - echo "PASS" - TESTS_SUCCEEDED=$((TESTS_SUCCEEDED+1)) - else - echo "FAIL" - TESTS_FAILED=$((TESTS_FAILED+1)) - fi -} - -print_test_summary_and_exit() -{ - echo "passed tests: ${TESTS_SUCCEEDED}" - echo "failed tests: ${TESTS_FAILED}" - if [ "${TESTS_FAILED}" -eq "0" ] ; then - exit 0 - else - exit 1 - fi -} - -setup() -{ - set -e # exit on error - TEST_STATUS=0 - - # create devices and namespaces - ip netns add "${NS1}" - ip netns add "${NS2}" - ip netns add "${NS3}" - - # rp_filter gets confused by what these tests are doing, so disable it - ip netns exec ${NS1} sysctl -wq net.ipv4.conf.all.rp_filter=0 - ip netns exec ${NS2} sysctl -wq net.ipv4.conf.all.rp_filter=0 - ip netns exec ${NS3} sysctl -wq net.ipv4.conf.all.rp_filter=0 - ip netns exec ${NS1} sysctl -wq net.ipv4.conf.default.rp_filter=0 - ip netns exec ${NS2} sysctl -wq net.ipv4.conf.default.rp_filter=0 - ip netns exec ${NS3} sysctl -wq net.ipv4.conf.default.rp_filter=0 - - # disable IPv6 DAD because it sometimes takes too long and fails tests - ip netns exec ${NS1} sysctl -wq net.ipv6.conf.all.accept_dad=0 - ip netns exec ${NS2} sysctl -wq net.ipv6.conf.all.accept_dad=0 - ip netns exec ${NS3} sysctl -wq net.ipv6.conf.all.accept_dad=0 - ip netns exec ${NS1} sysctl -wq net.ipv6.conf.default.accept_dad=0 - ip netns exec ${NS2} sysctl -wq net.ipv6.conf.default.accept_dad=0 - ip netns exec ${NS3} sysctl -wq net.ipv6.conf.default.accept_dad=0 - - ip link add veth1 type veth peer name veth2 - ip link add veth3 type veth peer name veth4 - ip link add veth5 type veth peer name veth6 - ip link add veth7 type veth peer name veth8 - - ip netns exec ${NS2} sysctl -wq net.ipv4.ip_forward=1 - ip netns exec ${NS2} sysctl -wq net.ipv6.conf.all.forwarding=1 - - ip link set veth1 netns ${NS1} - ip link set veth2 netns ${NS2} - ip link set veth3 netns ${NS2} - ip link set veth4 netns ${NS3} - ip link set veth5 netns ${NS1} - ip link set veth6 netns ${NS2} - ip link set veth7 netns ${NS2} - ip link set veth8 netns ${NS3} - - if [ ! -z "${VRF}" ] ; then - ip -netns ${NS1} link add red type vrf table 1001 - ip -netns ${NS1} link set red up - ip -netns ${NS1} route add table 1001 unreachable default metric 8192 - ip -netns ${NS1} -6 route add table 1001 unreachable default metric 8192 - ip -netns ${NS1} link set veth1 vrf red - ip -netns ${NS1} link set veth5 vrf red - - ip -netns ${NS2} link add red type vrf table 1001 - ip -netns ${NS2} link set red up - ip -netns ${NS2} route add table 1001 unreachable default metric 8192 - ip -netns ${NS2} -6 route add table 1001 unreachable default metric 8192 - ip -netns ${NS2} link set veth2 vrf red - ip -netns ${NS2} link set veth3 vrf red - ip -netns ${NS2} link set veth6 vrf red - ip -netns ${NS2} link set veth7 vrf red - fi - - # configure addesses: the top route (1-2-3-4) - ip -netns ${NS1} addr add ${IPv4_1}/24 dev veth1 - ip -netns ${NS2} addr add ${IPv4_2}/24 dev veth2 - ip -netns ${NS2} addr add ${IPv4_3}/24 dev veth3 - ip -netns ${NS3} addr add ${IPv4_4}/24 dev veth4 - ip -netns ${NS1} -6 addr add ${IPv6_1}/128 nodad dev veth1 - ip -netns ${NS2} -6 addr add ${IPv6_2}/128 nodad dev veth2 - ip -netns ${NS2} -6 addr add ${IPv6_3}/128 nodad dev veth3 - ip -netns ${NS3} -6 addr add ${IPv6_4}/128 nodad dev veth4 - - # configure addresses: the bottom route (5-6-7-8) - ip -netns ${NS1} addr add ${IPv4_5}/24 dev veth5 - ip -netns ${NS2} addr add ${IPv4_6}/24 dev veth6 - ip -netns ${NS2} addr add ${IPv4_7}/24 dev veth7 - ip -netns ${NS3} addr add ${IPv4_8}/24 dev veth8 - ip -netns ${NS1} -6 addr add ${IPv6_5}/128 nodad dev veth5 - ip -netns ${NS2} -6 addr add ${IPv6_6}/128 nodad dev veth6 - ip -netns ${NS2} -6 addr add ${IPv6_7}/128 nodad dev veth7 - ip -netns ${NS3} -6 addr add ${IPv6_8}/128 nodad dev veth8 - - ip -netns ${NS1} link set dev veth1 up - ip -netns ${NS2} link set dev veth2 up - ip -netns ${NS2} link set dev veth3 up - ip -netns ${NS3} link set dev veth4 up - ip -netns ${NS1} link set dev veth5 up - ip -netns ${NS2} link set dev veth6 up - ip -netns ${NS2} link set dev veth7 up - ip -netns ${NS3} link set dev veth8 up - - # configure routes: IP*_SRC -> veth1/IP*_2 (= top route) default; - # the bottom route to specific bottom addresses - - # NS1 - # top route - ip -netns ${NS1} route add ${IPv4_2}/32 dev veth1 ${VRF} - ip -netns ${NS1} route add default dev veth1 via ${IPv4_2} ${VRF} # go top by default - ip -netns ${NS1} -6 route add ${IPv6_2}/128 dev veth1 ${VRF} - ip -netns ${NS1} -6 route add default dev veth1 via ${IPv6_2} ${VRF} # go top by default - # bottom route - ip -netns ${NS1} route add ${IPv4_6}/32 dev veth5 ${VRF} - ip -netns ${NS1} route add ${IPv4_7}/32 dev veth5 via ${IPv4_6} ${VRF} - ip -netns ${NS1} route add ${IPv4_8}/32 dev veth5 via ${IPv4_6} ${VRF} - ip -netns ${NS1} -6 route add ${IPv6_6}/128 dev veth5 ${VRF} - ip -netns ${NS1} -6 route add ${IPv6_7}/128 dev veth5 via ${IPv6_6} ${VRF} - ip -netns ${NS1} -6 route add ${IPv6_8}/128 dev veth5 via ${IPv6_6} ${VRF} - - # NS2 - # top route - ip -netns ${NS2} route add ${IPv4_1}/32 dev veth2 ${VRF} - ip -netns ${NS2} route add ${IPv4_4}/32 dev veth3 ${VRF} - ip -netns ${NS2} -6 route add ${IPv6_1}/128 dev veth2 ${VRF} - ip -netns ${NS2} -6 route add ${IPv6_4}/128 dev veth3 ${VRF} - # bottom route - ip -netns ${NS2} route add ${IPv4_5}/32 dev veth6 ${VRF} - ip -netns ${NS2} route add ${IPv4_8}/32 dev veth7 ${VRF} - ip -netns ${NS2} -6 route add ${IPv6_5}/128 dev veth6 ${VRF} - ip -netns ${NS2} -6 route add ${IPv6_8}/128 dev veth7 ${VRF} - - # NS3 - # top route - ip -netns ${NS3} route add ${IPv4_3}/32 dev veth4 - ip -netns ${NS3} route add ${IPv4_1}/32 dev veth4 via ${IPv4_3} - ip -netns ${NS3} route add ${IPv4_2}/32 dev veth4 via ${IPv4_3} - ip -netns ${NS3} -6 route add ${IPv6_3}/128 dev veth4 - ip -netns ${NS3} -6 route add ${IPv6_1}/128 dev veth4 via ${IPv6_3} - ip -netns ${NS3} -6 route add ${IPv6_2}/128 dev veth4 via ${IPv6_3} - # bottom route - ip -netns ${NS3} route add ${IPv4_7}/32 dev veth8 - ip -netns ${NS3} route add ${IPv4_5}/32 dev veth8 via ${IPv4_7} - ip -netns ${NS3} route add ${IPv4_6}/32 dev veth8 via ${IPv4_7} - ip -netns ${NS3} -6 route add ${IPv6_7}/128 dev veth8 - ip -netns ${NS3} -6 route add ${IPv6_5}/128 dev veth8 via ${IPv6_7} - ip -netns ${NS3} -6 route add ${IPv6_6}/128 dev veth8 via ${IPv6_7} - - # configure IPv4 GRE device in NS3, and a route to it via the "bottom" route - ip -netns ${NS3} tunnel add gre_dev mode gre remote ${IPv4_1} local ${IPv4_GRE} ttl 255 - ip -netns ${NS3} link set gre_dev up - ip -netns ${NS3} addr add ${IPv4_GRE} dev gre_dev - ip -netns ${NS1} route add ${IPv4_GRE}/32 dev veth5 via ${IPv4_6} ${VRF} - ip -netns ${NS2} route add ${IPv4_GRE}/32 dev veth7 via ${IPv4_8} ${VRF} - - - # configure IPv6 GRE device in NS3, and a route to it via the "bottom" route - ip -netns ${NS3} -6 tunnel add name gre6_dev mode ip6gre remote ${IPv6_1} local ${IPv6_GRE} ttl 255 - ip -netns ${NS3} link set gre6_dev up - ip -netns ${NS3} -6 addr add ${IPv6_GRE} nodad dev gre6_dev - ip -netns ${NS1} -6 route add ${IPv6_GRE}/128 dev veth5 via ${IPv6_6} ${VRF} - ip -netns ${NS2} -6 route add ${IPv6_GRE}/128 dev veth7 via ${IPv6_8} ${VRF} - - TMPFILE=$(mktemp /tmp/test_lwt_ip_encap.XXXXXX) - - sleep 1 # reduce flakiness - set +e -} - -cleanup() -{ - if [ -f ${TMPFILE} ] ; then - rm ${TMPFILE} - fi - - ip netns del ${NS1} 2> /dev/null - ip netns del ${NS2} 2> /dev/null - ip netns del ${NS3} 2> /dev/null -} - -trap cleanup EXIT - -remove_routes_to_gredev() -{ - ip -netns ${NS1} route del ${IPv4_GRE} dev veth5 ${VRF} - ip -netns ${NS2} route del ${IPv4_GRE} dev veth7 ${VRF} - ip -netns ${NS1} -6 route del ${IPv6_GRE}/128 dev veth5 ${VRF} - ip -netns ${NS2} -6 route del ${IPv6_GRE}/128 dev veth7 ${VRF} -} - -add_unreachable_routes_to_gredev() -{ - ip -netns ${NS1} route add unreachable ${IPv4_GRE}/32 ${VRF} - ip -netns ${NS2} route add unreachable ${IPv4_GRE}/32 ${VRF} - ip -netns ${NS1} -6 route add unreachable ${IPv6_GRE}/128 ${VRF} - ip -netns ${NS2} -6 route add unreachable ${IPv6_GRE}/128 ${VRF} -} - -test_ping() -{ - local readonly PROTO=$1 - local readonly EXPECTED=$2 - local RET=0 - - if [ "${PROTO}" == "IPv4" ] ; then - ip netns exec ${NS1} ping -c 1 -W 1 -I veth1 ${IPv4_DST} 2>&1 > /dev/null - RET=$? - elif [ "${PROTO}" == "IPv6" ] ; then - ip netns exec ${NS1} ping6 -c 1 -W 1 -I veth1 ${IPv6_DST} 2>&1 > /dev/null - RET=$? - else - echo " test_ping: unknown PROTO: ${PROTO}" - TEST_STATUS=1 - fi - - if [ "0" != "${RET}" ]; then - RET=1 - fi - - if [ "${EXPECTED}" != "${RET}" ] ; then - echo " test_ping failed: expected: ${EXPECTED}; got ${RET}" - TEST_STATUS=1 - fi -} - -test_gso() -{ - local readonly PROTO=$1 - local readonly PKT_SZ=5000 - local IP_DST="" - : > ${TMPFILE} # trim the capture file - - # check that nc is present - command -v nc >/dev/null 2>&1 || \ - { echo >&2 "nc is not available: skipping TSO tests"; return; } - - # listen on port 9000, capture TCP into $TMPFILE - if [ "${PROTO}" == "IPv4" ] ; then - IP_DST=${IPv4_DST} - ip netns exec ${NS3} bash -c \ - "nc -4 -l -p 9000 > ${TMPFILE} &" - elif [ "${PROTO}" == "IPv6" ] ; then - IP_DST=${IPv6_DST} - ip netns exec ${NS3} bash -c \ - "nc -6 -l -p 9000 > ${TMPFILE} &" - RET=$? - else - echo " test_gso: unknown PROTO: ${PROTO}" - TEST_STATUS=1 - fi - sleep 1 # let nc start listening - - # send a packet larger than MTU - ip netns exec ${NS1} bash -c \ - "dd if=/dev/zero bs=$PKT_SZ count=1 > /dev/tcp/${IP_DST}/9000 2>/dev/null" - sleep 2 # let the packet get delivered - - # verify we received all expected bytes - SZ=$(stat -c %s ${TMPFILE}) - if [ "$SZ" != "$PKT_SZ" ] ; then - echo " test_gso failed: ${PROTO}" - TEST_STATUS=1 - fi -} - -test_egress() -{ - local readonly ENCAP=$1 - echo "starting egress ${ENCAP} encap test ${VRF}" - setup - - # by default, pings work - test_ping IPv4 0 - test_ping IPv6 0 - - # remove NS2->DST routes, ping fails - ip -netns ${NS2} route del ${IPv4_DST}/32 dev veth3 ${VRF} - ip -netns ${NS2} -6 route del ${IPv6_DST}/128 dev veth3 ${VRF} - test_ping IPv4 1 - test_ping IPv6 1 - - # install replacement routes (LWT/eBPF), pings succeed - if [ "${ENCAP}" == "IPv4" ] ; then - ip -netns ${NS1} route add ${IPv4_DST} encap bpf xmit obj \ - ${BPF_FILE} sec encap_gre dev veth1 ${VRF} - ip -netns ${NS1} -6 route add ${IPv6_DST} encap bpf xmit obj \ - ${BPF_FILE} sec encap_gre dev veth1 ${VRF} - elif [ "${ENCAP}" == "IPv6" ] ; then - ip -netns ${NS1} route add ${IPv4_DST} encap bpf xmit obj \ - ${BPF_FILE} sec encap_gre6 dev veth1 ${VRF} - ip -netns ${NS1} -6 route add ${IPv6_DST} encap bpf xmit obj \ - ${BPF_FILE} sec encap_gre6 dev veth1 ${VRF} - else - echo " unknown encap ${ENCAP}" - TEST_STATUS=1 - fi - test_ping IPv4 0 - test_ping IPv6 0 - - # skip GSO tests with VRF: VRF routing needs properly assigned - # source IP/device, which is easy to do with ping and hard with dd/nc. - if [ -z "${VRF}" ] ; then - test_gso IPv4 - test_gso IPv6 - fi - - # a negative test: remove routes to GRE devices: ping fails - remove_routes_to_gredev - test_ping IPv4 1 - test_ping IPv6 1 - - # another negative test - add_unreachable_routes_to_gredev - test_ping IPv4 1 - test_ping IPv6 1 - - cleanup - process_test_results -} - -test_ingress() -{ - local readonly ENCAP=$1 - echo "starting ingress ${ENCAP} encap test ${VRF}" - setup - - # need to wait a bit for IPv6 to autoconf, otherwise - # ping6 sometimes fails with "unable to bind to address" - - # by default, pings work - test_ping IPv4 0 - test_ping IPv6 0 - - # remove NS2->DST routes, pings fail - ip -netns ${NS2} route del ${IPv4_DST}/32 dev veth3 ${VRF} - ip -netns ${NS2} -6 route del ${IPv6_DST}/128 dev veth3 ${VRF} - test_ping IPv4 1 - test_ping IPv6 1 - - # install replacement routes (LWT/eBPF), pings succeed - if [ "${ENCAP}" == "IPv4" ] ; then - ip -netns ${NS2} route add ${IPv4_DST} encap bpf in obj \ - ${BPF_FILE} sec encap_gre dev veth2 ${VRF} - ip -netns ${NS2} -6 route add ${IPv6_DST} encap bpf in obj \ - ${BPF_FILE} sec encap_gre dev veth2 ${VRF} - elif [ "${ENCAP}" == "IPv6" ] ; then - ip -netns ${NS2} route add ${IPv4_DST} encap bpf in obj \ - ${BPF_FILE} sec encap_gre6 dev veth2 ${VRF} - ip -netns ${NS2} -6 route add ${IPv6_DST} encap bpf in obj \ - ${BPF_FILE} sec encap_gre6 dev veth2 ${VRF} - else - echo "FAIL: unknown encap ${ENCAP}" - TEST_STATUS=1 - fi - test_ping IPv4 0 - test_ping IPv6 0 - - # a negative test: remove routes to GRE devices: ping fails - remove_routes_to_gredev - test_ping IPv4 1 - test_ping IPv6 1 - - # another negative test - add_unreachable_routes_to_gredev - test_ping IPv4 1 - test_ping IPv6 1 - - cleanup - process_test_results -} - -VRF="" -test_egress IPv4 -test_egress IPv6 -test_ingress IPv4 -test_ingress IPv6 - -VRF="vrf red" -test_egress IPv4 -test_egress IPv6 -test_ingress IPv4 -test_ingress IPv6 - -print_test_summary_and_exit --- base-commit: 5fd21aaac37919abc5c5d0df1eb06a9f02518f27 change-id: 20250206-lwt_ip-b6a91d2787bf Best regards, -- Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>

5 months, 2 weeks

2
1
0 0

[PATCH] selftests/nolibc: stop testing constructor order

by Thomas Weißschuh

The execution order of constructors in undefined and depends on the toolchain. While recent toolchains seems to have a stable order, it doesn't work for older ones and may also change at any time. Stop validating the order and instead only validate that all constructors are executed. Reported-by: Willy Tarreau <w(a)1wt.eu> Closes: https://lore.kernel.org/lkml/20250301110735.GA18621@1wt.eu/ Signed-off-by: Thomas Weißschuh <linux(a)weissschuh.net> --- tools/testing/selftests/nolibc/nolibc-test-linkage.c | 6 +++--- tools/testing/selftests/nolibc/nolibc-test.c | 8 ++++---- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/nolibc/nolibc-test-linkage.c b/tools/testing/selftests/nolibc/nolibc-test-linkage.c index 5ff4c8a1db2a46cf3f8cb55bdabaa5e8819b344c..a7ca8325863face9cd4134a717fe4c7761bdeb7f 100644 --- a/tools/testing/selftests/nolibc/nolibc-test-linkage.c +++ b/tools/testing/selftests/nolibc/nolibc-test-linkage.c @@ -11,16 +11,16 @@ void *linkage_test_errno_addr(void) return &errno; } -int linkage_test_constructor_test_value; +int linkage_test_constructor_test_value = 0; __attribute__((constructor)) static void constructor1(void) { - linkage_test_constructor_test_value = 2; + linkage_test_constructor_test_value |= 1 << 0; } __attribute__((constructor)) static void constructor2(void) { - linkage_test_constructor_test_value *= 3; + linkage_test_constructor_test_value |= 1 << 1; } diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index a5abf16dbfe0f2aed286964fdfc391bc6201ef3b..5884a891c491544050fc35b07322c73a1a9dbaf3 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -692,14 +692,14 @@ int expect_strtox(int llen, void *func, const char *input, int base, intmax_t ex __attribute__((constructor)) static void constructor1(void) { - constructor_test_value = 1; + constructor_test_value |= 1 << 0; } __attribute__((constructor)) static void constructor2(int argc, char **argv, char **envp) { if (argc && argv && envp) - constructor_test_value *= 2; + constructor_test_value |= 1 << 1; } int run_startup(int min, int max) @@ -738,9 +738,9 @@ int run_startup(int min, int max) CASE_TEST(environ_HOME); EXPECT_PTRNZ(1, getenv("HOME")); break; CASE_TEST(auxv_addr); EXPECT_PTRGT(test_auxv != (void *)-1, test_auxv, brk); break; CASE_TEST(auxv_AT_UID); EXPECT_EQ(1, getauxval(AT_UID), getuid()); break; - CASE_TEST(constructor); EXPECT_EQ(is_nolibc, constructor_test_value, 2); break; + CASE_TEST(constructor); EXPECT_EQ(is_nolibc, constructor_test_value, 0x3); break; CASE_TEST(linkage_errno); EXPECT_PTREQ(1, linkage_test_errno_addr(), &errno); break; - CASE_TEST(linkage_constr); EXPECT_EQ(is_nolibc, linkage_test_constructor_test_value, 6); break; + CASE_TEST(linkage_constr); EXPECT_EQ(1, linkage_test_constructor_test_value, 0x3); break; case __LINE__: return ret; /* must be last */ /* note: do not set any defaults so as to permit holes above */ --- base-commit: 6e406202a44a1a37176da0333cec10d5320c4b33 change-id: 20250306-nolibc-constructor-order-6921e8c93591 Best regards, -- Thomas Weißschuh <linux(a)weissschuh.net>

5 months, 2 weeks

2
1
0 0

[PATCH v2] selftests: livepatch: test if ftrace can trace a livepatched function

by Filipe Xavier

This new test makes sure that ftrace can trace a function that was introduced by a livepatch. Signed-off-by: Filipe Xavier <felipeaggger(a)gmail.com> --- Changes in v2: - functions.sh: added reset tracing on push and pop_config. - test-ftrace.sh: enabled tracing_on before test init. - nitpick: added double quotations on filenames and fixed some wording. - Link to v1: https://lore.kernel.org/r/20250102-ftrace-selftest-livepatch-v1-1-84880baef… --- tools/testing/selftests/livepatch/functions.sh | 14 ++++++++++ tools/testing/selftests/livepatch/test-ftrace.sh | 33 ++++++++++++++++++++++++ 2 files changed, 47 insertions(+) diff --git a/tools/testing/selftests/livepatch/functions.sh b/tools/testing/selftests/livepatch/functions.sh index e5d06fb402335d85959bafe099087effc6ddce12..e6c13514002dae5f8d7461f90b8241ab43024ea4 100644 --- a/tools/testing/selftests/livepatch/functions.sh +++ b/tools/testing/selftests/livepatch/functions.sh @@ -62,6 +62,9 @@ function push_config() { awk -F'[: ]' '{print "file " $1 " line " $2 " " $4}') FTRACE_ENABLED=$(sysctl --values kernel.ftrace_enabled) KPROBE_ENABLED=$(cat "$SYSFS_KPROBES_DIR/enabled") + TRACING_ON=$(cat "$SYSFS_DEBUG_DIR/tracing/tracing_on") + CURRENT_TRACER=$(cat "$SYSFS_DEBUG_DIR/tracing/current_tracer") + FTRACE_FILTER=$(cat "$SYSFS_DEBUG_DIR/tracing/set_ftrace_filter") } function pop_config() { @@ -74,6 +77,17 @@ function pop_config() { if [[ -n "$KPROBE_ENABLED" ]]; then echo "$KPROBE_ENABLED" > "$SYSFS_KPROBES_DIR/enabled" fi + if [[ -n "$TRACING_ON" ]]; then + echo "$TRACING_ON" > "$SYSFS_DEBUG_DIR/tracing/tracing_on" + fi + if [[ -n "$CURRENT_TRACER" ]]; then + echo "$CURRENT_TRACER" > "$SYSFS_DEBUG_DIR/tracing/current_tracer" + fi + if [[ "$FTRACE_FILTER" == *"#"* ]]; then + echo > "$SYSFS_DEBUG_DIR/tracing/set_ftrace_filter" + elif [[ -n "$FTRACE_FILTER" ]]; then + echo "$FTRACE_FILTER" > "$SYSFS_DEBUG_DIR/tracing/set_ftrace_filter" + fi } function set_dynamic_debug() { diff --git a/tools/testing/selftests/livepatch/test-ftrace.sh b/tools/testing/selftests/livepatch/test-ftrace.sh index fe14f248913acbec46fb6c0fec38a2fc84209d39..66af5d726c52e48e5177804e182b4ff31784d5ac 100755 --- a/tools/testing/selftests/livepatch/test-ftrace.sh +++ b/tools/testing/selftests/livepatch/test-ftrace.sh @@ -61,4 +61,37 @@ livepatch: '$MOD_LIVEPATCH': unpatching complete % rmmod $MOD_LIVEPATCH" +# - verify livepatch can load +# - check if traces have a patched function +# - unload livepatch and reset trace + +start_test "trace livepatched function and check that the live patch remains in effect" + +TRACE_FILE="$SYSFS_DEBUG_DIR/tracing/trace" +FUNCTION_NAME="livepatch_cmdline_proc_show" + +load_lp $MOD_LIVEPATCH + +echo 1 > "$SYSFS_DEBUG_DIR/tracing/tracing_on" +echo $FUNCTION_NAME > "$SYSFS_DEBUG_DIR/tracing/set_ftrace_filter" +echo "function" > "$SYSFS_DEBUG_DIR/tracing/current_tracer" +echo "" > "$TRACE_FILE" + +if [[ "$(cat /proc/cmdline)" != "$MOD_LIVEPATCH: this has been live patched" ]] ; then + echo -e "FAIL\n\n" + die "livepatch kselftest(s) failed" +fi + +grep -q $FUNCTION_NAME "$TRACE_FILE" +FOUND=$? + +disable_lp $MOD_LIVEPATCH +unload_lp $MOD_LIVEPATCH + +if [ "$FOUND" -eq 1 ]; then + echo -e "FAIL\n\n" + die "livepatch kselftest(s) failed" +fi + + exit 0 --- base-commit: fc033cf25e612e840e545f8d5ad2edd6ba613ed5 change-id: 20250101-ftrace-selftest-livepatch-161fb77dbed8 Best regards, -- Filipe Xavier <felipeaggger(a)gmail.com>

5 months, 2 weeks

4
4
0 0

[PATCH v3 1/5] tools/nolibc: add support for openat(2)

by Louis Taylor

openat is useful to avoid needing to construct relative paths, so expose a wrapper for using it directly. Signed-off-by: Louis Taylor <louis(a)kragniz.eu> --- tools/include/nolibc/sys.h | 25 ++++++++++++++++++++ tools/testing/selftests/nolibc/nolibc-test.c | 17 +++++++++++++ 2 files changed, 42 insertions(+) diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 8f44c33b1213..3cd938f9abda 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -765,6 +765,31 @@ int mount(const char *src, const char *tgt, return __sysret(sys_mount(src, tgt, fst, flags, data)); } +/* + * int openat(int dirfd, const char *path, int flags[, mode_t mode]); + */ + +static __attribute__((unused)) +int sys_openat(int dirfd, const char *path, int flags, mode_t mode) +{ + return my_syscall4(__NR_openat, dirfd, path, flags, mode); +} + +static __attribute__((unused)) +int openat(int dirfd, const char *path, int flags, ...) +{ + mode_t mode = 0; + + if (flags & O_CREAT) { + va_list args; + + va_start(args, flags); + mode = va_arg(args, mode_t); + va_end(args); + } + + return __sysret(sys_openat(dirfd, path, flags, mode)); +} /* * int open(const char *path, int flags[, mode_t mode]); diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index 79c3e6a845f3..e8faddcecf9d 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -1028,6 +1028,22 @@ int test_rlimit(void) return 0; } +int test_openat(void) +{ + int dev, null; + + dev = openat(AT_FDCWD, "/dev", O_DIRECTORY); + if (dev < 0) + return -1; + + null = openat(dev, "null", O_RDONLY); + close(dev); + if (null < 0) + return -1; + + close(null); + return 0; +} /* Run syscall tests between IDs <min> and <max>. * Return 0 on success, non-zero on failure. @@ -1116,6 +1132,7 @@ int run_syscall(int min, int max) CASE_TEST(mmap_munmap_good); EXPECT_SYSZR(1, test_mmap_munmap()); break; CASE_TEST(open_tty); EXPECT_SYSNE(1, tmp = open("/dev/null", 0), -1); if (tmp != -1) close(tmp); break; CASE_TEST(open_blah); EXPECT_SYSER(1, tmp = open("/proc/self/blah", 0), -1, ENOENT); if (tmp != -1) close(tmp); break; + CASE_TEST(openat_dir); EXPECT_SYSZR(1, test_openat()); break; CASE_TEST(pipe); EXPECT_SYSZR(1, test_pipe()); break; CASE_TEST(poll_null); EXPECT_SYSZR(1, poll(NULL, 0, 0)); break; CASE_TEST(poll_stdout); EXPECT_SYSNE(1, ({ struct pollfd fds = { 1, POLLOUT, 0}; poll(&fds, 1, 0); }), -1); break; -- 2.45.2

5 months, 2 weeks

2
2
0 0

[PATCH bpf-next v5 0/6] XDP metadata support for tun driver

by Marcus Wichelmann

Hi all, this v5 of the patch series is very similar to v4, but rebased onto the bpf-next/net branch instead of bpf-next/master. Because the commit c047e0e0e435 ("selftests/bpf: Optionally open a dedicated namespace to run test in it") is not yet included in this branch, I changed the xdp_context_tuntap test to manually create a namespace to run the test in. Not so successful pipeline: https://github.com/kernel-patches/bpf/actions/runs/13682405154 The CI pipeline failed because of veristat changes in seemingly unrelated eBPF programs. I don't think this has to do with the changes from this patch series, but if it does, please let me know what I may have to do different to make the CI pass. --- v5: - rebase onto bpf-next/net - resolve rebase conflicts - change xdp_context_tuntap test to manually create and open a network namespace using netns_new v4: https://lore.kernel.org/bpf/20250227142330.1605996-1-marcus.wichelmann@hetz… - strip unrelated changes from the selftest patches - extend commit message for "selftests/bpf: refactor xdp_context_functional test and bpf program" - the NOARP flag was not effective to prevent other packets from interfering with the tests, add a filter to the XDP program instead - run xdp_context_tuntap in a separate namespace to avoid conflicts with other tests v3: https://lore.kernel.org/bpf/20250224152909.3911544-1-marcus.wichelmann@hetz… - change the condition to handle xdp_buffs without metadata support, as suggested by Willem de Bruijn <willemb(a)google.com> - add clarifying comment why that condition is needed - set NOARP flag in selftests to ensure that the kernel does not send packets on the test interfaces that may interfere with the tests v2: https://lore.kernel.org/bpf/20250217172308.3291739-1-marcus.wichelmann@hetz… - submit against bpf-next subtree - split commits and improved commit messages - remove redundant metasize check and add clarifying comment instead - use max() instead of ternary operator - add selftest for metadata support in the tun driver v1: https://lore.kernel.org/all/20250130171614.1657224-1-marcus.wichelmann@hetz… Marcus Wichelmann (6): net: tun: enable XDP metadata support net: tun: enable transfer of XDP metadata to skb selftests/bpf: move open_tuntap to network helpers selftests/bpf: refactor xdp_context_functional test and bpf program selftests/bpf: add test for XDP metadata support in tun driver selftests/bpf: fix file descriptor assertion in open_tuntap helper drivers/net/tun.c | 28 +++- tools/testing/selftests/bpf/network_helpers.c | 28 ++++ tools/testing/selftests/bpf/network_helpers.h | 3 + .../selftests/bpf/prog_tests/lwt_helpers.h | 29 ---- .../bpf/prog_tests/xdp_context_test_run.c | 145 +++++++++++++++++- .../selftests/bpf/progs/test_xdp_meta.c | 53 +++++-- 6 files changed, 230 insertions(+), 56 deletions(-) -- 2.43.0

5 months, 2 weeks

2
7
0 0

[PATCH v2 1/5] tools/nolibc: add support for openat(2)

by Louis Taylor

openat is useful to avoid needing to construct relative paths, so expose a wrapper for using it directly. Signed-off-by: Louis Taylor <louis(a)kragniz.eu> --- tools/include/nolibc/sys.h | 25 ++++++++++++++++++++ tools/testing/selftests/nolibc/nolibc-test.c | 21 ++++++++++++++++ 2 files changed, 46 insertions(+) diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h index 8f44c33b1213..3cd938f9abda 100644 --- a/tools/include/nolibc/sys.h +++ b/tools/include/nolibc/sys.h @@ -765,6 +765,31 @@ int mount(const char *src, const char *tgt, return __sysret(sys_mount(src, tgt, fst, flags, data)); } +/* + * int openat(int dirfd, const char *path, int flags[, mode_t mode]); + */ + +static __attribute__((unused)) +int sys_openat(int dirfd, const char *path, int flags, mode_t mode) +{ + return my_syscall4(__NR_openat, dirfd, path, flags, mode); +} + +static __attribute__((unused)) +int openat(int dirfd, const char *path, int flags, ...) +{ + mode_t mode = 0; + + if (flags & O_CREAT) { + va_list args; + + va_start(args, flags); + mode = va_arg(args, mode_t); + va_end(args); + } + + return __sysret(sys_openat(dirfd, path, flags, mode)); +} /* * int open(const char *path, int flags[, mode_t mode]); diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c index 79c3e6a845f3..2a1629938dd6 100644 --- a/tools/testing/selftests/nolibc/nolibc-test.c +++ b/tools/testing/selftests/nolibc/nolibc-test.c @@ -1028,6 +1028,26 @@ int test_rlimit(void) return 0; } +static int test_openat(void) +{ + int dev; + int null; + + dev = openat(AT_FDCWD, "/dev", O_DIRECTORY); + if (dev < 0) + return -1; + + null = openat(dev, "null", 0); + if (null < 0) { + close(dev); + return -1; + } + + close(dev); + close(null); + + return 0; +} /* Run syscall tests between IDs <min> and <max>. * Return 0 on success, non-zero on failure. @@ -1116,6 +1136,7 @@ int run_syscall(int min, int max) CASE_TEST(mmap_munmap_good); EXPECT_SYSZR(1, test_mmap_munmap()); break; CASE_TEST(open_tty); EXPECT_SYSNE(1, tmp = open("/dev/null", 0), -1); if (tmp != -1) close(tmp); break; CASE_TEST(open_blah); EXPECT_SYSER(1, tmp = open("/proc/self/blah", 0), -1, ENOENT); if (tmp != -1) close(tmp); break; + CASE_TEST(openat_dir); EXPECT_SYSNE(1, test_openat(), -1); break; CASE_TEST(pipe); EXPECT_SYSZR(1, test_pipe()); break; CASE_TEST(poll_null); EXPECT_SYSZR(1, poll(NULL, 0, 0)); break; CASE_TEST(poll_stdout); EXPECT_SYSNE(1, ({ struct pollfd fds = { 1, POLLOUT, 0}; poll(&fds, 1, 0); }), -1); break; -- 2.45.2

5 months, 2 weeks

3
8
0 0

[PATCH] kunit: tool: Fix bug in parsing test plan

by Rae Moar

A bug was identified where the KTAP below caused an infinite loop: TAP version 13 ok 4 test_case 1..4 The infinite loop was caused by the parser not parsing a test plan if following a test result line. Fix bug to correctly parse test plan and add error if test plan is missing. Signed-off-by: Rae Moar <rmoar(a)google.com> --- tools/testing/kunit/kunit_parser.py | 12 +++++++----- tools/testing/kunit/kunit_tool_test.py | 5 ++--- 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py index 29fc27e8949b..5dcbc670e1dc 100644 --- a/tools/testing/kunit/kunit_parser.py +++ b/tools/testing/kunit/kunit_parser.py @@ -761,20 +761,22 @@ def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest: test.name = "main" ktap_line = parse_ktap_header(lines, test, printer) test.log.extend(parse_diagnostic(lines)) - parse_test_plan(lines, test) + plan_line = parse_test_plan(lines, test) parent_test = True else: # If not the main test, attempt to parse a test header containing # the KTAP version line and/or subtest header line ktap_line = parse_ktap_header(lines, test, printer) subtest_line = parse_test_header(lines, test) + test.log.extend(parse_diagnostic(lines)) + plan_line = parse_test_plan(lines, test) parent_test = (ktap_line or subtest_line) if parent_test: - # If KTAP version line and/or subtest header is found, attempt - # to parse test plan and print test header - test.log.extend(parse_diagnostic(lines)) - parse_test_plan(lines, test) print_test_header(test, printer) + + if parent_test and not plan_line: + test.add_error(printer, 'missing test plan!') + expected_count = test.expected_count subtests = [] test_num = 1 diff --git a/tools/testing/kunit/kunit_tool_test.py b/tools/testing/kunit/kunit_tool_test.py index 0bcb0cc002f8..e1e142c1a850 100755 --- a/tools/testing/kunit/kunit_tool_test.py +++ b/tools/testing/kunit/kunit_tool_test.py @@ -181,8 +181,7 @@ class KUnitParserTest(unittest.TestCase): result = kunit_parser.parse_run_tests( kunit_parser.extract_tap_lines( file.readlines()), stdout) - # A missing test plan is not an error. - self.assertEqual(result.counts, kunit_parser.TestCounts(passed=10, errors=0)) + self.assertEqual(result.counts, kunit_parser.TestCounts(passed=10, errors=2)) self.assertEqual(kunit_parser.TestStatus.SUCCESS, result.status) def test_no_tests(self): @@ -203,7 +202,7 @@ class KUnitParserTest(unittest.TestCase): self.assertEqual( kunit_parser.TestStatus.NO_TESTS, result.subtests[0].subtests[0].status) - self.assertEqual(result.counts, kunit_parser.TestCounts(passed=1, errors=1)) + self.assertEqual(result.counts, kunit_parser.TestCounts(passed=1, errors=2)) def test_no_kunit_output(self): base-commit: 0619a4868fc1b32b07fb9ed6c69adc5e5cf4e4b2 -- 2.48.1.711.g2feabab25a-goog

5 months, 2 weeks

3
5
0 0

[PATCH v8 0/4] scanf: convert self-test to KUnit

by Tamir Duberstein

This is one of just 3 remaining "Test Module" kselftests (the others being bitmap and printf), the rest having been converted to KUnit. In addition to the enclosed patch, please consider this an RFC on the removal of the "Test Module" kselftest machinery. I tested this using: $ tools/testing/kunit/kunit.py run --arch arm64 --make_options LLVM=1 scanf Failure output before this series: [ 383.100048] test_scanf: vsscanf("1574 9 64ca 935b 7 142d ff58 0", "%4hx %1hx %4hx %4hx %1hx %4hx %4hx %1hx", ...) expected 2472240330 got 1690959881 [ 383.102843] test_scanf: vsscanf("f12:2:d:2:c166:1:36b:1906", "%3hx:%1hx:%1hx:%1hx:%4hx:%1hx:%3hx:%4hx", ...) expected 131085 got 851970 [ 383.105376] test_scanf: vsscanf("4,b2fe,3,593,6,0,3bde,0", "%1hx,%4hx,%1hx,%3hx,%1hx,%1hx,%4hx,%1hx", ...) expected 93519875 got 242430 [ 383.105659] test_scanf: vsscanf("6-1-2-1-d9e6-f-93e-e567", "%1hx-%1hx-%1hx-%1hx-%4hx-%1hx-%3hx-%4hx", ...) expected 65538 got 131073 [ 383.106127] test_scanf: vsscanf("72d6/35/e88d/1/0/6c8c/7/1", "%4hx/%2hx/%4hx/%1hx/%1hx/%4hx/%1hx/%1hx", ...) expected 125069 got 3901554741 [ 383.106235] test_scanf: vsscanf("c9bea1b8122113e9a168df573", "%4hx%4hx%1hx%4hx%4hx%1hx%4hx%3hx", ...) expected 571539457 got 106936 ... [ 383.106398] test_scanf: failed 6 out of 2545 tests Failure output after this series: # numbers_list_field_width_val_width: ASSERTION FAILED at lib/scanf_kunit.c:94 lib/scanf_kunit.c:555: vsscanf("0 1e 3e43 31f0 0 0 5797 9c70", "%1hx %2hx %4hx %4hx %1hx %1hx %4hx %4hx", ...) expected 837828163 got 1044578334 not ok 1 " " # numbers_list_field_width_val_width: ASSERTION FAILED at lib/scanf_kunit.c:94 lib/scanf_kunit.c:555: vsscanf("dc2:1c:0:3531:2621:5172:1:7", "%3hx:%2hx:%1hx:%4hx:%4hx:%4hx:%1hx:%1hx", ...) expected 892403712 got 28 not ok 2 ":" # numbers_list_field_width_val_width: ASSERTION FAILED at lib/scanf_kunit.c:94 lib/scanf_kunit.c:555: vsscanf("e083,8f6e,b,70ca,1,1,aab1,10e4", "%4hx,%4hx,%1hx,%4hx,%1hx,%1hx,%4hx,%4hx", ...) expected 1892286475 got 757614 not ok 3 "," # numbers_list_field_width_val_width: ASSERTION FAILED at lib/scanf_kunit.c:94 lib/scanf_kunit.c:555: vsscanf("2e72-8435-1-2fc-7cbd-c2f1-7158-2b41", "%4hx-%4hx-%1hx-%3hx-%4hx-%4hx-%4hx-%4hx", ...) expected 50069505 got 99381 not ok 4 "-" # numbers_list_field_width_val_width: ASSERTION FAILED at lib/scanf_kunit.c:94 lib/scanf_kunit.c:555: vsscanf("403/0/17/1/11e7/1/1fe8/34ba", "%3hx/%1hx/%2hx/%1hx/%4hx/%1hx/%4hx/%4hx", ...) expected 65559 got 1507328 not ok 5 "/" Signed-off-by: Tamir Duberstein <tamird(a)gmail.com> --- Changes in v8: - Expand "scanf: remove redundant debug logs" commit message. (Andy Shevchenko) - Add patch "implicate test line in failure messages". - Rebase on linux-next, move scanf_kunit.c into lib/tests/. - Link to v7: https://lore.kernel.org/r/20250211-scanf-kunit-convert-v7-0-c057f0a3d9d8@gm… Changes in v7: - Remove redundant debug logs. (Petr Mladek) - Drop Petr's Acked-by. - Use original test assertions as KUNIT_*_EQ_MSG produces hard-to-parse messages. The new failure output is: - Link to v6: https://lore.kernel.org/r/20250210-scanf-kunit-convert-v6-0-4d583d07f92d@gm… Changes in v6: - s/at boot/at runtime/ for consistency with the printf series. - Go back to kmalloc. (Geert Uytterhoeven) - Link to v5: https://lore.kernel.org/r/20250210-scanf-kunit-convert-v5-0-8e64f3a7de99@gm… Changes in v5: - Remove extraneous trailing newlines from failure messages. - Replace `pr_debug` with `kunit_printk`. - Use static char arrays instead of kmalloc. - Drop KUnit boilerplate from CONFIG_SCANF_KUNIT_TEST help text. - Drop arch changes. - Link to v4: https://lore.kernel.org/r/20250207-scanf-kunit-convert-v4-0-a23e2afaede8@gm… Changes in v4: - Bake `test` into various macros, greatly reducing diff noise. - Revert control flow changes. - Link to v3: https://lore.kernel.org/r/20250204-scanf-kunit-convert-v3-0-386d7c3ee714@gm… Changes in v3: - Reduce diff noise in lib/Makefile. (Petr Mladek) - Split `scanf_test` into a few test cases. New output: : =================== scanf (10 subtests) ==================== : [PASSED] numbers_simple : ====================== numbers_list ======================= : [PASSED] delim=" " : [PASSED] delim=":" : [PASSED] delim="," : [PASSED] delim="-" : [PASSED] delim="/" : ================== [PASSED] numbers_list =================== : ============ numbers_list_field_width_typemax ============= : [PASSED] delim=" " : [PASSED] delim=":" : [PASSED] delim="," : [PASSED] delim="-" : [PASSED] delim="/" : ======== [PASSED] numbers_list_field_width_typemax ========= : =========== numbers_list_field_width_val_width ============ : [PASSED] delim=" " : [PASSED] delim=":" : [PASSED] delim="," : [PASSED] delim="-" : [PASSED] delim="/" : ======= [PASSED] numbers_list_field_width_val_width ======== : [PASSED] numbers_slice : [PASSED] numbers_prefix_overflow : [PASSED] test_simple_strtoull : [PASSED] test_simple_strtoll : [PASSED] test_simple_strtoul : [PASSED] test_simple_strtol : ====================== [PASSED] scanf ====================== : ============================================================ : Testing complete. Ran 22 tests: passed: 22 : Elapsed time: 5.517s total, 0.001s configuring, 5.440s building, 0.067s running - Link to v2: https://lore.kernel.org/r/20250203-scanf-kunit-convert-v2-1-277a618d804e@gm… Changes in v2: - Rename lib/{test_scanf.c => scanf_kunit.c}. (Andy Shevchenko) - Link to v1: https://lore.kernel.org/r/20250131-scanf-kunit-convert-v1-1-0976524f0eba@gm… --- Tamir Duberstein (4): scanf: implicate test line in failure messages scanf: remove redundant debug logs scanf: convert self-test to KUnit scanf: break kunit into test cases MAINTAINERS | 2 +- lib/Kconfig.debug | 12 +- lib/Makefile | 1 - lib/tests/Makefile | 1 + lib/{test_scanf.c => tests/scanf_kunit.c} | 299 +++++++++++++++--------------- tools/testing/selftests/lib/Makefile | 2 +- tools/testing/selftests/lib/config | 1 - tools/testing/selftests/lib/scanf.sh | 4 - 8 files changed, 160 insertions(+), 162 deletions(-) --- base-commit: 7b7a883c7f4de1ee5040bd1c32aabaafde54d209 change-id: 20250131-scanf-kunit-convert-f70dc33bb34c Best regards, -- Tamir Duberstein <tamird(a)gmail.com>

5 months, 2 weeks

4
19
0 0

[PATCH net-next v7 0/6] tun: Introduce virtio-net hashing feature

by Akihiko Odaki

virtio-net have two usage of hashes: one is RSS and another is hash reporting. Conventionally the hash calculation was done by the VMM. However, computing the hash after the queue was chosen defeats the purpose of RSS. Another approach is to use eBPF steering program. This approach has another downside: it cannot report the calculated hash due to the restrictive nature of eBPF. Introduce the code to compute hashes to the kernel in order to overcome thse challenges. An alternative solution is to extend the eBPF steering program so that it will be able to report to the userspace, but it is based on context rewrites, which is in feature freeze. We can adopt kfuncs, but they will not be UAPIs. We opt to ioctl to align with other relevant UAPIs (KVM and vhost_net). The patches for QEMU to use this new feature was submitted as RFC and is available at: https://patchew.org/QEMU/20240915-hash-v3-0-79cb08d28647@daynix.com/ This work was presented at LPC 2024: https://lpc.events/event/18/contributions/1963/ V1 -> V2: Changed to introduce a new BPF program type. Signed-off-by: Akihiko Odaki <akihiko.odaki(a)daynix.com> --- Changes in v7: - Ensured to set hash_report to VIRTIO_NET_HASH_REPORT_NONE for VHOST_NET_F_VIRTIO_NET_HDR. - s/4/sizeof(u32)/ in patch "virtio_net: Add functions for hashing". - Added tap_skb_cb type. - Rebased. - Link to v6: https://lore.kernel.org/r/20250109-rss-v6-0-b1c90ad708f6@daynix.com Changes in v6: - Extracted changes to fill vnet header holes into another series. - Squashed patches "skbuff: Introduce SKB_EXT_TUN_VNET_HASH", "tun: Introduce virtio-net hash reporting feature", and "tun: Introduce virtio-net RSS" into patch "tun: Introduce virtio-net hash feature". - Dropped the RFC tag. - Link to v5: https://lore.kernel.org/r/20241008-rss-v5-0-f3cf68df005d@daynix.com Changes in v5: - Fixed a compilation error with CONFIG_TUN_VNET_CROSS_LE. - Optimized the calculation of the hash value according to: https://git.dpdk.org/dpdk/commit/?id=3fb1ea032bd6ff8317af5dac9af901f1f324ca… - Added patch "tun: Unify vnet implementation". - Dropped patch "tap: Pad virtio header with zero". - Added patch "selftest: tun: Test vnet ioctls without device". - Reworked selftests to skip for older kernels. - Documented the case when the underlying device is deleted and packets have queue_mapping set by TC. - Reordered test harness arguments. - Added code to handle fragmented packets. - Link to v4: https://lore.kernel.org/r/20240924-rss-v4-0-84e932ec0e6c@daynix.com Changes in v4: - Moved tun_vnet_hash_ext to if_tun.h. - Renamed virtio_net_toeplitz() to virtio_net_toeplitz_calc(). - Replaced htons() with cpu_to_be16(). - Changed virtio_net_hash_rss() to return void. - Reordered variable declarations in virtio_net_hash_rss(). - Removed virtio_net_hdr_v1_hash_from_skb(). - Updated messages of "tap: Pad virtio header with zero" and "tun: Pad virtio header with zero". - Fixed vnet_hash allocation size. - Ensured to free vnet_hash when destructing tun_struct. - Link to v3: https://lore.kernel.org/r/20240915-rss-v3-0-c630015db082@daynix.com Changes in v3: - Reverted back to add ioctl. - Split patch "tun: Introduce virtio-net hashing feature" into "tun: Introduce virtio-net hash reporting feature" and "tun: Introduce virtio-net RSS". - Changed to reuse hash values computed for automq instead of performing RSS hashing when hash reporting is requested but RSS is not. - Extracted relevant data from struct tun_struct to keep it minimal. - Added kernel-doc. - Changed to allow calling TUNGETVNETHASHCAP before TUNSETIFF. - Initialized num_buffers with 1. - Added a test case for unclassified packets. - Fixed error handling in tests. - Changed tests to verify that the queue index will not overflow. - Rebased. - Link to v2: https://lore.kernel.org/r/20231015141644.260646-1-akihiko.odaki@daynix.com --- Akihiko Odaki (6): virtio_net: Add functions for hashing net: flow_dissector: Export flow_keys_dissector_symmetric tun: Introduce virtio-net hash feature selftest: tun: Test vnet ioctls without device selftest: tun: Add tests for virtio-net hashing vhost/net: Support VIRTIO_NET_F_HASH_REPORT Documentation/networking/tuntap.rst | 7 + drivers/net/Kconfig | 1 + drivers/net/tap.c | 62 +++- drivers/net/tun.c | 89 ++++- drivers/net/tun_vnet.h | 180 +++++++++- drivers/vhost/net.c | 49 +-- include/linux/if_tap.h | 2 + include/linux/skbuff.h | 3 + include/linux/virtio_net.h | 188 +++++++++++ include/net/flow_dissector.h | 1 + include/uapi/linux/if_tun.h | 75 +++++ net/core/flow_dissector.c | 3 +- net/core/skbuff.c | 4 + tools/testing/selftests/net/Makefile | 2 +- tools/testing/selftests/net/tun.c | 627 ++++++++++++++++++++++++++++++++++- 15 files changed, 1231 insertions(+), 62 deletions(-) --- base-commit: dd83757f6e686a2188997cb58b5975f744bb7786 change-id: 20240403-rss-e737d89efa77 prerequisite-change-id: 20241230-tun-66e10a49b0c7:v6 prerequisite-patch-id: 871dc5f146fb6b0e3ec8612971a8e8190472c0fb prerequisite-patch-id: 2797ed249d32590321f088373d4055ff3f430a0e prerequisite-patch-id: ea3370c72d4904e2f0536ec76ba5d26784c0cede prerequisite-patch-id: 837e4cf5d6b451424f9b1639455e83a260c4440d prerequisite-patch-id: ea701076f57819e844f5a35efe5cbc5712d3080d prerequisite-patch-id: 701646fb43ad04cc64dd2bf13c150ccbe6f828ce prerequisite-patch-id: 53176dae0c003f5b6c114d43f936cf7140d31bb5 prerequisite-change-id: 20250116-buffers-96e14bf023fc:v2 prerequisite-patch-id: 25fd4f99d4236a05a5ef16ab79f3e85ee57e21cc Best regards, -- Akihiko Odaki <akihiko.odaki(a)daynix.com>

5 months, 2 weeks

4
12
0 0

Re: [brauner-github:vfs.all 205/231] WARNING: modpost: vmlinux: section mismatch in reference: initramfs_test_cases+0x0 (section: .data) -> initramfs_test_extract (section: .init.text)

by David Disseldorp

[cc'ing linux-kselftest and kunit-dev] Hi, On Wed, 5 Mar 2025 01:47:55 +0800, kernel test robot wrote: > tree: https://github.com/brauner/linux.git vfs.all > head: ea47e99a3a234837d5fea0d1a20bb2ad1eaa6dd4 > commit: b6736cfccb582b7c016cba6cd484fbcf30d499af [205/231] initramfs_test: kunit tests for initramfs unpacking > config: x86_64-buildonly-randconfig-002-20250304 (https://download.01.org/0day-ci/archive/20250305/202503050109.t5Ab93hX-lkp@…) > compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250305/202503050109.t5Ab93hX-lkp@…) > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot <lkp(a)intel.com> > | Closes: https://lore.kernel.org/oe-kbuild-all/202503050109.t5Ab93hX-lkp@intel.com/ > > All warnings (new ones prefixed by >>, old ones prefixed by <<): > > >> WARNING: modpost: vmlinux: section mismatch in reference: initramfs_test_cases+0x0 (section: .data) -> initramfs_test_extract (section: .init.text) > >> WARNING: modpost: vmlinux: section mismatch in reference: initramfs_test_cases+0x30 (section: .data) -> initramfs_test_fname_overrun (section: .init.text) > >> WARNING: modpost: vmlinux: section mismatch in reference: initramfs_test_cases+0x60 (section: .data) -> initramfs_test_data (section: .init.text) > >> WARNING: modpost: vmlinux: section mismatch in reference: initramfs_test_cases+0x90 (section: .data) -> initramfs_test_csum (section: .init.text) > >> WARNING: modpost: vmlinux: section mismatch in reference: initramfs_test_cases+0xc0 (section: .data) -> initramfs_test_hardlink (section: .init.text) > >> WARNING: modpost: vmlinux: section mismatch in reference: initramfs_test_cases+0xf0 (section: .data) -> initramfs_test_many (section: .init.text) These new warnings are covered in the commit message. The kunit_test_init_section_suites() registered tests aren't in the .init section as debugfs entries are retained for results reporting (without an ability to rerun them). IIUC, the __kunit_init_test_suites->CONCATENATE(..., _probe) suffix is intended to suppress the modpost warning - @kunit-dev: any ideas why this isn't working as intended? Thanks, David

5 months, 3 weeks

3
3
0 0

[PATCH] selftests: riscv: fix v_exec_initval_nolibc.c

by Ignacio Encinas

Vector registers are zero initialized by the kernel. Stop accepting "all ones" as a clean value. Note that this was not working as expected given that value == 0xff can be assumed to be always false by the compiler as value's range is [-128, 127]. Both GCC (-Wtype-limits) and clang (-Wtautological-constant-out-of-range-compare) warn about this. Signed-off-by: Ignacio Encinas <ignacio(a)iencinas.com> --- I tried looking why "all ones" was previously deemed a "clean" value but couldn't find any information. It looks like the kernel always zero-initializes the vector registers. If "all ones" is still acceptable for any reason, my intention is to spin a v2 changing the types of `value` and `prev_value` to unsigned char. --- tools/testing/selftests/riscv/vector/v_exec_initval_nolibc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/riscv/vector/v_exec_initval_nolibc.c b/tools/testing/selftests/riscv/vector/v_exec_initval_nolibc.c index 35c0812e32de0c82a54f84bd52c4272507121e35..b712c4d258a6cb045aa96de4a75299714866f5e6 100644 --- a/tools/testing/selftests/riscv/vector/v_exec_initval_nolibc.c +++ b/tools/testing/selftests/riscv/vector/v_exec_initval_nolibc.c @@ -6,7 +6,7 @@ * the values. To further ensure consistency, this file is compiled without * libc and without auto-vectorization. * - * To be "clean" all values must be either all ones or all zeroes. + * To be "clean" all values must be all zeroes. */ #define __stringify_1(x...) #x @@ -46,7 +46,7 @@ int main(int argc, char **argv) : "=r" (value)); \ if (first) { \ first = 0; \ - } else if (value != prev_value || !(value == 0x00 || value == 0xff)) { \ + } else if (value != prev_value || value != 0x00) { \ printf("Register " __stringify(register) \ " values not clean! value: %u\n", value); \ exit(-1); \ --- base-commit: 03d38806a902b36bf364cae8de6f1183c0a35a67 change-id: 20250301-fix-v_exec_initval_nolibc-498d976c372d Best regards, -- Ignacio Encinas <ignacio(a)iencinas.com>

5 months, 3 weeks

3
3
0 0

[PATCH v4 0/8] initramfs: kunit tests and cleanups

by David Disseldorp

This patchset adds basic kunit test coverage for initramfs unpacking and cleans up some minor buffer handling issues / inefficiencies. Changes since v3: - Drop shared unpack buffer changes + rework into initramfs: allocate heap buffers together (patch 5/8) + extra review complexity wasn't worth the tiny boot-time heap saving - move hardlink hash leak repro into first initramfs_test patch - add note regarding kunit section=.data -> section=.init.text warning Changes since v2 (patch 2 only): - fix !CONFIG_INITRAMFS_PRESERVE_MTIME kunit test checks - add test MODULE_DESCRIPTION(), as suggested by Jeff Johnson - add some missing headers, reported by kernel test robot Changes since v1 (RFC): - rebase atop v6.12-rc6 and filename field overrun fix from https://lore.kernel.org/r/20241030035509.20194-2-ddiss@suse.de - add unit test coverage (new patches 1 and 2) - add patch: fix hardlink hash leak without TRAILER - rework patch: avoid static buffer for error message + drop unnecessary message propagation - drop patch: cpio_buf reuse for built-in and bootloader initramfs + no good justification for the change Feedback appreciated. David Disseldorp (8): init: add initramfs_internal.h initramfs_test: kunit tests for initramfs unpacking vsprintf: add simple_strntoul initramfs: avoid memcpy for hex header fields initramfs: allocate heap buffers together initramfs: reuse name_len for dir mtime tracking initramfs: fix hardlink hash leak without TRAILER initramfs: avoid static buffer for error message include/linux/kstrtox.h | 1 + init/.kunitconfig | 3 + init/Kconfig | 7 + init/Makefile | 1 + init/initramfs.c | 66 ++++---- init/initramfs_internal.h | 8 + init/initramfs_test.c | 407 ++++++++++++++++++++++++++++++++++++++++++++++ lib/vsprintf.c | 7 + 8 files changed, 472 insertions(+), 28 deletions(-) create mode 100644 init/.kunitconfig create mode 100644 init/initramfs_internal.h create mode 100644 init/initramfs_test.c

5 months, 3 weeks

2
11
0 0

[PATCH] selftests: Override command line in lib.mk

by Akihiko Odaki

Documentation/dev-tools/kselftest.rst says you can use the "TARGETS" variable on the make command line to run only tests targeted for a single subsystem: $ make TARGETS="size timers" kselftest A natural way to narrow down further to a particular test in a subsystem is to specify e.g., TEST_GEN_PROGS: $ make TARGETS=net TEST_PROGS= TEST_GEN_PROGS=tun kselftest However, this does not work well because the following statement in tools/testing/selftests/lib.mk gets ignored: TEST_GEN_PROGS := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS)) Add the override directive to make it and similar ones will be effective even when TEST_GEN_PROGS and similar variables are specified in the command line. Signed-off-by: Akihiko Odaki <akihiko.odaki(a)daynix.com> --- tools/testing/selftests/lib.mk | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk index d6edcfcb5be832ddee4c3d34b5ad221e9295f878..68116e51f97d62376c63f727ba3fd1f616c67562 100644 --- a/tools/testing/selftests/lib.mk +++ b/tools/testing/selftests/lib.mk @@ -93,9 +93,9 @@ TOOLS_INCLUDES := -isystem $(top_srcdir)/tools/include/uapi # TEST_PROGS are for test shell scripts. # TEST_CUSTOM_PROGS and TEST_PROGS will be run by common run_tests # and install targets. Common clean doesn't touch them. -TEST_GEN_PROGS := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS)) -TEST_GEN_PROGS_EXTENDED := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS_EXTENDED)) -TEST_GEN_FILES := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_FILES)) +override TEST_GEN_PROGS := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS)) +override TEST_GEN_PROGS_EXTENDED := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS_EXTENDED)) +override TEST_GEN_FILES := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_FILES)) all: $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) \ $(if $(TEST_GEN_MODS_DIR),gen_mods_dir) --- base-commit: dd83757f6e686a2188997cb58b5975f744bb7786 change-id: 20250306-lib-4ac9711c10a2 Best regards, -- Akihiko Odaki <akihiko.odaki(a)daynix.com>

5 months, 3 weeks

1
0
0 0

[PATCHv3 net 0/2] bonding: fix incorrect mac address setting

by Hangbin Liu

The mac address on backup slave should be convert from Solicited-Node Multicast address, not from bonding unicast target address. v3: also fix the mac setting for slave_set_ns_maddr. (Jay) Add function description for slave_set_ns_maddr/slave_set_ns_maddrs (Jay) v2: fix patch 01's subject Hangbin Liu (2): bonding: fix incorrect MAC address setting to receive NS messages selftests: bonding: fix incorrect mac address drivers/net/bonding/bond_options.c | 55 ++++++++++++++++--- .../drivers/net/bonding/bond_options.sh | 4 +- 2 files changed, 49 insertions(+), 10 deletions(-) -- 2.46.0

5 months, 3 weeks

4
9
0 0

[PATCH bpf-next v3] selftests/Makefile: override the srctree for out-of-tree builds

by Li Zhijian

Fixes an issue where out-of-tree kselftest builds fail when building the BPF and bpftools components. The failure occurs because the top-level Makefile passes a relative srctree path to its sub-Makefiles, which leads to errors in locating necessary files. For example, the following error is encountered: ``` $ make V=1 O=$build/ TARGETS=hid kselftest-all ... make -C ../tools/testing/selftests all make[4]: Entering directory '/path/to/linux/tools/testing/selftests/hid' make -C /path/to/linux/tools/testing/selftests/../../../tools/lib/bpf OUTPUT=/path/to/linux/O/kselftest/hid/tools/build/libbpf/ \ EXTRA_CFLAGS='-g -O0' \ DESTDIR=/path/to/linux/O/kselftest/hid/tools prefix= all install_headers make[5]: Entering directory '/path/to/linux/tools/lib/bpf' ... make[5]: Entering directory '/path/to/linux/tools/bpf/bpftool' Makefile:127: ../tools/build/Makefile.feature: No such file or directory make[5]: *** No rule to make target '../tools/build/Makefile.feature'. Stop. ``` To resolve this, override the srctree in the kselftests's top Makefile when performing an out-of-tree build. This ensures that all sub-Makefiles have the correct path to the source tree, preventing directory resolution errors. Cc: Andrii Nakryiko <andrii.nakryiko(a)gmail.com> Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com> Tested-by: Quentin Monnet <qmo(a)kernel.org> --- Cc: Masahiro Yamada <masahiroy(a)kernel.org> V3: collected Tested-by and rebased on bpf-next V2: - handle srctree in selftests itself rather than the linux' top Makefile # Masahiro Yamada <masahiroy(a)kernel.org> V1: https://lore.kernel.org/lkml/20241217031052.69744-1-lizhijian@fujitsu.com/ --- tools/testing/selftests/Makefile | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 2401e973c359..f04a3b0003f6 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -154,15 +154,19 @@ override LDFLAGS = override MAKEFLAGS = endif +top_srcdir ?= ../../.. + # Append kselftest to KBUILD_OUTPUT and O to avoid cluttering # KBUILD_OUTPUT with selftest objects and headers installed # by selftests Makefile or lib.mk. +# Override the `srctree` variable to ensure it is correctly resolved in +# sub-Makefiles, such as those within `bpf`, when managing targets like +# `net` and `hid`. ifdef building_out_of_srctree override LDFLAGS = +override srctree := $(top_srcdir) endif -top_srcdir ?= ../../.. - ifeq ("$(origin O)", "command line") KBUILD_OUTPUT := $(O) endif -- 2.44.0

5 months, 3 weeks

3
2
0 0

[PATCH v9 0/7] mseal system mappings

by jeffxu＠chromium.org

From: Jeff Xu <jeffxu(a)chromium.org> This is V9 version, addressing comments from V8, without code logic change. ------------------------------------------------------------------- As discussed during mseal() upstream process [1], mseal() protects the VMAs of a given virtual memory range against modifications, such as the read/write (RW) and no-execute (NX) bits. For complete descriptions of memory sealing, please see mseal.rst [2]. The mseal() is useful to mitigate memory corruption issues where a corrupted pointer is passed to a memory management system. For example, such an attacker primitive can break control-flow integrity guarantees since read-only memory that is supposed to be trusted can become writable or .text pages can get remapped. The system mappings are readonly only, memory sealing can protect them from ever changing to writable or unmmap/remapped as different attributes. System mappings such as vdso, vvar, vvar_vclock, vectors (arm compat-mode), sigpage (arm compat-mode), are created by the kernel during program initialization, and could be sealed after creation. Unlike the aforementioned mappings, the uprobe mapping is not established during program startup. However, its lifetime is the same as the process's lifetime [3]. It could be sealed from creation. The vsyscall on x86-64 uses a special address (0xffffffffff600000), which is outside the mm managed range. This means mprotect, munmap, and mremap won't work on the vsyscall. Since sealing doesn't enhance the vsyscall's security, it is skipped in this patch. If we ever seal the vsyscall, it is probably only for decorative purpose, i.e. showing the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored. It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may alter the system mappings during restore operations. UML(User Mode Linux) and gVisor, rr are also known to change the vdso/vvar mappings. Consequently, this feature cannot be universally enabled across all systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default. To support mseal of system mappings, architectures must define CONFIG_ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS and update their special mappings calls to pass mseal flag. Additionally, architectures must confirm they do not unmap/remap system mappings during the process lifetime. The existence of this flag for an architecture implies that it does not require the remapping of thest system mappings during process lifetime, so sealing these mappings is safe from a kernel perspective. This version covers x86-64 and arm64 archiecture as minimum viable feature. While no specific CPU hardware features are required for enable this feature on an archiecture, memory sealing requires a 64-bit kernel. Other architectures can choose whether or not to adopt this feature. Currently, I'm not aware of any instances in the kernel code that actively munmap/mremap a system mapping without a request from userspace. The PPC does call munmap when _install_special_mapping fails for vdso; however, it's uncertain if this will ever fail for PPC - this needs to be investigated by PPC in the future [4]. The UML kernel can add this support when KUnit tests require it [5]. In this version, we've improved the handling of system mapping sealing from previous versions, instead of modifying the _install_special_mapping function itself, which would affect all architectures, we now call _install_special_mapping with a sealing flag only within the specific architecture that requires it. This targeted approach offers two key advantages: 1) It limits the code change's impact to the necessary architectures, and 2) It aligns with the software architecture by keeping the core memory management within the mm layer, while delegating the decision of sealing system mappings to the individual architecture, which is particularly relevant since 32-bit architectures never require sealing. Prior to this patch series, we explored sealing special mappings from userspace using glibc's dynamic linker. This approach revealed several issues: - The PT_LOAD header may report an incorrect length for vdso, (smaller than its actual size). The dynamic linker, which relies on PT_LOAD information to determine mapping size, would then split and partially seal the vdso mapping. Since each architecture has its own vdso/vvar code, fixing this in the kernel would require going through each archiecture. Our initial goal was to enable sealing readonly mappings, e.g. .text, across all architectures, sealing vdso from kernel since creation appears to be simpler than sealing vdso at glibc. - The [vvar] mapping header only contains address information, not length information. Similar issues might exist for other special mappings. - Mappings like uprobe are not covered by the dynamic linker, and there is no effective solution for them. This feature's security enhancements will benefit ChromeOS, Android, and other high security systems. Testing: This feature was tested on ChromeOS and Android for both x86-64 and ARM64. - Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly, i.e. "sl" shown in the smaps for those mappings, and mremap is blocked. - Passing various automation tests (e.g. pre-checkin) on ChromeOS and Android to ensure the sealing doesn't affect the functionality of Chromebook and Android phone. I also tested the feature on Ubuntu on x86-64: - With config disabled, vdso/vvar is not sealed, - with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK, normal operations such as browsing the web, open/edit doc are OK. Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1] Link: Documentation/userspace-api/mseal.rst [2] Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZx… [3] Link: https://lore.kernel.org/all/CABi2SkV6JJwJeviDLsq9N4ONvQ=EFANsiWkgiEOjyT9TQS… [4] Link: https://lore.kernel.org/all/202502251035.239B85A93@keescook/ [5] ------------------------------------------- History: V9: - Add negative test in selftest (Kees Cook) - fx typos in text (Kees Cook) V8: - Change ARCH_SUPPORTS_MSEAL_X to ARCH_SUPPORTS_MSEAL_X (Liam R. Howlett) - Update comments in Kconfig and mseal.rst (Lorenzo Stoakes, Liam R. Howlett) - Change patch header perfix to "mseal sysmap" (Lorenzo Stoakes) - Remove "vm_flags =" (Kees Cook, Liam R. Howlett, Oleg Nesterov) - Drop uml architecture (Lorenzo Stoakes, Kees Cook) - Add a selftest to verify system mappings are sealed (Lorenzo Stoakes) V7: https://lore.kernel.org/all/20250224225246.3712295-1-jeffxu@google.com/ - Remove cover letter from the first patch (Liam R. Howlett) - Change macro name to VM_SEALED_SYSMAP (Liam R. Howlett) - logging and fclose() in selftest (Liam R. Howlett) V6: https://lore.kernel.org/all/20250224174513.3600914-1-jeffxu@google.com/ - mseal.rst: fix a typo (Randy Dunlap) - security/Kconfig: add rr into note (Liam R. Howlett) - remove mseal_system_mappings() and use macro instead (Liam R. Howlett) - mseal.rst: add incompatible userland software (Lorenzo Stoakes) - remove RFC from title (Kees Cook) V5 https://lore.kernel.org/all/20250212032155.1276806-1-jeffxu@google.com/ - Remove kernel cmd line (Lorenzo Stoakes) - Add test info (Lorenzo Stoakes) - Add threat model info (Lorenzo Stoakes) - Fix x86 selftest: test_mremap_vdso - Restrict code change to ARM64/x86-64/UM arch only. - Add userprocess.h to include seal_system_mapping(). - Remove sealing vsyscall. - Split the patch. V4: https://lore.kernel.org/all/20241125202021.3684919-1-jeffxu@google.com/ - ARCH_HAS_SEAL_SYSTEM_MAPPINGS (Lorenzo Stoakes) - test info (Lorenzo Stoakes) - Update mseal.rst (Liam R. Howlett) - Update test_mremap_vdso.c (Liam R. Howlett) - Misc. style, comments, doc update (Liam R. Howlett) V3: https://lore.kernel.org/all/20241113191602.3541870-1-jeffxu@google.com/ - Revert uprobe to v1 logic (Oleg Nesterov) - use CONFIG_SEAL_SYSTEM_MAPPINGS instead of _ALWAYS/_NEVER (Kees Cook) - Move kernel cmd line from fs/exec.c to mm/mseal.c and misc. (Liam R. Howlett) V2: https://lore.kernel.org/all/20241014215022.68530-1-jeffxu@google.com/ - Seal uprobe always (Oleg Nesterov) - Update comments and description (Randy Dunlap, Liam R.Howlett, Oleg Nesterov) - Rebase to linux_main V1: - https://lore.kernel.org/all/20241004163155.3493183-1-jeffxu@google.com/ -------------------------------------------------- Jeff Xu (7): mseal sysmap: kernel config and header change selftests: x86: test_mremap_vdso: skip if vdso is msealed mseal sysmap: enable x86-64 mseal sysmap: enable arm64 mseal sysmap: uprobe mapping mseal sysmap: update mseal.rst selftest: test system mappings are sealed. Documentation/userspace-api/mseal.rst | 20 +++ arch/arm64/Kconfig | 1 + arch/arm64/kernel/vdso.c | 12 +- arch/x86/Kconfig | 1 + arch/x86/entry/vdso/vma.c | 7 +- include/linux/mm.h | 10 ++ init/Kconfig | 22 ++++ kernel/events/uprobes.c | 3 +- security/Kconfig | 21 ++++ tools/testing/selftests/Makefile | 1 + .../mseal_system_mappings/.gitignore | 2 + .../selftests/mseal_system_mappings/Makefile | 6 + .../selftests/mseal_system_mappings/config | 1 + .../mseal_system_mappings/sysmap_is_sealed.c | 119 ++++++++++++++++++ .../testing/selftests/x86/test_mremap_vdso.c | 43 +++++++ 15 files changed, 261 insertions(+), 8 deletions(-) create mode 100644 tools/testing/selftests/mseal_system_mappings/.gitignore create mode 100644 tools/testing/selftests/mseal_system_mappings/Makefile create mode 100644 tools/testing/selftests/mseal_system_mappings/config create mode 100644 tools/testing/selftests/mseal_system_mappings/sysmap_is_sealed.c -- 2.48.1.711.g2feabab25a-goog

5 months, 3 weeks

5
15
0 0

[PATCH 0/2] kselftest/arm64: mte: Minor fixes to the MTE hugetlb test

by Catalin Marinas

The first patch makes use of the correct terminology for synchronous and asynchronous errors. The second patch checks whether PROT_MTE is supported on hugetlb mappings before continuing with the tests. Such support was added in 6.13 but people tend to use current kselftests on older kernels. Avoid the failure reporting on such kernels, just skip the tests. Catalin Marinas (2): kselftest/arm64: mte: Use the correct naming for tag check modes in check_hugetlb_options.c kselftest/arm64: mte: Skip the hugetlb tests if MTE not supported on such mappings .../arm64/mte/check_hugetlb_options.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)

5 months, 3 weeks

3
6
0 0

Re: [PATCH v8 3/4] scanf: convert self-test to KUnit

by Petr Mladek

On Sat 2025-02-15 14:52:22, Tamir Duberstein wrote: > On Sat, Feb 15, 2025 at 1:51 PM kernel test robot <lkp(a)intel.com> wrote: > > > > Hi Tamir, > > > > kernel test robot noticed the following build warnings: > > > > [auto build test WARNING on 7b7a883c7f4de1ee5040bd1c32aabaafde54d209] > > > > url: > https://github.com/intel-lab-lkp/linux/commits/Tamir-Duberstein/scanf-impli… > > base: 7b7a883c7f4de1ee5040bd1c32aabaafde54d209 > > patch link: > https://lore.kernel.org/r/20250214-scanf-kunit-convert-v8-3-5ea50f95f83c%40… > > patch subject: [PATCH v8 3/4] scanf: convert self-test to KUnit > > config: sh-randconfig-002-20250216 ( > https://download.01.org/0day-ci/archive/20250216/202502160245.KUrryBJR-lkp@… > ) > > compiler: sh4-linux-gcc (GCC) 14.2.0 > > reproduce (this is a W=1 build): ( > https://download.01.org/0day-ci/archive/20250216/202502160245.KUrryBJR-lkp@… > ) > > > > If you fix the issue in a separate patch/commit (i.e. not just a new > version of > > the same patch/commit), kindly add following tags > > | Reported-by: kernel test robot <lkp(a)intel.com> > > | Closes: > https://lore.kernel.org/oe-kbuild-all/202502160245.KUrryBJR-lkp@intel.com/ > > > > All warnings (new ones prefixed by >>): > > > > In file included from <command-line>: > > lib/tests/scanf_kunit.c: In function 'numbers_list_ll': > > >> include/linux/compiler.h:197:61: warning: function 'numbers_list_ll' > might be a candidate for 'gnu_scanf' format attribute > [-Wsuggest-attribute=format] > > I am not able to reproduce these warnings with clang 19.1.7. They also > don't obviously make sense to me. I have reproduced the problem with gcc: $> gcc --version gcc (SUSE Linux) 14.2.1 20250220 [revision 9ffecde121af883b60bbe60d00425036bc873048] $> make W=1 lib/test_scanf.ko CALL scripts/checksyscalls.sh DESCEND objtool INSTALL libsubcmd_headers CC [M] lib/test_scanf.o In file included from <command-line>: lib/test_scanf.c: In function ‘numbers_list_ll’: ./include/linux/compiler.h:197:61: warning: function ‘numbers_list_ll’ might be a candidate for ‘gnu_scanf’ format attribute [-Wsuggest-attribute=format] 197 | #define __BUILD_BUG_ON_ZERO_MSG(e, msg) ((int)sizeof(struct {_Static_assert(!(e), msg);})) | ^ [...] It seems that it is a regression introduced by the first patch of this patch set. And the fix is: diff --git a/lib/test_scanf.c b/lib/test_scanf.c index d1664e0d0138..e65b10c3dc11 100644 --- a/lib/test_scanf.c +++ b/lib/test_scanf.c @@ -27,7 +27,7 @@ static struct rnd_state rnd_state __initdata; typedef int (*check_fn)(const char *file, const int line, const void *check_data, const char *string, const char *fmt, int n_args, va_list ap); -static void __scanf(6, 0) __init +static void __scanf(6, 8) __init _test(const char *file, const int line, check_fn fn, const void *check_data, const char *string, const char *fmt, int n_args, ...) { Best Regards, Petr

5 months, 3 weeks

2
1
0 0

[PATCH 00/32] kselftest harness and nolibc compatibility

by Thomas Weißschuh

Nolibc is useful for selftests as the test programs can be very small, and compiled with just a kernel crosscompiler, without userspace support. Currently nolibc is only usable with kselftest.h, not the more convenient to use kselftest_harness.h This series provides this compatibility by adding new features to nolibc and removing the usage of problematic features from the harness. The first half of the series are changes to the harness, the second one are for nolibc. Both parts are very independent and can go through different trees. The last patch is not meant to be applied and serves as test that everything works correctly. Based on the next branch of the nolibc tree: https://web.git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc.git… Signed-off-by: Thomas Weißschuh <thomas.weissschuh(a)linutronix.de> --- Thomas Weißschuh (32): selftests: harness: Add harness selftest selftests: harness: Use C89 comment style selftests: harness: Ignore unused variant argument warning selftests: harness: Mark functions without prototypes static selftests: harness: Remove inline qualifier for wrappers selftests: harness: Guard includes on nolibc selftests: harness: Remove dependency on libatomic selftests: harness: Implement test timeouts through pidfd selftests: harness: Don't set setup_completed for fixtureless tests selftests: harness: Always provide "self" and "variant" selftests: harness: Move teardown conditional into test metadata selftests: harness: Add teardown callback to test metadata selftests: harness: Stop using setjmp()/longjmp() tools/nolibc: handle intmax_t/uintmax_t in printf tools/nolibc: use intmax definitions from compiler tools/nolibc: use pselect6_time64 if available tools/nolibc: use ppoll_time64 if available tools/nolibc: add tolower() and toupper() tools/nolibc: add _exit() tools/nolibc: add setpgrp() tools/nolibc: implement waitpid() in terms of waitid() Revert "selftests/nolibc: use waitid() over waitpid()" tools/nolibc: add dprintf() and vdprintf() tools/nolibc: add getopt() tools/nolibc: allow different write callbacks in printf tools/nolibc: allow limiting of printf destination size tools/nolibc: add snprintf() and friends selftests/nolibc: use snprintf() for printf tests selftests/nolibc: rename vfprintf test suite selftests/nolibc: add test for snprintf() truncation tools/nolibc: implement width padding in printf() HACK: selftests/nolibc: demonstrate usage of the kselftest harness tools/include/nolibc/Makefile | 1 + tools/include/nolibc/getopt.h | 105 ++ tools/include/nolibc/nolibc.h | 1 + tools/include/nolibc/stdint.h | 4 +- tools/include/nolibc/stdio.h | 127 +- tools/include/nolibc/string.h | 17 + tools/include/nolibc/sys.h | 102 +- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/kselftest/.gitignore | 1 + tools/testing/selftests/kselftest/Makefile | 6 + .../testing/selftests/kselftest/harness-selftest.c | 129 ++ .../selftests/kselftest/harness-selftest.expected | 62 + .../selftests/kselftest/harness-selftest.sh | 14 + tools/testing/selftests/kselftest_harness.h | 188 +-- tools/testing/selftests/nolibc/Makefile | 17 +- tools/testing/selftests/nolibc/harness-selftest.c | 1 + tools/testing/selftests/nolibc/nolibc-test.c | 1712 +------------------- tools/testing/selftests/nolibc/run-tests.sh | 2 +- 18 files changed, 639 insertions(+), 1851 deletions(-) --- base-commit: cb839e0cc881b4abd4a2e64cd06c2e313987a189 change-id: 20250130-nolibc-kselftest-harness-8b2c8cac43bf Best regards, -- Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>

5 months, 3 weeks

2
42
0 0

[PATCH rcu 00/10] Miscellaneous RCU changes for v6.15

by Boqun Feng

Hi, Please find the upcoming miscellaneous RCU changes. The changes can also be found at: git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux.git misc.2025.03.04a Regards, Boqun Paul E. McKenney (6): rcu: Split rcu_report_exp_cpu_mult() mask parameter and use for tracing rcu: Fix get_state_synchronize_rcu_full() GP-start detection rcu-tasks: Move RCU Tasks self-tests to core_initcall() rcu/nocb: Print segment lengths in show_rcu_nocb_gp_state() context_tracking: Make RCU watch ct_kernel_exit_state() warning Flush console log from kernel_power_off() Uladzislau Rezki (Sony) (3): rcutorture: Allow a negative value for nfakewriters rcu: Update TREE05.boot to test normal synchronize_rcu() rcu: Use _full() API to debug synchronize_rcu() Zilin Guan (1): rcu: Remove READ_ONCE() for rdp->gpwrap access in __note_gp_changes() include/linux/printk.h | 6 ++++ include/linux/rcupdate.h | 6 ---- include/linux/rcupdate_wait.h | 3 ++ init/main.c | 1 - kernel/context_tracking.c | 9 +++--- kernel/printk/printk.c | 4 +-- kernel/rcu/rcu.h | 2 +- kernel/rcu/rcutorture.c | 22 ++++++++++---- kernel/rcu/tasks.h | 5 +++- kernel/rcu/tree.c | 29 +++++++++++-------- kernel/rcu/tree_exp.h | 6 ++-- kernel/rcu/tree_nocb.h | 20 +++++++++---- kernel/reboot.c | 1 + .../rcutorture/configs/rcu/TREE05.boot | 6 ++++ 14 files changed, 78 insertions(+), 42 deletions(-) -- 2.48.1

5 months, 3 weeks

1
10
0 0

[PATCH rcu 00/11] Lazy Preempt changes for v6.15

by Boqun Feng

Hi, Please find the upcoming changes for CONFIG_PREEMPT_LAZY in RCU. The changes can also be found at: git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux.git lazypreempt.2025.02.24a Paul & Ankur, I put patch #7 and #8 (bug fixes in rcutorture) before patch #9 (which is the one that enables non-preemptible RCU in preemptible kernel), because I want to avoid introduce a bug in-between a series, appreciate it if you can double check on this. Thanks! Regards, Boqun Ankur Arora (7): rcu: fix header guard for rcu_all_qs() rcu: rename PREEMPT_AUTO to PREEMPT_LAZY sched: update __cond_resched comment about RCU quiescent states rcu: handle unstable rdp in rcu_read_unlock_strict() rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y osnoise: provide quiescent states rcu: limit PREEMPT_RCU configurations Boqun Feng (1): rcutorture: Update ->extendables check for lazy preemption Paul E. McKenney (3): rcutorture: Update rcutorture_one_extend_check() for lazy preemption rcutorture: Make scenario TREE10 build CONFIG_PREEMPT_LAZY=y rcutorture: Make scenario TREE07 build CONFIG_PREEMPT_LAZY=y include/linux/rcupdate.h | 2 +- include/linux/rcutree.h | 2 +- include/linux/srcutiny.h | 2 +- kernel/rcu/Kconfig | 4 +-- kernel/rcu/rcutorture.c | 26 ++++++++++++--- kernel/rcu/srcutiny.c | 14 ++++---- kernel/rcu/tree_plugin.h | 22 ++++++++++--- kernel/sched/core.c | 4 ++- kernel/trace/trace_osnoise.c | 32 +++++++++---------- .../selftests/rcutorture/configs/rcu/TREE07 | 3 +- .../selftests/rcutorture/configs/rcu/TREE10 | 3 +- 11 files changed, 73 insertions(+), 41 deletions(-) -- 2.39.5 (Apple Git-154)

5 months, 3 weeks

4
15
0 0

[PATCH net-next 1/2] selftests: drv-net: use env.rpath in the HDS test

by Jakub Kicinski

Commit 29b036be1b0b ("selftests: drv-net: test XDP, HDS auto and the ioctl path") added a new test case in the net tree, now that this code has made its way to net-next convert it to use the env.rpath() helper instead of manually computing the relative path. Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> --- tools/testing/selftests/drivers/net/hds.py | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/tools/testing/selftests/drivers/net/hds.py b/tools/testing/selftests/drivers/net/hds.py index 873f5219e41d..7cc74faed743 100755 --- a/tools/testing/selftests/drivers/net/hds.py +++ b/tools/testing/selftests/drivers/net/hds.py @@ -20,8 +20,7 @@ from lib.py import defer, ethtool, ip def _xdp_onoff(cfg): - test_dir = os.path.dirname(os.path.realpath(__file__)) - prog = test_dir + "/../../net/lib/xdp_dummy.bpf.o" + prog = cfg.rpath("../../net/lib/xdp_dummy.bpf.o") ip("link set dev %s xdp obj %s sec xdp" % (cfg.ifname, prog)) ip("link set dev %s xdp off" % cfg.ifname) -- 2.48.1

5 months, 3 weeks

3
4
0 0

[PATCH net-next 0/5] mptcp: improve code coverage and small optimisations

by Matthieu Baerts (NGI0)

This small series have various unrelated patches: - Patch 1 and 2: improve code coverage by validating mptcp_diag_dump_one thanks to a new tool displaying MPTCP info for a specific token. - Patch 3: a fix for a commit which is only in net-next. - Patch 4: reduce parameters for one in-kernel PM helper. - Patch 5: exit early when processing an ADD_ADDR echo to avoid unneeded operations. Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> --- Gang Yan (2): selftests: mptcp: Add a tool to get specific msk_info selftests: mptcp: add a test for mptcp_diag_dump_one Geliang Tang (2): mptcp: pm: in-kernel: avoid access entry without lock mptcp: pm: in-kernel: reduce parameters of set_flags Matthieu Baerts (NGI0) (1): mptcp: pm: exit early with ADD_ADDR echo if possible net/mptcp/pm.c | 3 + net/mptcp/pm_netlink.c | 15 +- tools/testing/selftests/net/mptcp/Makefile | 2 +- tools/testing/selftests/net/mptcp/diag.sh | 27 +++ tools/testing/selftests/net/mptcp/mptcp_diag.c | 272 +++++++++++++++++++++++++ 5 files changed, 311 insertions(+), 8 deletions(-) --- base-commit: 56794b5862c5a9aefcf2b703257c6fb93f76573e change-id: 20250228-net-next-mptcp-coverage-small-opti-70d8dc1d329d Best regards, -- Matthieu Baerts (NGI0) <matttbe(a)kernel.org>

5 months, 3 weeks

2
6
0 0

[PATCH] selftests/pcie_bwctrl: Add "set_pcie_speed.sh" to TEST_PROGS

by Yi Lai

The test shell script "set_pcie_speed.sh" is not installed in INSTALL_PATH. Attempting to execute set_pcie_cooling_state.sh shows warning: " ./set_pcie_cooling_state.sh: line 119: ./set_pcie_speed.sh: No such file or directory " Add "set_pcie_speed.sh" to TEST_PROGS. Fixes: 838f12c3d551 ("selftests/pcie_bwctrl: Create selftests") Signed-off-by: Yi Lai <yi1.lai(a)intel.com> --- tools/testing/selftests/pcie_bwctrl/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/pcie_bwctrl/Makefile b/tools/testing/selftests/pcie_bwctrl/Makefile index 3e84e26341d1..48ec048f47af 100644 --- a/tools/testing/selftests/pcie_bwctrl/Makefile +++ b/tools/testing/selftests/pcie_bwctrl/Makefile @@ -1,2 +1,2 @@ -TEST_PROGS = set_pcie_cooling_state.sh +TEST_PROGS = set_pcie_cooling_state.sh set_pcie_speed.sh include ../lib.mk -- 2.39.2

5 months, 3 weeks

2
1
0 0

[PATCH v8 0/7] mseal system mappings

by jeffxu＠chromium.org

From: Jeff Xu <jeffxu(a)google.com> This is V8 version, addressing comments from V7, without code logic change. ------------------------------------------------------------------- As discussed during mseal() upstream process [1], mseal() protects the VMAs of a given virtual memory range against modifications, such as the read/write (RW) and no-execute (NX) bits. For complete descriptions of memory sealing, please see mseal.rst [2]. The mseal() is useful to mitigate memory corruption issues where a corrupted pointer is passed to a memory management system. For example, such an attacker primitive can break control-flow integrity guarantees since read-only memory that is supposed to be trusted can become writable or .text pages can get remapped. The system mappings are readonly only, memory sealing can protect them from ever changing to writable or unmmap/remapped as different attributes. System mappings such as vdso, vvar, vvar_vclock, vectors (arm compact-mode), sigpage (arm compact-mode), are created by the kernel during program initialization, and could be sealed after creation. Unlike the aforementioned mappings, the uprobe mapping is not established during program startup. However, its lifetime is the same as the process's lifetime [3]. It could be sealed from creation. The vsyscall on x86-64 uses a special address (0xffffffffff600000), which is outside the mm managed range. This means mprotect, munmap, and mremap won't work on the vsyscall. Since sealing doesn't enhance the vsyscall's security, it is skipped in this patch. If we ever seal the vsyscall, it is probably only for decorative purpose, i.e. showing the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored. It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may alter the system mappings during restore operations. UML(User Mode Linux) and gVisor, rr are also known to change the vdso/vvar mappings. Consequently, this feature cannot be universally enabled across all systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default. To support mseal of system mappings, architectures must define CONFIG_ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS and update their special mappings calls to pass mseal flag. Additionally, architectures must confirm they do not unmap/remap system mappings during the process lifetime. The existence of this flag for an architecture implies that it does not require the remapping of thest system mappings during process lifetime, so sealing these mappings is safe from a kernel perspective. This version covers x86-64 and arm64 archiecture as minimum viable feature. While no specific CPU hardware features are required for enable this feature on an archiecture, memory sealing requires a 64-bit kernel. Other architectures can choose whether or not to adopt this feature. Currently, I'm not aware of any instances in the kernel code that actively munmap/mremap a system mapping without a request from userspace. The PPC does call munmap when _install_special_mapping fails for vdso; however, it's uncertain if this will ever fail for PPC - this needs to be investigated by PPC in the future [4]. The UML kernel can add this support when KUnit tests require it [5]. In this version, we've improved the handling of system mapping sealing from previous versions, instead of modifying the _install_special_mapping function itself, which would affect all architectures, we now call _install_special_mapping with a sealing flag only within the specific architecture that requires it. This targeted approach offers two key advantages: 1) It limits the code change's impact to the necessary architectures, and 2) It aligns with the software architecture by keeping the core memory management within the mm layer, while delegating the decision of sealing system mappings to the individual architecture, which is particularly relevant since 32-bit architectures never require sealing. Prior to this patch series, we explored sealing special mappings from userspace using glibc's dynamic linker. This approach revealed several issues: - The PT_LOAD header may report an incorrect length for vdso, (smaller than its actual size). The dynamic linker, which relies on PT_LOAD information to determine mapping size, would then split and partially seal the vdso mapping. Since each architecture has its own vdso/vvar code, fixing this in the kernel would require going through each archiecture. Our initial goal was to enable sealing readonly mappings, e.g. .text, across all architectures, sealing vdso from kernel since creation appears to be simpler than sealing vdso at glibc. - The [vvar] mapping header only contains address information, not length information. Similar issues might exist for other special mappings. - Mappings like uprobe are not covered by the dynamic linker, and there is no effective solution for them. This feature's security enhancements will benefit ChromeOS, Android, and other high security systems. Testing: This feature was tested on ChromeOS and Android for both x86-64 and ARM64. - Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly, i.e. "sl" shown in the smaps for those mappings, and mremap is blocked. - Passing various automation tests (e.g. pre-checkin) on ChromeOS and Android to ensure the sealing doesn't affect the functionality of Chromebook and Android phone. I also tested the feature on Ubuntu on x86-64: - With config disabled, vdso/vvar is not sealed, - with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK, normal operations such as browsing the web, open/edit doc are OK. Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1] Link: Documentation/userspace-api/mseal.rst [2] Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZx… [3] Link: https://lore.kernel.org/all/CABi2SkV6JJwJeviDLsq9N4ONvQ=EFANsiWkgiEOjyT9TQS… [4] Link: https://lore.kernel.org/all/202502251035.239B85A93@keescook/ [5] ------------------------------------------- History: V8: - Change ARCH_SUPPORTS_MSEAL_X to ARCH_SUPPORTS_MSEAL_X (Liam R. Howlett) - Update comments in Kconfig and mseal.rst (Lorenzo Stoakes, Liam R. Howlett) - Change patch header perfix to "mseal sysmap" (Lorenzo Stoakes) - Remove "vm_flags =" (Kees Cook, Liam R. Howlett, Oleg Nesterov) - Drop uml architecture (Lorenzo Stoakes, Kees Cook) - Add a selftest to verify system mappings are sealed (Lorenzo Stoakes) V7: https://lore.kernel.org/all/20250224225246.3712295-1-jeffxu@google.com/ - Remove cover letter from the first patch (Liam R. Howlett) - Change macro name to VM_SEALED_SYSMAP (Liam R. Howlett) - logging and fclose() in selftest (Liam R. Howlett) V6: https://lore.kernel.org/all/20250224174513.3600914-1-jeffxu@google.com/ - mseal.rst: fix a typo (Randy Dunlap) - security/Kconfig: add rr into note (Liam R. Howlett) - remove mseal_system_mappings() and use macro instead (Liam R. Howlett) - mseal.rst: add incompatible userland software (Lorenzo Stoakes) - remove RFC from title (Kees Cook) V5 https://lore.kernel.org/all/20250212032155.1276806-1-jeffxu@google.com/ - Remove kernel cmd line (Lorenzo Stoakes) - Add test info (Lorenzo Stoakes) - Add threat model info (Lorenzo Stoakes) - Fix x86 selftest: test_mremap_vdso - Restrict code change to ARM64/x86-64/UM arch only. - Add userprocess.h to include seal_system_mapping(). - Remove sealing vsyscall. - Split the patch. V4: https://lore.kernel.org/all/20241125202021.3684919-1-jeffxu@google.com/ - ARCH_HAS_SEAL_SYSTEM_MAPPINGS (Lorenzo Stoakes) - test info (Lorenzo Stoakes) - Update mseal.rst (Liam R. Howlett) - Update test_mremap_vdso.c (Liam R. Howlett) - Misc. style, comments, doc update (Liam R. Howlett) V3: https://lore.kernel.org/all/20241113191602.3541870-1-jeffxu@google.com/ - Revert uprobe to v1 logic (Oleg Nesterov) - use CONFIG_SEAL_SYSTEM_MAPPINGS instead of _ALWAYS/_NEVER (Kees Cook) - Move kernel cmd line from fs/exec.c to mm/mseal.c and misc. (Liam R. Howlett) V2: https://lore.kernel.org/all/20241014215022.68530-1-jeffxu@google.com/ - Seal uprobe always (Oleg Nesterov) - Update comments and description (Randy Dunlap, Liam R.Howlett, Oleg Nesterov) - Rebase to linux_main V1: - https://lore.kernel.org/all/20241004163155.3493183-1-jeffxu@google.com/ -------------------------------------------------- Jeff Xu (7): mseal sysmap: kernel config and header change selftests: x86: test_mremap_vdso: skip if vdso is msealed mseal sysmap: enable x86-64 mseal sysmap: enable arm64 mseal sysmap: uprobe mapping mseal sysmap: update mseal.rst selftest: test system mappings are sealed. Documentation/userspace-api/mseal.rst | 20 ++++ arch/arm64/Kconfig | 1 + arch/arm64/kernel/vdso.c | 12 +- arch/x86/Kconfig | 1 + arch/x86/entry/vdso/vma.c | 7 +- include/linux/mm.h | 10 ++ init/Kconfig | 22 ++++ kernel/events/uprobes.c | 3 +- security/Kconfig | 21 ++++ .../mseal_system_mappings/.gitignore | 2 + .../selftests/mseal_system_mappings/Makefile | 6 + .../selftests/mseal_system_mappings/config | 1 + .../mseal_system_mappings/sysmap_is_sealed.c | 113 ++++++++++++++++++ .../testing/selftests/x86/test_mremap_vdso.c | 43 +++++++ 14 files changed, 254 insertions(+), 8 deletions(-) create mode 100644 tools/testing/selftests/mseal_system_mappings/.gitignore create mode 100644 tools/testing/selftests/mseal_system_mappings/Makefile create mode 100644 tools/testing/selftests/mseal_system_mappings/config create mode 100644 tools/testing/selftests/mseal_system_mappings/sysmap_is_sealed.c -- 2.48.1.711.g2feabab25a-goog

5 months, 3 weeks

6
38
0 0

[PATCH] selftests: proofreading bpf module

by Armin

Fixed multiple spelling issues in the kselftests bpf modules. Signed-off-by: Armin <Armin.Mahdilou(a)gmail.com> --- tools/testing/selftests/bpf/Makefile | 2 +- tools/testing/selftests/bpf/bench.c | 2 +- tools/testing/selftests/bpf/prog_tests/btf_dump.c | 2 +- tools/testing/selftests/bpf/prog_tests/fd_array.c | 2 +- tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c | 2 +- tools/testing/selftests/bpf/prog_tests/reg_bounds.c | 4 ++-- tools/testing/selftests/bpf/progs/bpf_cc_cubic.c | 2 +- tools/testing/selftests/bpf/progs/bpf_dctcp.c | 2 +- .../testing/selftests/bpf/progs/freplace_unreliable_prog.c | 2 +- tools/testing/selftests/bpf/progs/iters_state_safety.c | 2 +- .../testing/selftests/bpf/progs/test_cls_redirect_dynptr.c | 2 +- tools/testing/selftests/bpf/progs/test_tc_dtime.c | 2 +- tools/testing/selftests/bpf/progs/uprobe_multi_verifier.c | 6 +++--- tools/testing/selftests/bpf/progs/uretprobe_stack.c | 2 +- tools/testing/selftests/bpf/progs/verifier_loops1.c | 2 +- tools/testing/selftests/bpf/progs/verifier_scalar_ids.c | 2 +- tools/testing/selftests/bpf/test_lru_map.c | 4 ++-- tools/testing/selftests/bpf/test_lwt_ip_encap.sh | 2 +- tools/testing/selftests/bpf/test_sockmap.c | 2 +- tools/testing/selftests/bpf/verifier/calls.c | 6 +++--- tools/testing/selftests/bpf/xdping.c | 2 +- tools/testing/selftests/bpf/xsk.h | 2 +- 22 files changed, 28 insertions(+), 28 deletions(-) diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 87551628e112..a66d9173609d 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -392,7 +392,7 @@ $(HOST_BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \ DESTDIR=$(HOST_SCRATCH_DIR)/ prefix= all install_headers endif -# vmlinux.h is first dumped to a temprorary file and then compared to +# vmlinux.h is first dumped to a temporary file and then compared to # the previous version. This helps to avoid unnecessary re-builds of # $(TRUNNER_BPF_OBJS) $(INCLUDE_DIR)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL) | $(INCLUDE_DIR) diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c index 1bd403a5ef7b..b25db4142126 100644 --- a/tools/testing/selftests/bpf/bench.c +++ b/tools/testing/selftests/bpf/bench.c @@ -497,7 +497,7 @@ extern const struct bench bench_rename_rawtp; extern const struct bench bench_rename_fentry; extern const struct bench bench_rename_fexit; -/* pure counting benchmarks to establish theoretical lmits */ +/* pure counting benchmarks to establish theoretical limits */ extern const struct bench bench_trig_usermode_count; extern const struct bench bench_trig_syscall_count; extern const struct bench bench_trig_kernel_count; diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dump.c b/tools/testing/selftests/bpf/prog_tests/btf_dump.c index b293b8501fd6..2b434038d5af 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf_dump.c +++ b/tools/testing/selftests/bpf/prog_tests/btf_dump.c @@ -63,7 +63,7 @@ static int test_btf_dump_case(int n, struct btf_dump_test_case *t) /* tests with t->known_ptr_sz have no "long" or "unsigned long" type, * so it's impossible to determine correct pointer size; but if they - * do, it should be 8 regardless of host architecture, becaues BPF + * do, it should be 8 regardless of host architecture, because BPF * target is always 64-bit */ if (!t->known_ptr_sz) { diff --git a/tools/testing/selftests/bpf/prog_tests/fd_array.c b/tools/testing/selftests/bpf/prog_tests/fd_array.c index a1d52e73fb16..ab0fe7add9dc 100644 --- a/tools/testing/selftests/bpf/prog_tests/fd_array.c +++ b/tools/testing/selftests/bpf/prog_tests/fd_array.c @@ -293,7 +293,7 @@ static int get_btf_id_by_fd(int btf_fd, __u32 *id) * 1) Create a new btf, it's referenced only by a file descriptor, so refcnt=1 * 2) Load a BPF prog with fd_array[0] = btf_fd; now btf's refcnt=2 * 3) Close the btf_fd, now refcnt=1 - * Wait and check that BTF stil exists. + * Wait and check that BTF still exists. */ static void check_fd_array_cnt__referenced_btfs(void) { diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c index e19ef509ebf8..f377bea0b82d 100644 --- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c +++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c @@ -463,7 +463,7 @@ static bool skip_entry(char *name) return false; } -/* Do comparision by ignoring '.llvm.<hash>' suffixes. */ +/* Do comparison by ignoring '.llvm.<hash>' suffixes. */ static int compare_name(const char *name1, const char *name2) { const char *res1, *res2; diff --git a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c index 39d42271cc46..de302ecd5f58 100644 --- a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c +++ b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c @@ -609,7 +609,7 @@ static void range_cond(enum num_t t, struct range x, struct range y, *newx = range(t, x.a, x.b); *newy = range(t, y.a + 1, y.b); } else if (x.a == x.b && x.b == y.b) { - /* X is a constant matching rigth side of Y */ + /* X is a constant matching right side of Y */ *newx = range(t, x.a, x.b); *newy = range(t, y.a, y.b - 1); } else if (y.a == y.b && x.a == y.a) { @@ -617,7 +617,7 @@ static void range_cond(enum num_t t, struct range x, struct range y, *newx = range(t, x.a + 1, x.b); *newy = range(t, y.a, y.b); } else if (y.a == y.b && x.b == y.b) { - /* Y is a constant matching rigth side of X */ + /* Y is a constant matching right side of X */ *newx = range(t, x.a, x.b - 1); *newy = range(t, y.a, y.b); } else { diff --git a/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c b/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c index 1654a530aa3d..4e51785e7606 100644 --- a/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c +++ b/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c @@ -101,7 +101,7 @@ static void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, tp->snd_cwnd = pkts_in_flight + sndcnt; } -/* Decide wheather to run the increase function of congestion control. */ +/* Decide whether to run the increase function of congestion control. */ static bool tcp_may_raise_cwnd(const struct sock *sk, const int flag) { if (tcp_sk(sk)->reordering > TCP_REORDERING) diff --git a/tools/testing/selftests/bpf/progs/bpf_dctcp.c b/tools/testing/selftests/bpf/progs/bpf_dctcp.c index 7cd73e75f52a..32c511bcd60b 100644 --- a/tools/testing/selftests/bpf/progs/bpf_dctcp.c +++ b/tools/testing/selftests/bpf/progs/bpf_dctcp.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright (c) 2019 Facebook */ -/* WARNING: This implemenation is not necessarily the same +/* WARNING: This implementation is not necessarily the same * as the tcp_dctcp.c. The purpose is mainly for testing * the kernel BPF logic. */ diff --git a/tools/testing/selftests/bpf/progs/freplace_unreliable_prog.c b/tools/testing/selftests/bpf/progs/freplace_unreliable_prog.c index 624078abf3de..d7e30ee576c9 100644 --- a/tools/testing/selftests/bpf/progs/freplace_unreliable_prog.c +++ b/tools/testing/selftests/bpf/progs/freplace_unreliable_prog.c @@ -7,7 +7,7 @@ SEC("freplace/btf_unreliable_kprobe") /* context type is what BPF verifier expects for kprobe context, but target - * program has `stuct whatever *ctx` argument, so freplace operation will be + * program has `struct whatever *ctx` argument, so freplace operation will be * rejected with the following message: * * arg0 replace_btf_unreliable_kprobe(struct pt_regs *) doesn't match btf_unreliable_kprobe(struct whatever *) diff --git a/tools/testing/selftests/bpf/progs/iters_state_safety.c b/tools/testing/selftests/bpf/progs/iters_state_safety.c index f41257eadbb2..b381ac0c736c 100644 --- a/tools/testing/selftests/bpf/progs/iters_state_safety.c +++ b/tools/testing/selftests/bpf/progs/iters_state_safety.c @@ -345,7 +345,7 @@ int __naked read_from_iter_slot_fail(void) "r3 = 1000;" "call %[bpf_iter_num_new];" - /* attemp to leak bpf_iter_num state */ + /* attempt to leak bpf_iter_num state */ "r7 = *(u64 *)(r6 + 0);" "r8 = *(u64 *)(r6 + 8);" diff --git a/tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c b/tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c index d0f7670351e5..dfd4a2710391 100644 --- a/tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c +++ b/tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c @@ -494,7 +494,7 @@ static ret_t get_next_hop(struct bpf_dynptr *dynptr, __u64 *offset, encap_header *offset += sizeof(*next_hop); - /* Skip the remainig next hops (may be zero). */ + /* Skip the remaining next hops (may be zero). */ return skip_next_hops(offset, encap->unigue.hop_count - encap->unigue.next_hop - 1); } diff --git a/tools/testing/selftests/bpf/progs/test_tc_dtime.c b/tools/testing/selftests/bpf/progs/test_tc_dtime.c index ca8e8734d901..f0c58519cab1 100644 --- a/tools/testing/selftests/bpf/progs/test_tc_dtime.c +++ b/tools/testing/selftests/bpf/progs/test_tc_dtime.c @@ -21,7 +21,7 @@ * ns_src | ns_fwd | ns_dst * * ns_src and ns_dst: ENDHOST namespace - * ns_fwd: Fowarding namespace + * ns_fwd: Forwarding namespace */ #define ctx_ptr(field) (void *)(long)(field) diff --git a/tools/testing/selftests/bpf/progs/uprobe_multi_verifier.c b/tools/testing/selftests/bpf/progs/uprobe_multi_verifier.c index fe49f2cb5360..d15f49ed93f1 100644 --- a/tools/testing/selftests/bpf/progs/uprobe_multi_verifier.c +++ b/tools/testing/selftests/bpf/progs/uprobe_multi_verifier.c @@ -10,14 +10,14 @@ char _license[] SEC("license") = "GPL"; SEC("uprobe.session") __success -int uprobe_sesison_return_0(struct pt_regs *ctx) +int uprobe_session_return_0(struct pt_regs *ctx) { return 0; } SEC("uprobe.session") __success -int uprobe_sesison_return_1(struct pt_regs *ctx) +int uprobe_session_return_1(struct pt_regs *ctx) { return 1; } @@ -25,7 +25,7 @@ int uprobe_sesison_return_1(struct pt_regs *ctx) SEC("uprobe.session") __failure __msg("At program exit the register R0 has smin=2 smax=2 should have been in [0, 1]") -int uprobe_sesison_return_2(struct pt_regs *ctx) +int uprobe_session_return_2(struct pt_regs *ctx) { return 2; } diff --git a/tools/testing/selftests/bpf/progs/uretprobe_stack.c b/tools/testing/selftests/bpf/progs/uretprobe_stack.c index 9fdcf396b8f4..cbc428a80744 100644 --- a/tools/testing/selftests/bpf/progs/uretprobe_stack.c +++ b/tools/testing/selftests/bpf/progs/uretprobe_stack.c @@ -27,7 +27,7 @@ SEC("uprobe//proc/self/exe:target_1") int BPF_UPROBE(uprobe_1) { /* target_1 is recursive wit depth of 2, so we capture two separate - * stack traces, depending on which occurence it is + * stack traces, depending on which occurrence it is */ static bool recur = false; diff --git a/tools/testing/selftests/bpf/progs/verifier_loops1.c b/tools/testing/selftests/bpf/progs/verifier_loops1.c index e07b43b78fd2..cd4ee4d38cf6 100644 --- a/tools/testing/selftests/bpf/progs/verifier_loops1.c +++ b/tools/testing/selftests/bpf/progs/verifier_loops1.c @@ -261,7 +261,7 @@ l0_%=: r2 += r1; \ SEC("xdp") __success -__naked void not_an_inifinite_loop(void) +__naked void not_an_infinite_loop(void) { asm volatile (" \ call %[bpf_get_prandom_u32]; \ diff --git a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c index 7c5e5e6d10eb..dba3ca728f6e 100644 --- a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c +++ b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c @@ -349,7 +349,7 @@ __naked void precision_two_ids(void) SEC("socket") __success __log_level(2) __flag(BPF_F_TEST_STATE_FREQ) -/* check thar r0 and r6 have different IDs after 'if', +/* check that r0 and r6 have different IDs after 'if', * collect_linked_regs() can't tie more than 6 registers for a single insn. */ __msg("8: (25) if r0 > 0x7 goto pc+0 ; R0=scalar(id=1") diff --git a/tools/testing/selftests/bpf/test_lru_map.c b/tools/testing/selftests/bpf/test_lru_map.c index fda7589c5023..b80b3bc17575 100644 --- a/tools/testing/selftests/bpf/test_lru_map.c +++ b/tools/testing/selftests/bpf/test_lru_map.c @@ -306,7 +306,7 @@ static void test_lru_sanity1(int map_type, int map_flags, unsigned int tgt_free) * Update 1 to tgt_free/2 * => The original 1 to tgt_free/2 will be removed due to * the LRU shrink process - * Re-insert 1 to tgt_free/2 again and do a lookup immeidately + * Re-insert 1 to tgt_free/2 again and do a lookup immediately * Insert 1+tgt_free to tgt_free*3/2 * Insert 1+tgt_free*3/2 to tgt_free*5/2 * => Key 1+tgt_free to tgt_free*3/2 @@ -371,7 +371,7 @@ static void test_lru_sanity2(int map_type, int map_flags, unsigned int tgt_free) } /* Re-insert 1 to tgt_free/2 again and do a lookup - * immeidately. + * immediately. */ end_key = 1 + batch_size; value[0] = 4321; diff --git a/tools/testing/selftests/bpf/test_lwt_ip_encap.sh b/tools/testing/selftests/bpf/test_lwt_ip_encap.sh index 1e565f47aca9..37c31e53cc6e 100755 --- a/tools/testing/selftests/bpf/test_lwt_ip_encap.sh +++ b/tools/testing/selftests/bpf/test_lwt_ip_encap.sh @@ -164,7 +164,7 @@ setup() ip -netns ${NS2} link set veth7 vrf red fi - # configure addesses: the top route (1-2-3-4) + # configure addresses: the top route (1-2-3-4) ip -netns ${NS1} addr add ${IPv4_1}/24 dev veth1 ip -netns ${NS2} addr add ${IPv4_2}/24 dev veth2 ip -netns ${NS2} addr add ${IPv4_3}/24 dev veth3 diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c index fd2da2234cc9..76568db7a664 100644 --- a/tools/testing/selftests/bpf/test_sockmap.c +++ b/tools/testing/selftests/bpf/test_sockmap.c @@ -1372,7 +1372,7 @@ static int run_options(struct sockmap_options *options, int cg_fd, int test) } else fprintf(stderr, "unknown test\n"); out: - /* Detatch and zero all the maps */ + /* Detach and zero all the maps */ bpf_prog_detach2(bpf_program__fd(progs[3]), cg_fd, BPF_CGROUP_SOCK_OPS); for (i = 0; i < ARRAY_SIZE(links); i++) { diff --git a/tools/testing/selftests/bpf/verifier/calls.c b/tools/testing/selftests/bpf/verifier/calls.c index 18596ae0b0c1..e0ce0a7ed774 100644 --- a/tools/testing/selftests/bpf/verifier/calls.c +++ b/tools/testing/selftests/bpf/verifier/calls.c @@ -1375,7 +1375,7 @@ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1), /* write into map value */ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0), - /* fetch secound map_value_ptr from the stack */ + /* fetch second map_value_ptr from the stack */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_10, -16), BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1), /* write into map value */ @@ -1439,7 +1439,7 @@ /* second time with fp-16 */ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 4), BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 1, 2), - /* fetch secound map_value_ptr from the stack */ + /* fetch second map_value_ptr from the stack */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_7, 0), /* write into map value */ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0), @@ -1493,7 +1493,7 @@ /* second time with fp-16 */ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 4), BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2), - /* fetch secound map_value_ptr from the stack */ + /* fetch second map_value_ptr from the stack */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_7, 0), /* write into map value */ BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0), diff --git a/tools/testing/selftests/bpf/xdping.c b/tools/testing/selftests/bpf/xdping.c index 1503a1d2faa0..9ed8c796645d 100644 --- a/tools/testing/selftests/bpf/xdping.c +++ b/tools/testing/selftests/bpf/xdping.c @@ -155,7 +155,7 @@ int main(int argc, char **argv) } if (!server) { - /* Only supports IPv4; see hints initiailization above. */ + /* Only supports IPv4; see hints initialization above. */ if (getaddrinfo(argv[optind], NULL, &hints, &a) || !a) { fprintf(stderr, "Could not resolve %s\n", argv[optind]); return 1; diff --git a/tools/testing/selftests/bpf/xsk.h b/tools/testing/selftests/bpf/xsk.h index 93c2cc413cfc..73d01518bf84 100644 --- a/tools/testing/selftests/bpf/xsk.h +++ b/tools/testing/selftests/bpf/xsk.h @@ -94,7 +94,7 @@ static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb) * cached_cons is r->size bigger than the real consumer pointer so * that this addition can be avoided in the more frequently * executed code that computs free_entries in the beginning of - * this function. Without this optimization it whould have been + * this function. Without this optimization it would have been * free_entries = r->cached_prod - r->cached_cons + r->size. */ r->cached_cons = __atomic_load_n(r->consumer, __ATOMIC_ACQUIRE); -- 2.47.2

5 months, 3 weeks

2
1
0 0

[PATCH 00/12] kunit: Introduce UAPI testing framework

by Thomas Weißschuh

Currently testing of userspace and in-kernel API use two different frameworks. kselftests for the userspace ones and Kunit for the in-kernel ones. Besides their different scopes, both have different strengths and limitations: Kunit: * Tests are normal kernel code. * They use the regular kernel toolchain. * They can be packaged and distributed as modules conveniently. Kselftests: * Tests are normal userspace code * They need a userspace toolchain. A kernel cross toolchain is likely not enough. * A fair amout of userland is required to run the tests, which means a full distro or handcrafted rootfs. * There is no way to conveniently package and run kselftests with a given kernel image. * The kselftests makefiles are not as powerful as regular kbuild. For example they are missing proper header dependency tracking or more complex compiler option modifications. Therefore kunit is much easier to run against different kernel configurations and architectures. This series aims to combine kselftests and kunit, avoiding both their limitations. It works by compiling the userspace kselftests as part of the regular kernel build, embedding them into the kunit kernel or module and executing them from there. If the kernel toolchain is not fit to produce userspace because of a missing libc, the kernel's own nolibc can be used instead. The structured TAP output from the kselftest is integrated into the kunit KTAP output transparently, the kunit parser can parse the combined logs together. Further room for improvements: * Call each test in its completely dedicated namespace * Handle additional test files besides the test executable through archives. CPIO, cramfs, etc. * Compatibility with kselftest_harness.h (in progress) * Expose the blobs in debugfs * Provide some convience wrappers around compat userprogs * Figure out a migration path/coexistence solution for kunit UAPI and tools/testing/selftests/ Output from the kunit example testcase, note the output of "example_uapi_tests". $ ./tools/testing/kunit/kunit.py run --kunitconfig lib/kunit example ... Running tests with: $ .kunit/linux kunit.filter_glob=example kunit.enable=1 mem=1G console=tty kunit_shutdown=halt [11:53:53] ================== example (10 subtests) =================== [11:53:53] [PASSED] example_simple_test [11:53:53] [SKIPPED] example_skip_test [11:53:53] [SKIPPED] example_mark_skipped_test [11:53:53] [PASSED] example_all_expect_macros_test [11:53:53] [PASSED] example_static_stub_test [11:53:53] [PASSED] example_static_stub_using_fn_ptr_test [11:53:53] [PASSED] example_priv_test [11:53:53] =================== example_params_test =================== [11:53:53] [SKIPPED] example value 3 [11:53:53] [PASSED] example value 2 [11:53:53] [PASSED] example value 1 [11:53:53] [SKIPPED] example value 0 [11:53:53] =============== [PASSED] example_params_test =============== [11:53:53] [PASSED] example_slow_test [11:53:53] ======================= (4 subtests) ======================= [11:53:53] [PASSED] procfs [11:53:53] [PASSED] userspace test 2 [11:53:53] [SKIPPED] userspace test 3: some reason [11:53:53] [PASSED] userspace test 4 [11:53:53] ================ [PASSED] example_uapi_test ================ [11:53:53] ===================== [PASSED] example ===================== [11:53:53] ============================================================ [11:53:53] Testing complete. Ran 16 tests: passed: 11, skipped: 5 [11:53:53] Elapsed time: 67.543s total, 1.823s configuring, 65.655s building, 0.058s running Based on v6.14-rc1 and the series "tools/nolibc: compatibility with -Wmissing-prototypes" [0]. For compatibility with LLVM/clang another series is needed [1]. [0] https://lore.kernel.org/lkml/20250123-nolibc-prototype-v1-0-e1afc5c1999a@we… [1] https://lore.kernel.org/lkml/20250213-kbuild-userprog-fixes-v1-0-f255fb477d… Signed-off-by: Thomas Weißschuh <thomas.weissschuh(a)linutronix.de> --- Thomas Weißschuh (12): kconfig: implement CONFIG_HEADERS_INSTALL for Usermode Linux kconfig: introduce CONFIG_ARCH_HAS_NOLIBC kbuild: userprogs: respect CONFIG_WERROR kbuild: userprogs: add nolibc support kbuild: introduce blob framework kunit: tool: Add test for nested test result reporting kunit: tool: Don't overwrite test status based on subtest counts kunit: tool: Parse skipped tests from kselftest.h kunit: Introduce UAPI testing framework kunit: uapi: Add example for UAPI tests kunit: uapi: Introduce preinit executable kunit: uapi: Validate usability of /proc Documentation/kbuild/makefiles.rst | 12 + Makefile | 5 +- include/kunit/uapi.h | 17 ++ include/linux/blob.h | 21 ++ init/Kconfig | 2 + lib/Kconfig.debug | 1 - lib/kunit/Kconfig | 9 + lib/kunit/Makefile | 17 +- lib/kunit/kunit-example-test.c | 17 ++ lib/kunit/kunit-uapi-example.c | 58 +++++ lib/kunit/uapi-preinit.c | 61 +++++ lib/kunit/uapi.c | 250 +++++++++++++++++++++ scripts/Makefile.blobs | 19 ++ scripts/Makefile.build | 6 + scripts/Makefile.clean | 2 +- scripts/Makefile.userprogs | 18 +- scripts/blob-wrap.c | 27 +++ tools/include/nolibc/Kconfig.nolibc | 18 ++ tools/testing/kunit/kunit_parser.py | 13 +- tools/testing/kunit/kunit_tool_test.py | 9 + .../test_is_test_passed-failure-nested.log | 10 + .../test_data/test_is_test_passed-kselftest.log | 3 +- 22 files changed, 584 insertions(+), 11 deletions(-) --- base-commit: 20e952894066214a80793404c9578d72ef89c5e0 change-id: 20241015-kunit-kselftests-56273bc40442 Best regards, -- Thomas Weißschuh <thomas.weissschuh(a)linutronix.de>

5 months, 3 weeks

4
17
0 0

[PATCH] selftests/ftrace: add 'poll' binary to gitignore

by Bharadwaj Raju

When building this test, a binary file 'poll' is generated and should be gitignore'd. Signed-off-by: Bharadwaj Raju <bharadwaj.raju777(a)gmail.com> --- tools/testing/selftests/ftrace/.gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/ftrace/.gitignore b/tools/testing/selftests/ftrace/.gitignore index 2659417cb2c7..4d7fcb828850 100644 --- a/tools/testing/selftests/ftrace/.gitignore +++ b/tools/testing/selftests/ftrace/.gitignore @@ -1,2 +1,3 @@ # SPDX-License-Identifier: GPL-2.0-only logs +poll -- 2.43.0

5 months, 3 weeks

3
3
0 0

[PATCH v3 0/2] tools: Unify top-level quiet infrastructure

by Charlie Jenkins

The quiet infrastructure was moved out of Makefile.build to accomidate the new syscall table generation scripts in perf. Syscall table generation wanted to also be able to be quiet, so instead of again copying the code to set the quiet variables, the code was moved into Makefile.perf to be used globally. This was not the right solution. It should have been moved even further upwards in the call chain. Makefile.include is imported in many files so this seems like a proper place to put it. To: Signed-off-by: Charlie Jenkins <charlie(a)rivosinc.com> --- Changes in v3: - Add back erroneously removed "silent=1" (Jiri) - Link to v2: https://lore.kernel.org/r/20250210-quiet_tools-v2-0-b2f18cbf72af@rivosinc.c… Changes in v2: - Fix spacing around Q= (Andrii) - Link to v1: https://lore.kernel.org/r/20250203-quiet_tools-v1-0-d25c8956e59a@rivosinc.c… --- Charlie Jenkins (2): tools: Unify top-level quiet infrastructure tools: Remove redundant quiet setup tools/arch/arm64/tools/Makefile | 6 ----- tools/bpf/Makefile | 6 ----- tools/bpf/bpftool/Documentation/Makefile | 6 ----- tools/bpf/bpftool/Makefile | 6 ----- tools/bpf/resolve_btfids/Makefile | 2 -- tools/bpf/runqslower/Makefile | 5 +--- tools/build/Makefile | 8 +----- tools/lib/bpf/Makefile | 13 ---------- tools/lib/perf/Makefile | 13 ---------- tools/lib/thermal/Makefile | 13 ---------- tools/objtool/Makefile | 6 ----- tools/perf/Makefile.perf | 41 ------------------------------- tools/scripts/Makefile.include | 30 ++++++++++++++++++++++ tools/testing/selftests/bpf/Makefile.docs | 6 ----- tools/testing/selftests/hid/Makefile | 2 -- tools/thermal/lib/Makefile | 13 ---------- tools/tracing/latency/Makefile | 6 ----- tools/tracing/rtla/Makefile | 6 ----- tools/verification/rv/Makefile | 6 ----- 19 files changed, 32 insertions(+), 162 deletions(-) --- base-commit: 2014c95afecee3e76ca4a56956a936e23283f05b change-id: 20250203-quiet_tools-9a6ea9d65a19 -- - Charlie

5 months, 3 weeks

3
4
0 0

[PATCH net-next v2 0/8] netconsole: Add taskname sysdata support

by Breno Leitao

This patchset introduces a new feature to the netconsole extradata subsystem that enables the inclusion of the current task's name in the sysdata output of netconsole messages. This enhancement is particularly valuable for large-scale deployments, such as Meta's, where netconsole collects messages from millions of servers and stores them in a data warehouse for analysis. Engineers often rely on these messages to investigate issues and assess kernel health. One common challenge we face is determining the context in which a particular message was generated. By including the task name (task->comm) with each message, this feature provides a direct answer to the frequently asked question: "What was running when this message was generated?" This added context will significantly improve our ability to diagnose and troubleshoot issues, making it easier to interpret output of netconsole. The patchset consists of seven patches that implement the following changes: * Refactor CPU number formatting into a separate function * Prefix CPU_NR sysdata feature with SYSDATA_ * Patch to covert a bitwise operation into boolean * Add configfs controls for taskname sysdata feature * Add taskname to extradata entry count * Add support for including task name in netconsole's extra data output * Document the task name feature in Documentation/networking/netconsole.rst * Add test coverage for the task name feature to the existing sysdata selftest script These changes allow users to enable or disable the task name feature via configfs and provide additional context for kernel messages by showing which task generated each console message. I have tested these patches on some servers and they seem to work as expected. Signed-off-by: Breno Leitao <leitao(a)debian.org> --- Changes in v2: - Add an extra patch to convert the comparison more stable. (Paolo) - Changed the argument of a function (Simon) - Removed the warn on `current == NULLL` since it shouldn't be the case. (Simon and Paolo) - Link to v1: https://lore.kernel.org/r/20250221-netcons_current-v1-0-21c86ae8fc0d@debian… --- Breno Leitao (8): netconsole: prefix CPU_NR sysdata feature with SYSDATA_ netconsole: Make boolean comparison consistent netconsole: refactor CPU number formatting into separate function netconsole: add taskname to extradata entry count netconsole: add configfs controls for taskname sysdata feature netconsole: add task name to extra data fields netconsole: docs: document the task name feature netconsole: selftest: add task name append testing Documentation/networking/netconsole.rst | 28 +++++++ drivers/net/netconsole.c | 95 ++++++++++++++++++---- .../selftests/drivers/net/netcons_sysdata.sh | 51 ++++++++++-- 3 files changed, 153 insertions(+), 21 deletions(-) --- base-commit: 56794b5862c5a9aefcf2b703257c6fb93f76573e change-id: 20250217-netcons_current-2c635fa5beda prerequisite-change-id: 20250212-netdevsim-258d2d628175:v3 prerequisite-patch-id: 4ecfdbc58dd599d2358655e4ad742cbb9dde39f3 Best regards, -- Breno Leitao <leitao(a)debian.org>

5 months, 3 weeks

3
11
0 0

[PATCH net-next 0/7] netconsole: Add taskname sysdata support

by Breno Leitao

This patchset introduces a new feature to the netconsole extradata subsystem that enables the inclusion of the current task's name in the sysdata output of netconsole messages. This enhancement is particularly valuable for large-scale deployments, such as Meta's, where netconsole collects messages from millions of servers and stores them in a data warehouse for analysis. Engineers often rely on these messages to investigate issues and assess kernel health. One common challenge we face is determining the context in which a particular message was generated. By including the task name (task->comm) with each message, this feature provides a direct answer to the frequently asked question: "What was running when this message was generated?" This added context will significantly improve our ability to diagnose and troubleshoot issues, making it easier to interpret output of netconsole. The patchset consists of seven patches that implement the following changes: * Refactor CPU number formatting into a separate function * Prefix CPU_NR sysdata feature with SYSDATA_ * Add configfs controls for taskname sysdata feature * Add taskname to extradata entry count * Add support for including task name in netconsole's extra data output * Document the task name feature in Documentation/networking/netconsole.rst * Add test coverage for the task name feature to the existing sysdata selftest script These changes allow users to enable or disable the task name feature via configfs and provide additional context for kernel messages by showing which task generated each console message. I have tested these patches on some servers and they seem to work as expected. Signed-off-by: Breno Leitao <leitao(a)debian.org> --- Breno Leitao (7): netconsole: prefix CPU_NR sysdata feature with SYSDATA_ netconsole: refactor CPU number formatting into separate function netconsole: add taskname to extradata entry count netconsole: add configfs controls for taskname sysdata feature netconsole: add task name to extra data fields netconsole: docs: document the task name feature netconsole: selftest: add task name append testing Documentation/networking/netconsole.rst | 28 +++++++ drivers/net/netconsole.c | 98 ++++++++++++++++++---- .../selftests/drivers/net/netcons_sysdata.sh | 51 +++++++++-- 3 files changed, 156 insertions(+), 21 deletions(-) --- base-commit: bb3bb6c92e5719c0f5d7adb9d34db7e76705ac33 change-id: 20250217-netcons_current-2c635fa5beda prerequisite-change-id: 20250212-netdevsim-258d2d628175:v3 prerequisite-patch-id: 4ecfdbc58dd599d2358655e4ad742cbb9dde39f3 Best regards, -- Breno Leitao <leitao(a)debian.org>

5 months, 3 weeks

3
23
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror