A bug was identified where the KTAP below caused an infinite loop:
TAP version 13
ok 4 test_case
1..4
The infinite loop was caused by the parser not parsing a test plan
if following a test result line.
Fix this bug by parsing test plan line to avoid the infinite loop.
Signed-off-by: Rae Moar <rmoar(a)google.com>
---
Changes since v2:
- None, adds test in second patch
tools/testing/kunit/kunit_parser.py | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py
index 29fc27e8949b..da53a709773a 100644
--- a/tools/testing/kunit/kunit_parser.py
+++ b/tools/testing/kunit/kunit_parser.py
@@ -759,7 +759,7 @@ def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest:
# If parsing the main/top-level test, parse KTAP version line and
# test plan
test.name = "main"
- ktap_line = parse_ktap_header(lines, test, printer)
+ parse_ktap_header(lines, test, printer)
test.log.extend(parse_diagnostic(lines))
parse_test_plan(lines, test)
parent_test = True
@@ -768,13 +768,12 @@ def parse_test(lines: LineStream, expected_num: int, log: List[str], is_subtest:
# the KTAP version line and/or subtest header line
ktap_line = parse_ktap_header(lines, test, printer)
subtest_line = parse_test_header(lines, test)
+ test.log.extend(parse_diagnostic(lines))
+ parse_test_plan(lines, test)
parent_test = (ktap_line or subtest_line)
if parent_test:
- # If KTAP version line and/or subtest header is found, attempt
- # to parse test plan and print test header
- test.log.extend(parse_diagnostic(lines))
- parse_test_plan(lines, test)
print_test_header(test, printer)
+
expected_count = test.expected_count
subtests = []
test_num = 1
base-commit: 0619a4868fc1b32b07fb9ed6c69adc5e5cf4e4b2
--
2.49.0.rc1.451.g8f38331e32-goog
Here are a few cleanups, preparation work for the new PM ops, and sysctl
knobs.
- Patch 1: reorg: move generic NL code used by all PMs to pm_netlink.c.
- Patch 2: use kmemdup() instead of kmalloc + copy.
- Patch 3: small cleanup to use pm var instead of msk->pm.
- Patch 4: reorg: id_avail_bitmap is only used by the in-kernel PM.
- Patch 5: use struct_group to easily reset a subset of PM data vars.
- Patch 6: introduce the minimal skeleton for the new PM ops.
- Patch 7: register in-kernel and userspace PM ops.
- Patch 8: new net.mptcp.path_manager sysctl knob, deprecating pm_type.
- Patch 9: map the new path_manager sysctl knob with pm_type.
- Patch 10: map the old pm_type sysctl knob with path_manager.
- Patch 11: new net.mptcp.available_path_managers sysctl knob.
- Patch 12: new test to validate path_manager and pm_type mapping.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Geliang Tang (11):
mptcp: pm: in-kernel: use kmemdup helper
mptcp: pm: use pm variable instead of msk->pm
mptcp: pm: only fill id_avail_bitmap for in-kernel pm
mptcp: pm: add struct_group in mptcp_pm_data
mptcp: pm: define struct mptcp_pm_ops
mptcp: pm: register in-kernel and userspace PM
mptcp: sysctl: set path manager by name
mptcp: sysctl: map path_manager to pm_type
mptcp: sysctl: map pm_type to path_manager
mptcp: sysctl: add available_path_managers
selftests: mptcp: add pm sysctl mapping tests
Matthieu Baerts (NGI0) (1):
mptcp: pm: split netlink and in-kernel init
Documentation/networking/mptcp-sysctl.rst | 23 +++++
include/net/mptcp.h | 14 +++
net/mptcp/ctrl.c | 113 +++++++++++++++++++++-
net/mptcp/pm.c | 97 ++++++++++++++++---
net/mptcp/pm_kernel.c | 16 +--
net/mptcp/pm_netlink.c | 6 ++
net/mptcp/pm_userspace.c | 10 ++
net/mptcp/protocol.h | 17 ++++
tools/testing/selftests/net/mptcp/userspace_pm.sh | 30 +++++-
9 files changed, 301 insertions(+), 25 deletions(-)
---
base-commit: e016cf5f39e9c53e274a7b7122a949d8839b8782
change-id: 20250312-net-next-mptcp-pm-ops-intro-01510135cd5e
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
For platforms on powerpc architecture with a default page size greater
than 4096, there was an inconsistency in fragment size calculation.
This caused the BPF selftest xdp_adjust_tail/xdp_adjust_frags_tail_grow
to fail on powerpc.
The issue occurred because the fragment buffer size in
bpf_prog_test_run_xdp() was set to 4096, while the actual data size in
the fragment within the shared skb was checked against PAGE_SIZE
(65536 on powerpc) in min_t, causing it to exceed 4096 and be set
accordingly. This discrepancy led to an overflow when
bpf_xdp_frags_increase_tail() checked for tailroom, as skb_frag_size(frag)
could be greater than rxq->frag_size (when PAGE_SIZE > 4096).
This change fixes:
1. test_run by getting the correct arch dependent PAGE_SIZE.
2. selftest by caculating tailroom and getting correct PAGE_SIZE.
Changes:
v1 -> v2:
* Address comments from Alexander
* Use dynamic page size, cacheline size and size of
struct skb_shared_info to calculate parameters.
* Fixed both test_run and selftest.
v1: https://lore.kernel.org/all/20250122183720.1411176-1-skb99@linux.ibm.com/
Saket Kumar Bhaskar (2):
bpf, test_run: Replace hardcoded page size with dynamic PAGE_SIZE in
test_run
selftests/bpf: Refactor xdp_adjust_tail selftest with dynamic sizing
.../bpf/prog_tests/xdp_adjust_tail.c | 160 +++++++++++++-----
.../bpf/progs/test_xdp_adjust_tail_grow.c | 41 +++--
2 files changed, 149 insertions(+), 52 deletions(-)
--
2.43.5
Currently there is no means of determining whether a give page in a mapping
range is designated a guard region (as installed via madvise() using the
MADV_GUARD_INSTALL flag).
This is generally not an issue, but in some instances users may wish to
determine whether this is the case.
This series adds this ability via /proc/$pid/pagemap, updates the
documentation and adds a self test to assert that this functions correctly.
Lorenzo Stoakes (2):
fs/proc/task_mmu: add guard region bit to pagemap
tools/selftests: add guard region test for /proc/$pid/pagemap
Documentation/admin-guide/mm/pagemap.rst | 3 +-
fs/proc/task_mmu.c | 6 ++-
tools/testing/selftests/mm/guard-regions.c | 47 ++++++++++++++++++++++
tools/testing/selftests/mm/vm_util.h | 1 +
4 files changed, 55 insertions(+), 2 deletions(-)
--
2.48.1
Hi all,
This patch series continues the work to migrate the script tests into
prog_tests.
test_xdp_vlan.sh tests the ability of an XDP program to modify the VLAN
ids on the fly. This isn't currently covered by an other test in the
test_progs framework so I add a new file prog_tests/xdp_vlan.c that does
the exact same tests (same network topology, same BPF programs) and
remove the script.
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>
---
Bastien Curutchet (eBPF Foundation) (2):
selftests/bpf: test_xdp_vlan: Rename BPF sections
selftests/bpf: Migrate test_xdp_vlan.sh into test_progs
tools/testing/selftests/bpf/Makefile | 4 +-
tools/testing/selftests/bpf/prog_tests/xdp_vlan.c | 175 ++++++++++++++++
tools/testing/selftests/bpf/progs/test_xdp_vlan.c | 20 +-
tools/testing/selftests/bpf/test_xdp_vlan.sh | 233 ---------------------
.../selftests/bpf/test_xdp_vlan_mode_generic.sh | 9 -
.../selftests/bpf/test_xdp_vlan_mode_native.sh | 9 -
6 files changed, 186 insertions(+), 264 deletions(-)
---
base-commit: a814b9be27fb3c3f49343aee4b015b76f5875558
change-id: 20250130-xdp_vlan-e825cc4df14a
Best regards,
--
Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>
When the rseq UAPI header is included 'union rseq' clashes with 'struct
rseq', it's not the case in the rseq selftests but it does break the KVM
selftests that also include this file.
Rename 'union rseq' to 'union rseq_tls' to fix this.
Fixes: e6644c967d3c ("rseq/selftests: Ensure the rseq ABI TLS is actually 1024 bytes")
Reported-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Michael Jeanson <mjeanson(a)efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
---
tools/testing/selftests/rseq/rseq.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
index 6d8997d815ef..663a9cef1952 100644
--- a/tools/testing/selftests/rseq/rseq.c
+++ b/tools/testing/selftests/rseq/rseq.c
@@ -75,13 +75,13 @@ static int rseq_ownership;
* Use a union to ensure we allocate a TLS area of 1024 bytes to accomodate an
* rseq registration that is larger than the current rseq ABI.
*/
-union rseq {
+union rseq_tls {
struct rseq_abi abi;
char dummy[RSEQ_THREAD_AREA_ALLOC_SIZE];
};
static
-__thread union rseq __rseq __attribute__((tls_model("initial-exec"))) = {
+__thread union rseq_tls __rseq __attribute__((tls_model("initial-exec"))) = {
.abi = {
.cpu_id = RSEQ_ABI_CPU_ID_UNINITIALIZED,
},
--
2.43.0
Conntrack bridge only tracks untagged and 802.1q.
To make the bridge-fastpath experience more similar to the
forward-fastpath experience, add double vlan, pppoe and pppoe-in-q
tagged packets to bridge conntrack and to bridge filter chain.
Split from patch-set: bridge-fastpath and related improvements v9
Eric Woudstra (3):
netfilter: bridge: Add conntrack double vlan and pppoe
netfilter: nft_chain_filter: Add bridge double vlan and pppoe
selftests: netfilter: Add conntrack_bridge.sh
net/bridge/netfilter/nf_conntrack_bridge.c | 83 +++++++--
net/netfilter/nft_chain_filter.c | 20 +-
.../testing/selftests/net/netfilter/Makefile | 1 +
.../net/netfilter/conntrack_bridge.sh | 176 ++++++++++++++++++
4 files changed, 267 insertions(+), 13 deletions(-)
create mode 100755 tools/testing/selftests/net/netfilter/conntrack_bridge.sh
--
2.47.1
Make sure the test cleans up after itself. The XDP off statements
at the end of the test may not be reached.
Fixes: 75cc19c8ff89 ("selftests: drv-net: add xdp cases for ping.py")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
---
CC: shuah(a)kernel.org
CC: ap420073(a)gmail.com
CC: linux-kselftest(a)vger.kernel.org
---
tools/testing/selftests/drivers/net/ping.py | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/ping.py b/tools/testing/selftests/drivers/net/ping.py
index 93f4b411b378..fc69bfcc37c4 100755
--- a/tools/testing/selftests/drivers/net/ping.py
+++ b/tools/testing/selftests/drivers/net/ping.py
@@ -7,7 +7,7 @@ from lib.py import ksft_run, ksft_exit
from lib.py import ksft_eq, KsftSkipEx, KsftFailEx
from lib.py import EthtoolFamily, NetDrvEpEnv
from lib.py import bkg, cmd, wait_port_listen, rand_port
-from lib.py import ethtool, ip
+from lib.py import defer, ethtool, ip
remote_ifname=""
no_sleep=False
@@ -60,6 +60,7 @@ no_sleep=False
prog = test_dir + "/../../net/lib/xdp_dummy.bpf.o"
cmd(f"ip link set dev {remote_ifname} mtu 1500", shell=True, host=cfg.remote)
cmd(f"ip link set dev {cfg.ifname} mtu 1500 xdpgeneric obj {prog} sec xdp", shell=True)
+ defer(cmd, f"ip link set dev {cfg.ifname} xdpgeneric off")
if no_sleep != True:
time.sleep(10)
@@ -68,7 +69,9 @@ no_sleep=False
test_dir = os.path.dirname(os.path.realpath(__file__))
prog = test_dir + "/../../net/lib/xdp_dummy.bpf.o"
cmd(f"ip link set dev {remote_ifname} mtu 9000", shell=True, host=cfg.remote)
+ defer(ip, f"link set dev {remote_ifname} mtu 1500", host=cfg.remote)
ip("link set dev %s mtu 9000 xdpgeneric obj %s sec xdp.frags" % (cfg.ifname, prog))
+ defer(ip, f"link set dev {cfg.ifname} mtu 1500 xdpgeneric off")
if no_sleep != True:
time.sleep(10)
@@ -78,6 +81,7 @@ no_sleep=False
prog = test_dir + "/../../net/lib/xdp_dummy.bpf.o"
cmd(f"ip link set dev {remote_ifname} mtu 1500", shell=True, host=cfg.remote)
cmd(f"ip -j link set dev {cfg.ifname} mtu 1500 xdp obj {prog} sec xdp", shell=True)
+ defer(ip, f"link set dev {cfg.ifname} mtu 1500 xdp off")
xdp_info = ip("-d link show %s" % (cfg.ifname), json=True)[0]
if xdp_info['xdp']['mode'] != 1:
"""
@@ -94,10 +98,11 @@ no_sleep=False
test_dir = os.path.dirname(os.path.realpath(__file__))
prog = test_dir + "/../../net/lib/xdp_dummy.bpf.o"
cmd(f"ip link set dev {remote_ifname} mtu 9000", shell=True, host=cfg.remote)
+ defer(ip, f"link set dev {remote_ifname} mtu 1500", host=cfg.remote)
try:
cmd(f"ip link set dev {cfg.ifname} mtu 9000 xdp obj {prog} sec xdp.frags", shell=True)
+ defer(ip, f"link set dev {cfg.ifname} mtu 1500 xdp off")
except Exception as e:
- cmd(f"ip link set dev {remote_ifname} mtu 1500", shell=True, host=cfg.remote)
raise KsftSkipEx('device does not support native-multi-buffer XDP')
if no_sleep != True:
@@ -111,6 +116,7 @@ no_sleep=False
cmd(f"ip link set dev {cfg.ifname} xdpoffload obj {prog} sec xdp", shell=True)
except Exception as e:
raise KsftSkipEx('device does not support offloaded XDP')
+ defer(ip, f"link set dev {cfg.ifname} xdpoffload off")
cmd(f"ip link set dev {remote_ifname} mtu 1500", shell=True, host=cfg.remote)
if no_sleep != True:
@@ -157,7 +163,6 @@ no_sleep=False
_test_v4(cfg)
_test_v6(cfg)
_test_tcp(cfg)
- ip("link set dev %s xdpgeneric off" % cfg.ifname)
def test_xdp_generic_mb(cfg, netnl) -> None:
_set_xdp_generic_mb_on(cfg)
@@ -169,7 +174,6 @@ no_sleep=False
_test_v4(cfg)
_test_v6(cfg)
_test_tcp(cfg)
- ip("link set dev %s xdpgeneric off" % cfg.ifname)
def test_xdp_native_sb(cfg, netnl) -> None:
_set_xdp_native_sb_on(cfg)
@@ -181,7 +185,6 @@ no_sleep=False
_test_v4(cfg)
_test_v6(cfg)
_test_tcp(cfg)
- ip("link set dev %s xdp off" % cfg.ifname)
def test_xdp_native_mb(cfg, netnl) -> None:
_set_xdp_native_mb_on(cfg)
@@ -193,14 +196,12 @@ no_sleep=False
_test_v4(cfg)
_test_v6(cfg)
_test_tcp(cfg)
- ip("link set dev %s xdp off" % cfg.ifname)
def test_xdp_offload(cfg, netnl) -> None:
_set_xdp_offload_on(cfg)
_test_v4(cfg)
_test_v6(cfg)
_test_tcp(cfg)
- ip("link set dev %s xdpoffload off" % cfg.ifname)
def main() -> None:
with NetDrvEpEnv(__file__) as cfg:
@@ -213,7 +214,6 @@ no_sleep=False
test_xdp_native_mb,
test_xdp_offload],
args=(cfg, EthtoolFamily()))
- set_interface_init(cfg)
ksft_exit()
--
2.48.1
Unmapping virtual machine guest memory from the host kernel's direct map
is a successful mitigation against Spectre-style transient execution
issues: If the kernel page tables do not contain entries pointing to
guest memory, then any attempted speculative read through the direct map
will necessarily be blocked by the MMU before any observable
microarchitectural side-effects happen. This means that Spectre-gadgets
and similar cannot be used to target virtual machine memory. Roughly 60%
of speculative execution issues fall into this category [1, Table 1].
This patch series extends guest_memfd with the ability to remove its
memory from the host kernel's direct map, to be able to attain the above
protection for KVM guests running inside guest_memfd.
=== Changes to RFC v3 ===
- Settle relationship between direct map removal and shared/private
memory in guest_memfd (David H.)
- Omit TLB flushes upon direct map removal again
- Settle uABI for how KVM accesses guest memory in non-CoCo guest_memfd
VMs (upstream guest_memfd calls)
- Add selftests that exercise the codepaths of non-CoCo guest_memfd VMs
Lastly, this series is rebased on top of Fuad's v4 for shared mapping of
guest_memfd [2]. The KVM parts should also apply on top of 0ad2507d5d93
("Linux 6.14-rc3"), but the selftest patches need Fuad's series as base.
=== Overview ===
guest_memfd should be usable for "non-CoCo" VMs - virtual machines where
host userspace is trusted (e.g. can have access to all of guest memory),
but which should still be hardened against speculative execution attacks
(Spectre, etc.) staged through potentially existing gadgets in the host
kernel.
To attain this hardening, unmap guest memory from the host kernels
address space (e.g. zap direct map entries), while allowing KVM to
continue accessing guest memory through userspace mappings. This works
because KVM already almost always uses userspace mappings whenever KVM
needs to access guest memory - the only parts that require direct map
entries (because they use GUP) are KVM's MMU, and kvm-clock on x86.
Building on top of guest_memfd sidesteps the MMU problem, as for
memslots with KVM_MEM_GUEST_MEMFD set, the MMU consumes fd + offset
directly without going through any VMAs. kvm-clock on the other hand is
not strictly needed (guests boot fine without it), so ignore it for
now.
=== Implementation ===
Make KVM_CREATE_GUEST_MEMFD accept a flag (KVM_GMEM_NO_DIRECT_MAP) that
instructs it to remove newly allocated folios from the host kernels
direct map immediately after preparation.
Nothing further is needed to make non-CoCo VMs work - particularly, KVM
does not need to be taught any special ways of accessing guest memory if
it is in guest_memfd. Userspace can simply mmap guest_memfd (via
KVM_GMEM_SHARED_MEM added in Fuad's series), and set the memslot's
userspace_addr to this userspace mapping of guest_memfd.
=== Open Questions ===
In this patch series, stale TLB entries do not get flushed after direct
map entries are marked as not present. This is fine from a functional
point of view (as the mapping is still valid, it's just temporarily not
supposed to be used), but pokes a theoretical hole into the speculation
protection: Something could try to keep alive stale TLB entries for
specific pages until the guest starts using them for sensitive
information, and then stage a Spectre attack on that memory, despite it
being unmapped. In practice, this would require knowing in advance, at
gmem fault-time, which pages will eventually contain information of
interest, and then preventing these specific TLB entries from getting
naturally evicted (where the number of pages that can be targeted like
this is limited by the size of the TLB). These seem to be fairly
difficult requisites to fulfill, but we were wondering what the
community thinks.
=== Summary ===
Patch 1 adds a struct address_space flag that indices that folios in a
mapping are direct map removed, and threads it through mm code to ensure
direct map removed folios don't end up in places where they can cause
mayhem (particularly, we reject them in get_user_pages). Since these
checks end up being duplicates of already existing checks for secretmem
folios, patch 2 unifies the two by using the new address_space flag for
secretmem mappings. Patches 3 through 5 are about support for direct map
removal in guest_memfd, while patches 6 through 12 are about testing the
non-CoCo setup in KVM selftests, with patches 6 through 9 being
preparatory, and patches 10 through 12 adding the actual test cases.
[1]: https://download.vusec.net/papers/quarantine_raid23.pdf
[2]: https://lore.kernel.org/kvm/20250218172500.807733-1-tabba@google.com/
[RFC v1]: https://lore.kernel.org/kvm/20240709132041.3625501-1-roypat@amazon.co.uk/
[RFC v2]: https://lore.kernel.org/kvm/20240910163038.1298452-1-roypat@amazon.co.uk/
[RFC v3]: https://lore.kernel.org/kvm/20241030134912.515725-1-roypat@amazon.co.uk/
Patrick Roy (12):
mm: introduce AS_NO_DIRECT_MAP
mm/secretmem: set AS_NO_DIRECT_MAP instead of special-casing
KVM: guest_memfd: Add flag to remove from direct map
KVM: Add capability to discover KVM_GMEM_NO_DIRECT_MAP support
KVM: Documentation: document KVM_GMEM_NO_DIRECT_MAP flag
KVM: selftests: load elf via bounce buffer
KVM: selftests: set KVM_MEM_GUEST_MEMFD in vm_mem_add() if guest_memfd
!= -1
KVM: selftests: Add guest_memfd based vm_mem_backing_src_types
KVM: selftests: stuff vm_mem_backing_src_type into vm_shape
KVM: selftests: adjust test_create_guest_memfd_invalid
KVM: selftests: set KVM_GMEM_NO_DIRECT_MAP in mem conversion tests
KVM: selftests: Test guest execution from direct map removed gmem
Documentation/virt/kvm/api.rst | 13 ++++
include/linux/pagemap.h | 16 +++++
include/linux/secretmem.h | 18 ------
include/uapi/linux/kvm.h | 3 +
lib/buildid.c | 4 +-
mm/gup.c | 14 +---
mm/mlock.c | 2 +-
mm/secretmem.c | 6 +-
.../testing/selftests/kvm/guest_memfd_test.c | 2 +-
.../testing/selftests/kvm/include/kvm_util.h | 29 ++++++---
.../testing/selftests/kvm/include/test_util.h | 8 +++
tools/testing/selftests/kvm/lib/elf.c | 8 +--
tools/testing/selftests/kvm/lib/io.c | 23 +++++++
tools/testing/selftests/kvm/lib/kvm_util.c | 64 +++++++++++--------
tools/testing/selftests/kvm/lib/test_util.c | 8 +++
tools/testing/selftests/kvm/lib/x86/sev.c | 1 +
.../selftests/kvm/pre_fault_memory_test.c | 1 +
.../selftests/kvm/set_memory_region_test.c | 40 ++++++++++++
.../kvm/x86/private_mem_conversions_test.c | 7 +-
virt/kvm/guest_memfd.c | 24 ++++++-
virt/kvm/kvm_main.c | 5 ++
21 files changed, 214 insertions(+), 82 deletions(-)
base-commit: da40655874b54a2b563f8ceb3ed839c6cd38e0b4
--
2.48.1
$half_ufd_size_MB is supposed to be half of the available hugetlb memory
expressed in MB. But previously it was calculated in pages since
$freepgs is the number of free pages.
When huge pages are 2M it doesn't make a whole lot of difference; the
number of pages that get used is just halved. But on arm64 with 16K or
64K base pages, the PMD size (and default hugetlb size) is 32M and 512M
respectively. So in this case we end up passing a number of MB that is
smaller than a single hugetlb page and the test raises an error.
Fixes: 2e47a445d7b3 ("selftests/mm: run_vmtests.sh: fix hugetlb mem size calculation")
Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com>
---
tools/testing/selftests/mm/run_vmtests.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index da7e26668103..14fa9d40d574 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -304,7 +304,7 @@ uffd_stress_bin=./uffd-stress
CATEGORY="userfaultfd" run_test ${uffd_stress_bin} anon 20 16
# Hugetlb tests require source and destination huge pages. Pass in half
# the size of the free pages we have, which is used for *each*.
-half_ufd_size_MB=$((freepgs / 2))
+half_ufd_size_MB=$(((freepgs * hpgsize_KB / 2) / 1024))
CATEGORY="userfaultfd" run_test ${uffd_stress_bin} hugetlb "$half_ufd_size_MB" 32
CATEGORY="userfaultfd" run_test ${uffd_stress_bin} hugetlb-private "$half_ufd_size_MB" 32
CATEGORY="userfaultfd" run_test ${uffd_stress_bin} shmem 20 16
--
2.43.0
Adding the aligned(1024) attribute to the definition of __rseq_abi did
not increase its size to 1024, for this attribute to impact the size of
__rseq_abi it would need to be added to the declaration of 'struct
rseq_abi'. We only want to increase the size of the TLS allocation to
ensure registration will succeed with future extended ABI. Use a union
with a dummy member to ensure we allocate 1024 bytes.
Signed-off-by: Michael Jeanson <mjeanson(a)efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
---
tools/testing/selftests/rseq/rseq.c | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
index f6156790c3b4..aa9ae866bc1a 100644
--- a/tools/testing/selftests/rseq/rseq.c
+++ b/tools/testing/selftests/rseq/rseq.c
@@ -71,9 +71,20 @@ static int rseq_ownership;
/* Original struct rseq allocation size is 32 bytes. */
#define ORIG_RSEQ_ALLOC_SIZE 32
+/*
+ * Use a union to ensure we allocate a TLS area of 1024 bytes to accomodate an
+ * rseq registration that is larger than the current rseq ABI.
+ */
+union rseq {
+ struct rseq_abi abi;
+ char dummy[RSEQ_THREAD_AREA_ALLOC_SIZE];
+};
+
static
-__thread struct rseq_abi __rseq_abi __attribute__((tls_model("initial-exec"), aligned(RSEQ_THREAD_AREA_ALLOC_SIZE))) = {
- .cpu_id = RSEQ_ABI_CPU_ID_UNINITIALIZED,
+__thread union rseq __rseq __attribute__((tls_model("initial-exec"))) = {
+ .abi = {
+ .cpu_id = RSEQ_ABI_CPU_ID_UNINITIALIZED,
+ },
};
static int sys_rseq(struct rseq_abi *rseq_abi, uint32_t rseq_len,
@@ -149,7 +160,7 @@ int rseq_register_current_thread(void)
/* Treat libc's ownership as a successful registration. */
return 0;
}
- rc = sys_rseq(&__rseq_abi, get_rseq_min_alloc_size(), 0, RSEQ_SIG);
+ rc = sys_rseq(&__rseq.abi, get_rseq_min_alloc_size(), 0, RSEQ_SIG);
if (rc) {
/*
* After at least one thread has registered successfully
@@ -183,7 +194,7 @@ int rseq_unregister_current_thread(void)
/* Treat libc's ownership as a successful unregistration. */
return 0;
}
- rc = sys_rseq(&__rseq_abi, get_rseq_min_alloc_size(), RSEQ_ABI_FLAG_UNREGISTER, RSEQ_SIG);
+ rc = sys_rseq(&__rseq.abi, get_rseq_min_alloc_size(), RSEQ_ABI_FLAG_UNREGISTER, RSEQ_SIG);
if (rc)
return -1;
return 0;
@@ -249,7 +260,7 @@ void rseq_init(void)
rseq_ownership = 1;
/* Calculate the offset of the rseq area from the thread pointer. */
- rseq_offset = (void *)&__rseq_abi - rseq_thread_pointer();
+ rseq_offset = (void *)&__rseq.abi - rseq_thread_pointer();
/* rseq flags are deprecated, always set to 0. */
rseq_flags = 0;
--
2.43.0
Hi all,
This patch series continues the work to migrate the script tests into
prog_tests.
The test_xsk.sh script tests lots of AF_XDP use cases. The tests it uses
are defined in xksxceiver.c. As this script is used to test real
hardware, the goal here is to keep it as is and only integrate the
tests on veth peers into the test_progs framework.
Three tests are flaky on s390 so they won't be integrated to test_progs
yet (I'm currently trying to make them more robust).
PATCH 1 & 2 fix some small issues xskxceiver.c
PATCH 3 to 9 rework the xskxceiver to ease the integration in the
test_progs framework. Two main points are addressed in them :
- wrap kselftest calls behind macros to ease their replacement later
- handle all errors to release resources instead of calling exit() when
any error occurs.
PATCH 10 extracts test_xsk[.c/.h] from xskxceiver[.c/.h] to make the
tests available to test_progs
PATCH 11 enables kselftest de-activation
PATCH 12 isolates the flaky tests
PATCH 13 integrate the non-flaky tests to the test_progs framework
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>
---
Bastien Curutchet (eBPF Foundation) (13):
selftests/bpf: test_xsk: Initialize bitmap before use
selftests/bpf: test_xsk: Fix memory leaks
selftests/bpf: test_xsk: Wrap ksft_*() behind macros
selftests/bpf: test_xsk: Add return value to init_iface()
selftests/bpf: test_xsk: Don't exit immediately when xsk_attach fails
selftests/bpf: test_xsk: Don't exit immediately when gettimeofday fails
selftests/bpf: test_xsk: Don't exit immediately when workers fail
selftests/bpf: test_xsk: Don't exit immediately if validate_traffic fails
selftests/bpf: test_xsk: Don't exit immediately on allocation failures
selftests/bpf: test_xsk: Split xskxceiver
selftests/bpf: test_xsk: Make kselftest dependency optional
selftests/bpf: test_xsk: Isolate flaky tests
selftests/bpf: test_xsk: Integrate test_xsk.c to test_progs framework
tools/testing/selftests/bpf/Makefile | 13 +-
tools/testing/selftests/bpf/prog_tests/test_xsk.c | 2416 ++++++++++++++++++++
tools/testing/selftests/bpf/prog_tests/test_xsk.h | 299 +++
tools/testing/selftests/bpf/prog_tests/xsk.c | 178 ++
tools/testing/selftests/bpf/xskxceiver.c | 2543 +--------------------
tools/testing/selftests/bpf/xskxceiver.h | 153 --
6 files changed, 3021 insertions(+), 2581 deletions(-)
---
base-commit: 720c696b16a1b1680f64cac9b3bb9e312a23ac47
change-id: 20250218-xsk-0cf90e975d14
Best regards,
--
Bastien Curutchet (eBPF Foundation) <bastien.curutchet(a)bootlin.com>
When using 'make LLVM=1 W=1 -C tools/testing/selftests/pid_namespace'
with clang-20, I've noticed the following:
pid_max.c:42:8: error: call to undeclared function 'mount'; ISO
C99 and later do not support implicit function declarations
[-Wimplicit-function-declaration]
42 | ret = mount("", "/", NULL, MS_PRIVATE | MS_REC, 0);
| ^
pid_max.c:42:29: error: use of undeclared identifier 'MS_PRIVATE'
42 | ret = mount("", "/", NULL, MS_PRIVATE | MS_REC, 0);
| ^
...
So include '<sys/mount.h>' to add all of the required declarations.
Signed-off-by: Dmitry Antipov <dmantipov(a)yandex.ru>
---
tools/testing/selftests/pid_namespace/pid_max.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/pid_namespace/pid_max.c b/tools/testing/selftests/pid_namespace/pid_max.c
index 51c414faabb0..972bedc475f1 100644
--- a/tools/testing/selftests/pid_namespace/pid_max.c
+++ b/tools/testing/selftests/pid_namespace/pid_max.c
@@ -11,6 +11,7 @@
#include <string.h>
#include <syscall.h>
#include <sys/wait.h>
+#include <sys/mount.h>
#include "../kselftest_harness.h"
#include "../pidfd/pidfd.h"
--
2.48.1
This patchset add ftrace helpers functions and
add a new test makes sure that ftrace can trace
a function that was introduced by a livepatch.
Signed-off-by: Filipe Xavier <felipeaggger(a)gmail.com>
---
Filipe Xavier (2):
selftests: livepatch: add new ftrace helpers functions
selftests: livepatch: test if ftrace can trace a livepatched function
tools/testing/selftests/livepatch/functions.sh | 45 ++++++++++++++++++++++++
tools/testing/selftests/livepatch/test-ftrace.sh | 35 ++++++++++++++++++
2 files changed, 80 insertions(+)
---
base-commit: 848e076317446f9c663771ddec142d7c2eb4cb43
change-id: 20250306-ftrace-sftest-livepatch-60d9dc472235
Best regards,
--
Filipe Xavier <felipeaggger(a)gmail.com>
As the vIOMMU infrastructure series part-3, this introduces a new vEVENTQ
object. The existing FAULT object provides a nice notification pathway to
the user space with a queue already, so let vEVENTQ reuse that.
Mimicing the HWPT structure, add a common EVENTQ structure to support its
derivatives: IOMMUFD_OBJ_FAULT (existing) and IOMMUFD_OBJ_VEVENTQ (new).
An IOMMUFD_CMD_VEVENTQ_ALLOC is introduced to allocate vEVENTQ object for
vIOMMUs. One vIOMMU can have multiple vEVENTQs in different types but can
not support multiple vEVENTQs in the same type.
The forwarding part is fairly simple but might need to replace a physical
device ID with a virtual device ID in a driver-level event data structure.
So, this also adds some helpers for drivers to use.
As usual, this series comes with the selftest coverage for this new ioctl
and with a real world use case in the ARM SMMUv3 driver.
This is on Github:
https://github.com/nicolinc/iommufd/commits/iommufd_veventq-v9
Paring QEMU branch for testing:
https://github.com/nicolinc/qemu/commits/wip/for_iommufd_veventq-v9
Changelog
v9
* Add Acked-by from Will
* Fix typo in commit logs and reviewer name
* Allow invaid nested STE for C_BAD_STE report
* Drop extra indentation in arm_smmu_handle_event()
* Drop comments in arm_smmu_attach_prepare_vmaster()
v8
https://lore.kernel.org/all/cover.1740504232.git.nicolinc@nvidia.com/
* Add Reviewed-by from Jason and Pranjal
* Fix errno returned in arm_smmu_handle_event()
* Validate domain->type outside of arm_smmu_attach_prepare_vmaster()
* Drop unnecessary vmaster comparison in arm_smmu_attach_commit_vmaster()
v7
https://lore.kernel.org/all/cover.1740238876.git.nicolinc@nvidia.com/
* Rebase on Jason's for-next tree for latest fault.c
* Add Reviewed-by
* Update commit logs
* Add __reserved field sanity
* Skip kfree() on the static header
* Replace "bool on_list" with list_is_last()
* Use u32 for flags in iommufd_vevent_header
* Drop casting in iommufd_viommu_get_vdev_id()
* Update the bounding logic to veventq->sequence
* Add missing cpu_to_le64() around STRTAB_STE_1_MEV
* Reuse veventq->common.lock to fence sequence and num_events
* Rename overflow to lost_events and log it in upon kmalloc failure
* Correct the error handling part in iommufd_veventq_deliver_fetch()
* Add an arm_smmu_clear_vmaster() to simplify identity/blocked domain
attach ops
* Add additional four event records to forward to user space VM, and
update the uAPI doc
* Reuse the existing smmu->streams_mutex lock to fence master->vmaster
pointer, instead of adding a new rwsem
v6
https://lore.kernel.org/all/cover.1737754129.git.nicolinc@nvidia.com/
* Drop supports_veventq viommu op
* Split bug/cosmetics fixes out of the series
* Drop the blocking mutex around copy_to_user()
* Add veventq_depth in uAPI to limit vEVENTQ size
* Revise the documentation for a clear description
* Fix sparse warnings in arm_vmaster_report_event()
* Rework iommufd_viommu_get_vdev_id() to return -ENOENT v.s. 0
* Allow Abort/Bypass STEs to allocate vEVENTQ and set STE.MEV for DoS
mitigations
v5
https://lore.kernel.org/all/cover.1736237481.git.nicolinc@nvidia.com/
* Add Reviewed-by from Baolu
* Reorder the OBJ list as well
* Fix alphabetical order after renaming in v4
* Add supports_veventq viommu op for vEVENTQ type validation
v4
https://lore.kernel.org/all/cover.1735933254.git.nicolinc@nvidia.com/
* Rename "vIRQ" to "vEVENTQ"
* Use flexible array in struct iommufd_vevent
* Add the new ioctl command to union ucmd_buffer
* Fix the alphabetical order in union ucmd_buffer too
* Rename _TYPE_NONE to _TYPE_DEFAULT aligning with vIOMMU naming
v3
https://lore.kernel.org/all/cover.1734477608.git.nicolinc@nvidia.com/
* Rebase on Will's for-joerg/arm-smmu/updates for arm_smmu_event series
* Add "Reviewed-by" lines from Kevin
* Fix typos in comments, kdocs, and jump tags
* Add a patch to sort struct iommufd_ioctl_op
* Update iommufd's userpsace-api documentation
* Update uAPI kdoc to quote SMMUv3 offical spec
* Drop the unused workqueue in struct iommufd_virq
* Drop might_sleep() in iommufd_viommu_report_irq() helper
* Add missing "break" in iommufd_viommu_get_vdev_id() helper
* Shrink the scope of the vmaster's read lock in SMMUv3 driver
* Pass in two arguments to iommufd_eventq_virq_handler() helper
* Move "!ops || !ops->read" validation into iommufd_eventq_init()
* Move "fault->ictx = ictx" closer to iommufd_ctx_get(fault->ictx)
* Update commit message for arm_smmu_attach_prepare/commit_vmaster()
* Keep "iommufd_fault" as-is and rename "iommufd_eventq_virq" to just
"iommufd_virq"
v2
https://lore.kernel.org/all/cover.1733263737.git.nicolinc@nvidia.com/
* Rebase on v6.13-rc1
* Add IOPF and vIRQ in iommufd.rst (userspace-api)
* Add a proper locking in iommufd_event_virq_destroy
* Add iommufd_event_virq_abort with a lockdep_assert_held
* Rename "EVENT_*" to "EVENTQ_*" to describe the objects better
* Reorganize flows in iommufd_eventq_virq_alloc for abort() to work
* Adde struct arm_smmu_vmaster to store vSID upon attaching to a nested
domain, calling a newly added iommufd_viommu_get_vdev_id helper
* Adde an arm_vmaster_report_event helper in arm-smmu-v3-iommufd file
to simplify the routine in arm_smmu_handle_evt() of the main driver
v1
https://lore.kernel.org/all/cover.1724777091.git.nicolinc@nvidia.com/
Thanks!
Nicolin
Nicolin Chen (14):
iommufd/fault: Move two fault functions out of the header
iommufd/fault: Add an iommufd_fault_init() helper
iommufd: Abstract an iommufd_eventq from iommufd_fault
iommufd: Rename fault.c to eventq.c
iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOC
iommufd/viommu: Add iommufd_viommu_get_vdev_id helper
iommufd/viommu: Add iommufd_viommu_report_event helper
iommufd/selftest: Require vdev_id when attaching to a nested domain
iommufd/selftest: Add IOMMU_TEST_OP_TRIGGER_VEVENT for vEVENTQ
coverage
iommufd/selftest: Add IOMMU_VEVENTQ_ALLOC test coverage
Documentation: userspace-api: iommufd: Update FAULT and VEVENTQ
iommu/arm-smmu-v3: Introduce struct arm_smmu_vmaster
iommu/arm-smmu-v3: Report events that belong to devices attached to
vIOMMU
iommu/arm-smmu-v3: Set MEV bit in nested STE for DoS mitigations
drivers/iommu/iommufd/Makefile | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 36 ++
drivers/iommu/iommufd/iommufd_private.h | 135 +++-
drivers/iommu/iommufd/iommufd_test.h | 10 +
include/linux/iommufd.h | 23 +
include/uapi/linux/iommufd.h | 105 +++
tools/testing/selftests/iommu/iommufd_utils.h | 115 ++++
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 60 ++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 80 ++-
drivers/iommu/iommufd/driver.c | 72 +++
drivers/iommu/iommufd/eventq.c | 597 ++++++++++++++++++
drivers/iommu/iommufd/fault.c | 342 ----------
drivers/iommu/iommufd/hw_pagetable.c | 6 +-
drivers/iommu/iommufd/main.c | 7 +
drivers/iommu/iommufd/selftest.c | 54 ++
drivers/iommu/iommufd/viommu.c | 2 +
tools/testing/selftests/iommu/iommufd.c | 36 ++
.../selftests/iommu/iommufd_fail_nth.c | 7 +
Documentation/userspace-api/iommufd.rst | 17 +
19 files changed, 1298 insertions(+), 408 deletions(-)
create mode 100644 drivers/iommu/iommufd/eventq.c
delete mode 100644 drivers/iommu/iommufd/fault.c
base-commit: a05df03a88bc1088be8e9d958f208d6484691e43
--
2.43.0
This picks up from Michal Rostecki's work[0]. Per Michal's guidance I
have omitted Co-authored tags, as the end result is quite different.
Link: https://lore.kernel.org/rust-for-linux/20240819153656.28807-2-vadorovsky@pr… [0]
Closes: https://github.com/Rust-for-Linux/linux/issues/1075
Signed-off-by: Tamir Duberstein <tamird(a)gmail.com>
---
Changes in v9:
- Rebase on rust-next.
- Restore `impl Display for BStr` which exists upstream[1].
- Link: https://doc.rust-lang.org/nightly/std/bstr/struct.ByteStr.html#impl-Display… [1]
- Link to v8: https://lore.kernel.org/r/20250203-cstr-core-v8-0-cb3f26e78686@gmail.com
Changes in v8:
- Move `{from,as}_char_ptr` back to `CStrExt`. This reduces the diff
some.
- Restore `from_bytes_with_nul_unchecked_mut`, `to_cstring`.
- Link to v7: https://lore.kernel.org/r/20250202-cstr-core-v7-0-da1802520438@gmail.com
Changes in v7:
- Rebased on mainline.
- Restore functionality added in commit a321f3ad0a5d ("rust: str: add
{make,to}_{upper,lower}case() to CString").
- Used `diff.algorithm patience` to improve diff readability.
- Link to v6: https://lore.kernel.org/r/20250202-cstr-core-v6-0-8469cd6d29fd@gmail.com
Changes in v6:
- Split the work into several commits for ease of review.
- Restore `{from,as}_char_ptr` to allow building on ARM (see commit
message).
- Add `CStrExt` to `kernel::prelude`. (Alice Ryhl)
- Remove `CStrExt::from_bytes_with_nul_unchecked_mut` and restore
`DerefMut for CString`. (Alice Ryhl)
- Rename and hide `kernel::c_str!` to encourage use of C-String
literals.
- Drop implementation and invocation changes in kunit.rs. (Trevor Gross)
- Drop docs on `Display` impl. (Trevor Gross)
- Rewrite docs in the style of the standard library.
- Restore the `test_cstr_debug` unit tests to demonstrate that the
implementation has changed.
Changes in v5:
- Keep the `test_cstr_display*` unit tests.
Changes in v4:
- Provide the `CStrExt` trait with `display()` method, which returns a
`CStrDisplay` wrapper with `Display` implementation. This addresses
the lack of `Display` implementation for `core::ffi::CStr`.
- Provide `from_bytes_with_nul_unchecked_mut()` method in `CStrExt`,
which might be useful and is going to prevent manual, unsafe casts.
- Fix a typo (s/preffered/prefered/).
Changes in v3:
- Fix the commit message.
- Remove redundant braces in `use`, when only one item is imported.
Changes in v2:
- Do not remove `c_str` macro. While it's preferred to use C-string
literals, there are two cases where `c_str` is helpful:
- When working with macros, which already return a Rust string literal
(e.g. `stringify!`).
- When building macros, where we want to take a Rust string literal as an
argument (for caller's convenience), but still use it as a C-string
internally.
- Use Rust literals as arguments in macros (`new_mutex`, `new_condvar`,
`new_mutex`). Use the `c_str` macro to convert these literals to C-string
literals.
- Use `c_str` in kunit.rs for converting the output of `stringify!` to a
`CStr`.
- Remove `DerefMut` implementation for `CString`.
---
Tamir Duberstein (4):
rust: move `CStr`'s `Display` to helper struct
rust: replace `CStr` with `core::ffi::CStr`
rust: replace `kernel::c_str!` with C-Strings
rust: remove core::ffi::CStr reexport
drivers/gpu/drm/drm_panic_qr.rs | 6 +-
drivers/net/phy/ax88796b_rust.rs | 8 +-
drivers/net/phy/qt2025.rs | 6 +-
rust/kernel/device.rs | 7 +-
rust/kernel/devres.rs | 2 +-
rust/kernel/driver.rs | 4 +-
rust/kernel/error.rs | 10 +-
rust/kernel/faux.rs | 5 +-
rust/kernel/firmware.rs | 8 +-
rust/kernel/kunit.rs | 18 +-
rust/kernel/lib.rs | 2 +-
rust/kernel/miscdevice.rs | 5 +-
rust/kernel/net/phy.rs | 12 +-
rust/kernel/of.rs | 5 +-
rust/kernel/pci.rs | 3 +-
rust/kernel/platform.rs | 7 +-
rust/kernel/prelude.rs | 2 +-
rust/kernel/seq_file.rs | 4 +-
rust/kernel/str.rs | 499 +++++++++++++----------------------
rust/kernel/sync.rs | 4 +-
rust/kernel/sync/condvar.rs | 3 +-
rust/kernel/sync/lock.rs | 4 +-
rust/kernel/sync/lock/global.rs | 6 +-
rust/kernel/sync/poll.rs | 1 +
rust/kernel/workqueue.rs | 1 +
rust/macros/module.rs | 2 +-
samples/rust/rust_driver_faux.rs | 4 +-
samples/rust/rust_driver_pci.rs | 4 +-
samples/rust/rust_driver_platform.rs | 4 +-
samples/rust/rust_misc_device.rs | 3 +-
30 files changed, 256 insertions(+), 393 deletions(-)
---
base-commit: 433b1bd6e0a98938105c43c0553f24e0747ef52c
change-id: 20250201-cstr-core-d4b9b69120cf
Best regards,
--
Tamir Duberstein <tamird(a)gmail.com>
This started with a patch that enabled `clippy::ptr_as_ptr`. Benno
Lossin suggested I also look into `clippy::ptr_cast_constness` and I
discovered `clippy::as_ptr_cast_mut`. This series now enables all 3
lints. It also enables `clippy::as_underscore` which ensures other
pointer casts weren't missed. The first commit reduces the need for
pointer casts and is shared with another series[1].
The final patch also enables pointer provenance lints and fixes
violations. See that commit message for details. The build system
portion of that commit is pretty messy but I couldn't find a better way
to convincingly ensure that these lints were applied globally.
Suggestions would be very welcome.
Link: https://lore.kernel.org/all/20250307-no-offset-v1-0-0c728f63b69c@gmail.com/ [1]
Signed-off-by: Tamir Duberstein <tamird(a)gmail.com>
---
Changes in v4:
- Add missing SoB. (Benno Lossin)
- Use `without_provenance_mut` in alloc. (Boqun Feng)
- Limit strict provenance lints to the `kernel` crate to avoid complex
logic in the build system. This can be revisited on MSRV >= 1.84.0.
- Rebase on rust-next.
- Link to v3: https://lore.kernel.org/r/20250314-ptr-as-ptr-v3-0-e7ba61048f4a@gmail.com
Changes in v3:
- Fixed clippy warning in rust/kernel/firmware.rs. (kernel test robot)
Link: https://lore.kernel.org/all/202503120332.YTCpFEvv-lkp@intel.com/
- s/as u64/as bindings::phys_addr_t/g. (Benno Lossin)
- Use strict provenance APIs and enable lints. (Benno Lossin)
- Link to v2: https://lore.kernel.org/r/20250309-ptr-as-ptr-v2-0-25d60ad922b7@gmail.com
Changes in v2:
- Fixed typo in first commit message.
- Added additional patches, converted to series.
- Link to v1: https://lore.kernel.org/r/20250307-ptr-as-ptr-v1-1-582d06514c98@gmail.com
---
Tamir Duberstein (6):
rust: retain pointer mut-ness in `container_of!`
rust: enable `clippy::ptr_as_ptr` lint
rust: enable `clippy::ptr_cast_constness` lint
rust: enable `clippy::as_ptr_cast_mut` lint
rust: enable `clippy::as_underscore` lint
rust: use strict provenance APIs
Makefile | 4 +++
init/Kconfig | 3 ++
rust/bindings/lib.rs | 1 +
rust/kernel/alloc.rs | 2 +-
rust/kernel/alloc/allocator_test.rs | 2 +-
rust/kernel/alloc/kvec.rs | 4 +--
rust/kernel/block/mq/operations.rs | 2 +-
rust/kernel/block/mq/request.rs | 7 +++--
rust/kernel/device.rs | 5 +--
rust/kernel/device_id.rs | 2 +-
rust/kernel/devres.rs | 19 ++++++------
rust/kernel/error.rs | 2 +-
rust/kernel/firmware.rs | 3 +-
rust/kernel/fs/file.rs | 2 +-
rust/kernel/io.rs | 16 +++++-----
rust/kernel/kunit.rs | 15 +++++----
rust/kernel/lib.rs | 57 ++++++++++++++++++++++++++++++++--
rust/kernel/list/impl_list_item_mod.rs | 2 +-
rust/kernel/miscdevice.rs | 2 +-
rust/kernel/of.rs | 6 ++--
rust/kernel/pci.rs | 15 +++++----
rust/kernel/platform.rs | 6 ++--
rust/kernel/print.rs | 11 +++----
rust/kernel/rbtree.rs | 23 ++++++--------
rust/kernel/seq_file.rs | 3 +-
rust/kernel/str.rs | 18 +++++------
rust/kernel/sync/poll.rs | 2 +-
rust/kernel/uaccess.rs | 12 ++++---
rust/kernel/workqueue.rs | 12 +++----
rust/uapi/lib.rs | 1 +
30 files changed, 162 insertions(+), 97 deletions(-)
---
base-commit: 2aadc0fc1f85d7a9ed2822ba7ee9f06775eb6d84
change-id: 20250307-ptr-as-ptr-21b1867fc4d4
Best regards,
--
Tamir Duberstein <tamird(a)gmail.com>
As discussed here:
https://lore.kernel.org/lkml/Z9RRkL1hom48z3Tt@google.com/
This code could benefit from some more commentary.
To avoid needing to comment the same thing in multiple places (I guess
more of these SKIPs will need to be added over time, for now I am only
like 20% of the way through Project Run run_vmtests.sh Successfully),
add a dummy "skip tests for this specific reason" function that
basically just serves as a hook to hang comments on.
Signed-off-by: Brendan Jackman <jackmanb(a)google.com>
---
To: David Hildenbrand <david(a)redhat.com>
---
tools/testing/selftests/mm/gup_longterm.c | 6 +-----
tools/testing/selftests/mm/map_populate.c | 8 +++-----
tools/testing/selftests/mm/vm_util.h | 18 ++++++++++++++++++
3 files changed, 22 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/mm/gup_longterm.c b/tools/testing/selftests/mm/gup_longterm.c
index 03271442aae5aed060fd44010df552a2eedcdafc..21595b20bbc391a0e5d0ab0563ac4ce5e1e0069f 100644
--- a/tools/testing/selftests/mm/gup_longterm.c
+++ b/tools/testing/selftests/mm/gup_longterm.c
@@ -97,11 +97,7 @@ static void do_test(int fd, size_t size, enum test_type type, bool shared)
if (ftruncate(fd, size)) {
if (errno == ENOENT) {
- /*
- * This can happen if the file has been unlinked and the
- * filesystem doesn't support truncating unlinked files.
- */
- ksft_test_result_skip("ftruncate() failed with ENOENT\n");
+ skip_test_dodgy_fs("ftruncate()");
} else {
ksft_test_result_fail("ftruncate() failed (%s)\n", strerror(errno));
}
diff --git a/tools/testing/selftests/mm/map_populate.c b/tools/testing/selftests/mm/map_populate.c
index 433e54fb634f793f2eb4c53ba6b791045c9f4986..9df2636c829bf34d6d0517e126b3deda1f3ba834 100644
--- a/tools/testing/selftests/mm/map_populate.c
+++ b/tools/testing/selftests/mm/map_populate.c
@@ -18,6 +18,8 @@
#include <unistd.h>
#include "../kselftest.h"
+#include "vm_util.h"
+
#define MMAP_SZ 4096
#define BUG_ON(condition, description) \
@@ -88,11 +90,7 @@ int main(int argc, char **argv)
ret = ftruncate(fileno(ftmp), MMAP_SZ);
if (ret < 0 && errno == ENOENT) {
- /*
- * This probably means tmpfile() made a file on a filesystem
- * that doesn't handle temporary files the way we want.
- */
- ksft_exit_skip("ftruncate(fileno(tmpfile())) gave ENOENT, weird filesystem?\n");
+ skip_test_dodgy_fs("ftruncate()");
}
BUG_ON(ret, "ftruncate()");
diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h
index 0e629586556b5aae580d8e4ce7491bc93adcc4d6..6effafdc4d8a23f91f0adcb9e43d6196d651ba88 100644
--- a/tools/testing/selftests/mm/vm_util.h
+++ b/tools/testing/selftests/mm/vm_util.h
@@ -5,6 +5,7 @@
#include <err.h>
#include <strings.h> /* ffsl() */
#include <unistd.h> /* _SC_PAGESIZE */
+#include "../kselftest.h"
#define BIT_ULL(nr) (1ULL << (nr))
#define PM_SOFT_DIRTY BIT_ULL(55)
@@ -32,6 +33,23 @@ static inline unsigned int pshift(void)
return __page_shift;
}
+/*
+ * Plan 9 FS has bugs (at least on QEMU) where certain operations fail with
+ * ENOENT on unlinked files. See
+ * https://gitlab.com/qemu-project/qemu/-/issues/103 for some info about such
+ * bugs. There are rumours of NFS implementations with similar bugs.
+ *
+ * Ideally, tests should just detect filesystems known to have such issues and
+ * bail early. But 9pfs has the additional "feature" that it causes fstatfs to
+ * pass through the f_type field from the host filesystem. To avoid having to
+ * scrape /proc/mounts or some other hackery, tests can call this function when
+ * it seems such a bug might have been encountered.
+ */
+static inline void skip_test_dodgy_fs(const char *op_name)
+{
+ ksft_test_result_skip("%s failed with ENOENT. Filesystem might be buggy (9pfs?)\n", op_name);
+}
+
uint64_t pagemap_get_entry(int fd, char *start);
bool pagemap_is_softdirty(int fd, char *start);
bool pagemap_is_swapped(int fd, char *start);
---
base-commit: a91aaf8dd549dcee9caab227ecaa6cbc243bbc5a
change-id: 20250317-9pfs-comments-24b6fa5417cd
Best regards,
--
Brendan Jackman <jackmanb(a)google.com>
Signal delivery during connect() may disconnect an already established
socket. Problem is that such socket might have been placed in a sockmap
before the connection was closed.
PATCH 1 ensures this race won't lead to an unconnected vsock staying in the
sockmap. PATCH 2 selftests it.
PATCH 3 fixes a related race. Note that selftest in PATCH 2 does test this
code as well, but winning this race variant may take more than 2 seconds,
so I'm not advertising it.
Signed-off-by: Michal Luczaj <mhal(a)rbox.co>
---
Changes in v3:
- Selftest: drop unnecessary variable initialization and reorder the calls
- Link to v2: https://lore.kernel.org/r/20250314-vsock-trans-signal-race-v2-0-421a41f60f4…
Changes in v2:
- Handle one more path of tripping the warning
- Add a selftest
- Collect R-b [Stefano]
- Link to v1: https://lore.kernel.org/r/20250307-vsock-trans-signal-race-v1-1-3aca3f771fb…
---
Michal Luczaj (3):
vsock/bpf: Fix EINTR connect() racing sockmap update
selftest/bpf: Add test for AF_VSOCK connect() racing sockmap update
vsock/bpf: Fix bpf recvmsg() racing transport reassignment
net/vmw_vsock/af_vsock.c | 10 ++-
net/vmw_vsock/vsock_bpf.c | 24 ++++--
.../selftests/bpf/prog_tests/sockmap_basic.c | 97 ++++++++++++++++++++++
3 files changed, 122 insertions(+), 9 deletions(-)
---
base-commit: da9e8efe7ee10e8425dc356a9fc593502c8e3933
change-id: 20250305-vsock-trans-signal-race-d62f7718d099
Best regards,
--
Michal Luczaj <mhal(a)rbox.co>
virtio-net have two usage of hashes: one is RSS and another is hash
reporting. Conventionally the hash calculation was done by the VMM.
However, computing the hash after the queue was chosen defeats the
purpose of RSS.
Another approach is to use eBPF steering program. This approach has
another downside: it cannot report the calculated hash due to the
restrictive nature of eBPF.
Introduce the code to compute hashes to the kernel in order to overcome
thse challenges.
An alternative solution is to extend the eBPF steering program so that it
will be able to report to the userspace, but it is based on context
rewrites, which is in feature freeze. We can adopt kfuncs, but they will
not be UAPIs. We opt to ioctl to align with other relevant UAPIs (KVM
and vhost_net).
The patches for QEMU to use this new feature was submitted as RFC and
is available at:
https://patchew.org/QEMU/20250313-hash-v4-0-c75c494b495e@daynix.com/
This work was presented at LPC 2024:
https://lpc.events/event/18/contributions/1963/
V1 -> V2:
Changed to introduce a new BPF program type.
Signed-off-by: Akihiko Odaki <akihiko.odaki(a)daynix.com>
---
Changes in v10:
- Split common code and TUN/TAP-specific code into separate patches.
- Reverted a spurious style change in patch "tun: Introduce virtio-net
hash feature".
- Added a comment explaining disable_ipv6 in tests.
- Used AF_PACKET for patch "selftest: tun: Add tests for
virtio-net hashing". I also added the usage of FIXTURE_VARIANT() as
the testing function now needs access to more variant-specific
variables.
- Corrected the message of patch "selftest: tun: Add tests for
virtio-net hashing"; it mentioned validation of configuration but
it is not scope of this patch.
- Expanded the description of patch "selftest: tun: Add tests for
virtio-net hashing".
- Added patch "tun: Allow steering eBPF program to fall back".
- Changed to handle TUNGETVNETHASHCAP before taking the rtnl lock.
- Removed redundant tests for tun_vnet_ioctl().
- Added patch "selftest: tap: Add tests for virtio-net ioctls".
- Added a design explanation of ioctls for extensibility and migration.
- Removed a few branches in patch
"vhost/net: Support VIRTIO_NET_F_HASH_REPORT".
- Link to v9: https://lore.kernel.org/r/20250307-rss-v9-0-df76624025eb@daynix.com
Changes in v9:
- Added a missing return statement in patch
"tun: Introduce virtio-net hash feature".
- Link to v8: https://lore.kernel.org/r/20250306-rss-v8-0-7ab4f56ff423@daynix.com
Changes in v8:
- Disabled IPv6 to eliminate noises in tests.
- Added a branch in tap to avoid unnecessary dissection when hash
reporting is disabled.
- Removed unnecessary rtnl_lock().
- Extracted code to handle new ioctls into separate functions to avoid
adding extra NULL checks to the code handling other ioctls.
- Introduced variable named "fd" to __tun_chr_ioctl().
- s/-/=/g in a patch message to avoid confusing Git.
- Link to v7: https://lore.kernel.org/r/20250228-rss-v7-0-844205cbbdd6@daynix.com
Changes in v7:
- Ensured to set hash_report to VIRTIO_NET_HASH_REPORT_NONE for
VHOST_NET_F_VIRTIO_NET_HDR.
- s/4/sizeof(u32)/ in patch "virtio_net: Add functions for hashing".
- Added tap_skb_cb type.
- Rebased.
- Link to v6: https://lore.kernel.org/r/20250109-rss-v6-0-b1c90ad708f6@daynix.com
Changes in v6:
- Extracted changes to fill vnet header holes into another series.
- Squashed patches "skbuff: Introduce SKB_EXT_TUN_VNET_HASH", "tun:
Introduce virtio-net hash reporting feature", and "tun: Introduce
virtio-net RSS" into patch "tun: Introduce virtio-net hash feature".
- Dropped the RFC tag.
- Link to v5: https://lore.kernel.org/r/20241008-rss-v5-0-f3cf68df005d@daynix.com
Changes in v5:
- Fixed a compilation error with CONFIG_TUN_VNET_CROSS_LE.
- Optimized the calculation of the hash value according to:
https://git.dpdk.org/dpdk/commit/?id=3fb1ea032bd6ff8317af5dac9af901f1f324ca…
- Added patch "tun: Unify vnet implementation".
- Dropped patch "tap: Pad virtio header with zero".
- Added patch "selftest: tun: Test vnet ioctls without device".
- Reworked selftests to skip for older kernels.
- Documented the case when the underlying device is deleted and packets
have queue_mapping set by TC.
- Reordered test harness arguments.
- Added code to handle fragmented packets.
- Link to v4: https://lore.kernel.org/r/20240924-rss-v4-0-84e932ec0e6c@daynix.com
Changes in v4:
- Moved tun_vnet_hash_ext to if_tun.h.
- Renamed virtio_net_toeplitz() to virtio_net_toeplitz_calc().
- Replaced htons() with cpu_to_be16().
- Changed virtio_net_hash_rss() to return void.
- Reordered variable declarations in virtio_net_hash_rss().
- Removed virtio_net_hdr_v1_hash_from_skb().
- Updated messages of "tap: Pad virtio header with zero" and
"tun: Pad virtio header with zero".
- Fixed vnet_hash allocation size.
- Ensured to free vnet_hash when destructing tun_struct.
- Link to v3: https://lore.kernel.org/r/20240915-rss-v3-0-c630015db082@daynix.com
Changes in v3:
- Reverted back to add ioctl.
- Split patch "tun: Introduce virtio-net hashing feature" into
"tun: Introduce virtio-net hash reporting feature" and
"tun: Introduce virtio-net RSS".
- Changed to reuse hash values computed for automq instead of performing
RSS hashing when hash reporting is requested but RSS is not.
- Extracted relevant data from struct tun_struct to keep it minimal.
- Added kernel-doc.
- Changed to allow calling TUNGETVNETHASHCAP before TUNSETIFF.
- Initialized num_buffers with 1.
- Added a test case for unclassified packets.
- Fixed error handling in tests.
- Changed tests to verify that the queue index will not overflow.
- Rebased.
- Link to v2: https://lore.kernel.org/r/20231015141644.260646-1-akihiko.odaki@daynix.com
---
Akihiko Odaki (10):
virtio_net: Add functions for hashing
net: flow_dissector: Export flow_keys_dissector_symmetric
tun: Allow steering eBPF program to fall back
tun: Add common virtio-net hash feature code
tun: Introduce virtio-net hash feature
tap: Introduce virtio-net hash feature
selftest: tun: Test vnet ioctls without device
selftest: tun: Add tests for virtio-net hashing
selftest: tap: Add tests for virtio-net ioctls
vhost/net: Support VIRTIO_NET_F_HASH_REPORT
Documentation/networking/tuntap.rst | 7 +
drivers/net/Kconfig | 1 +
drivers/net/tap.c | 68 ++++-
drivers/net/tun.c | 90 +++++--
drivers/net/tun_vnet.h | 155 ++++++++++-
drivers/vhost/net.c | 68 ++---
include/linux/if_tap.h | 2 +
include/linux/skbuff.h | 3 +
include/linux/virtio_net.h | 188 ++++++++++++++
include/net/flow_dissector.h | 1 +
include/uapi/linux/if_tun.h | 82 ++++++
net/core/flow_dissector.c | 3 +-
net/core/skbuff.c | 4 +
tools/testing/selftests/net/Makefile | 2 +-
tools/testing/selftests/net/tap.c | 97 ++++++-
tools/testing/selftests/net/tun.c | 491 ++++++++++++++++++++++++++++++++++-
16 files changed, 1185 insertions(+), 77 deletions(-)
---
base-commit: dd83757f6e686a2188997cb58b5975f744bb7786
change-id: 20240403-rss-e737d89efa77
prerequisite-change-id: 20241230-tun-66e10a49b0c7:v6
prerequisite-patch-id: 871dc5f146fb6b0e3ec8612971a8e8190472c0fb
prerequisite-patch-id: 2797ed249d32590321f088373d4055ff3f430a0e
prerequisite-patch-id: ea3370c72d4904e2f0536ec76ba5d26784c0cede
prerequisite-patch-id: 837e4cf5d6b451424f9b1639455e83a260c4440d
prerequisite-patch-id: ea701076f57819e844f5a35efe5cbc5712d3080d
prerequisite-patch-id: 701646fb43ad04cc64dd2bf13c150ccbe6f828ce
prerequisite-patch-id: 53176dae0c003f5b6c114d43f936cf7140d31bb5
prerequisite-change-id: 20250116-buffers-96e14bf023fc:v2
prerequisite-patch-id: 25fd4f99d4236a05a5ef16ab79f3e85ee57e21cc
Best regards,
--
Akihiko Odaki <akihiko.odaki(a)daynix.com>
On Friday, 14 March 2025 05:14:30 CDT Su Hui wrote:
> On 2025/3/14 17:21, Dan Carpenter wrote:
> > On Fri, Mar 14, 2025 at 03:14:51PM +0800, Su Hui wrote:
> >> When 'manual=false' and 'signaled=true', then expected value when using
> >> NTSYNC_IOC_CREATE_EVENT should be greater than zero. Fix this typo error.
> >>
> >> Signed-off-by: Su Hui<suhui(a)nfschina.com>
> >> ---
> >> tools/testing/selftests/drivers/ntsync/ntsync.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/tools/testing/selftests/drivers/ntsync/ntsync.c b/tools/testing/selftests/drivers/ntsync/ntsync.c
> >> index 3aad311574c4..bfb6fad653d0 100644
> >> --- a/tools/testing/selftests/drivers/ntsync/ntsync.c
> >> +++ b/tools/testing/selftests/drivers/ntsync/ntsync.c
> >> @@ -968,7 +968,7 @@ TEST(wake_all)
> >> auto_event_args.manual = false;
> >> auto_event_args.signaled = true;
> >> objs[3] = ioctl(fd, NTSYNC_IOC_CREATE_EVENT, &auto_event_args);
> >> - EXPECT_EQ(0, objs[3]);
> >> + EXPECT_LE(0, objs[3]);
> > It's kind of weird how these macros put the constant on the left.
> > It returns an "fd" on success. So this look reasonable. It probably
> > won't return the zero fd so we could probably check EXPECT_LT()?
> Agreed, there are about 29 items that can be changed to EXPECT_LT().
> I can send a v2 patchset with this change if there is no more other
> suggestions.
I personally think it looks wrong to use EXPECT_LT(), but I'll certainly defer to a higher maintainer on this point.
Replacing all occurrences of `addr_of!(place)` with `&raw const place`, and
all occurrences of `addr_of_mut!(place)` with `&raw mut place`.
Utilizing the new feature will allow us to reduce macro complexity, and
improve consistency with existing reference syntax as `&raw const`, `&raw mut`
is very similar to `&`, `&mut` making it fit more naturally with other
existing code.
Suggested-by: Benno Lossin <benno.lossin(a)proton.me>
Link: https://github.com/Rust-for-Linux/linux/issues/1148
Signed-off-by: Antonio Hickey <contact(a)antoniohickey.com>
---
rust/kernel/block/mq/request.rs | 4 ++--
rust/kernel/faux.rs | 4 ++--
rust/kernel/fs/file.rs | 2 +-
rust/kernel/init.rs | 8 ++++----
rust/kernel/init/macros.rs | 28 +++++++++++++-------------
rust/kernel/jump_label.rs | 4 ++--
rust/kernel/kunit.rs | 4 ++--
rust/kernel/list.rs | 2 +-
rust/kernel/list/impl_list_item_mod.rs | 6 +++---
rust/kernel/net/phy.rs | 4 ++--
rust/kernel/pci.rs | 4 ++--
rust/kernel/platform.rs | 4 +---
rust/kernel/rbtree.rs | 22 ++++++++++----------
rust/kernel/sync/arc.rs | 2 +-
rust/kernel/task.rs | 4 ++--
rust/kernel/workqueue.rs | 8 ++++----
16 files changed, 54 insertions(+), 56 deletions(-)
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 7943f43b9575..4a5b7ec914ef 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -12,7 +12,7 @@
};
use core::{
marker::PhantomData,
- ptr::{addr_of_mut, NonNull},
+ ptr::NonNull,
sync::atomic::{AtomicU64, Ordering},
};
@@ -187,7 +187,7 @@ pub(crate) fn refcount(&self) -> &AtomicU64 {
pub(crate) unsafe fn refcount_ptr(this: *mut Self) -> *mut AtomicU64 {
// SAFETY: Because of the safety requirements of this function, the
// field projection is safe.
- unsafe { addr_of_mut!((*this).refcount) }
+ unsafe { &raw mut (*this).refcount }
}
}
diff --git a/rust/kernel/faux.rs b/rust/kernel/faux.rs
index 5acc0c02d451..52ac554c1119 100644
--- a/rust/kernel/faux.rs
+++ b/rust/kernel/faux.rs
@@ -7,7 +7,7 @@
//! C header: [`include/linux/device/faux.h`]
use crate::{bindings, device, error::code::*, prelude::*};
-use core::ptr::{addr_of_mut, null, null_mut, NonNull};
+use core::ptr::{null, null_mut, NonNull};
/// The registration of a faux device.
///
@@ -45,7 +45,7 @@ impl AsRef<device::Device> for Registration {
fn as_ref(&self) -> &device::Device {
// SAFETY: The underlying `device` in `faux_device` is guaranteed by the C API to be
// a valid initialized `device`.
- unsafe { device::Device::as_ref(addr_of_mut!((*self.as_raw()).dev)) }
+ unsafe { device::Device::as_ref((&raw mut (*self.as_raw()).dev)) }
}
}
diff --git a/rust/kernel/fs/file.rs b/rust/kernel/fs/file.rs
index ed57e0137cdb..7ee4830b67f3 100644
--- a/rust/kernel/fs/file.rs
+++ b/rust/kernel/fs/file.rs
@@ -331,7 +331,7 @@ pub fn flags(&self) -> u32 {
// SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.
//
// FIXME(read_once): Replace with `read_once` when available on the Rust side.
- unsafe { core::ptr::addr_of!((*self.as_ptr()).f_flags).read_volatile() }
+ unsafe { (&raw const (*self.as_ptr()).f_flags).read_volatile() }
}
}
diff --git a/rust/kernel/init.rs b/rust/kernel/init.rs
index 7fd1ea8265a5..a8fac6558671 100644
--- a/rust/kernel/init.rs
+++ b/rust/kernel/init.rs
@@ -122,7 +122,7 @@
//! ```rust
//! # #![expect(unreachable_pub, clippy::disallowed_names)]
//! use kernel::{init, types::Opaque};
-//! use core::{ptr::addr_of_mut, marker::PhantomPinned, pin::Pin};
+//! use core::{marker::PhantomPinned, pin::Pin};
//! # mod bindings {
//! # #![expect(non_camel_case_types)]
//! # #![expect(clippy::missing_safety_doc)]
@@ -159,7 +159,7 @@
//! unsafe {
//! init::pin_init_from_closure(move |slot: *mut Self| {
//! // `slot` contains uninit memory, avoid creating a reference.
-//! let foo = addr_of_mut!((*slot).foo);
+//! let foo = &raw mut (*slot).foo;
//!
//! // Initialize the `foo`
//! bindings::init_foo(Opaque::raw_get(foo));
@@ -541,7 +541,7 @@ macro_rules! stack_try_pin_init {
///
/// ```rust
/// # use kernel::{macros::{Zeroable, pin_data}, pin_init};
-/// # use core::{ptr::addr_of_mut, marker::PhantomPinned};
+/// # use core::marker::PhantomPinned;
/// #[pin_data]
/// #[derive(Zeroable)]
/// struct Buf {
@@ -554,7 +554,7 @@ macro_rules! stack_try_pin_init {
/// pin_init!(&this in Buf {
/// buf: [0; 64],
/// // SAFETY: TODO.
-/// ptr: unsafe { addr_of_mut!((*this.as_ptr()).buf).cast() },
+/// ptr: unsafe { &raw mut (*this.as_ptr()).buf.cast() },
/// pin: PhantomPinned,
/// });
/// pin_init!(Buf {
diff --git a/rust/kernel/init/macros.rs b/rust/kernel/init/macros.rs
index 1fd146a83241..af525fbb2f01 100644
--- a/rust/kernel/init/macros.rs
+++ b/rust/kernel/init/macros.rs
@@ -244,25 +244,25 @@
//! struct __InitOk;
//! // This is the expansion of `t,`, which is syntactic sugar for `t: t,`.
//! {
-//! unsafe { ::core::ptr::write(::core::addr_of_mut!((*slot).t), t) };
+//! unsafe { ::core::ptr::write(&raw mut (*slot).t, t) };
//! }
//! // Since initialization could fail later (not in this case, since the
//! // error type is `Infallible`) we will need to drop this field if there
//! // is an error later. This `DropGuard` will drop the field when it gets
//! // dropped and has not yet been forgotten.
//! let __t_guard = unsafe {
-//! ::pinned_init::__internal::DropGuard::new(::core::addr_of_mut!((*slot).t))
+//! ::pinned_init::__internal::DropGuard::new(&raw mut (*slot).t)
//! };
//! // Expansion of `x: 0,`:
//! // Since this can be an arbitrary expression we cannot place it inside
//! // of the `unsafe` block, so we bind it here.
//! {
//! let x = 0;
-//! unsafe { ::core::ptr::write(::core::addr_of_mut!((*slot).x), x) };
+//! unsafe { ::core::ptr::write(&raw mut (*slot).x, x) };
//! }
//! // We again create a `DropGuard`.
//! let __x_guard = unsafe {
-//! ::kernel::init::__internal::DropGuard::new(::core::addr_of_mut!((*slot).x))
+//! ::kernel::init::__internal::DropGuard::new(&raw mut (*slot).x)
//! };
//! // Since initialization has successfully completed, we can now forget
//! // the guards. This is not `mem::forget`, since we only have
@@ -459,15 +459,15 @@
//! {
//! struct __InitOk;
//! {
-//! unsafe { ::core::ptr::write(::core::addr_of_mut!((*slot).a), a) };
+//! unsafe { ::core::ptr::write(&raw mut (*slot).a, a) };
//! }
//! let __a_guard = unsafe {
-//! ::kernel::init::__internal::DropGuard::new(::core::addr_of_mut!((*slot).a))
+//! ::kernel::init::__internal::DropGuard::new(&raw mut (*slot).a)
//! };
//! let init = Bar::new(36);
-//! unsafe { data.b(::core::addr_of_mut!((*slot).b), b)? };
+//! unsafe { data.b(&raw mut (*slot).b, b)? };
//! let __b_guard = unsafe {
-//! ::kernel::init::__internal::DropGuard::new(::core::addr_of_mut!((*slot).b))
+//! ::kernel::init::__internal::DropGuard::new(&raw mut (*slot).b)
//! };
//! ::core::mem::forget(__b_guard);
//! ::core::mem::forget(__a_guard);
@@ -1210,7 +1210,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
// SAFETY: `slot` is valid, because we are inside of an initializer closure, we
// return when an error/panic occurs.
// We also use the `data` to require the correct trait (`Init` or `PinInit`) for `$field`.
- unsafe { $data.$field(::core::ptr::addr_of_mut!((*$slot).$field), init)? };
+ unsafe { $data.$field(&raw mut (*$slot).$field, init)? };
// Create the drop guard:
//
// We rely on macro hygiene to make it impossible for users to access this local variable.
@@ -1218,7 +1218,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
::kernel::macros::paste! {
// SAFETY: We forget the guard later when initialization has succeeded.
let [< __ $field _guard >] = unsafe {
- $crate::init::__internal::DropGuard::new(::core::ptr::addr_of_mut!((*$slot).$field))
+ $crate::init::__internal::DropGuard::new(&raw mut (*$slot).$field)
};
$crate::__init_internal!(init_slot($use_data):
@@ -1241,7 +1241,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
//
// SAFETY: `slot` is valid, because we are inside of an initializer closure, we
// return when an error/panic occurs.
- unsafe { $crate::init::Init::__init(init, ::core::ptr::addr_of_mut!((*$slot).$field))? };
+ unsafe { $crate::init::Init::__init(init, &raw mut (*$slot).$field)? };
// Create the drop guard:
//
// We rely on macro hygiene to make it impossible for users to access this local variable.
@@ -1249,7 +1249,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
::kernel::macros::paste! {
// SAFETY: We forget the guard later when initialization has succeeded.
let [< __ $field _guard >] = unsafe {
- $crate::init::__internal::DropGuard::new(::core::ptr::addr_of_mut!((*$slot).$field))
+ $crate::init::__internal::DropGuard::new(&raw mut (*$slot).$field)
};
$crate::__init_internal!(init_slot():
@@ -1272,7 +1272,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
// Initialize the field.
//
// SAFETY: The memory at `slot` is uninitialized.
- unsafe { ::core::ptr::write(::core::ptr::addr_of_mut!((*$slot).$field), $field) };
+ unsafe { ::core::ptr::write(&raw mut (*$slot).$field, $field) };
}
// Create the drop guard:
//
@@ -1281,7 +1281,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
::kernel::macros::paste! {
// SAFETY: We forget the guard later when initialization has succeeded.
let [< __ $field _guard >] = unsafe {
- $crate::init::__internal::DropGuard::new(::core::ptr::addr_of_mut!((*$slot).$field))
+ $crate::init::__internal::DropGuard::new(&raw mut (*$slot).$field)
};
$crate::__init_internal!(init_slot($($use_data)?):
diff --git a/rust/kernel/jump_label.rs b/rust/kernel/jump_label.rs
index 4e974c768dbd..ca10abae0eee 100644
--- a/rust/kernel/jump_label.rs
+++ b/rust/kernel/jump_label.rs
@@ -20,8 +20,8 @@
#[macro_export]
macro_rules! static_branch_unlikely {
($key:path, $keytyp:ty, $field:ident) => {{
- let _key: *const $keytyp = ::core::ptr::addr_of!($key);
- let _key: *const $crate::bindings::static_key_false = ::core::ptr::addr_of!((*_key).$field);
+ let _key: *const $keytyp = &raw const $key;
+ let _key: *const $crate::bindings::static_key_false = &raw const (*_key).$field;
let _key: *const $crate::bindings::static_key = _key.cast();
#[cfg(not(CONFIG_JUMP_LABEL))]
diff --git a/rust/kernel/kunit.rs b/rust/kernel/kunit.rs
index 824da0e9738a..a17ef3b2e860 100644
--- a/rust/kernel/kunit.rs
+++ b/rust/kernel/kunit.rs
@@ -128,9 +128,9 @@ unsafe impl Sync for UnaryAssert {}
unsafe {
$crate::bindings::__kunit_do_failed_assertion(
kunit_test,
- core::ptr::addr_of!(LOCATION.0),
+ &raw const LOCATION.0,
$crate::bindings::kunit_assert_type_KUNIT_ASSERTION,
- core::ptr::addr_of!(ASSERTION.0.assert),
+ &raw const ASSERTION.0.assert,
Some($crate::bindings::kunit_unary_assert_format),
core::ptr::null(),
);
diff --git a/rust/kernel/list.rs b/rust/kernel/list.rs
index c0ed227b8a4f..e98f0820f002 100644
--- a/rust/kernel/list.rs
+++ b/rust/kernel/list.rs
@@ -176,7 +176,7 @@ pub fn new() -> impl PinInit<Self> {
#[inline]
unsafe fn fields(me: *mut Self) -> *mut ListLinksFields {
// SAFETY: The caller promises that the pointer is valid.
- unsafe { Opaque::raw_get(ptr::addr_of!((*me).inner)) }
+ unsafe { Opaque::raw_get(&raw const (*me).inner) }
}
/// # Safety
diff --git a/rust/kernel/list/impl_list_item_mod.rs b/rust/kernel/list/impl_list_item_mod.rs
index a0438537cee1..014b6713d59d 100644
--- a/rust/kernel/list/impl_list_item_mod.rs
+++ b/rust/kernel/list/impl_list_item_mod.rs
@@ -49,7 +49,7 @@ macro_rules! impl_has_list_links {
// SAFETY: The implementation of `raw_get_list_links` only compiles if the field has the
// right type.
//
- // The behavior of `raw_get_list_links` is not changed since the `addr_of_mut!` macro is
+ // The behavior of `raw_get_list_links` is not changed since the `&raw mut` op is
// equivalent to the pointer offset operation in the trait definition.
unsafe impl$(<$($implarg),*>)? $crate::list::HasListLinks$(<$id>)? for
$self $(<$($selfarg),*>)?
@@ -61,7 +61,7 @@ unsafe fn raw_get_list_links(ptr: *mut Self) -> *mut $crate::list::ListLinks$(<$
// SAFETY: The caller promises that the pointer is not dangling. We know that this
// expression doesn't follow any pointers, as the `offset_of!` invocation above
// would otherwise not compile.
- unsafe { ::core::ptr::addr_of_mut!((*ptr)$(.$field)*) }
+ unsafe { &raw mut (*ptr)$(.$field)* }
}
}
)*};
@@ -103,7 +103,7 @@ macro_rules! impl_has_list_links_self_ptr {
unsafe fn raw_get_list_links(ptr: *mut Self) -> *mut $crate::list::ListLinks$(<$id>)? {
// SAFETY: The caller promises that the pointer is not dangling.
let ptr: *mut $crate::list::ListLinksSelfPtr<$item_type $(, $id)?> =
- unsafe { ::core::ptr::addr_of_mut!((*ptr).$field) };
+ unsafe { &raw mut (*ptr).$field };
ptr.cast()
}
}
diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
index a59469c785e3..757db052cc09 100644
--- a/rust/kernel/net/phy.rs
+++ b/rust/kernel/net/phy.rs
@@ -7,7 +7,7 @@
//! C headers: [`include/linux/phy.h`](srctree/include/linux/phy.h).
use crate::{error::*, prelude::*, types::Opaque};
-use core::{marker::PhantomData, ptr::addr_of_mut};
+use core::marker::PhantomData;
pub mod reg;
@@ -285,7 +285,7 @@ impl AsRef<kernel::device::Device> for Device {
fn as_ref(&self) -> &kernel::device::Device {
let phydev = self.0.get();
// SAFETY: The struct invariant ensures that `mdio.dev` is valid.
- unsafe { kernel::device::Device::as_ref(addr_of_mut!((*phydev).mdio.dev)) }
+ unsafe { kernel::device::Device::as_ref(&raw mut (*phydev).mdio.dev) }
}
}
diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index f7b2743828ae..6cb9ed1e7cbf 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -17,7 +17,7 @@
types::{ARef, ForeignOwnable, Opaque},
ThisModule,
};
-use core::{ops::Deref, ptr::addr_of_mut};
+use core::ops::Deref;
use kernel::prelude::*;
/// An adapter for the registration of PCI drivers.
@@ -60,7 +60,7 @@ extern "C" fn probe_callback(
) -> kernel::ffi::c_int {
// SAFETY: The PCI bus only ever calls the probe callback with a valid pointer to a
// `struct pci_dev`.
- let dev = unsafe { device::Device::get_device(addr_of_mut!((*pdev).dev)) };
+ let dev = unsafe { device::Device::get_device(&raw mut (*pdev).dev) };
// SAFETY: `dev` is guaranteed to be embedded in a valid `struct pci_dev` by the call
// above.
let mut pdev = unsafe { Device::from_dev(dev) };
diff --git a/rust/kernel/platform.rs b/rust/kernel/platform.rs
index 1297f5292ba9..344875ad7b82 100644
--- a/rust/kernel/platform.rs
+++ b/rust/kernel/platform.rs
@@ -14,8 +14,6 @@
ThisModule,
};
-use core::ptr::addr_of_mut;
-
/// An adapter for the registration of platform drivers.
pub struct Adapter<T: Driver>(T);
@@ -55,7 +53,7 @@ unsafe fn unregister(pdrv: &Opaque<Self::RegType>) {
impl<T: Driver + 'static> Adapter<T> {
extern "C" fn probe_callback(pdev: *mut bindings::platform_device) -> kernel::ffi::c_int {
// SAFETY: The platform bus only ever calls the probe callback with a valid `pdev`.
- let dev = unsafe { device::Device::get_device(addr_of_mut!((*pdev).dev)) };
+ let dev = unsafe { device::Device::get_device(&raw mut (*pdev).dev) };
// SAFETY: `dev` is guaranteed to be embedded in a valid `struct platform_device` by the
// call above.
let mut pdev = unsafe { Device::from_dev(dev) };
diff --git a/rust/kernel/rbtree.rs b/rust/kernel/rbtree.rs
index 1ea25c7092fb..b0ad35663cb0 100644
--- a/rust/kernel/rbtree.rs
+++ b/rust/kernel/rbtree.rs
@@ -11,7 +11,7 @@
cmp::{Ord, Ordering},
marker::PhantomData,
mem::MaybeUninit,
- ptr::{addr_of_mut, from_mut, NonNull},
+ ptr::{from_mut, NonNull},
};
/// A red-black tree with owned nodes.
@@ -238,7 +238,7 @@ pub fn values_mut(&mut self) -> impl Iterator<Item = &'_ mut V> {
/// Returns a cursor over the tree nodes, starting with the smallest key.
pub fn cursor_front(&mut self) -> Option<Cursor<'_, K, V>> {
- let root = addr_of_mut!(self.root);
+ let root = &raw mut self.root;
// SAFETY: `self.root` is always a valid root node
let current = unsafe { bindings::rb_first(root) };
NonNull::new(current).map(|current| {
@@ -253,7 +253,7 @@ pub fn cursor_front(&mut self) -> Option<Cursor<'_, K, V>> {
/// Returns a cursor over the tree nodes, starting with the largest key.
pub fn cursor_back(&mut self) -> Option<Cursor<'_, K, V>> {
- let root = addr_of_mut!(self.root);
+ let root = &raw mut self.root;
// SAFETY: `self.root` is always a valid root node
let current = unsafe { bindings::rb_last(root) };
NonNull::new(current).map(|current| {
@@ -459,7 +459,7 @@ pub fn cursor_lower_bound(&mut self, key: &K) -> Option<Cursor<'_, K, V>>
let best = best_match?;
// SAFETY: `best` is a non-null node so it is valid by the type invariants.
- let links = unsafe { addr_of_mut!((*best.as_ptr()).links) };
+ let links = unsafe { &raw mut (*best.as_ptr()).links };
NonNull::new(links).map(|current| {
// INVARIANT:
@@ -767,7 +767,7 @@ pub fn remove_current(self) -> (Option<Self>, RBTreeNode<K, V>) {
let node = RBTreeNode { node };
// SAFETY: The reference to the tree used to create the cursor outlives the cursor, so
// the tree cannot change. By the tree invariant, all nodes are valid.
- unsafe { bindings::rb_erase(&mut (*this).links, addr_of_mut!(self.tree.root)) };
+ unsafe { bindings::rb_erase(&mut (*this).links, &raw mut self.tree.root) };
let current = match (prev, next) {
(_, Some(next)) => next,
@@ -803,7 +803,7 @@ fn remove_neighbor(&mut self, direction: Direction) -> Option<RBTreeNode<K, V>>
let neighbor = neighbor.as_ptr();
// SAFETY: The reference to the tree used to create the cursor outlives the cursor, so
// the tree cannot change. By the tree invariant, all nodes are valid.
- unsafe { bindings::rb_erase(neighbor, addr_of_mut!(self.tree.root)) };
+ unsafe { bindings::rb_erase(neighbor, &raw mut self.tree.root) };
// SAFETY: By the type invariant of `Self`, all non-null `rb_node` pointers stored in `self`
// point to the links field of `Node<K, V>` objects.
let this = unsafe { container_of!(neighbor, Node<K, V>, links) }.cast_mut();
@@ -918,7 +918,7 @@ unsafe fn to_key_value_raw<'b>(node: NonNull<bindings::rb_node>) -> (&'b K, *mut
let k = unsafe { &(*this).key };
// SAFETY: The passed `node` is the current node or a non-null neighbor,
// thus `this` is valid by the type invariants.
- let v = unsafe { addr_of_mut!((*this).value) };
+ let v = unsafe { &raw mut (*this).value };
(k, v)
}
}
@@ -1027,7 +1027,7 @@ fn next(&mut self) -> Option<Self::Item> {
self.next = unsafe { bindings::rb_next(self.next) };
// SAFETY: By the same reasoning above, it is safe to dereference the node.
- Some(unsafe { (addr_of_mut!((*cur).key), addr_of_mut!((*cur).value)) })
+ Some(unsafe { (&raw mut (*cur).key, &raw mut (*cur).value) })
}
}
@@ -1170,7 +1170,7 @@ fn insert(self, node: RBTreeNode<K, V>) -> &'a mut V {
// SAFETY: `node` is valid at least until we call `Box::from_raw`, which only happens when
// the node is removed or replaced.
- let node_links = unsafe { addr_of_mut!((*node).links) };
+ let node_links = unsafe { &raw mut (*node).links };
// INVARIANT: We are linking in a new node, which is valid. It remains valid because we
// "forgot" it with `Box::into_raw`.
@@ -1178,7 +1178,7 @@ fn insert(self, node: RBTreeNode<K, V>) -> &'a mut V {
unsafe { bindings::rb_link_node(node_links, self.parent, self.child_field_of_parent) };
// SAFETY: All pointers are valid. `node` has just been inserted into the tree.
- unsafe { bindings::rb_insert_color(node_links, addr_of_mut!((*self.rbtree).root)) };
+ unsafe { bindings::rb_insert_color(node_links, &raw mut (*self.rbtree).root) };
// SAFETY: The node is valid until we remove it from the tree.
unsafe { &mut (*node).value }
@@ -1261,7 +1261,7 @@ fn replace(self, node: RBTreeNode<K, V>) -> RBTreeNode<K, V> {
// SAFETY: `node` is valid at least until we call `Box::from_raw`, which only happens when
// the node is removed or replaced.
- let new_node_links = unsafe { addr_of_mut!((*node).links) };
+ let new_node_links = unsafe { &raw mut (*node).links };
// SAFETY: This updates the pointers so that `new_node_links` is in the tree where
// `self.node_links` used to be.
diff --git a/rust/kernel/sync/arc.rs b/rust/kernel/sync/arc.rs
index 3cefda7a4372..81d8b0f84957 100644
--- a/rust/kernel/sync/arc.rs
+++ b/rust/kernel/sync/arc.rs
@@ -243,7 +243,7 @@ pub fn into_raw(self) -> *const T {
let ptr = self.ptr.as_ptr();
core::mem::forget(self);
// SAFETY: The pointer is valid.
- unsafe { core::ptr::addr_of!((*ptr).data) }
+ unsafe { &raw const (*ptr).data }
}
/// Recreates an [`Arc`] instance previously deconstructed via [`Arc::into_raw`].
diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs
index 49012e711942..b2ac768eed23 100644
--- a/rust/kernel/task.rs
+++ b/rust/kernel/task.rs
@@ -257,7 +257,7 @@ pub fn as_ptr(&self) -> *mut bindings::task_struct {
pub fn group_leader(&self) -> &Task {
// SAFETY: The group leader of a task never changes after initialization, so reading this
// field is not a data race.
- let ptr = unsafe { *ptr::addr_of!((*self.as_ptr()).group_leader) };
+ let ptr = unsafe { *(&raw const (*self.as_ptr()).group_leader) };
// SAFETY: The lifetime of the returned task reference is tied to the lifetime of `self`,
// and given that a task has a reference to its group leader, we know it must be valid for
@@ -269,7 +269,7 @@ pub fn group_leader(&self) -> &Task {
pub fn pid(&self) -> Pid {
// SAFETY: The pid of a task never changes after initialization, so reading this field is
// not a data race.
- unsafe { *ptr::addr_of!((*self.as_ptr()).pid) }
+ unsafe { *(&raw const (*self.as_ptr()).pid) }
}
/// Returns the UID of the given task.
diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs
index 0cd100d2aefb..34e8abb38974 100644
--- a/rust/kernel/workqueue.rs
+++ b/rust/kernel/workqueue.rs
@@ -401,9 +401,9 @@ pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self
pub unsafe fn raw_get(ptr: *const Self) -> *mut bindings::work_struct {
// SAFETY: The caller promises that the pointer is aligned and not dangling.
//
- // A pointer cast would also be ok due to `#[repr(transparent)]`. We use `addr_of!` so that
- // the compiler does not complain that the `work` field is unused.
- unsafe { Opaque::raw_get(core::ptr::addr_of!((*ptr).work)) }
+ // A pointer cast would also be ok due to `#[repr(transparent)]`. We use `&raw const (*ptr).work`
+ // so that the compiler does not complain that the `work` field is unused.
+ unsafe { Opaque::raw_get(&raw const (*ptr).work) }
}
}
@@ -510,7 +510,7 @@ macro_rules! impl_has_work {
unsafe fn raw_get_work(ptr: *mut Self) -> *mut $crate::workqueue::Work<$work_type $(, $id)?> {
// SAFETY: The caller promises that the pointer is not dangling.
unsafe {
- ::core::ptr::addr_of_mut!((*ptr).$field)
+ &raw mut (*ptr).$field
}
}
}
--
2.48.1
There are four small fixes for ntsync test and doc. I divided these into
four different patches due to different types of errors. If one patch is
better, I can do it too.
Su Hui (4):
selftests: ntsync: fix the wrong condition in wake_all
selftests: ntsync: avoid possible overflow in 32-bit machine
selftests: ntsync: update config
docs: ntsync: update NTSYNC_IOC_*
Documentation/userspace-api/ntsync.rst | 18 +++++++++---------
tools/testing/selftests/drivers/ntsync/config | 2 +-
.../testing/selftests/drivers/ntsync/ntsync.c | 6 +++---
3 files changed, 13 insertions(+), 13 deletions(-)
--
2.30.2
I never had much luck running mm selftests so I spent a few hours
digging into why.
Looks like most of the reason is missing SKIP checks, so this series is
just adding a bunch of those that I found. I did not do anything like
all of them, just the ones I spotted in gup_longterm, gup_test, mmap,
userfaultfd and memfd_secret.
It's a bit unfortunate to have to skip those tests when ftruncate()
fails, but I don't have time to dig deep enough into it to actually make
them pass. I have observed the issue on 9pfs and heard rumours that NFS
has a similar problem.
I'm now able to run these test groups successfully:
- mmap
- gup_test
- compaction
- migration
- page_frag
- userfaultfd
Signed-off-by: Brendan Jackman <jackmanb(a)google.com>
---
Changes in v3:
- Added fix for userfaultfd tests.
- Dropped attempts to use sudo.
- Fixed garbage printf in uffd-stress.
(Added EXTRA_CFLAGS=-Werror FORCE_TARGETS=1 to my scripts to prevent
such errors happening again).
- Fixed missing newlines in ksft_test_result_skip() calls.
- Link to v2: https://lore.kernel.org/r/20250221-mm-selftests-v2-0-28c4d66383c5@google.com
Changes in v2 (Thanks to Dev for the reviews):
- Improve and cleanup some error messages
- Add some extra SKIPs
- Fix misnaming of nr_cpus variable in uffd tests
- Link to v1: https://lore.kernel.org/r/20250220-mm-selftests-v1-0-9bbf57d64463@google.com
---
Brendan Jackman (10):
selftests/mm: Report errno when things fail in gup_longterm
selftests/mm: Skip uffd-stress if userfaultfd not available
selftests/mm: Skip uffd-wp-mremap if userfaultfd not available
selftests/mm/uffd: Rename nr_cpus -> nr_threads
selftests/mm: Print some details when uffd-stress gets bad params
selftests/mm: Don't fail uffd-stress if too many CPUs
selftests/mm: Skip map_populate on weird filesystems
selftests/mm: Skip gup_longerm tests on weird filesystems
selftests/mm: Drop unnecessary sudo usage
selftests/mm: Ensure uffd-wp-mremap gets pages of each size
tools/testing/selftests/mm/gup_longterm.c | 45 ++++++++++++++++++----------
tools/testing/selftests/mm/map_populate.c | 7 +++++
tools/testing/selftests/mm/run_vmtests.sh | 25 ++++++++++++++--
tools/testing/selftests/mm/uffd-common.c | 8 ++---
tools/testing/selftests/mm/uffd-common.h | 2 +-
tools/testing/selftests/mm/uffd-stress.c | 42 ++++++++++++++++----------
tools/testing/selftests/mm/uffd-unit-tests.c | 2 +-
tools/testing/selftests/mm/uffd-wp-mremap.c | 5 +++-
8 files changed, 95 insertions(+), 41 deletions(-)
---
base-commit: 76544811c850a1f4c055aa182b513b7a843868ea
change-id: 20250220-mm-selftests-2d7d0542face
Best regards,
--
Brendan Jackman <jackmanb(a)google.com>
This series is built on top of the v3 write syscall support [1].
With James's KVM userfault [2], it is possible to handle stage-2 faults
in guest_memfd in userspace. However, KVM itself also triggers faults
in guest_memfd in some cases, for example: PV interfaces like kvmclock,
PV EOI and page table walking code when fetching the MMIO instruction on
x86. It was agreed in the guest_memfd upstream call on 23 Jan 2025 [3]
that KVM would be accessing those pages via userspace page tables. In
order for such faults to be handled in userspace, guest_memfd needs to
support userfaultfd.
This series proposes a limited support for userfaultfd in guest_memfd:
- userfaultfd support is conditional to `CONFIG_KVM_GMEM_SHARED_MEM`
(as is fault support in general)
- Only `page missing` event is currently supported
- Userspace is supposed to respond to the event with the `write`
syscall followed by `UFFDIO_CONTINUE` ioctl to unblock the faulting
process. Note that we can't use `UFFDIO_COPY` here because
userfaulfd code does not know how to prepare guest_memfd pages, eg
remove them from direct map [4].
Not included in this series:
- Proper interface for userfaultfd to recognise guest_memfd mappings
- Proper handling of truncation cases after locking the page
Request for comments:
- Is it a sensible workflow for guest_memfd to resolve a userfault
`page missing` event with `write` syscall + `UFFDIO_CONTINUE`? One
of the alternatives is teaching `UFFDIO_COPY` how to deal with
guest_memfd pages.
- What is a way forward to make userfaultfd code aware of guest_memfd?
I saw that Patrick hit a somewhat similar problem in [5] when trying
to use direct map manipulation functions in KVM and was pointed by
David at Elliot's guestmem library [6] that might include a shim for that.
Would the library be the right place to expose required interfaces like
`vma_is_gmem`?
Nikita
[1] https://lore.kernel.org/kvm/20250303130838.28812-1-kalyazin@amazon.com/T/
[2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com/…
[3] https://docs.google.com/document/d/1M6766BzdY1Lhk7LiR5IqVR8B8mG3cr-cxTxOrAo…
[4] https://lore.kernel.org/kvm/20250221160728.1584559-1-roypat@amazon.co.uk/T/
[4] https://lore.kernel.org/kvm/20250221160728.1584559-1-roypat@amazon.co.uk/T/…
[5] https://lore.kernel.org/kvm/20241122-guestmem-library-v5-2-450e92951a15@qui…
Nikita Kalyazin (5):
KVM: guest_memfd: add kvm_gmem_vma_is_gmem
KVM: guest_memfd: add support for uffd missing
mm: userfaultfd: allow to register userfaultfd for guest_memfd
mm: userfaultfd: support continue for guest_memfd
KVM: selftests: add uffd missing test for guest_memfd
include/linux/userfaultfd_k.h | 9 ++
mm/userfaultfd.c | 23 ++++-
.../testing/selftests/kvm/guest_memfd_test.c | 88 +++++++++++++++++++
virt/kvm/guest_memfd.c | 17 +++-
virt/kvm/kvm_mm.h | 1 +
5 files changed, 136 insertions(+), 2 deletions(-)
base-commit: 592e7531753dc4b711f96cd1daf808fd493d3223
--
2.47.1
Basics and overview
===================
Software with larger attack surfaces (e.g. network facing apps like databases,
browsers or apps relying on browser runtimes) suffer from memory corruption
issues which can be utilized by attackers to bend control flow of the program
to eventually gain control (by making their payload executable). Attackers are
able to perform such attacks by leveraging call-sites which rely on indirect
calls or return sites which rely on obtaining return address from stack memory.
To mitigate such attacks, risc-v extension zicfilp enforces that all indirect
calls must land on a landing pad instruction `lpad` else cpu will raise software
check exception (a new cpu exception cause code on riscv).
Similarly for return flow, risc-v extension zicfiss extends architecture with
- `sspush` instruction to push return address on a shadow stack
- `sspopchk` instruction to pop return address from shadow stack
and compare with input operand (i.e. return address on stack)
- `sspopchk` to raise software check exception if comparision above
was a mismatch
- Protection mechanism using which shadow stack is not writeable via
regular store instructions
More information an details can be found at extensions github repo [1].
Equivalent to landing pad (zicfilp) on x86 is `ENDBRANCH` instruction in Intel
CET [3] and branch target identification (BTI) [4] on arm.
Similarly x86's Intel CET has shadow stack [5] and arm64 has guarded control
stack (GCS) [6] which are very similar to risc-v's zicfiss shadow stack.
x86 and arm64 support for user mode shadow stack is already in mainline.
Kernel awareness for user control flow integrity
================================================
This series picks up Samuel Holland's envcfg changes [2] as well. So if those are
being applied independently, they should be removed from this series.
Enabling:
In order to maintain compatibility and not break anything in user mode, kernel
doesn't enable control flow integrity cpu extensions on binary by default.
Instead exposes a prctl interface to enable, disable and lock the shadow stack
or landing pad feature for a task. This allows userspace (loader) to enumerate
if all objects in its address space are compiled with shadow stack and landing
pad support and accordingly enable the feature. Additionally if a subsequent
`dlopen` happens on a library, user mode can take a decision again to disable
the feature (if incoming library is not compiled with support) OR terminate the
task (if user mode policy is strict to have all objects in address space to be
compiled with control flow integirty cpu feature). prctl to enable shadow stack
results in allocating shadow stack from virtual memory and activating for user
address space. x86 and arm64 are also following same direction due to similar
reason(s).
clone/fork:
On clone and fork, cfi state for task is inherited by child. Shadow stack is
part of virtual memory and is a writeable memory from kernel perspective
(writeable via a restricted set of instructions aka shadow stack instructions)
Thus kernel changes ensure that this memory is converted into read-only when
fork/clone happens and COWed when fault is taken due to sspush, sspopchk or
ssamoswap. In case `CLONE_VM` is specified and shadow stack is to be enabled,
kernel will automatically allocate a shadow stack for that clone call.
map_shadow_stack:
x86 introduced `map_shadow_stack` system call to allow user space to explicitly
map shadow stack memory in its address space. It is useful to allocate shadow
for different contexts managed by a single thread (green threads or contexts)
risc-v implements this system call as well.
signal management:
If shadow stack is enabled for a task, kernel performs an asynchronous control
flow diversion to deliver the signal and eventually expects userspace to issue
sigreturn so that original execution can be resumed. Even though resume context
is prepared by kernel, it is in user space memory and is subject to memory
corruption and corruption bugs can be utilized by attacker in this race window
to perform arbitrary sigreturn and eventually bypass cfi mechanism.
Another issue is how to ensure that cfi related state on sigcontext area is not
trampled by legacy apps or apps compiled with old kernel headers.
In order to mitigate control-flow hijacting, kernel prepares a token and place
it on shadow stack before signal delivery and places address of token in
sigcontext structure. During sigreturn, kernel obtains address of token from
sigcontext struture, reads token from shadow stack and validates it and only
then allow sigreturn to succeed. Compatiblity issue is solved by adopting
dynamic sigcontext management introduced for vector extension. This series
re-factor the code little bit to allow future sigcontext management easy (as
proposed by Andy Chiu from SiFive)
config and compilation:
Introduce a new risc-v config option `CONFIG_RISCV_USER_CFI`. Selecting this
config option picks the kernel support for user control flow integrity. This
optin is presented only if toolchain has shadow stack and landing pad support.
And is on purpose guarded by toolchain support. Reason being that eventually
vDSO also needs to be compiled in with shadow stack and landing pad support.
vDSO compile patches are not included as of now because landing pad labeling
scheme is yet to settle for usermode runtime.
To get more information on kernel interactions with respect to
zicfilp and zicfiss, patch series adds documentation for
`zicfilp` and `zicfiss` in following:
Documentation/arch/riscv/zicfiss.rst
Documentation/arch/riscv/zicfilp.rst
How to test this series
=======================
Toolchain
---------
$ git clone git@github.com:sifive/riscv-gnu-toolchain.git -b cfi-dev
$ riscv-gnu-toolchain/configure --prefix=<path-to-where-to-build> --with-arch=rv64gc_zicfilp_zicfiss --enable-linux --disable-gdb --with-extra-multilib-test="rv64gc_zicfilp_zicfiss-lp64d:-static"
$ make -j$(nproc)
Qemu
----
Get the lastest qemu
$ cd qemu
$ mkdir build
$ cd build
$ ../configure --target-list=riscv64-softmmu
$ make -j$(nproc)
Opensbi
-------
$ git clone git@github.com:deepak0414/opensbi.git -b v6_cfi_spec_split_opensbi
$ make CROSS_COMPILE=<your riscv toolchain> -j$(nproc) PLATFORM=generic
Linux
-----
Running defconfig is fine. CFI is enabled by default if the toolchain
supports it.
$ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc) defconfig
$ make ARCH=riscv CROSS_COMPILE=<path-to-cfi-riscv-gnu-toolchain>/build/bin/riscv64-unknown-linux-gnu- -j$(nproc)
In case you're building your own rootfs using toolchain, please make sure you
pick following patch to ensure that vDSO compiled with lpad and shadow stack.
"arch/riscv: compile vdso with landing pad"
Branch where above patch can be picked
https://github.com/deepak0414/linux-riscv-cfi/tree/vdso_user_cfi_v6.12-rc1
Running
-------
Modify your qemu command to have:
-bios <path-to-cfi-opensbi>/build/platform/generic/firmware/fw_dynamic.bin
-cpu rv64,zicfilp=true,zicfiss=true,zimop=true,zcmop=true
vDSO related Opens (in the flux)
=================================
I am listing these opens for laying out plan and what to expect in future
patch sets. And of course for the sake of discussion.
Shadow stack and landing pad enabling in vDSO
----------------------------------------------
vDSO must have shadow stack and landing pad support compiled in for task
to have shadow stack and landing pad support. This patch series doesn't
enable that (yet). Enabling shadow stack support in vDSO should be
straight forward (intend to do that in next versions of patch set). Enabling
landing pad support in vDSO requires some collaboration with toolchain folks
to follow a single label scheme for all object binaries. This is necessary to
ensure that all indirect call-sites are setting correct label and target landing
pads are decorated with same label scheme.
How many vDSOs
---------------
Shadow stack instructions are carved out of zimop (may be operations) and if CPU
doesn't implement zimop, they're illegal instructions. Kernel could be running on
a CPU which may or may not implement zimop. And thus kernel will have to carry 2
different vDSOs and expose the appropriate one depending on whether CPU implements
zimop or not.
References
==========
[1] - https://github.com/riscv/riscv-cfi
[2] - https://lore.kernel.org/all/20240814081126.956287-1-samuel.holland@sifive.c…
[3] - https://lwn.net/Articles/889475/
[4] - https://developer.arm.com/documentation/109576/0100/Branch-Target-Identific…
[5] - https://www.intel.com/content/dam/develop/external/us/en/documents/catc17-i…
[6] - https://lwn.net/Articles/940403/
---
changelog
---------
v11:
- patch "arch/riscv: compile vdso with landing pad" was unconditionally
selecting `_zicfilp` for vDSO compile. fixed that. Changed `lpad 1` to
to `lpad 0`.
v10:
- dropped "mm: helper `is_shadow_stack_vma` to check shadow stack vma". This patch
is not that interesting to this patch series for risc-v. There are instances in
arch directories where VM_SHADOW_STACK flag is anyways used. Dropping this patch
to expedite merging in riscv tree.
- Took suggestions from `Clement` on "riscv: zicfiss / zicfilp enumeration" to
validate presence of cfi based on config.
- Added a patch for vDSO to have `lpad 0`. I had omitted this earlier to make sure
we add single vdso object with cfi enabled. But a vdso object with scheme of
zero labeled landing pad is least common denominator and should work with all
objects of zero labeled as well as function-signature labeled objects.
v9:
- rebased on master (39a803b754d5 fix braino in "9p: fix ->rename_sem exclusion")
- dropped "mm: Introduce ARCH_HAS_USER_SHADOW_STACK" (master has it from arm64/gcs)
- dropped "prctl: arch-agnostic prctl for shadow stack" (master has it from arm64/gcs)
v8:
- rebased on palmer/for-next
- dropped samuel holland's `envcfg` context switch patches.
they are in parlmer/for-next
v7:
- Removed "riscv/Kconfig: enable HAVE_EXIT_THREAD for riscv"
Instead using `deactivate_mm` flow to clean up.
see here for more context
https://lore.kernel.org/all/20230908203655.543765-1-rick.p.edgecombe@intel.…
- Changed the header include in `kselftest`. Hopefully this fixes compile
issue faced by Zong Li at SiFive.
- Cleaned up an orphaned change to `mm/mmap.c` in below patch
"riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE"
- Lock interfaces for shadow stack and indirect branch tracking expect arg == 0
Any future evolution of this interface should accordingly define how arg should
be setup.
- `mm/map.c` has an instance of using `VM_SHADOW_STACK`. Fixed it to use helper
`is_shadow_stack_vma`.
- Link to v6: https://lore.kernel.org/r/20241008-v5_user_cfi_series-v6-0-60d9fe073f37@riv…
v6:
- Picked up Samuel Holland's changes as is with `envcfg` placed in
`thread` instead of `thread_info`
- fixed unaligned newline escapes in kselftest
- cleaned up messages in kselftest and included test output in commit message
- fixed a bug in clone path reported by Zong Li
- fixed a build issue if CONFIG_RISCV_ISA_V is not selected
(this was introduced due to re-factoring signal context
management code)
v5:
- rebased on v6.12-rc1
- Fixed schema related issues in device tree file
- Fixed some of the documentation related issues in zicfilp/ss.rst
(style issues and added index)
- added `SHADOW_STACK_SET_MARKER` so that implementation can define base
of shadow stack.
- Fixed warnings on definitions added in usercfi.h when
CONFIG_RISCV_USER_CFI is not selected.
- Adopted context header based signal handling as proposed by Andy Chiu
- Added support for enabling kernel mode access to shadow stack using
FWFT
(https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-firmware…)
- Link to v5: https://lore.kernel.org/r/20241001-v5_user_cfi_series-v1-0-3ba65b6e550f@riv…
(Note: I had an issue in my workflow due to which version number wasn't
picked up correctly while sending out patches)
v4:
- rebased on 6.11-rc6
- envcfg: Converged with Samuel Holland's patches for envcfg management on per-
thread basis.
- vma_is_shadow_stack is renamed to is_vma_shadow_stack
- picked up Mark Brown's `ARCH_HAS_USER_SHADOW_STACK` patch
- signal context: using extended context management to maintain compatibility.
- fixed `-Wmissing-prototypes` compiler warnings for prctl functions
- Documentation fixes and amending typos.
- Link to v4: https://lore.kernel.org/all/20240912231650.3740732-1-debug@rivosinc.com/
v3:
- envcfg
logic to pick up base envcfg had a bug where `ENVCFG_CBZE` could have been
picked on per task basis, even though CPU didn't implement it. Fixed in
this series.
- dt-bindings
As suggested, split into separate commit. fixed the messaging that spec is
in public review
- arch_is_shadow_stack change
arch_is_shadow_stack changed to vma_is_shadow_stack
- hwprobe
zicfiss / zicfilp if present will get enumerated in hwprobe
- selftests
As suggested, added object and binary filenames to .gitignore
Selftest binary anyways need to be compiled with cfi enabled compiler which
will make sure that landing pad and shadow stack are enabled. Thus removed
separate enable/disable tests. Cleaned up tests a bit.
- Link to v3: https://lore.kernel.org/lkml/20240403234054.2020347-1-debug@rivosinc.com/
v2:
- Using config `CONFIG_RISCV_USER_CFI`, kernel support for riscv control flow
integrity for user mode programs can be compiled in the kernel.
- Enabling of control flow integrity for user programs is left to user runtime
- This patch series introduces arch agnostic `prctls` to enable shadow stack
and indirect branch tracking. And implements them on riscv.
---
---
Changes in v11:
- EDITME: describe what is new in this series revision.
- EDITME: use bulletpoints and terse descriptions.
- Link to v10: https://lore.kernel.org/r/20250210-v5_user_cfi_series-v10-0-163dcfa31c60@ri…
---
Andy Chiu (1):
riscv: signal: abstract header saving for setup_sigcontext
Clément Léger (1):
riscv: Add Firmware Feature SBI extensions definitions
Deepak Gupta (24):
mm: VM_SHADOW_STACK definition for riscv
dt-bindings: riscv: zicfilp and zicfiss in dt-bindings (extensions.yaml)
riscv: zicfiss / zicfilp enumeration
riscv: zicfiss / zicfilp extension csr and bit definitions
riscv: usercfi state for task and save/restore of CSR_SSP on trap entry/exit
riscv/mm : ensure PROT_WRITE leads to VM_READ | VM_WRITE
riscv mm: manufacture shadow stack pte
riscv mmu: teach pte_mkwrite to manufacture shadow stack PTEs
riscv mmu: write protect and shadow stack
riscv/mm: Implement map_shadow_stack() syscall
riscv/shstk: If needed allocate a new shadow stack on clone
riscv: Implements arch agnostic shadow stack prctls
prctl: arch-agnostic prctl for indirect branch tracking
riscv/traps: Introduce software check exception
riscv/signal: save and restore of shadow stack for signal
riscv/kernel: update __show_regs to print shadow stack register
riscv/ptrace: riscv cfi status and state via ptrace and in core files
riscv/hwprobe: zicfilp / zicfiss enumeration in hwprobe
riscv: enable kernel access to shadow stack memory via FWFT sbi call
riscv: kernel command line option to opt out of user cfi
riscv: create a config for shadow stack and landing pad instr support
riscv: Documentation for landing pad / indirect branch tracking
riscv: Documentation for shadow stack on riscv
kselftest/riscv: kselftest for user mode cfi
Jim Shu (1):
arch/riscv: compile vdso with landing pad
Documentation/arch/riscv/index.rst | 2 +
Documentation/arch/riscv/zicfilp.rst | 115 +++++
Documentation/arch/riscv/zicfiss.rst | 176 +++++++
.../devicetree/bindings/riscv/extensions.yaml | 14 +
arch/riscv/Kconfig | 20 +
arch/riscv/Makefile | 7 +-
arch/riscv/include/asm/asm-prototypes.h | 1 +
arch/riscv/include/asm/assembler.h | 44 ++
arch/riscv/include/asm/cpufeature.h | 13 +
arch/riscv/include/asm/csr.h | 16 +
arch/riscv/include/asm/entry-common.h | 2 +
arch/riscv/include/asm/hwcap.h | 2 +
arch/riscv/include/asm/mman.h | 25 +
arch/riscv/include/asm/mmu_context.h | 7 +
arch/riscv/include/asm/pgtable.h | 30 +-
arch/riscv/include/asm/processor.h | 2 +
arch/riscv/include/asm/sbi.h | 26 +
arch/riscv/include/asm/thread_info.h | 3 +
arch/riscv/include/asm/usercfi.h | 89 ++++
arch/riscv/include/asm/vector.h | 3 +
arch/riscv/include/uapi/asm/hwprobe.h | 2 +
arch/riscv/include/uapi/asm/ptrace.h | 22 +
arch/riscv/include/uapi/asm/sigcontext.h | 1 +
arch/riscv/kernel/Makefile | 1 +
arch/riscv/kernel/asm-offsets.c | 8 +
arch/riscv/kernel/cpufeature.c | 13 +
arch/riscv/kernel/entry.S | 31 +-
arch/riscv/kernel/head.S | 12 +
arch/riscv/kernel/process.c | 26 +-
arch/riscv/kernel/ptrace.c | 83 ++++
arch/riscv/kernel/signal.c | 142 +++++-
arch/riscv/kernel/sys_hwprobe.c | 2 +
arch/riscv/kernel/sys_riscv.c | 10 +
arch/riscv/kernel/traps.c | 43 ++
arch/riscv/kernel/usercfi.c | 524 +++++++++++++++++++++
arch/riscv/kernel/vdso/Makefile | 12 +
arch/riscv/kernel/vdso/flush_icache.S | 4 +
arch/riscv/kernel/vdso/getcpu.S | 4 +
arch/riscv/kernel/vdso/rt_sigreturn.S | 4 +
arch/riscv/kernel/vdso/sys_hwprobe.S | 4 +
arch/riscv/mm/init.c | 2 +-
arch/riscv/mm/pgtable.c | 17 +
include/linux/cpu.h | 4 +
include/linux/mm.h | 7 +
include/uapi/linux/elf.h | 1 +
include/uapi/linux/prctl.h | 27 ++
kernel/sys.c | 30 ++
tools/testing/selftests/riscv/Makefile | 2 +-
tools/testing/selftests/riscv/cfi/.gitignore | 3 +
tools/testing/selftests/riscv/cfi/Makefile | 10 +
tools/testing/selftests/riscv/cfi/cfi_rv_test.h | 84 ++++
tools/testing/selftests/riscv/cfi/riscv_cfi_test.c | 78 +++
tools/testing/selftests/riscv/cfi/shadowstack.c | 375 +++++++++++++++
tools/testing/selftests/riscv/cfi/shadowstack.h | 37 ++
54 files changed, 2193 insertions(+), 29 deletions(-)
---
base-commit: 39a803b754d5224a3522016b564113ee1e4091b2
change-id: 20240930-v5_user_cfi_series-3dc332f8f5b2
--
- debug
In Rust 1.51.0, Clippy introduced the `ignored_unit_patterns` lint [1]:
> Though `as` casts between raw pointers are not terrible,
> `pointer::cast` is safer because it cannot accidentally change the
> pointer's mutability, nor cast the pointer to other types like `usize`.
There are a few classes of changes required:
- Modules generated by bindgen are marked
`#[allow(clippy::ptr_as_ptr)]`.
- Inferred casts (` as _`) are replaced with `.cast()`.
- Ascribed casts (` as *... T`) are replaced with `.cast::<T>()`.
- Multistep casts from references (` as *const _ as *const T`) are
replaced with `let x: *const _ = &x;` and `.cast()` or `.cast::<T>()`
according to the previous rules. The intermediate `let` binding is
required because `(x as *const _).cast::<T>()` results in inference
failure.
- Native literal C strings are replaced with `c_str!().as_char_ptr()`.
Apply these changes and enable the lint -- no functional change
intended.
Link: https://rust-lang.github.io/rust-clippy/master/index.html#ptr_as_ptr [1]
Signed-off-by: Tamir Duberstein <tamird(a)gmail.com>
---
Makefile | 1 +
rust/bindings/lib.rs | 1 +
rust/kernel/alloc/allocator_test.rs | 2 +-
rust/kernel/alloc/kvec.rs | 4 ++--
rust/kernel/device.rs | 5 +++--
rust/kernel/devres.rs | 2 +-
rust/kernel/error.rs | 2 +-
rust/kernel/fs/file.rs | 2 +-
rust/kernel/kunit.rs | 15 +++++++--------
rust/kernel/lib.rs | 4 ++--
rust/kernel/list/impl_list_item_mod.rs | 2 +-
rust/kernel/pci.rs | 2 +-
rust/kernel/platform.rs | 4 +++-
rust/kernel/print.rs | 11 +++++------
rust/kernel/seq_file.rs | 3 ++-
rust/kernel/str.rs | 2 +-
rust/kernel/sync/poll.rs | 2 +-
rust/kernel/workqueue.rs | 10 +++++-----
rust/uapi/lib.rs | 1 +
19 files changed, 40 insertions(+), 35 deletions(-)
diff --git a/Makefile b/Makefile
index 70bdbf2218fc..ec8efc8e23ba 100644
--- a/Makefile
+++ b/Makefile
@@ -483,6 +483,7 @@ export rust_common_flags := --edition=2021 \
-Wclippy::needless_continue \
-Aclippy::needless_lifetimes \
-Wclippy::no_mangle_with_rust_abi \
+ -Wclippy::ptr_as_ptr \
-Wclippy::undocumented_unsafe_blocks \
-Wclippy::unnecessary_safety_comment \
-Wclippy::unnecessary_safety_doc \
diff --git a/rust/bindings/lib.rs b/rust/bindings/lib.rs
index 014af0d1fc70..0486a32ed314 100644
--- a/rust/bindings/lib.rs
+++ b/rust/bindings/lib.rs
@@ -25,6 +25,7 @@
)]
#[allow(dead_code)]
+#[allow(clippy::ptr_as_ptr)]
#[allow(clippy::undocumented_unsafe_blocks)]
mod bindings_raw {
// Manual definition for blocklisted types.
diff --git a/rust/kernel/alloc/allocator_test.rs b/rust/kernel/alloc/allocator_test.rs
index c37d4c0c64e9..8017aa9d5213 100644
--- a/rust/kernel/alloc/allocator_test.rs
+++ b/rust/kernel/alloc/allocator_test.rs
@@ -82,7 +82,7 @@ unsafe fn realloc(
// SAFETY: Returns either NULL or a pointer to a memory allocation that satisfies or
// exceeds the given size and alignment requirements.
- let dst = unsafe { libc_aligned_alloc(layout.align(), layout.size()) } as *mut u8;
+ let dst = unsafe { libc_aligned_alloc(layout.align(), layout.size()) }.cast::<u8>();
let dst = NonNull::new(dst).ok_or(AllocError)?;
if flags.contains(__GFP_ZERO) {
diff --git a/rust/kernel/alloc/kvec.rs b/rust/kernel/alloc/kvec.rs
index ae9d072741ce..c12844764671 100644
--- a/rust/kernel/alloc/kvec.rs
+++ b/rust/kernel/alloc/kvec.rs
@@ -262,7 +262,7 @@ pub fn spare_capacity_mut(&mut self) -> &mut [MaybeUninit<T>] {
// - `self.len` is smaller than `self.capacity` and hence, the resulting pointer is
// guaranteed to be part of the same allocated object.
// - `self.len` can not overflow `isize`.
- let ptr = unsafe { self.as_mut_ptr().add(self.len) } as *mut MaybeUninit<T>;
+ let ptr = unsafe { self.as_mut_ptr().add(self.len) }.cast::<MaybeUninit<T>>();
// SAFETY: The memory between `self.len` and `self.capacity` is guaranteed to be allocated
// and valid, but uninitialized.
@@ -554,7 +554,7 @@ fn drop(&mut self) {
// - `ptr` points to memory with at least a size of `size_of::<T>() * len`,
// - all elements within `b` are initialized values of `T`,
// - `len` does not exceed `isize::MAX`.
- unsafe { Vec::from_raw_parts(ptr as _, len, len) }
+ unsafe { Vec::from_raw_parts(ptr.cast(), len, len) }
}
}
diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs
index db2d9658ba47..9e500498835d 100644
--- a/rust/kernel/device.rs
+++ b/rust/kernel/device.rs
@@ -168,16 +168,17 @@ pub fn pr_dbg(&self, args: fmt::Arguments<'_>) {
/// `KERN_*`constants, for example, `KERN_CRIT`, `KERN_ALERT`, etc.
#[cfg_attr(not(CONFIG_PRINTK), allow(unused_variables))]
unsafe fn printk(&self, klevel: &[u8], msg: fmt::Arguments<'_>) {
+ let msg: *const _ = &msg;
// SAFETY: `klevel` is null-terminated and one of the kernel constants. `self.as_raw`
// is valid because `self` is valid. The "%pA" format string expects a pointer to
// `fmt::Arguments`, which is what we're passing as the last argument.
#[cfg(CONFIG_PRINTK)]
unsafe {
bindings::_dev_printk(
- klevel as *const _ as *const crate::ffi::c_char,
+ klevel.as_ptr().cast::<crate::ffi::c_char>(),
self.as_raw(),
c_str!("%pA").as_char_ptr(),
- &msg as *const _ as *const crate::ffi::c_void,
+ msg.cast::<crate::ffi::c_void>(),
)
};
}
diff --git a/rust/kernel/devres.rs b/rust/kernel/devres.rs
index 942376f6f3af..3a9d998ec371 100644
--- a/rust/kernel/devres.rs
+++ b/rust/kernel/devres.rs
@@ -157,7 +157,7 @@ fn remove_action(this: &Arc<Self>) {
#[allow(clippy::missing_safety_doc)]
unsafe extern "C" fn devres_callback(ptr: *mut kernel::ffi::c_void) {
- let ptr = ptr as *mut DevresInner<T>;
+ let ptr = ptr.cast::<DevresInner<T>>();
// Devres owned this memory; now that we received the callback, drop the `Arc` and hence the
// reference.
// SAFETY: Safe, since we leaked an `Arc` reference to devm_add_action() in
diff --git a/rust/kernel/error.rs b/rust/kernel/error.rs
index f6ecf09cb65f..8654d52b0bb9 100644
--- a/rust/kernel/error.rs
+++ b/rust/kernel/error.rs
@@ -152,7 +152,7 @@ pub(crate) fn to_blk_status(self) -> bindings::blk_status_t {
/// Returns the error encoded as a pointer.
pub fn to_ptr<T>(self) -> *mut T {
// SAFETY: `self.0` is a valid error due to its invariant.
- unsafe { bindings::ERR_PTR(self.0.get() as _) as *mut _ }
+ unsafe { bindings::ERR_PTR(self.0.get() as _).cast() }
}
/// Returns a string representing the error, if one exists.
diff --git a/rust/kernel/fs/file.rs b/rust/kernel/fs/file.rs
index e03dbe14d62a..8936afc234a4 100644
--- a/rust/kernel/fs/file.rs
+++ b/rust/kernel/fs/file.rs
@@ -364,7 +364,7 @@ fn deref(&self) -> &LocalFile {
//
// By the type invariants, there are no `fdget_pos` calls that did not take the
// `f_pos_lock` mutex.
- unsafe { LocalFile::from_raw_file(self as *const File as *const bindings::file) }
+ unsafe { LocalFile::from_raw_file((self as *const Self).cast()) }
}
}
diff --git a/rust/kernel/kunit.rs b/rust/kernel/kunit.rs
index 824da0e9738a..7ed2063c1af0 100644
--- a/rust/kernel/kunit.rs
+++ b/rust/kernel/kunit.rs
@@ -8,19 +8,20 @@
use core::{ffi::c_void, fmt};
+#[cfg(CONFIG_PRINTK)]
+use crate::c_str;
+
/// Prints a KUnit error-level message.
///
/// Public but hidden since it should only be used from KUnit generated code.
#[doc(hidden)]
pub fn err(args: fmt::Arguments<'_>) {
+ let args: *const _ = &args;
// SAFETY: The format string is null-terminated and the `%pA` specifier matches the argument we
// are passing.
#[cfg(CONFIG_PRINTK)]
unsafe {
- bindings::_printk(
- c"\x013%pA".as_ptr() as _,
- &args as *const _ as *const c_void,
- );
+ bindings::_printk(c_str!("\x013%pA").as_char_ptr(), args.cast::<c_void>());
}
}
@@ -29,14 +30,12 @@ pub fn err(args: fmt::Arguments<'_>) {
/// Public but hidden since it should only be used from KUnit generated code.
#[doc(hidden)]
pub fn info(args: fmt::Arguments<'_>) {
+ let args: *const _ = &args;
// SAFETY: The format string is null-terminated and the `%pA` specifier matches the argument we
// are passing.
#[cfg(CONFIG_PRINTK)]
unsafe {
- bindings::_printk(
- c"\x016%pA".as_ptr() as _,
- &args as *const _ as *const c_void,
- );
+ bindings::_printk(c_str!("\x016%pA").as_char_ptr(), args.cast::<c_void>());
}
}
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index 7697c60b2d1a..01264e459c92 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -196,9 +196,9 @@ fn panic(info: &core::panic::PanicInfo<'_>) -> ! {
#[macro_export]
macro_rules! container_of {
($ptr:expr, $type:ty, $($f:tt)*) => {{
- let ptr = $ptr as *const _ as *const u8;
+ let ptr: *const _ = $ptr;
let offset: usize = ::core::mem::offset_of!($type, $($f)*);
- ptr.sub(offset) as *const $type
+ ptr.cast::<u8>().sub(offset).cast::<$type>()
}}
}
diff --git a/rust/kernel/list/impl_list_item_mod.rs b/rust/kernel/list/impl_list_item_mod.rs
index a0438537cee1..1f9498c1458f 100644
--- a/rust/kernel/list/impl_list_item_mod.rs
+++ b/rust/kernel/list/impl_list_item_mod.rs
@@ -34,7 +34,7 @@ pub unsafe trait HasListLinks<const ID: u64 = 0> {
unsafe fn raw_get_list_links(ptr: *mut Self) -> *mut ListLinks<ID> {
// SAFETY: The caller promises that the pointer is valid. The implementer promises that the
// `OFFSET` constant is correct.
- unsafe { (ptr as *mut u8).add(Self::OFFSET) as *mut ListLinks<ID> }
+ unsafe { ptr.cast::<u8>().add(Self::OFFSET).cast() }
}
}
diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index 4c98b5b9aa1e..206f71d33ab2 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -75,7 +75,7 @@ extern "C" fn probe_callback(
// Let the `struct pci_dev` own a reference of the driver's private data.
// SAFETY: By the type invariant `pdev.as_raw` returns a valid pointer to a
// `struct pci_dev`.
- unsafe { bindings::pci_set_drvdata(pdev.as_raw(), data.into_foreign() as _) };
+ unsafe { bindings::pci_set_drvdata(pdev.as_raw(), data.into_foreign().cast()) };
}
Err(err) => return Error::to_errno(err),
}
diff --git a/rust/kernel/platform.rs b/rust/kernel/platform.rs
index 50e6b0421813..8f9e6b125faf 100644
--- a/rust/kernel/platform.rs
+++ b/rust/kernel/platform.rs
@@ -66,7 +66,9 @@ extern "C" fn probe_callback(pdev: *mut bindings::platform_device) -> kernel::ff
// Let the `struct platform_device` own a reference of the driver's private data.
// SAFETY: By the type invariant `pdev.as_raw` returns a valid pointer to a
// `struct platform_device`.
- unsafe { bindings::platform_set_drvdata(pdev.as_raw(), data.into_foreign() as _) };
+ unsafe {
+ bindings::platform_set_drvdata(pdev.as_raw(), data.into_foreign().cast())
+ };
}
Err(err) => return Error::to_errno(err),
}
diff --git a/rust/kernel/print.rs b/rust/kernel/print.rs
index b19ee490be58..0245c145ea32 100644
--- a/rust/kernel/print.rs
+++ b/rust/kernel/print.rs
@@ -25,7 +25,7 @@
// SAFETY: The C contract guarantees that `buf` is valid if it's less than `end`.
let mut w = unsafe { RawFormatter::from_ptrs(buf.cast(), end.cast()) };
// SAFETY: TODO.
- let _ = w.write_fmt(unsafe { *(ptr as *const fmt::Arguments<'_>) });
+ let _ = w.write_fmt(unsafe { *ptr.cast::<fmt::Arguments<'_>>() });
w.pos().cast()
}
@@ -102,6 +102,7 @@ pub unsafe fn call_printk(
module_name: &[u8],
args: fmt::Arguments<'_>,
) {
+ let args: *const _ = &args;
// `_printk` does not seem to fail in any path.
#[cfg(CONFIG_PRINTK)]
// SAFETY: TODO.
@@ -109,7 +110,7 @@ pub unsafe fn call_printk(
bindings::_printk(
format_string.as_ptr(),
module_name.as_ptr(),
- &args as *const _ as *const c_void,
+ args.cast::<c_void>(),
);
}
}
@@ -122,15 +123,13 @@ pub unsafe fn call_printk(
#[doc(hidden)]
#[cfg_attr(not(CONFIG_PRINTK), allow(unused_variables))]
pub fn call_printk_cont(args: fmt::Arguments<'_>) {
+ let args: *const _ = &args;
// `_printk` does not seem to fail in any path.
//
// SAFETY: The format string is fixed.
#[cfg(CONFIG_PRINTK)]
unsafe {
- bindings::_printk(
- format_strings::CONT.as_ptr(),
- &args as *const _ as *const c_void,
- );
+ bindings::_printk(format_strings::CONT.as_ptr(), args.cast::<c_void>());
}
}
diff --git a/rust/kernel/seq_file.rs b/rust/kernel/seq_file.rs
index 04947c672979..90545d28e6b7 100644
--- a/rust/kernel/seq_file.rs
+++ b/rust/kernel/seq_file.rs
@@ -31,12 +31,13 @@ pub unsafe fn from_raw<'a>(ptr: *mut bindings::seq_file) -> &'a SeqFile {
/// Used by the [`seq_print`] macro.
pub fn call_printf(&self, args: core::fmt::Arguments<'_>) {
+ let args: *const _ = &args;
// SAFETY: Passing a void pointer to `Arguments` is valid for `%pA`.
unsafe {
bindings::seq_printf(
self.inner.get(),
c_str!("%pA").as_char_ptr(),
- &args as *const _ as *const crate::ffi::c_void,
+ args.cast::<crate::ffi::c_void>(),
);
}
}
diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs
index 28e2201604d6..6a1a982b946d 100644
--- a/rust/kernel/str.rs
+++ b/rust/kernel/str.rs
@@ -191,7 +191,7 @@ pub unsafe fn from_char_ptr<'a>(ptr: *const crate::ffi::c_char) -> &'a Self {
// to a `NUL`-terminated C string.
let len = unsafe { bindings::strlen(ptr) } + 1;
// SAFETY: Lifetime guaranteed by the safety precondition.
- let bytes = unsafe { core::slice::from_raw_parts(ptr as _, len) };
+ let bytes = unsafe { core::slice::from_raw_parts(ptr.cast(), len) };
// SAFETY: As `len` is returned by `strlen`, `bytes` does not contain interior `NUL`.
// As we have added 1 to `len`, the last byte is known to be `NUL`.
unsafe { Self::from_bytes_with_nul_unchecked(bytes) }
diff --git a/rust/kernel/sync/poll.rs b/rust/kernel/sync/poll.rs
index d5f17153b424..a151f54cde91 100644
--- a/rust/kernel/sync/poll.rs
+++ b/rust/kernel/sync/poll.rs
@@ -73,7 +73,7 @@ pub fn register_wait(&mut self, file: &File, cv: &PollCondVar) {
// be destroyed, the destructor must run. That destructor first removes all waiters,
// and then waits for an rcu grace period. Therefore, `cv.wait_queue_head` is valid for
// long enough.
- unsafe { qproc(file.as_ptr() as _, cv.wait_queue_head.get(), self.0.get()) };
+ unsafe { qproc(file.as_ptr().cast(), cv.wait_queue_head.get(), self.0.get()) };
}
}
}
diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs
index 0cd100d2aefb..8ff54105be3f 100644
--- a/rust/kernel/workqueue.rs
+++ b/rust/kernel/workqueue.rs
@@ -170,7 +170,7 @@ impl Queue {
pub unsafe fn from_raw<'a>(ptr: *const bindings::workqueue_struct) -> &'a Queue {
// SAFETY: The `Queue` type is `#[repr(transparent)]`, so the pointer cast is valid. The
// caller promises that the pointer is not dangling.
- unsafe { &*(ptr as *const Queue) }
+ unsafe { &*ptr.cast::<Queue>() }
}
/// Enqueues a work item.
@@ -457,7 +457,7 @@ fn get_work_offset(&self) -> usize {
#[inline]
unsafe fn raw_get_work(ptr: *mut Self) -> *mut Work<T, ID> {
// SAFETY: The caller promises that the pointer is valid.
- unsafe { (ptr as *mut u8).add(Self::OFFSET) as *mut Work<T, ID> }
+ unsafe { ptr.cast::<u8>().add(Self::OFFSET).cast::<Work<T, ID>>() }
}
/// Returns a pointer to the struct containing the [`Work<T, ID>`] field.
@@ -472,7 +472,7 @@ unsafe fn work_container_of(ptr: *mut Work<T, ID>) -> *mut Self
{
// SAFETY: The caller promises that the pointer points at a field of the right type in the
// right kind of struct.
- unsafe { (ptr as *mut u8).sub(Self::OFFSET) as *mut Self }
+ unsafe { ptr.cast::<u8>().sub(Self::OFFSET).cast::<Self>() }
}
}
@@ -538,7 +538,7 @@ unsafe impl<T, const ID: u64> WorkItemPointer<ID> for Arc<T>
{
unsafe extern "C" fn run(ptr: *mut bindings::work_struct) {
// The `__enqueue` method always uses a `work_struct` stored in a `Work<T, ID>`.
- let ptr = ptr as *mut Work<T, ID>;
+ let ptr = ptr.cast::<Work<T, ID>>();
// SAFETY: This computes the pointer that `__enqueue` got from `Arc::into_raw`.
let ptr = unsafe { T::work_container_of(ptr) };
// SAFETY: This pointer comes from `Arc::into_raw` and we've been given back ownership.
@@ -591,7 +591,7 @@ unsafe impl<T, const ID: u64> WorkItemPointer<ID> for Pin<KBox<T>>
{
unsafe extern "C" fn run(ptr: *mut bindings::work_struct) {
// The `__enqueue` method always uses a `work_struct` stored in a `Work<T, ID>`.
- let ptr = ptr as *mut Work<T, ID>;
+ let ptr = ptr.cast::<Work<T, ID>>();
// SAFETY: This computes the pointer that `__enqueue` got from `Arc::into_raw`.
let ptr = unsafe { T::work_container_of(ptr) };
// SAFETY: This pointer comes from `Arc::into_raw` and we've been given back ownership.
diff --git a/rust/uapi/lib.rs b/rust/uapi/lib.rs
index 13495910271f..fe9bf7b5a306 100644
--- a/rust/uapi/lib.rs
+++ b/rust/uapi/lib.rs
@@ -15,6 +15,7 @@
#![allow(
clippy::all,
clippy::undocumented_unsafe_blocks,
+ clippy::ptr_as_ptr,
dead_code,
missing_docs,
non_camel_case_types,
---
base-commit: ff64846bee0e7e3e7bc9363ebad3bab42dd27e24
change-id: 20250307-ptr-as-ptr-21b1867fc4d4
Best regards,
--
Tamir Duberstein <tamird(a)gmail.com>
Signal delivery during connect() may disconnect an already established
socket. Problem is that such socket might have been placed in a sockmap
before the connection was closed.
PATCH 1 ensures this race won't lead to an unconnected vsock staying in the
sockmap. PATCH 2 selftests it.
PATCH 3 fixes a related race. Note that here the race window is rather
difficult to hit and I can't think of an easy way of testing it.
Signed-off-by: Michal Luczaj <mhal(a)rbox.co>
---
Changes in v2:
- Handle one more path of tripping the warning
- Add a selftest
- Collect R-b [Stefano]
- Link to v1: https://lore.kernel.org/r/20250307-vsock-trans-signal-race-v1-1-3aca3f771fb…
---
Michal Luczaj (3):
vsock/bpf: Fix EINTR connect() racing sockmap update
selftest/bpf: Add test for AF_VSOCK connect() racing sockmap update
vsock/bpf: Fix bpf recvmsg() racing transport reassignment
net/vmw_vsock/af_vsock.c | 10 +-
net/vmw_vsock/vsock_bpf.c | 24 +++--
.../selftests/bpf/prog_tests/sockmap_basic.c | 111 +++++++++++++++++++++
3 files changed, 136 insertions(+), 9 deletions(-)
---
base-commit: da9e8efe7ee10e8425dc356a9fc593502c8e3933
change-id: 20250305-vsock-trans-signal-race-d62f7718d099
Best regards,
--
Michal Luczaj <mhal(a)rbox.co>
On 2025/3/14 18:14, Su Hui wrote:
> On 2025/3/14 17:21, Dan Carpenter wrote:
>> On Fri, Mar 14, 2025 at 03:14:51PM +0800, Su Hui wrote:
>>> When 'manual=false' and 'signaled=true', then expected value when using
>>> NTSYNC_IOC_CREATE_EVENT should be greater than zero. Fix this typo error.
>>>
>>> Signed-off-by: Su Hui<suhui(a)nfschina.com>
>>> ---
>>> tools/testing/selftests/drivers/ntsync/ntsync.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/tools/testing/selftests/drivers/ntsync/ntsync.c b/tools/testing/selftests/drivers/ntsync/ntsync.c
>>> index 3aad311574c4..bfb6fad653d0 100644
>>> --- a/tools/testing/selftests/drivers/ntsync/ntsync.c
>>> +++ b/tools/testing/selftests/drivers/ntsync/ntsync.c
>>> @@ -968,7 +968,7 @@ TEST(wake_all)
>>> auto_event_args.manual = false;
>>> auto_event_args.signaled = true;
>>> objs[3] = ioctl(fd, NTSYNC_IOC_CREATE_EVENT, &auto_event_args);
>>> - EXPECT_EQ(0, objs[3]);
>>> + EXPECT_LE(0, objs[3]);
>> It's kind of weird how these macros put the constant on the left.
>> It returns an "fd" on success. So this look reasonable. It probably
>> won't return the zero fd so we could probably check EXPECT_LT()?
> Agreed, there are about 29 items that can be changed to EXPECT_LT().
> I can send a v2 patchset with this change if there is no more other
> suggestions.
Sorry for the wrong style of email:(.
Su Hui
After the recent merge between net-next and net, I got some conflicts on
my side because the merge resolution was different from Stephen's one
[1] I applied on my side in the MPTCP tree.
It looks like the code that is now in net-next is using the old way to
retrieve the local and remote addresses. This patch is now using the new
way, like what was in Stephen's email [1].
Also, in get_interface_info(), there were no conflicts in this area,
because that was new code from 'net', but a small adaptation was needed
there as well to get the remote address.
Fixes: 941defcea7e1 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")
Link: https://lore.kernel.org/20250311115758.17a1d414@canb.auug.org.au [1]
Suggested-by: Stephen Rothwell <sfr(a)canb.auug.org.au>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
tools/testing/selftests/drivers/net/ping.py | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/ping.py b/tools/testing/selftests/drivers/net/ping.py
index 7a1026a073681d159202015fc6945e91368863fe..79f07e0510ecc14d3bc2716e14f49f9381bb919f 100755
--- a/tools/testing/selftests/drivers/net/ping.py
+++ b/tools/testing/selftests/drivers/net/ping.py
@@ -15,18 +15,18 @@ no_sleep=False
def _test_v4(cfg) -> None:
cfg.require_ipver("4")
- cmd(f"ping -c 1 -W0.5 {cfg.remote_v4}")
- cmd(f"ping -c 1 -W0.5 {cfg.v4}", host=cfg.remote)
- cmd(f"ping -s 65000 -c 1 -W0.5 {cfg.remote_v4}")
- cmd(f"ping -s 65000 -c 1 -W0.5 {cfg.v4}", host=cfg.remote)
+ cmd("ping -c 1 -W0.5 " + cfg.remote_addr_v["4"])
+ cmd("ping -c 1 -W0.5 " + cfg.addr_v["4"], host=cfg.remote)
+ cmd("ping -s 65000 -c 1 -W0.5 " + cfg.remote_addr_v["4"])
+ cmd("ping -s 65000 -c 1 -W0.5 " + cfg.addr_v["4"], host=cfg.remote)
def _test_v6(cfg) -> None:
cfg.require_ipver("6")
- cmd(f"ping -c 1 -W5 {cfg.remote_v6}")
- cmd(f"ping -c 1 -W5 {cfg.v6}", host=cfg.remote)
- cmd(f"ping -s 65000 -c 1 -W0.5 {cfg.remote_v6}")
- cmd(f"ping -s 65000 -c 1 -W0.5 {cfg.v6}", host=cfg.remote)
+ cmd("ping -c 1 -W5 " + cfg.remote_addr_v["6"])
+ cmd("ping -c 1 -W5 " + cfg.addr_v["6"], host=cfg.remote)
+ cmd("ping -s 65000 -c 1 -W0.5 " + cfg.remote_addr_v["6"])
+ cmd("ping -s 65000 -c 1 -W0.5 " + cfg.addr_v["6"], host=cfg.remote)
def _test_tcp(cfg) -> None:
cfg.require_cmd("socat", remote=True)
@@ -120,7 +120,7 @@ def get_interface_info(cfg) -> None:
global remote_ifname
global no_sleep
- remote_info = cmd(f"ip -4 -o addr show to {cfg.remote_v4} | awk '{{print $2}}'", shell=True, host=cfg.remote).stdout
+ remote_info = cmd(f"ip -4 -o addr show to {cfg.remote_addr_v['4']} | awk '{{print $2}}'", shell=True, host=cfg.remote).stdout
remote_ifname = remote_info.rstrip('\n')
if remote_ifname == "":
raise KsftFailEx('Can not get remote interface')
---
base-commit: 941defcea7e11ad7ff8f0d4856716dd637d757dd
change-id: 20250314-net-next-drv-net-ping-fix-merge-b303167fde16
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Replacing all occurrences of `addr_of!(place)` with `&raw place`, and
all occurrences of `addr_of_mut!(place)` with `&raw mut place`.
Utilizing the new feature will allow us to reduce macro complexity, and
improve consistency with existing reference syntax as `&raw`, `&raw mut`
is very similar to `&`, `&mut` making it fit more naturally with other
existing code.
Depends on: Patch 1/3 0001-rust-enable-raw_ref_op-feature.patch
Suggested-by: Benno Lossin <y86-dev(a)protonmail.com>
Link: https://github.com/Rust-for-Linux/linux/issues/1148
Signed-off-by: Antonio Hickey <contact(a)antoniohickey.com>
---
rust/kernel/block/mq/request.rs | 4 ++--
rust/kernel/faux.rs | 4 ++--
rust/kernel/fs/file.rs | 2 +-
rust/kernel/init.rs | 8 ++++----
rust/kernel/init/macros.rs | 28 +++++++++++++-------------
rust/kernel/jump_label.rs | 4 ++--
rust/kernel/kunit.rs | 4 ++--
rust/kernel/list.rs | 2 +-
rust/kernel/list/impl_list_item_mod.rs | 6 +++---
rust/kernel/net/phy.rs | 4 ++--
rust/kernel/pci.rs | 4 ++--
rust/kernel/platform.rs | 4 +---
rust/kernel/rbtree.rs | 22 ++++++++++----------
rust/kernel/sync/arc.rs | 2 +-
rust/kernel/task.rs | 4 ++--
rust/kernel/workqueue.rs | 8 ++++----
16 files changed, 54 insertions(+), 56 deletions(-)
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 7943f43b9575..4a5b7ec914ef 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -12,7 +12,7 @@
};
use core::{
marker::PhantomData,
- ptr::{addr_of_mut, NonNull},
+ ptr::NonNull,
sync::atomic::{AtomicU64, Ordering},
};
@@ -187,7 +187,7 @@ pub(crate) fn refcount(&self) -> &AtomicU64 {
pub(crate) unsafe fn refcount_ptr(this: *mut Self) -> *mut AtomicU64 {
// SAFETY: Because of the safety requirements of this function, the
// field projection is safe.
- unsafe { addr_of_mut!((*this).refcount) }
+ unsafe { &raw mut (*this).refcount }
}
}
diff --git a/rust/kernel/faux.rs b/rust/kernel/faux.rs
index 5acc0c02d451..52ac554c1119 100644
--- a/rust/kernel/faux.rs
+++ b/rust/kernel/faux.rs
@@ -7,7 +7,7 @@
//! C header: [`include/linux/device/faux.h`]
use crate::{bindings, device, error::code::*, prelude::*};
-use core::ptr::{addr_of_mut, null, null_mut, NonNull};
+use core::ptr::{null, null_mut, NonNull};
/// The registration of a faux device.
///
@@ -45,7 +45,7 @@ impl AsRef<device::Device> for Registration {
fn as_ref(&self) -> &device::Device {
// SAFETY: The underlying `device` in `faux_device` is guaranteed by the C API to be
// a valid initialized `device`.
- unsafe { device::Device::as_ref(addr_of_mut!((*self.as_raw()).dev)) }
+ unsafe { device::Device::as_ref((&raw mut (*self.as_raw()).dev)) }
}
}
diff --git a/rust/kernel/fs/file.rs b/rust/kernel/fs/file.rs
index ed57e0137cdb..7ee4830b67f3 100644
--- a/rust/kernel/fs/file.rs
+++ b/rust/kernel/fs/file.rs
@@ -331,7 +331,7 @@ pub fn flags(&self) -> u32 {
// SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.
//
// FIXME(read_once): Replace with `read_once` when available on the Rust side.
- unsafe { core::ptr::addr_of!((*self.as_ptr()).f_flags).read_volatile() }
+ unsafe { (&raw const (*self.as_ptr()).f_flags).read_volatile() }
}
}
diff --git a/rust/kernel/init.rs b/rust/kernel/init.rs
index 7fd1ea8265a5..a8fac6558671 100644
--- a/rust/kernel/init.rs
+++ b/rust/kernel/init.rs
@@ -122,7 +122,7 @@
//! ```rust
//! # #![expect(unreachable_pub, clippy::disallowed_names)]
//! use kernel::{init, types::Opaque};
-//! use core::{ptr::addr_of_mut, marker::PhantomPinned, pin::Pin};
+//! use core::{marker::PhantomPinned, pin::Pin};
//! # mod bindings {
//! # #![expect(non_camel_case_types)]
//! # #![expect(clippy::missing_safety_doc)]
@@ -159,7 +159,7 @@
//! unsafe {
//! init::pin_init_from_closure(move |slot: *mut Self| {
//! // `slot` contains uninit memory, avoid creating a reference.
-//! let foo = addr_of_mut!((*slot).foo);
+//! let foo = &raw mut (*slot).foo;
//!
//! // Initialize the `foo`
//! bindings::init_foo(Opaque::raw_get(foo));
@@ -541,7 +541,7 @@ macro_rules! stack_try_pin_init {
///
/// ```rust
/// # use kernel::{macros::{Zeroable, pin_data}, pin_init};
-/// # use core::{ptr::addr_of_mut, marker::PhantomPinned};
+/// # use core::marker::PhantomPinned;
/// #[pin_data]
/// #[derive(Zeroable)]
/// struct Buf {
@@ -554,7 +554,7 @@ macro_rules! stack_try_pin_init {
/// pin_init!(&this in Buf {
/// buf: [0; 64],
/// // SAFETY: TODO.
-/// ptr: unsafe { addr_of_mut!((*this.as_ptr()).buf).cast() },
+/// ptr: unsafe { &raw mut (*this.as_ptr()).buf.cast() },
/// pin: PhantomPinned,
/// });
/// pin_init!(Buf {
diff --git a/rust/kernel/init/macros.rs b/rust/kernel/init/macros.rs
index 1fd146a83241..af525fbb2f01 100644
--- a/rust/kernel/init/macros.rs
+++ b/rust/kernel/init/macros.rs
@@ -244,25 +244,25 @@
//! struct __InitOk;
//! // This is the expansion of `t,`, which is syntactic sugar for `t: t,`.
//! {
-//! unsafe { ::core::ptr::write(::core::addr_of_mut!((*slot).t), t) };
+//! unsafe { ::core::ptr::write(&raw mut (*slot).t, t) };
//! }
//! // Since initialization could fail later (not in this case, since the
//! // error type is `Infallible`) we will need to drop this field if there
//! // is an error later. This `DropGuard` will drop the field when it gets
//! // dropped and has not yet been forgotten.
//! let __t_guard = unsafe {
-//! ::pinned_init::__internal::DropGuard::new(::core::addr_of_mut!((*slot).t))
+//! ::pinned_init::__internal::DropGuard::new(&raw mut (*slot).t)
//! };
//! // Expansion of `x: 0,`:
//! // Since this can be an arbitrary expression we cannot place it inside
//! // of the `unsafe` block, so we bind it here.
//! {
//! let x = 0;
-//! unsafe { ::core::ptr::write(::core::addr_of_mut!((*slot).x), x) };
+//! unsafe { ::core::ptr::write(&raw mut (*slot).x, x) };
//! }
//! // We again create a `DropGuard`.
//! let __x_guard = unsafe {
-//! ::kernel::init::__internal::DropGuard::new(::core::addr_of_mut!((*slot).x))
+//! ::kernel::init::__internal::DropGuard::new(&raw mut (*slot).x)
//! };
//! // Since initialization has successfully completed, we can now forget
//! // the guards. This is not `mem::forget`, since we only have
@@ -459,15 +459,15 @@
//! {
//! struct __InitOk;
//! {
-//! unsafe { ::core::ptr::write(::core::addr_of_mut!((*slot).a), a) };
+//! unsafe { ::core::ptr::write(&raw mut (*slot).a, a) };
//! }
//! let __a_guard = unsafe {
-//! ::kernel::init::__internal::DropGuard::new(::core::addr_of_mut!((*slot).a))
+//! ::kernel::init::__internal::DropGuard::new(&raw mut (*slot).a)
//! };
//! let init = Bar::new(36);
-//! unsafe { data.b(::core::addr_of_mut!((*slot).b), b)? };
+//! unsafe { data.b(&raw mut (*slot).b, b)? };
//! let __b_guard = unsafe {
-//! ::kernel::init::__internal::DropGuard::new(::core::addr_of_mut!((*slot).b))
+//! ::kernel::init::__internal::DropGuard::new(&raw mut (*slot).b)
//! };
//! ::core::mem::forget(__b_guard);
//! ::core::mem::forget(__a_guard);
@@ -1210,7 +1210,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
// SAFETY: `slot` is valid, because we are inside of an initializer closure, we
// return when an error/panic occurs.
// We also use the `data` to require the correct trait (`Init` or `PinInit`) for `$field`.
- unsafe { $data.$field(::core::ptr::addr_of_mut!((*$slot).$field), init)? };
+ unsafe { $data.$field(&raw mut (*$slot).$field, init)? };
// Create the drop guard:
//
// We rely on macro hygiene to make it impossible for users to access this local variable.
@@ -1218,7 +1218,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
::kernel::macros::paste! {
// SAFETY: We forget the guard later when initialization has succeeded.
let [< __ $field _guard >] = unsafe {
- $crate::init::__internal::DropGuard::new(::core::ptr::addr_of_mut!((*$slot).$field))
+ $crate::init::__internal::DropGuard::new(&raw mut (*$slot).$field)
};
$crate::__init_internal!(init_slot($use_data):
@@ -1241,7 +1241,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
//
// SAFETY: `slot` is valid, because we are inside of an initializer closure, we
// return when an error/panic occurs.
- unsafe { $crate::init::Init::__init(init, ::core::ptr::addr_of_mut!((*$slot).$field))? };
+ unsafe { $crate::init::Init::__init(init, &raw mut (*$slot).$field)? };
// Create the drop guard:
//
// We rely on macro hygiene to make it impossible for users to access this local variable.
@@ -1249,7 +1249,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
::kernel::macros::paste! {
// SAFETY: We forget the guard later when initialization has succeeded.
let [< __ $field _guard >] = unsafe {
- $crate::init::__internal::DropGuard::new(::core::ptr::addr_of_mut!((*$slot).$field))
+ $crate::init::__internal::DropGuard::new(&raw mut (*$slot).$field)
};
$crate::__init_internal!(init_slot():
@@ -1272,7 +1272,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
// Initialize the field.
//
// SAFETY: The memory at `slot` is uninitialized.
- unsafe { ::core::ptr::write(::core::ptr::addr_of_mut!((*$slot).$field), $field) };
+ unsafe { ::core::ptr::write(&raw mut (*$slot).$field, $field) };
}
// Create the drop guard:
//
@@ -1281,7 +1281,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
::kernel::macros::paste! {
// SAFETY: We forget the guard later when initialization has succeeded.
let [< __ $field _guard >] = unsafe {
- $crate::init::__internal::DropGuard::new(::core::ptr::addr_of_mut!((*$slot).$field))
+ $crate::init::__internal::DropGuard::new(&raw mut (*$slot).$field)
};
$crate::__init_internal!(init_slot($($use_data)?):
diff --git a/rust/kernel/jump_label.rs b/rust/kernel/jump_label.rs
index 4e974c768dbd..05d4564714c7 100644
--- a/rust/kernel/jump_label.rs
+++ b/rust/kernel/jump_label.rs
@@ -20,8 +20,8 @@
#[macro_export]
macro_rules! static_branch_unlikely {
($key:path, $keytyp:ty, $field:ident) => {{
- let _key: *const $keytyp = ::core::ptr::addr_of!($key);
- let _key: *const $crate::bindings::static_key_false = ::core::ptr::addr_of!((*_key).$field);
+ let _key: *const $keytyp = &raw $key;
+ let _key: *const $crate::bindings::static_key_false = &raw (*_key).$field;
let _key: *const $crate::bindings::static_key = _key.cast();
#[cfg(not(CONFIG_JUMP_LABEL))]
diff --git a/rust/kernel/kunit.rs b/rust/kernel/kunit.rs
index 824da0e9738a..18357dd782ed 100644
--- a/rust/kernel/kunit.rs
+++ b/rust/kernel/kunit.rs
@@ -128,9 +128,9 @@ unsafe impl Sync for UnaryAssert {}
unsafe {
$crate::bindings::__kunit_do_failed_assertion(
kunit_test,
- core::ptr::addr_of!(LOCATION.0),
+ &raw LOCATION.0,
$crate::bindings::kunit_assert_type_KUNIT_ASSERTION,
- core::ptr::addr_of!(ASSERTION.0.assert),
+ &raw ASSERTION.0.assert,
Some($crate::bindings::kunit_unary_assert_format),
core::ptr::null(),
);
diff --git a/rust/kernel/list.rs b/rust/kernel/list.rs
index c0ed227b8a4f..e98f0820f002 100644
--- a/rust/kernel/list.rs
+++ b/rust/kernel/list.rs
@@ -176,7 +176,7 @@ pub fn new() -> impl PinInit<Self> {
#[inline]
unsafe fn fields(me: *mut Self) -> *mut ListLinksFields {
// SAFETY: The caller promises that the pointer is valid.
- unsafe { Opaque::raw_get(ptr::addr_of!((*me).inner)) }
+ unsafe { Opaque::raw_get(&raw const (*me).inner) }
}
/// # Safety
diff --git a/rust/kernel/list/impl_list_item_mod.rs b/rust/kernel/list/impl_list_item_mod.rs
index a0438537cee1..014b6713d59d 100644
--- a/rust/kernel/list/impl_list_item_mod.rs
+++ b/rust/kernel/list/impl_list_item_mod.rs
@@ -49,7 +49,7 @@ macro_rules! impl_has_list_links {
// SAFETY: The implementation of `raw_get_list_links` only compiles if the field has the
// right type.
//
- // The behavior of `raw_get_list_links` is not changed since the `addr_of_mut!` macro is
+ // The behavior of `raw_get_list_links` is not changed since the `&raw mut` op is
// equivalent to the pointer offset operation in the trait definition.
unsafe impl$(<$($implarg),*>)? $crate::list::HasListLinks$(<$id>)? for
$self $(<$($selfarg),*>)?
@@ -61,7 +61,7 @@ unsafe fn raw_get_list_links(ptr: *mut Self) -> *mut $crate::list::ListLinks$(<$
// SAFETY: The caller promises that the pointer is not dangling. We know that this
// expression doesn't follow any pointers, as the `offset_of!` invocation above
// would otherwise not compile.
- unsafe { ::core::ptr::addr_of_mut!((*ptr)$(.$field)*) }
+ unsafe { &raw mut (*ptr)$(.$field)* }
}
}
)*};
@@ -103,7 +103,7 @@ macro_rules! impl_has_list_links_self_ptr {
unsafe fn raw_get_list_links(ptr: *mut Self) -> *mut $crate::list::ListLinks$(<$id>)? {
// SAFETY: The caller promises that the pointer is not dangling.
let ptr: *mut $crate::list::ListLinksSelfPtr<$item_type $(, $id)?> =
- unsafe { ::core::ptr::addr_of_mut!((*ptr).$field) };
+ unsafe { &raw mut (*ptr).$field };
ptr.cast()
}
}
diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
index a59469c785e3..757db052cc09 100644
--- a/rust/kernel/net/phy.rs
+++ b/rust/kernel/net/phy.rs
@@ -7,7 +7,7 @@
//! C headers: [`include/linux/phy.h`](srctree/include/linux/phy.h).
use crate::{error::*, prelude::*, types::Opaque};
-use core::{marker::PhantomData, ptr::addr_of_mut};
+use core::marker::PhantomData;
pub mod reg;
@@ -285,7 +285,7 @@ impl AsRef<kernel::device::Device> for Device {
fn as_ref(&self) -> &kernel::device::Device {
let phydev = self.0.get();
// SAFETY: The struct invariant ensures that `mdio.dev` is valid.
- unsafe { kernel::device::Device::as_ref(addr_of_mut!((*phydev).mdio.dev)) }
+ unsafe { kernel::device::Device::as_ref(&raw mut (*phydev).mdio.dev) }
}
}
diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index f7b2743828ae..6cb9ed1e7cbf 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -17,7 +17,7 @@
types::{ARef, ForeignOwnable, Opaque},
ThisModule,
};
-use core::{ops::Deref, ptr::addr_of_mut};
+use core::ops::Deref;
use kernel::prelude::*;
/// An adapter for the registration of PCI drivers.
@@ -60,7 +60,7 @@ extern "C" fn probe_callback(
) -> kernel::ffi::c_int {
// SAFETY: The PCI bus only ever calls the probe callback with a valid pointer to a
// `struct pci_dev`.
- let dev = unsafe { device::Device::get_device(addr_of_mut!((*pdev).dev)) };
+ let dev = unsafe { device::Device::get_device(&raw mut (*pdev).dev) };
// SAFETY: `dev` is guaranteed to be embedded in a valid `struct pci_dev` by the call
// above.
let mut pdev = unsafe { Device::from_dev(dev) };
diff --git a/rust/kernel/platform.rs b/rust/kernel/platform.rs
index 1297f5292ba9..344875ad7b82 100644
--- a/rust/kernel/platform.rs
+++ b/rust/kernel/platform.rs
@@ -14,8 +14,6 @@
ThisModule,
};
-use core::ptr::addr_of_mut;
-
/// An adapter for the registration of platform drivers.
pub struct Adapter<T: Driver>(T);
@@ -55,7 +53,7 @@ unsafe fn unregister(pdrv: &Opaque<Self::RegType>) {
impl<T: Driver + 'static> Adapter<T> {
extern "C" fn probe_callback(pdev: *mut bindings::platform_device) -> kernel::ffi::c_int {
// SAFETY: The platform bus only ever calls the probe callback with a valid `pdev`.
- let dev = unsafe { device::Device::get_device(addr_of_mut!((*pdev).dev)) };
+ let dev = unsafe { device::Device::get_device(&raw mut (*pdev).dev) };
// SAFETY: `dev` is guaranteed to be embedded in a valid `struct platform_device` by the
// call above.
let mut pdev = unsafe { Device::from_dev(dev) };
diff --git a/rust/kernel/rbtree.rs b/rust/kernel/rbtree.rs
index 1ea25c7092fb..b0ad35663cb0 100644
--- a/rust/kernel/rbtree.rs
+++ b/rust/kernel/rbtree.rs
@@ -11,7 +11,7 @@
cmp::{Ord, Ordering},
marker::PhantomData,
mem::MaybeUninit,
- ptr::{addr_of_mut, from_mut, NonNull},
+ ptr::{from_mut, NonNull},
};
/// A red-black tree with owned nodes.
@@ -238,7 +238,7 @@ pub fn values_mut(&mut self) -> impl Iterator<Item = &'_ mut V> {
/// Returns a cursor over the tree nodes, starting with the smallest key.
pub fn cursor_front(&mut self) -> Option<Cursor<'_, K, V>> {
- let root = addr_of_mut!(self.root);
+ let root = &raw mut self.root;
// SAFETY: `self.root` is always a valid root node
let current = unsafe { bindings::rb_first(root) };
NonNull::new(current).map(|current| {
@@ -253,7 +253,7 @@ pub fn cursor_front(&mut self) -> Option<Cursor<'_, K, V>> {
/// Returns a cursor over the tree nodes, starting with the largest key.
pub fn cursor_back(&mut self) -> Option<Cursor<'_, K, V>> {
- let root = addr_of_mut!(self.root);
+ let root = &raw mut self.root;
// SAFETY: `self.root` is always a valid root node
let current = unsafe { bindings::rb_last(root) };
NonNull::new(current).map(|current| {
@@ -459,7 +459,7 @@ pub fn cursor_lower_bound(&mut self, key: &K) -> Option<Cursor<'_, K, V>>
let best = best_match?;
// SAFETY: `best` is a non-null node so it is valid by the type invariants.
- let links = unsafe { addr_of_mut!((*best.as_ptr()).links) };
+ let links = unsafe { &raw mut (*best.as_ptr()).links };
NonNull::new(links).map(|current| {
// INVARIANT:
@@ -767,7 +767,7 @@ pub fn remove_current(self) -> (Option<Self>, RBTreeNode<K, V>) {
let node = RBTreeNode { node };
// SAFETY: The reference to the tree used to create the cursor outlives the cursor, so
// the tree cannot change. By the tree invariant, all nodes are valid.
- unsafe { bindings::rb_erase(&mut (*this).links, addr_of_mut!(self.tree.root)) };
+ unsafe { bindings::rb_erase(&mut (*this).links, &raw mut self.tree.root) };
let current = match (prev, next) {
(_, Some(next)) => next,
@@ -803,7 +803,7 @@ fn remove_neighbor(&mut self, direction: Direction) -> Option<RBTreeNode<K, V>>
let neighbor = neighbor.as_ptr();
// SAFETY: The reference to the tree used to create the cursor outlives the cursor, so
// the tree cannot change. By the tree invariant, all nodes are valid.
- unsafe { bindings::rb_erase(neighbor, addr_of_mut!(self.tree.root)) };
+ unsafe { bindings::rb_erase(neighbor, &raw mut self.tree.root) };
// SAFETY: By the type invariant of `Self`, all non-null `rb_node` pointers stored in `self`
// point to the links field of `Node<K, V>` objects.
let this = unsafe { container_of!(neighbor, Node<K, V>, links) }.cast_mut();
@@ -918,7 +918,7 @@ unsafe fn to_key_value_raw<'b>(node: NonNull<bindings::rb_node>) -> (&'b K, *mut
let k = unsafe { &(*this).key };
// SAFETY: The passed `node` is the current node or a non-null neighbor,
// thus `this` is valid by the type invariants.
- let v = unsafe { addr_of_mut!((*this).value) };
+ let v = unsafe { &raw mut (*this).value };
(k, v)
}
}
@@ -1027,7 +1027,7 @@ fn next(&mut self) -> Option<Self::Item> {
self.next = unsafe { bindings::rb_next(self.next) };
// SAFETY: By the same reasoning above, it is safe to dereference the node.
- Some(unsafe { (addr_of_mut!((*cur).key), addr_of_mut!((*cur).value)) })
+ Some(unsafe { (&raw mut (*cur).key, &raw mut (*cur).value) })
}
}
@@ -1170,7 +1170,7 @@ fn insert(self, node: RBTreeNode<K, V>) -> &'a mut V {
// SAFETY: `node` is valid at least until we call `Box::from_raw`, which only happens when
// the node is removed or replaced.
- let node_links = unsafe { addr_of_mut!((*node).links) };
+ let node_links = unsafe { &raw mut (*node).links };
// INVARIANT: We are linking in a new node, which is valid. It remains valid because we
// "forgot" it with `Box::into_raw`.
@@ -1178,7 +1178,7 @@ fn insert(self, node: RBTreeNode<K, V>) -> &'a mut V {
unsafe { bindings::rb_link_node(node_links, self.parent, self.child_field_of_parent) };
// SAFETY: All pointers are valid. `node` has just been inserted into the tree.
- unsafe { bindings::rb_insert_color(node_links, addr_of_mut!((*self.rbtree).root)) };
+ unsafe { bindings::rb_insert_color(node_links, &raw mut (*self.rbtree).root) };
// SAFETY: The node is valid until we remove it from the tree.
unsafe { &mut (*node).value }
@@ -1261,7 +1261,7 @@ fn replace(self, node: RBTreeNode<K, V>) -> RBTreeNode<K, V> {
// SAFETY: `node` is valid at least until we call `Box::from_raw`, which only happens when
// the node is removed or replaced.
- let new_node_links = unsafe { addr_of_mut!((*node).links) };
+ let new_node_links = unsafe { &raw mut (*node).links };
// SAFETY: This updates the pointers so that `new_node_links` is in the tree where
// `self.node_links` used to be.
diff --git a/rust/kernel/sync/arc.rs b/rust/kernel/sync/arc.rs
index 3cefda7a4372..81d8b0f84957 100644
--- a/rust/kernel/sync/arc.rs
+++ b/rust/kernel/sync/arc.rs
@@ -243,7 +243,7 @@ pub fn into_raw(self) -> *const T {
let ptr = self.ptr.as_ptr();
core::mem::forget(self);
// SAFETY: The pointer is valid.
- unsafe { core::ptr::addr_of!((*ptr).data) }
+ unsafe { &raw const (*ptr).data }
}
/// Recreates an [`Arc`] instance previously deconstructed via [`Arc::into_raw`].
diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs
index 49012e711942..b2ac768eed23 100644
--- a/rust/kernel/task.rs
+++ b/rust/kernel/task.rs
@@ -257,7 +257,7 @@ pub fn as_ptr(&self) -> *mut bindings::task_struct {
pub fn group_leader(&self) -> &Task {
// SAFETY: The group leader of a task never changes after initialization, so reading this
// field is not a data race.
- let ptr = unsafe { *ptr::addr_of!((*self.as_ptr()).group_leader) };
+ let ptr = unsafe { *(&raw const (*self.as_ptr()).group_leader) };
// SAFETY: The lifetime of the returned task reference is tied to the lifetime of `self`,
// and given that a task has a reference to its group leader, we know it must be valid for
@@ -269,7 +269,7 @@ pub fn group_leader(&self) -> &Task {
pub fn pid(&self) -> Pid {
// SAFETY: The pid of a task never changes after initialization, so reading this field is
// not a data race.
- unsafe { *ptr::addr_of!((*self.as_ptr()).pid) }
+ unsafe { *(&raw const (*self.as_ptr()).pid) }
}
/// Returns the UID of the given task.
diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs
index 0cd100d2aefb..34e8abb38974 100644
--- a/rust/kernel/workqueue.rs
+++ b/rust/kernel/workqueue.rs
@@ -401,9 +401,9 @@ pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self
pub unsafe fn raw_get(ptr: *const Self) -> *mut bindings::work_struct {
// SAFETY: The caller promises that the pointer is aligned and not dangling.
//
- // A pointer cast would also be ok due to `#[repr(transparent)]`. We use `addr_of!` so that
- // the compiler does not complain that the `work` field is unused.
- unsafe { Opaque::raw_get(core::ptr::addr_of!((*ptr).work)) }
+ // A pointer cast would also be ok due to `#[repr(transparent)]`. We use `&raw const (*ptr).work`
+ // so that the compiler does not complain that the `work` field is unused.
+ unsafe { Opaque::raw_get(&raw const (*ptr).work) }
}
}
@@ -510,7 +510,7 @@ macro_rules! impl_has_work {
unsafe fn raw_get_work(ptr: *mut Self) -> *mut $crate::workqueue::Work<$work_type $(, $id)?> {
// SAFETY: The caller promises that the pointer is not dangling.
unsafe {
- ::core::ptr::addr_of_mut!((*ptr).$field)
+ &raw mut (*ptr).$field
}
}
}
--
2.48.1
Replacing all occurrences of `addr_of!(place)` with `&raw const place`, and
all occurrences of `addr_of_mut!(place)` with `&raw mut place`.
Utilizing the new feature will allow us to reduce macro complexity, and
improve consistency with existing reference syntax as `&raw const`, `&raw mut`
is very similar to `&`, `&mut` making it fit more naturally with other
existing code than the previously used macros.
Suggested-by: Benno Lossin <benno.lossin(a)proton.me>
Link: https://github.com/Rust-for-Linux/linux/issues/1148
Signed-off-by: Antonio Hickey <contact(a)antoniohickey.com>
---
rust/kernel/block/mq/request.rs | 4 ++--
rust/kernel/faux.rs | 4 ++--
rust/kernel/fs/file.rs | 2 +-
rust/kernel/init.rs | 8 ++++----
rust/kernel/init/macros.rs | 28 +++++++++++++-------------
rust/kernel/jump_label.rs | 4 ++--
rust/kernel/kunit.rs | 4 ++--
rust/kernel/list.rs | 2 +-
rust/kernel/list/impl_list_item_mod.rs | 6 +++---
rust/kernel/net/phy.rs | 4 ++--
rust/kernel/pci.rs | 4 ++--
rust/kernel/platform.rs | 4 +---
rust/kernel/rbtree.rs | 22 ++++++++++----------
rust/kernel/sync/arc.rs | 2 +-
rust/kernel/task.rs | 4 ++--
rust/kernel/workqueue.rs | 8 ++++----
16 files changed, 54 insertions(+), 56 deletions(-)
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index 7943f43b9575..4a5b7ec914ef 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -12,7 +12,7 @@
};
use core::{
marker::PhantomData,
- ptr::{addr_of_mut, NonNull},
+ ptr::NonNull,
sync::atomic::{AtomicU64, Ordering},
};
@@ -187,7 +187,7 @@ pub(crate) fn refcount(&self) -> &AtomicU64 {
pub(crate) unsafe fn refcount_ptr(this: *mut Self) -> *mut AtomicU64 {
// SAFETY: Because of the safety requirements of this function, the
// field projection is safe.
- unsafe { addr_of_mut!((*this).refcount) }
+ unsafe { &raw mut (*this).refcount }
}
}
diff --git a/rust/kernel/faux.rs b/rust/kernel/faux.rs
index 5acc0c02d451..52ac554c1119 100644
--- a/rust/kernel/faux.rs
+++ b/rust/kernel/faux.rs
@@ -7,7 +7,7 @@
//! C header: [`include/linux/device/faux.h`]
use crate::{bindings, device, error::code::*, prelude::*};
-use core::ptr::{addr_of_mut, null, null_mut, NonNull};
+use core::ptr::{null, null_mut, NonNull};
/// The registration of a faux device.
///
@@ -45,7 +45,7 @@ impl AsRef<device::Device> for Registration {
fn as_ref(&self) -> &device::Device {
// SAFETY: The underlying `device` in `faux_device` is guaranteed by the C API to be
// a valid initialized `device`.
- unsafe { device::Device::as_ref(addr_of_mut!((*self.as_raw()).dev)) }
+ unsafe { device::Device::as_ref((&raw mut (*self.as_raw()).dev)) }
}
}
diff --git a/rust/kernel/fs/file.rs b/rust/kernel/fs/file.rs
index ed57e0137cdb..7ee4830b67f3 100644
--- a/rust/kernel/fs/file.rs
+++ b/rust/kernel/fs/file.rs
@@ -331,7 +331,7 @@ pub fn flags(&self) -> u32 {
// SAFETY: The file is valid because the shared reference guarantees a nonzero refcount.
//
// FIXME(read_once): Replace with `read_once` when available on the Rust side.
- unsafe { core::ptr::addr_of!((*self.as_ptr()).f_flags).read_volatile() }
+ unsafe { (&raw const (*self.as_ptr()).f_flags).read_volatile() }
}
}
diff --git a/rust/kernel/init.rs b/rust/kernel/init.rs
index 7fd1ea8265a5..a8fac6558671 100644
--- a/rust/kernel/init.rs
+++ b/rust/kernel/init.rs
@@ -122,7 +122,7 @@
//! ```rust
//! # #![expect(unreachable_pub, clippy::disallowed_names)]
//! use kernel::{init, types::Opaque};
-//! use core::{ptr::addr_of_mut, marker::PhantomPinned, pin::Pin};
+//! use core::{marker::PhantomPinned, pin::Pin};
//! # mod bindings {
//! # #![expect(non_camel_case_types)]
//! # #![expect(clippy::missing_safety_doc)]
@@ -159,7 +159,7 @@
//! unsafe {
//! init::pin_init_from_closure(move |slot: *mut Self| {
//! // `slot` contains uninit memory, avoid creating a reference.
-//! let foo = addr_of_mut!((*slot).foo);
+//! let foo = &raw mut (*slot).foo;
//!
//! // Initialize the `foo`
//! bindings::init_foo(Opaque::raw_get(foo));
@@ -541,7 +541,7 @@ macro_rules! stack_try_pin_init {
///
/// ```rust
/// # use kernel::{macros::{Zeroable, pin_data}, pin_init};
-/// # use core::{ptr::addr_of_mut, marker::PhantomPinned};
+/// # use core::marker::PhantomPinned;
/// #[pin_data]
/// #[derive(Zeroable)]
/// struct Buf {
@@ -554,7 +554,7 @@ macro_rules! stack_try_pin_init {
/// pin_init!(&this in Buf {
/// buf: [0; 64],
/// // SAFETY: TODO.
-/// ptr: unsafe { addr_of_mut!((*this.as_ptr()).buf).cast() },
+/// ptr: unsafe { &raw mut (*this.as_ptr()).buf.cast() },
/// pin: PhantomPinned,
/// });
/// pin_init!(Buf {
diff --git a/rust/kernel/init/macros.rs b/rust/kernel/init/macros.rs
index 1fd146a83241..af525fbb2f01 100644
--- a/rust/kernel/init/macros.rs
+++ b/rust/kernel/init/macros.rs
@@ -244,25 +244,25 @@
//! struct __InitOk;
//! // This is the expansion of `t,`, which is syntactic sugar for `t: t,`.
//! {
-//! unsafe { ::core::ptr::write(::core::addr_of_mut!((*slot).t), t) };
+//! unsafe { ::core::ptr::write(&raw mut (*slot).t, t) };
//! }
//! // Since initialization could fail later (not in this case, since the
//! // error type is `Infallible`) we will need to drop this field if there
//! // is an error later. This `DropGuard` will drop the field when it gets
//! // dropped and has not yet been forgotten.
//! let __t_guard = unsafe {
-//! ::pinned_init::__internal::DropGuard::new(::core::addr_of_mut!((*slot).t))
+//! ::pinned_init::__internal::DropGuard::new(&raw mut (*slot).t)
//! };
//! // Expansion of `x: 0,`:
//! // Since this can be an arbitrary expression we cannot place it inside
//! // of the `unsafe` block, so we bind it here.
//! {
//! let x = 0;
-//! unsafe { ::core::ptr::write(::core::addr_of_mut!((*slot).x), x) };
+//! unsafe { ::core::ptr::write(&raw mut (*slot).x, x) };
//! }
//! // We again create a `DropGuard`.
//! let __x_guard = unsafe {
-//! ::kernel::init::__internal::DropGuard::new(::core::addr_of_mut!((*slot).x))
+//! ::kernel::init::__internal::DropGuard::new(&raw mut (*slot).x)
//! };
//! // Since initialization has successfully completed, we can now forget
//! // the guards. This is not `mem::forget`, since we only have
@@ -459,15 +459,15 @@
//! {
//! struct __InitOk;
//! {
-//! unsafe { ::core::ptr::write(::core::addr_of_mut!((*slot).a), a) };
+//! unsafe { ::core::ptr::write(&raw mut (*slot).a, a) };
//! }
//! let __a_guard = unsafe {
-//! ::kernel::init::__internal::DropGuard::new(::core::addr_of_mut!((*slot).a))
+//! ::kernel::init::__internal::DropGuard::new(&raw mut (*slot).a)
//! };
//! let init = Bar::new(36);
-//! unsafe { data.b(::core::addr_of_mut!((*slot).b), b)? };
+//! unsafe { data.b(&raw mut (*slot).b, b)? };
//! let __b_guard = unsafe {
-//! ::kernel::init::__internal::DropGuard::new(::core::addr_of_mut!((*slot).b))
+//! ::kernel::init::__internal::DropGuard::new(&raw mut (*slot).b)
//! };
//! ::core::mem::forget(__b_guard);
//! ::core::mem::forget(__a_guard);
@@ -1210,7 +1210,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
// SAFETY: `slot` is valid, because we are inside of an initializer closure, we
// return when an error/panic occurs.
// We also use the `data` to require the correct trait (`Init` or `PinInit`) for `$field`.
- unsafe { $data.$field(::core::ptr::addr_of_mut!((*$slot).$field), init)? };
+ unsafe { $data.$field(&raw mut (*$slot).$field, init)? };
// Create the drop guard:
//
// We rely on macro hygiene to make it impossible for users to access this local variable.
@@ -1218,7 +1218,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
::kernel::macros::paste! {
// SAFETY: We forget the guard later when initialization has succeeded.
let [< __ $field _guard >] = unsafe {
- $crate::init::__internal::DropGuard::new(::core::ptr::addr_of_mut!((*$slot).$field))
+ $crate::init::__internal::DropGuard::new(&raw mut (*$slot).$field)
};
$crate::__init_internal!(init_slot($use_data):
@@ -1241,7 +1241,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
//
// SAFETY: `slot` is valid, because we are inside of an initializer closure, we
// return when an error/panic occurs.
- unsafe { $crate::init::Init::__init(init, ::core::ptr::addr_of_mut!((*$slot).$field))? };
+ unsafe { $crate::init::Init::__init(init, &raw mut (*$slot).$field)? };
// Create the drop guard:
//
// We rely on macro hygiene to make it impossible for users to access this local variable.
@@ -1249,7 +1249,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
::kernel::macros::paste! {
// SAFETY: We forget the guard later when initialization has succeeded.
let [< __ $field _guard >] = unsafe {
- $crate::init::__internal::DropGuard::new(::core::ptr::addr_of_mut!((*$slot).$field))
+ $crate::init::__internal::DropGuard::new(&raw mut (*$slot).$field)
};
$crate::__init_internal!(init_slot():
@@ -1272,7 +1272,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
// Initialize the field.
//
// SAFETY: The memory at `slot` is uninitialized.
- unsafe { ::core::ptr::write(::core::ptr::addr_of_mut!((*$slot).$field), $field) };
+ unsafe { ::core::ptr::write(&raw mut (*$slot).$field, $field) };
}
// Create the drop guard:
//
@@ -1281,7 +1281,7 @@ fn assert_zeroable<T: $crate::init::Zeroable>(_: *mut T) {}
::kernel::macros::paste! {
// SAFETY: We forget the guard later when initialization has succeeded.
let [< __ $field _guard >] = unsafe {
- $crate::init::__internal::DropGuard::new(::core::ptr::addr_of_mut!((*$slot).$field))
+ $crate::init::__internal::DropGuard::new(&raw mut (*$slot).$field)
};
$crate::__init_internal!(init_slot($($use_data)?):
diff --git a/rust/kernel/jump_label.rs b/rust/kernel/jump_label.rs
index 4e974c768dbd..ca10abae0eee 100644
--- a/rust/kernel/jump_label.rs
+++ b/rust/kernel/jump_label.rs
@@ -20,8 +20,8 @@
#[macro_export]
macro_rules! static_branch_unlikely {
($key:path, $keytyp:ty, $field:ident) => {{
- let _key: *const $keytyp = ::core::ptr::addr_of!($key);
- let _key: *const $crate::bindings::static_key_false = ::core::ptr::addr_of!((*_key).$field);
+ let _key: *const $keytyp = &raw const $key;
+ let _key: *const $crate::bindings::static_key_false = &raw const (*_key).$field;
let _key: *const $crate::bindings::static_key = _key.cast();
#[cfg(not(CONFIG_JUMP_LABEL))]
diff --git a/rust/kernel/kunit.rs b/rust/kernel/kunit.rs
index 824da0e9738a..a17ef3b2e860 100644
--- a/rust/kernel/kunit.rs
+++ b/rust/kernel/kunit.rs
@@ -128,9 +128,9 @@ unsafe impl Sync for UnaryAssert {}
unsafe {
$crate::bindings::__kunit_do_failed_assertion(
kunit_test,
- core::ptr::addr_of!(LOCATION.0),
+ &raw const LOCATION.0,
$crate::bindings::kunit_assert_type_KUNIT_ASSERTION,
- core::ptr::addr_of!(ASSERTION.0.assert),
+ &raw const ASSERTION.0.assert,
Some($crate::bindings::kunit_unary_assert_format),
core::ptr::null(),
);
diff --git a/rust/kernel/list.rs b/rust/kernel/list.rs
index c0ed227b8a4f..e98f0820f002 100644
--- a/rust/kernel/list.rs
+++ b/rust/kernel/list.rs
@@ -176,7 +176,7 @@ pub fn new() -> impl PinInit<Self> {
#[inline]
unsafe fn fields(me: *mut Self) -> *mut ListLinksFields {
// SAFETY: The caller promises that the pointer is valid.
- unsafe { Opaque::raw_get(ptr::addr_of!((*me).inner)) }
+ unsafe { Opaque::raw_get(&raw const (*me).inner) }
}
/// # Safety
diff --git a/rust/kernel/list/impl_list_item_mod.rs b/rust/kernel/list/impl_list_item_mod.rs
index a0438537cee1..014b6713d59d 100644
--- a/rust/kernel/list/impl_list_item_mod.rs
+++ b/rust/kernel/list/impl_list_item_mod.rs
@@ -49,7 +49,7 @@ macro_rules! impl_has_list_links {
// SAFETY: The implementation of `raw_get_list_links` only compiles if the field has the
// right type.
//
- // The behavior of `raw_get_list_links` is not changed since the `addr_of_mut!` macro is
+ // The behavior of `raw_get_list_links` is not changed since the `&raw mut` op is
// equivalent to the pointer offset operation in the trait definition.
unsafe impl$(<$($implarg),*>)? $crate::list::HasListLinks$(<$id>)? for
$self $(<$($selfarg),*>)?
@@ -61,7 +61,7 @@ unsafe fn raw_get_list_links(ptr: *mut Self) -> *mut $crate::list::ListLinks$(<$
// SAFETY: The caller promises that the pointer is not dangling. We know that this
// expression doesn't follow any pointers, as the `offset_of!` invocation above
// would otherwise not compile.
- unsafe { ::core::ptr::addr_of_mut!((*ptr)$(.$field)*) }
+ unsafe { &raw mut (*ptr)$(.$field)* }
}
}
)*};
@@ -103,7 +103,7 @@ macro_rules! impl_has_list_links_self_ptr {
unsafe fn raw_get_list_links(ptr: *mut Self) -> *mut $crate::list::ListLinks$(<$id>)? {
// SAFETY: The caller promises that the pointer is not dangling.
let ptr: *mut $crate::list::ListLinksSelfPtr<$item_type $(, $id)?> =
- unsafe { ::core::ptr::addr_of_mut!((*ptr).$field) };
+ unsafe { &raw mut (*ptr).$field };
ptr.cast()
}
}
diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
index a59469c785e3..757db052cc09 100644
--- a/rust/kernel/net/phy.rs
+++ b/rust/kernel/net/phy.rs
@@ -7,7 +7,7 @@
//! C headers: [`include/linux/phy.h`](srctree/include/linux/phy.h).
use crate::{error::*, prelude::*, types::Opaque};
-use core::{marker::PhantomData, ptr::addr_of_mut};
+use core::marker::PhantomData;
pub mod reg;
@@ -285,7 +285,7 @@ impl AsRef<kernel::device::Device> for Device {
fn as_ref(&self) -> &kernel::device::Device {
let phydev = self.0.get();
// SAFETY: The struct invariant ensures that `mdio.dev` is valid.
- unsafe { kernel::device::Device::as_ref(addr_of_mut!((*phydev).mdio.dev)) }
+ unsafe { kernel::device::Device::as_ref(&raw mut (*phydev).mdio.dev) }
}
}
diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index f7b2743828ae..6cb9ed1e7cbf 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -17,7 +17,7 @@
types::{ARef, ForeignOwnable, Opaque},
ThisModule,
};
-use core::{ops::Deref, ptr::addr_of_mut};
+use core::ops::Deref;
use kernel::prelude::*;
/// An adapter for the registration of PCI drivers.
@@ -60,7 +60,7 @@ extern "C" fn probe_callback(
) -> kernel::ffi::c_int {
// SAFETY: The PCI bus only ever calls the probe callback with a valid pointer to a
// `struct pci_dev`.
- let dev = unsafe { device::Device::get_device(addr_of_mut!((*pdev).dev)) };
+ let dev = unsafe { device::Device::get_device(&raw mut (*pdev).dev) };
// SAFETY: `dev` is guaranteed to be embedded in a valid `struct pci_dev` by the call
// above.
let mut pdev = unsafe { Device::from_dev(dev) };
diff --git a/rust/kernel/platform.rs b/rust/kernel/platform.rs
index 1297f5292ba9..344875ad7b82 100644
--- a/rust/kernel/platform.rs
+++ b/rust/kernel/platform.rs
@@ -14,8 +14,6 @@
ThisModule,
};
-use core::ptr::addr_of_mut;
-
/// An adapter for the registration of platform drivers.
pub struct Adapter<T: Driver>(T);
@@ -55,7 +53,7 @@ unsafe fn unregister(pdrv: &Opaque<Self::RegType>) {
impl<T: Driver + 'static> Adapter<T> {
extern "C" fn probe_callback(pdev: *mut bindings::platform_device) -> kernel::ffi::c_int {
// SAFETY: The platform bus only ever calls the probe callback with a valid `pdev`.
- let dev = unsafe { device::Device::get_device(addr_of_mut!((*pdev).dev)) };
+ let dev = unsafe { device::Device::get_device(&raw mut (*pdev).dev) };
// SAFETY: `dev` is guaranteed to be embedded in a valid `struct platform_device` by the
// call above.
let mut pdev = unsafe { Device::from_dev(dev) };
diff --git a/rust/kernel/rbtree.rs b/rust/kernel/rbtree.rs
index 1ea25c7092fb..b0ad35663cb0 100644
--- a/rust/kernel/rbtree.rs
+++ b/rust/kernel/rbtree.rs
@@ -11,7 +11,7 @@
cmp::{Ord, Ordering},
marker::PhantomData,
mem::MaybeUninit,
- ptr::{addr_of_mut, from_mut, NonNull},
+ ptr::{from_mut, NonNull},
};
/// A red-black tree with owned nodes.
@@ -238,7 +238,7 @@ pub fn values_mut(&mut self) -> impl Iterator<Item = &'_ mut V> {
/// Returns a cursor over the tree nodes, starting with the smallest key.
pub fn cursor_front(&mut self) -> Option<Cursor<'_, K, V>> {
- let root = addr_of_mut!(self.root);
+ let root = &raw mut self.root;
// SAFETY: `self.root` is always a valid root node
let current = unsafe { bindings::rb_first(root) };
NonNull::new(current).map(|current| {
@@ -253,7 +253,7 @@ pub fn cursor_front(&mut self) -> Option<Cursor<'_, K, V>> {
/// Returns a cursor over the tree nodes, starting with the largest key.
pub fn cursor_back(&mut self) -> Option<Cursor<'_, K, V>> {
- let root = addr_of_mut!(self.root);
+ let root = &raw mut self.root;
// SAFETY: `self.root` is always a valid root node
let current = unsafe { bindings::rb_last(root) };
NonNull::new(current).map(|current| {
@@ -459,7 +459,7 @@ pub fn cursor_lower_bound(&mut self, key: &K) -> Option<Cursor<'_, K, V>>
let best = best_match?;
// SAFETY: `best` is a non-null node so it is valid by the type invariants.
- let links = unsafe { addr_of_mut!((*best.as_ptr()).links) };
+ let links = unsafe { &raw mut (*best.as_ptr()).links };
NonNull::new(links).map(|current| {
// INVARIANT:
@@ -767,7 +767,7 @@ pub fn remove_current(self) -> (Option<Self>, RBTreeNode<K, V>) {
let node = RBTreeNode { node };
// SAFETY: The reference to the tree used to create the cursor outlives the cursor, so
// the tree cannot change. By the tree invariant, all nodes are valid.
- unsafe { bindings::rb_erase(&mut (*this).links, addr_of_mut!(self.tree.root)) };
+ unsafe { bindings::rb_erase(&mut (*this).links, &raw mut self.tree.root) };
let current = match (prev, next) {
(_, Some(next)) => next,
@@ -803,7 +803,7 @@ fn remove_neighbor(&mut self, direction: Direction) -> Option<RBTreeNode<K, V>>
let neighbor = neighbor.as_ptr();
// SAFETY: The reference to the tree used to create the cursor outlives the cursor, so
// the tree cannot change. By the tree invariant, all nodes are valid.
- unsafe { bindings::rb_erase(neighbor, addr_of_mut!(self.tree.root)) };
+ unsafe { bindings::rb_erase(neighbor, &raw mut self.tree.root) };
// SAFETY: By the type invariant of `Self`, all non-null `rb_node` pointers stored in `self`
// point to the links field of `Node<K, V>` objects.
let this = unsafe { container_of!(neighbor, Node<K, V>, links) }.cast_mut();
@@ -918,7 +918,7 @@ unsafe fn to_key_value_raw<'b>(node: NonNull<bindings::rb_node>) -> (&'b K, *mut
let k = unsafe { &(*this).key };
// SAFETY: The passed `node` is the current node or a non-null neighbor,
// thus `this` is valid by the type invariants.
- let v = unsafe { addr_of_mut!((*this).value) };
+ let v = unsafe { &raw mut (*this).value };
(k, v)
}
}
@@ -1027,7 +1027,7 @@ fn next(&mut self) -> Option<Self::Item> {
self.next = unsafe { bindings::rb_next(self.next) };
// SAFETY: By the same reasoning above, it is safe to dereference the node.
- Some(unsafe { (addr_of_mut!((*cur).key), addr_of_mut!((*cur).value)) })
+ Some(unsafe { (&raw mut (*cur).key, &raw mut (*cur).value) })
}
}
@@ -1170,7 +1170,7 @@ fn insert(self, node: RBTreeNode<K, V>) -> &'a mut V {
// SAFETY: `node` is valid at least until we call `Box::from_raw`, which only happens when
// the node is removed or replaced.
- let node_links = unsafe { addr_of_mut!((*node).links) };
+ let node_links = unsafe { &raw mut (*node).links };
// INVARIANT: We are linking in a new node, which is valid. It remains valid because we
// "forgot" it with `Box::into_raw`.
@@ -1178,7 +1178,7 @@ fn insert(self, node: RBTreeNode<K, V>) -> &'a mut V {
unsafe { bindings::rb_link_node(node_links, self.parent, self.child_field_of_parent) };
// SAFETY: All pointers are valid. `node` has just been inserted into the tree.
- unsafe { bindings::rb_insert_color(node_links, addr_of_mut!((*self.rbtree).root)) };
+ unsafe { bindings::rb_insert_color(node_links, &raw mut (*self.rbtree).root) };
// SAFETY: The node is valid until we remove it from the tree.
unsafe { &mut (*node).value }
@@ -1261,7 +1261,7 @@ fn replace(self, node: RBTreeNode<K, V>) -> RBTreeNode<K, V> {
// SAFETY: `node` is valid at least until we call `Box::from_raw`, which only happens when
// the node is removed or replaced.
- let new_node_links = unsafe { addr_of_mut!((*node).links) };
+ let new_node_links = unsafe { &raw mut (*node).links };
// SAFETY: This updates the pointers so that `new_node_links` is in the tree where
// `self.node_links` used to be.
diff --git a/rust/kernel/sync/arc.rs b/rust/kernel/sync/arc.rs
index 3cefda7a4372..81d8b0f84957 100644
--- a/rust/kernel/sync/arc.rs
+++ b/rust/kernel/sync/arc.rs
@@ -243,7 +243,7 @@ pub fn into_raw(self) -> *const T {
let ptr = self.ptr.as_ptr();
core::mem::forget(self);
// SAFETY: The pointer is valid.
- unsafe { core::ptr::addr_of!((*ptr).data) }
+ unsafe { &raw const (*ptr).data }
}
/// Recreates an [`Arc`] instance previously deconstructed via [`Arc::into_raw`].
diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs
index 49012e711942..b2ac768eed23 100644
--- a/rust/kernel/task.rs
+++ b/rust/kernel/task.rs
@@ -257,7 +257,7 @@ pub fn as_ptr(&self) -> *mut bindings::task_struct {
pub fn group_leader(&self) -> &Task {
// SAFETY: The group leader of a task never changes after initialization, so reading this
// field is not a data race.
- let ptr = unsafe { *ptr::addr_of!((*self.as_ptr()).group_leader) };
+ let ptr = unsafe { *(&raw const (*self.as_ptr()).group_leader) };
// SAFETY: The lifetime of the returned task reference is tied to the lifetime of `self`,
// and given that a task has a reference to its group leader, we know it must be valid for
@@ -269,7 +269,7 @@ pub fn group_leader(&self) -> &Task {
pub fn pid(&self) -> Pid {
// SAFETY: The pid of a task never changes after initialization, so reading this field is
// not a data race.
- unsafe { *ptr::addr_of!((*self.as_ptr()).pid) }
+ unsafe { *(&raw const (*self.as_ptr()).pid) }
}
/// Returns the UID of the given task.
diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs
index 0cd100d2aefb..34e8abb38974 100644
--- a/rust/kernel/workqueue.rs
+++ b/rust/kernel/workqueue.rs
@@ -401,9 +401,9 @@ pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self
pub unsafe fn raw_get(ptr: *const Self) -> *mut bindings::work_struct {
// SAFETY: The caller promises that the pointer is aligned and not dangling.
//
- // A pointer cast would also be ok due to `#[repr(transparent)]`. We use `addr_of!` so that
- // the compiler does not complain that the `work` field is unused.
- unsafe { Opaque::raw_get(core::ptr::addr_of!((*ptr).work)) }
+ // A pointer cast would also be ok due to `#[repr(transparent)]`. We use `&raw const (*ptr).work`
+ // so that the compiler does not complain that the `work` field is unused.
+ unsafe { Opaque::raw_get(&raw const (*ptr).work) }
}
}
@@ -510,7 +510,7 @@ macro_rules! impl_has_work {
unsafe fn raw_get_work(ptr: *mut Self) -> *mut $crate::workqueue::Work<$work_type $(, $id)?> {
// SAFETY: The caller promises that the pointer is not dangling.
unsafe {
- ::core::ptr::addr_of_mut!((*ptr).$field)
+ &raw mut (*ptr).$field
}
}
}
--
2.48.1
From: Jeff Xu <jeffxu(a)chromium.org>
Initially, when mseal was introduced in 6.10, semantically, when a VMA
within the specified address range is sealed, the mprotect will be rejected,
leaving all of VMA unmodified. However, adding an extra loop to check the mseal
flag for every VMA slows things down a bit, therefore in 6.12, this issue was
solved by removing can_modify_mm and checking each VMA’s mseal flag directly
without an extra loop [1]. This is a semantic change, i.e. partial update is
allowed, VMAs can be updated until a sealed VMA is found.
The new semantic also means, we could allow mprotect on a sealed VMA if the new
attribute of VMA remains the same as the old one. Relaxing this avoids unnecessary
impacts for applications that want to seal a particular mapping. Doing this also
has no security impact.
The mseal_test is also modified by this patch to adapt to the new
semantic. Please note, mseal_test is currently undergoing refactoring,
and eventually will be replaced with a new memory sealing selftest.
In this patch, I only make a minimum change to make it pass. I considered
adding a new testcase in mseal_test to cover this new behavior, however, the
existing mseal_test is using wrong patterns and won’t pass the review.
Such a new test is better to be added in the new refactored memory sealing tests.
The refactoring is currently pending review [2].
[1] https://lore.kernel.org/all/20240817-mseal-depessimize-v3-0-d8d2e037df30@gm…
[2] https://lore.kernel.org/all/20241211053311.245636-1-jeffxu@google.com/
Jeff Xu (2):
selftests/mm: mseal_test: avoid using no-op mprotect
mseal: allow noop mprotect
mm/mprotect.c | 6 +++---
tools/testing/selftests/mm/mseal_test.c | 6 +++---
2 files changed, 6 insertions(+), 6 deletions(-)
--
2.49.0.rc0.332.g42c0ae87b1-goog
The SO_RCVLOWAT option is defined as 18 in the selftest header,
which matches the generic definition. However, on powerpc,
SO_RCVLOWAT is defined as 16. This discrepancy causes
sol_socket_sockopt() to fail with the default switch case on powerpc.
This commit fixes by defining SO_RCVLOWAT as 16 for powerpc.
Signed-off-by: Saket Kumar Bhaskar <skb99(a)linux.ibm.com>
---
tools/testing/selftests/bpf/progs/bpf_tracing_net.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
index 59843b430f76..bcd44d5018bf 100644
--- a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
+++ b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h
@@ -15,7 +15,11 @@
#define SO_KEEPALIVE 9
#define SO_PRIORITY 12
#define SO_REUSEPORT 15
+#if defined(__TARGET_ARCH_powerpc)
+#define SO_RCVLOWAT 16
+#else
#define SO_RCVLOWAT 18
+#endif
#define SO_BINDTODEVICE 25
#define SO_MARK 36
#define SO_MAX_PACING_RATE 47
--
2.43.5
David, Brendan, Rae,
I am seeing the following error when I run
./tools/testing/kunit/kunit.py run --arch x86_64
ERROR:root:ld:arch/x86/realmode/rm/realmode.lds:236: undefined symbol `sev_es_trampoline_start' referenced in expression
I isolated it to dependency on CONFIG_AMD_MEM_ENCRYPT
I added the option using --kconfig_add
./tools/testing/kunit/kunit.py run --arch x86_64 --kconfig_add CONFIG_AMD_MEM_ENCRYPT=y
I see the following
RROR:root:Not all Kconfig options selected in kunitconfig were in the generated .config.
This is probably due to unsatisfied dependencies.
Missing: CONFIG_AMD_MEM_ENCRYPT=y
Is there a better way to fix the dependencies? Does kunit default config
need changing for x86_64?
thanks,
-- Shuah
This is one of just 3 remaining "Test Module" kselftests (the others
being bitmap and scanf), the rest having been converted to KUnit.
I tested this using:
$ tools/testing/kunit/kunit.py run --arch arm64 --make_options LLVM=1 printf
I have also sent out a series converting scanf[0].
Link: https://lore.kernel.org/all/20250204-scanf-kunit-convert-v3-0-386d7c3ee714@… [0]
Signed-off-by: Tamir Duberstein <tamird(a)gmail.com>
---
Changes in v6:
- Use __printf correctly on `__test`. (Petr Mladek)
- Rebase on linux-next.
- Remove leftover references to `printf.sh`.
- Update comment in `hash_pointer`. (Petr Mladek)
- Avoid overrun in `KUNIT_EXPECT_MEMNEQ`. (Petr Mladek)
- Restore trailing newlines on printk strings and add some missing ones.
(Petr Mladek)
- Use `kunit_skip` on not-yet-initialized crng. (Petr Mladek)
- Link to v5: https://lore.kernel.org/r/20250221-printf-kunit-convert-v5-0-5db840301730@g…
Changes in v5:
- Update `do_test` `__printf` annotation (Rasmus Villemoes).
- Link to v4: https://lore.kernel.org/r/20250214-printf-kunit-convert-v4-0-c254572f1565@g…
Changes in v4:
- Add patch "implicate test line in failure messages".
- Rebase on linux-next, move scanf_kunit.c into lib/tests/.
- Link to v3: https://lore.kernel.org/r/20250210-printf-kunit-convert-v3-0-ee6ac5500f5e@g…
Changes in v3:
- Remove extraneous trailing newlines from failure messages.
- Replace `pr_warn` with `kunit_warn`.
- Drop arch changes.
- Remove KUnit boilerplate from CONFIG_PRINTF_KUNIT_TEST help text.
- Restore `total_tests` counting.
- Remove tc_fail macro in last patch.
- Link to v2: https://lore.kernel.org/r/20250207-printf-kunit-convert-v2-0-057b23860823@g…
Changes in v2:
- Incorporate code review from prior work[0] by Arpitha Raghunandan.
- Link to v1: https://lore.kernel.org/r/20250204-printf-kunit-convert-v1-0-ecf1b846a4de@g…
Link: https://lore.kernel.org/lkml/20200817043028.76502-1-98.arpi@gmail.com/t/#u [0]
---
Tamir Duberstein (3):
printf: convert self-test to KUnit
printf: break kunit into test cases
printf: implicate test line in failure messages
Documentation/core-api/printk-formats.rst | 4 +-
Documentation/dev-tools/kselftest.rst | 2 +-
MAINTAINERS | 2 +-
lib/Kconfig.debug | 12 +-
lib/Makefile | 1 -
lib/tests/Makefile | 1 +
lib/{test_printf.c => tests/printf_kunit.c} | 442 ++++++++++++----------------
tools/testing/selftests/kselftest/module.sh | 2 +-
tools/testing/selftests/lib/Makefile | 2 +-
tools/testing/selftests/lib/config | 1 -
tools/testing/selftests/lib/printf.sh | 4 -
11 files changed, 207 insertions(+), 266 deletions(-)
---
base-commit: 7ec162622e66a4ff886f8f28712ea1b13069e1aa
change-id: 20250131-printf-kunit-convert-fd4012aa2ec6
Best regards,
--
Tamir Duberstein <tamird(a)gmail.com>
This series solves some issues about global "irq_type" that is used for
indicating the current type for users.
In addition, avoid an unexpected warning that occur due to interrupts
remaining after displaying an error caused by devm_request_irq().
Patch 1 includes adding GET_IRQTYPE test (check for failure).
Patch 2-4 include fixes for stable kernels that have global "irq_type".
Patch 5-6 include improvements for the latest.
Changes since v3:
- Add GET_IRQTYPE check to pci_endpoint test in selftests
- Add the reason why global variables aren't necessary (patch 5/6)
- Add Reviewed-by: lines (patch {2, 4, 6}/6)
Changes since v2:
- Rebase to v6.14-rc1
- Update message to clarify, and add result of call trace (patch 1/5)
- Add Reviewed-by: lines (patch 2/5)
- Add new patch to remove global "irq_type" variable (patch 4/5)
- Add new patch to replace "devm" version of IRQ functions (patch 5/5)
Changes since v1:
- Divide original patch into two
- Add an error message example
- Add "pcitest" display example
- Add a patch to fix an interrupt remaining issue
Kunihiko Hayashi (6):
selftests: pci_endpoint: Add GET_IRQTYPE checks to each interrupt test
misc: pci_endpoint_test: Avoid issue of interrupts remaining after
request_irq error
misc: pci_endpoint_test: Fix displaying irq_type after request_irq
error
misc: pci_endpoint_test: Fix irq_type to convey the correct type
misc: pci_endpoint_test: Remove global 'irq_type' and 'no_msi'
misc: pci_endpoint_test: Do not use managed irq functions
drivers/misc/pci_endpoint_test.c | 31 +++++++------------
.../pci_endpoint/pci_endpoint_test.c | 11 ++++++-
2 files changed, 21 insertions(+), 21 deletions(-)
--
2.25.1
Hi all,
This patchset adds a new buddy allocator like (or non-uniform) large folio
split from a order-n folio to order-m with m < n. It reduces
1. the total number of after-split folios from 2^(n-m) to n-m+1;
2. the amount of memory needed for multi-index xarray split from 2^(n/6-m/6) to
n/6-m/6, assuming XA_CHUNK_SHIFT=6;
3. keep more large folios after a split from all order-m folios to
order-(n-1) to order-m folios.
For example, to split an order-9 to order-0, folio split generates 10
(or 11 for anonymous memory) folios instead of 512, allocates 1 xa_node
instead of 8, and leaves 1 order-8, 1 order-7, ..., 1 order-1 and 2 order-0
folios (or 4 order-0 for anonymous memory) instead of 512 order-0 folios.
Instead of duplicating existing split_huge_page*() code, __folio_split()
is introduced as the shared backend code for both
split_huge_page_to_list_to_order() and folio_split(). __folio_split()
can support both uniform split and buddy allocator like (or non-uniform) split.
All existing split_huge_page*() users can be gradually converted to use
folio_split() if possible. In this patchset, I converted
truncate_inode_partial_folio() to use folio_split().
xfstests quick group passed for both tmpfs and xfs. I also
semi-replicated Hugh's test[12] and ran it without any issue for almost
24 hours.
It is on top of mm-everything-2025-03-07-07-55. It is ready to be merged.
Changelog
===
From V9[13]
1. Incorporated Hugh's fixes[14] (Thanks Hugh):
a) moved folio_set_order() in __split_folio_to_order() to be called
only once for the input folio,
b) used folio_test_swapcache() to catch both anon and shmem in swap
cache cases,
c) moved folio_next() out of for(;;),
d) used mapping instead of origin_folio->mapping.
2. Added a TODO in __folio_split(), since large in-swap-cache shmem folio
split is not supported yet.
3. Changed __split_folio_to_order() based on David Hildenbrand's MM owner
tracking for large folios patchset[15], due to rebasing.
From V8[11]:
1. Removed gfp parameter from xas_try_split() and GFP_NOWAIT is used all
the time. (per Baolin Wang)
2. Used __xas_init_node_for_split() instead of
__xas_alloc_node_for_split() and moved node allocation out. It fixed
a bug when xa_node is pre-allocated by xas_nomem() before
xas_try_split() is called without being initialized for split.
From V7[9]:
1. Fixed a wrong function name in lib/test_xarray.c.
2. Made __split_folio_to_order() never fail, since the old order check
is already done in __folio_split(). (per David Hildenbrand)
3. Fixed an issue reported by syzbot[10] by not dropping the original
folio during truncate.
4. Fixed a WARNING when READ_ONLY_THP_FOR_FS is enabled. (Thank David
Hildenbrand for reporting the issue)
5. Used two separate struct page* parameters, split_at and lock_at, to
specify at which subpage the non-uniform split happens and which subpage
to keep locked after the split, respectively. It improves code
readability.
From V6[8]:
1. Added an xarray function xas_try_split() to support iterative folio split,
removing the need of using xas_split_alloc() and xas_split(). The
function guarantees that at most one xa_node is allocated for each
call.
2. Added concrete numbers of after-split folios and xa_node savings to
cover letter, commit log. (per Andrew)
From V5[7]:
1. Split shmem to any lower order patches are in mm tree, so dropped
from this series.
2. Rename split_folio_at() to try_folio_split() to clarify that
non-uniform split will not be used if it is not supported.
From V4[6]:
1. Enabled shmem support in both uniform and buddy allocator like split
and added selftests for it.
2. Added functions to check if uniform split and buddy allocator like
split are supported for the given folio and order.
3. Made truncate fall back to uniform split if buddy allocator split is
not supported (CONFIG_READ_ONLY_THP_FOR_FS and FS without large folio).
4. Added the missing folio_clear_has_hwpoisoned() to
__split_unmapped_folio().
From V3[5]:
1. Used xas_split_alloc(GFP_NOWAIT) instead of xas_nomem(), since extra
operations inside xas_split_alloc() are needed for correctness.
2. Enabled folio_split() for shmem and no issue was found with xfstests
quick test group.
3. Split both ends of a truncate range in truncate_inode_partial_folio()
to avoid wasting memory in shmem truncate (per David Hildenbrand).
4. Removed page_in_folio_offset() since page_folio() does the same
thing.
5. Finished truncate related tests from xfstests quick test group on XFS and
tmpfs without issues.
6. Disabled buddy allocator like split on CONFIG_READ_ONLY_THP_FOR_FS
and FS without large folio. This check was missed in the prior
versions.
From V2[3]:
1. Incorporated all the feedback from Kirill[4].
2. Used GFP_NOWAIT for xas_nomem().
3. Tested the code path when xas_nomem() fails.
4. Added selftests for folio_split().
5. Fixed no THP config build error.
From V1[2]:
1. Split the original patch 1 into multiple ones for easy review (per
Kirill).
2. Added xas_destroy() to avoid memory leak.
3. Fixed nr_dropped not used error (per kernel test robot).
4. Added proper error handling when xas_nomem() fails to allocate memory
for xas_split() during buddy allocator like split.
From RFC[1]:
1. Merged backend code of split_huge_page_to_list_to_order() and
folio_split(). The same code is used for both uniform split and buddy
allocator like split.
2. Use xas_nomem() instead of xas_split_alloc() for folio_split().
3. folio_split() now leaves the first after-split folio unlocked,
instead of the one containing the given page, since
the caller of truncate_inode_partial_folio() locks and unlocks the
first folio.
4. Extended split_huge_page debugfs to use folio_split().
5. Added truncate_inode_partial_folio() as first user of folio_split().
Design
===
folio_split() splits a large folio in the same way as buddy allocator
splits a large free page for allocation. The purpose is to minimize the
number of folios after the split. For example, if user wants to free the
3rd subpage in a order-9 folio, folio_split() will split the order-9 folio
as:
O-0, O-0, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-8 if it is anon
O-1, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-9 if it is pagecache
Since anon folio does not support order-1 yet.
The split process is similar to existing approach:
1. Unmap all page mappings (split PMD mappings if exist);
2. Split meta data like memcg, page owner, page alloc tag;
3. Copy meta data in struct folio to sub pages, but instead of spliting
the whole folio into multiple smaller ones with the same order in a
shot, this approach splits the folio iteratively. Taking the example
above, this approach first splits the original order-9 into two order-8,
then splits left part of order-8 to two order-7 and so on;
4. Post-process split folios, like write mapping->i_pages for pagecache,
adjust folio refcounts, add split folios to corresponding list;
5. Remap split folios
6. Unlock split folios.
__split_unmapped_folio() and __split_folio_to_order() replace
__split_huge_page() and __split_huge_page_tail() respectively.
__split_unmapped_folio() uses different approaches to perform
uniform split and buddy allocator like split:
1. uniform split: one single call to __split_folio_to_order() is used to
uniformly split the given folio. All resulting folios are put back to
the list after split. The folio containing the given page is left to
caller to unlock and others are unlocked.
2. buddy allocator like (or non-uniform) split: (old_order - new_order) calls
to __split_folio_to_order() are used to split the given folio at order N to
order N-1. After each call, the target folio is changed to the one
containing the page, which is given as a folio_split() parameter.
After each call, folios not containing the page are put back to the list.
The folio containing the page is put back to the list when its order
is new_order. All folios are unlocked except the first folio, which
is left to caller to unlock.
Patch Overview
===
1. Patch 1 added a new xarray function xas_try_split() to perform
iterative xarray split.
2. Patch 2 added __split_unmapped_folio() and __split_folio_to_order() to
prepare for moving to new backend split code.
3. Patch 3 moved common code in split_huge_page_to_list_to_order() to
__folio_split().
4. Patch 4 added new folio_split() and made
split_huge_page_to_list_to_order() share the new
__split_unmapped_folio() with folio_split().
5. Patch 5 removed no longer used __split_huge_page() and
__split_huge_page_tail().
6. Patch 6 added a new in_folio_offset to split_huge_page debugfs for
folio_split() test.
7. Patch 7 used try_folio_split() for truncate operation.
8. Patch 8 added folio_split() tests.
Any comments and/or suggestions are welcome. Thanks.
[1] https://lore.kernel.org/linux-mm/20241008223748.555845-1-ziy@nvidia.com/
[2] https://lore.kernel.org/linux-mm/20241028180932.1319265-1-ziy@nvidia.com/
[3] https://lore.kernel.org/linux-mm/20241101150357.1752726-1-ziy@nvidia.com/
[4] https://lore.kernel.org/linux-mm/e6ppwz5t4p4kvir6eqzoto4y5fmdjdxdyvxvtw43nc…
[5] https://lore.kernel.org/linux-mm/20241205001839.2582020-1-ziy@nvidia.com/
[6] https://lore.kernel.org/linux-mm/20250106165513.104899-1-ziy@nvidia.com/
[7] https://lore.kernel.org/linux-mm/20250116211042.741543-1-ziy@nvidia.com/
[8] https://lore.kernel.org/linux-mm/20250205031417.1771278-1-ziy@nvidia.com/
[9] https://lore.kernel.org/linux-mm/20250211155034.268962-1-ziy@nvidia.com/
[10] https://lore.kernel.org/all/67af65cb.050a0220.21dd3.004a.GAE@google.com/
[11] https://lore.kernel.org/linux-mm/20250218235012.1542225-1-ziy@nvidia.com/
[12] https://lore.kernel.org/linux-mm/D45D4F01-E5A5-47E6-8724-01610CC192CC@nvidi…
[13] https://lore.kernel.org/linux-mm/20250226210032.2044041-1-ziy@nvidia.com/
[14] https://lore.kernel.org/linux-mm/2fae27fe-6e2e-3587-4b68-072118d80cf8@googl…
[15] https://lore.kernel.org/all/20250303163014.1128035-4-david@redhat.com/
Zi Yan (8):
xarray: add xas_try_split() to split a multi-index entry
mm/huge_memory: add two new (not yet used) functions for folio_split()
mm/huge_memory: move folio split common code to __folio_split()
mm/huge_memory: add buddy allocator like (non-uniform) folio_split()
mm/huge_memory: remove the old, unused __split_huge_page()
mm/huge_memory: add folio_split() to debugfs testing interface
mm/truncate: use folio_split() in truncate operation
selftests/mm: add tests for folio_split(), buddy allocator like split
Documentation/core-api/xarray.rst | 14 +-
include/linux/huge_mm.h | 36 +
include/linux/xarray.h | 6 +
lib/test_xarray.c | 52 ++
lib/xarray.c | 132 ++-
mm/huge_memory.c | 786 ++++++++++++------
mm/truncate.c | 37 +-
tools/testing/radix-tree/Makefile | 1 +
.../selftests/mm/split_huge_page_test.c | 34 +-
9 files changed, 809 insertions(+), 289 deletions(-)
--
2.47.2