Hi all,
This series is based on previous RFCs/discussions:
Tech topic: https://lore.kernel.org/linux-iommu/20250918214425.2677057-1-amastro@fb.com/
RFCv1: https://lore.kernel.org/all/20260226202211.929005-1-mattev@meta.com/
RFCv2: https://lore.kernel.org/kvm/20260312184613.3710705-1-mattev@meta.com/
The background/rationale is covered in more detail in the RFC cover
letters. The TL;DR is:
The goal is to enable userspace driver designs that use VFIO to export
DMABUFs representing subsets of PCI device BARs, and "vend" those
buffers from a primary process to other subordinate processes by fd.
These processes then mmap() the buffers and their access to the device
is isolated to the exported ranges. This is an improvement on sharing
the VFIO device fd to subordinate processes, which would allow
unfettered access.
This is achieved by enabling mmap() of vfio-pci DMABUFs, passed by fd
to subordinate processes. Second, a new ioctl()-based revocation
mechanism is added to allow the primary process to forcibly revoke
access to previously-shared BAR spans, even if the subordinate
processes haven't cleanly exited.
(The related topic of safe delegation of iommufd control to the
subordinate processes is not addressed here, and is follow-up work.)
As well as isolation and revocation, another advantage to accessing a
BAR through a VMA backed by a DMABUF is that it's straightforward to
mmap() the buffer with access attributes, such as write-combining.
Feedback from the RFCs requested that, instead of creating
DMABUF-specific vm_ops and .fault paths, to go the whole way and
migrate the existing VFIO PCI BAR mmap() to be backed by a DMABUF too,
resulting in a common vm_ops and fault handler for mmap()s of both the
VFIO device and explicitly-exported DMABUFs. This will help future
iommufd emulation of VFIO Type1 peer-to-peer, making it easier to get
a DMABUF for a VFIO BAR as a DMA target.
mmap() conversion to use DMABUF underneath has been done for vfio-pci,
but not sub-drivers:
nvgrace-gpu's mmap() override path is unchanged; I kept this out of
scope for now not least because I don't have a thorough test setup
for this system. I would prefer to help the nvgrace-gpu maintainers
enable BAR mmap() DMABUFs themselves.
Notes on patches
================
PCI/P2PDMA: Add CONFIG_PCI_P2PDMA_CORE
Later in the series, vfio-pci's mmap() is going to depend on
pcim_p2pdma_provider() which depended on CONFIG_PCI_P2PDMA, which
in turn depended on ZONE_DEVICE (which isn't available on 32-bit
and some archs, because they lack MEMORY_HOTPLUG and friends).
VFIO does _not_ require actual P2P to be present for basic mmap()
functionality, only for the optional CONFIG_DMA_SHARED_BUFFER
feature.
This splits P2PDMA into a CONFIG_PCI_P2PDMA_CORE (which currently
contains pcim_p2pdma_provider()) and an optional CONFIG_PCI_P2PDMA
(which depends on ZONE_DEVICE etc., and provides P2P
functionality).
vfio/pci: Add a helper to look up PFNs for DMABUFs
vfio/pci: Add a helper to create a DMABUF for a BAR-map VMA
The first is for a DMABUF VMA fault handler to determine
arbitrary-sized PFNs from ranges in DMABUF. Secondly, refactor
DMABUF export for use by the existing export feature and add a new
helper that creates a DMABUF corresponding to a VFIO BAR mmap()
request.
vfio/pci: Convert BAR mmap() to use a DMABUF
The vfio-pci core mmap() creates a DMABUF with the helper, and the
vm_ops fault handler uses the other helper to resolve the fault.
Because this depends on DMABUF structs/code, CONFIG_VFIO_PCI_CORE
needs to depend on CONFIG_DMA_SHARED_BUFFER. The
CONFIG_VFIO_PCI_DMABUF still conditionally enables the export
support code.
NOTE: The user mmap()s a device fd, but the resulting VMA's vm_file
becomes that of the DMABUF which takes ownership of the device and
puts it on release. This maintains the existing behaviour of a VMA
keeping the VFIO device open.
BAR zapping then happens via the existing vfio_pci_dma_buf_move()
path, which now needs to unmap PTEs in the DMABUF's address_space.
vfio/pci: Provide a user-facing name for BAR mappings
There was a request for decent debug naming in /proc/<pid>/maps
etc. comparable to the existing VFIO names: since the VMAs are
DMABUFs, they have a "dmabuf:" prefix and can't be 100% identical
to before. This is a user-visible change, but this patch at least
now gives us extra info on the BDF & BAR being mapped.
vfio/pci: Clean up BAR zap and revocation
In general (see NOTE!) the vfio_pci_zap_bars() is now obsolete,
since it unmaps PTEs in the VFIO device address_space which is now
unused. This consolidates all calls (e.g. around reset) with the
neighbouring vfio_pci_dma_buf_move()s into new functions, to
revoke-zap/unrevoke.
!!! NOTE: the nvgrace-gpu driver continues to use its own private
vm_ops, fault handler, etc. for its special memregions, and these
DO still add PTEs to the VFIO device address_space. So, a
temporary flag, vdev->bar_needs_zap, maintains the old behaviour
for this use. At least this patch's consolidation makes it easy to
remove the remaining zap when this need goes away; a FIXME reminds
that this can be removed when nvgrace-gpu is converted.
vfio/pci: Support mmap() of a VFIO DMABUF
Adds mmap() for a DMABUF fd exported from vfio-pci.
It was a goal to keep the VFIO device fd lifetime behaviour
unchanged with respect to the DMABUFs. An application can close
all device fds, and this will revoke/clean up all DMABUFs; no
mappings or other access can be performed now. When enabling
mmap() of the DMABUFs, this means access through the VMA is also
revoked. This complicates the fault handler because whilst the
DMABUF exists, it has no guarantee that the corresponding VFIO
device is still alive. Adds synchronisation ensuring the vdev is
available before vdev->memory_lock is touched; this holds the
device registration so that even if the buffer has been cleaned up,
vdev hasn't been freed and so the lock can be safely taken.
(I decided against the alternative of preventing cleanup by holding
the VFIO device open if any DMABUFs exist, because it's both a
change of behaviour and less clean overall.)
I've added a chonky comment in place, happy to clarify more if you
have ideas.
This commit makes VFIO_PCI_CORE depend on PCI_P2PDMA_CORE (commit
1) to bring in (only) the P2PDMA provider code.
vfio/pci: Permanently revoke a DMABUF on request
By weight, this is mostly a rename of revoked to an enum, status.
There are now 3 states for a buffer, usable and revoked
temporary/permanent. A new VFIO device ioctl is added,
VFIO_DEVICE_PCI_DMABUF_REVOKE, which passes a DMABUF (exported from
that device) and permanently revokes it. Thus a userspace driver
can guarantee any downstream consumers of a shared fd are prevented
from accessing a BAR range, and that range can be reused.
The code doing revocation in vfio_pci_dma_buf_move() is moved,
unchanged, to a common function for use by _move() and the new
ioctl path.
Q: I can't think of a good reason to temporarily revoke/unrevoke
buffers from userspace, so didn't add a 'flags' field to the ioctl
struct. Easy to add if people think it's worthwhile for future
use.
vfio/pci: Add mmap() attributes to DMABUF feature
Adds a new VFIO feature, VFIO_DEVICE_FEATURE_DMA_BUF_MEMATTR.
After a DMABUF is exported, this feature ioctl() isused to set a
memory attribute that will be used by future mmap()s of the DMABUF
fd (i.e. it does nothing for any existing maps).
The default is UC, and via the feature one can specify CPU access
as WC. The attribute is an enum/scalar rather than
bitmap/cumulative. The attributes follow a "try-fail" model where
a client can request an attribute and either succeed or fail with
ENOTSUPP if it's unknown; if future attributes are
platform-specific then their support can be probed.
(Since it's just UC/WC for now, there is no reservation or numeric
structure to the namespace yet, but we could support
system/arch-specific values in future by carving out base +
arch-specific + IMPDEF ranges.)
Testing
=======
(The [RFC ONLY] userspace test program, for QEMU edu-plus, has been
dropped from the series, but can be found in the GitHub branch below.
It at least illustrates the export, map, revoke, attribute, and close
semantics interoperate.)
This code has been tested in mapping DMABUFs of single/multiple
ranges, aliasing mmap()s, aliasing ranges across DMABUFs, vm_pgoff >
0, revocation, shutdown/cleanup scenarios, and hugepage mappings seem
to work correctly. I've lightly tested WC mappings also (by observing
resulting PTEs as having the correct attributes...). No regressions
observed on the VFIO selftests, or on our internal vfio-pci
applications.
End
===
This is based on VFIO next (e.g. at b9285405c5f6).
These commits are on GitHub for easier browsing, along with
"[RFC ONLY] selftests: vfio: Add standalone vfio_dmabuf_mmap_test":
https://github.com/metamev/linux/compare/b9285405c5f6...metamev:linux:dev/m…
Thanks for reading,
Matt
================================================================================
Change log:
v2:
- Rebase on VFIO next, picking up Alex's
vfio_pci_dma_buf_move()/vfio_pci_dma_buf_cleanup() fixes, and
dropping "vfio/pci: Fix vfio_pci_dma_buf_cleanup() double-put"
- Added "PCI/P2PDMA: Add CONFIG_PCI_P2PDMA_CORE" so that the
newly-added vfio-pci hard dependency on the P2PDMA provider instead
pulls in the _CORE variant and not the full-fat CONFIG_PCI_P2PDMA.
This means that the core of vfio-pci does not need ZONE_DEVICE, but
if it's available then enabling P2PDMA in turn enables DMABUF
export. Fixes basic VFIO operation on 32b or other platforms without
ZONE_DEVICE.
- Fixed comment inaccuracy in vfio_pci_dma_buf_revoke() and cleaned
up vdev validity test.
- vfio_pci_dma_buf_find_pfn(): use PAGE_ALIGN(), better span variable
naming, OVF check
- Made vm_pgoffs use consistent (keeping the resource index at the
top and masking where offset is used). For BAR mmap, use new
vma_pgoff_adjust to create the DMABUF with the exact mmap()ed span
instead of from the start of the BAR with an invisible portion
before the mapping.
- Added VFIO_DEVICE_FEATURE_DMA_BUF_MEMATTR to set memory attributes,
instead of using the export `flags` field.
- vfio_pci_ioctl_reset: Moved vfio_pci_zap_revoke_bars()
(effectively, vfio_pci_dma_buf_move()) back after D0 transition.
Note, if a BAR zap is needed, it's done in this function so now
happens after this D0 transition with the _move; it was done before
it at the time of the memory_lock taking.
- Minimised vfio_pci_dma_buf_mmap() (removed redundant span check),
added READ_ONCE for memattr
- Misc fixes: comment in DMABUF name generation, removed superfluous
READ_ONCE from faulthandler
v1:
https://lore.kernel.org/kvm/20260416131815.2729131-1-mattev@meta.com/
- Cleanup of the common DMABUF-aware VMA vm_ops fault handler and
export code.
- Fixed a lot of races, particularly faults racing with DMABUF
cleanup (if the VFIO device fds close, for example).
- Added nicer human-readable names for VFIO mmap() VMAs
RFCv2: Respin based on the feedback/suggestions:
https://lore.kernel.org/kvm/20260312184613.3710705-1-mattev@meta.com/
- Transform the existing VFIO BAR mmap path to also use DMABUFs
behind the scenes, and then simply share that code for
explicitly-mapped DMABUFs. Jason wanted to go that direction to
enable iommufd VFIO type 1 emulation to pick up a DMABUF for an IO
mapping.
- Revoke buffers using a VFIO device fd ioctl
RFCv1:
https://lore.kernel.org/all/20260226202211.929005-1-mattev@meta.com/
Matt Evans (9):
PCI/P2PDMA: Add CONFIG_PCI_P2PDMA_CORE
vfio/pci: Add a helper to look up PFNs for DMABUFs
vfio/pci: Add a helper to create a DMABUF for a BAR-map VMA
vfio/pci: Convert BAR mmap() to use a DMABUF
vfio/pci: Provide a user-facing name for BAR mappings
vfio/pci: Clean up BAR zap and revocation
vfio/pci: Support mmap() of a VFIO DMABUF
vfio/pci: Permanently revoke a DMABUF on request
vfio/pci: Add mmap() attributes to DMABUF feature
drivers/pci/Kconfig | 10 +-
drivers/pci/Makefile | 2 +-
drivers/pci/p2pdma.c | 16 +
drivers/vfio/pci/Kconfig | 4 +-
drivers/vfio/pci/Makefile | 3 +-
drivers/vfio/pci/nvgrace-gpu/main.c | 5 +
drivers/vfio/pci/vfio_pci_config.c | 30 +-
drivers/vfio/pci/vfio_pci_core.c | 225 +++++++++---
drivers/vfio/pci/vfio_pci_dmabuf.c | 548 ++++++++++++++++++++++++----
drivers/vfio/pci/vfio_pci_priv.h | 57 ++-
include/linux/pci-p2pdma.h | 24 +-
include/linux/pci.h | 2 +-
include/linux/vfio_pci_core.h | 1 +
include/uapi/linux/vfio.h | 57 +++
14 files changed, 815 insertions(+), 169 deletions(-)
--
2.47.3
Most of this patch series has already been pushed upstream, this is just
the second half of the patch series that has not been pushed yet + some
additional changes which were required to implement changes requested by
the mailing list. This patch series is originally from Asahi, previously
posted by Daniel Almeida.
The previous version of the patch series can be found here:
https://patchwork.freedesktop.org/series/164580/
Branch with patches applied available here:
https://gitlab.freedesktop.org/lyudess/linux/-/commits/rust/gem-shmem
This patch series applies on top of drm-rust-next
Patch-series wide changes since V15:
* Fix some major rebasing errors I somehow didn't notice :(
* Drop the dependency on LazyInit, use the trick that Alice suggested
instead.
* Fix dependency ordering so that Tyr can get the vmap stuff first
without the other bits.
Lyude Paul (6):
rust: drm: gem/shmem: Add DmaResvGuard helper
rust: drm: gem: Add vmap functions to shmem bindings
rust: sync: Add SetOnce::reset()
rust: gem: shmem: Fix Default implementation for ObjectConfig
rust: faux: Allow retrieving a bound Device
rust: drm: gem: Introduce shmem::Object::sg_table()
rust/kernel/drm/gem/shmem.rs | 507 ++++++++++++++++++++++++++++++++++-
rust/kernel/faux.rs | 7 +-
rust/kernel/sync/set_once.rs | 60 ++++-
3 files changed, 552 insertions(+), 22 deletions(-)
base-commit: b78dab829760aee9b83f5cf15550a0fe36c6f4b0
--
2.54.0
Remember the joy of classic arcade games? The simple yet addictive thrill of guiding a character through a challenging environment, aiming for the top score? In the vast ocean of online games, one title consistently delivers that nostalgic punch with a modern twist: Slither io. This seemingly straightforward game, where you control a growing snake, offers an engaging and surprisingly strategic experience that can captivate players for hours. If you're looking for a delightful way to pass the time or a new challenge to conquer, let's dive into the charming world of Slither io.
The Hypnotic Dance of Growth: Understanding Slither io Gameplay
At its core, Slither io is an incredibly easy game to pick up. You start as a small, unassuming snake, slithering across a vast arena filled with colorful, glowing dots. Your primary objective? To eat these dots and grow. Each dot consumed adds a tiny segment to your snake, making you longer and, consequently, more imposing.
https://slitherio.onl
The catch, and where the real fun begins, lies in the other snakes. The arena is teeming with fellow players, each trying to grow their own serpentine champion. Your snake's head is its Achilles' heel. If your head collides with any part of another snake’s body, it's game over. You burst into a cloud of those precious glowing dots, which then become a feast for your rivals. Conversely, if another snake’s head collides with your body, they explode, leaving their accumulated mass for you to gobble up. This creates a thrilling dynamic of predator and prey, where even the smallest snake can take down the largest with a well-timed maneuver.
Movement is equally intuitive. You control your snake's direction with your mouse or finger (on mobile). A subtle but crucial mechanic is "boosting." By holding down the left mouse button (or tapping and holding on mobile), your snake speeds up, leaving a trail of its own mass behind. This boost is invaluable for quick escapes, aggressive maneuvers, or snatching up desirable food before a competitor. However, be warned: boosting constantly will cause your snake to shrink, as you sacrifice your hard-earned length for speed. Mastering the art of when and how to boost is a key element of success.
Weaving Your Way to Victory: Essential Tips and Strategies
While the basic gameplay is simple, becoming a top-tier Slither io player requires a blend of observation, strategy, and a little bit of nerve. Here are some tips to help you dominate the arena:
Patience is a Virtue: Don't rush into every cluster of food or chase every small snake. Focus on steady growth initially. Stay near the edges of the map where there's usually less competition and more scattered dots.
The Art of the Encirclement: One of the most satisfying ways to take down a larger opponent is to encircle them. Once you're big enough, coil around a smaller snake, slowly tightening your grip until they have nowhere to go but into your body. This takes practice and a good sense of timing.
Boost Wisely, Not Wildly: Boosting is a powerful tool, but use it strategically. It's excellent for darting in to steal food from under another snake's nose, making a quick escape from a dangerous situation, or closing in on a vulnerable opponent. Avoid continuous boosting, as it shrinks you rapidly.
The "Head-on" Gambit: Sometimes, a risky head-on approach can pay off. If you're confident in your agility, you can try to cut in front of another snake's path, forcing them to collide with your body. This is a high-risk, high-reward strategy.
Learn from the Fallen: When a large snake explodes, it leaves a significant amount of food. This is often the prime target for many players, leading to chaotic scrambles. Don't be afraid to dive into these situations, but be extremely careful. Approach the food from the outside, picking off segments rather than rushing directly into the center.
Observe and Anticipate: Pay attention to the movements of other snakes. Are they aggressive? Are they fleeing? Anticipating their next move can give you a crucial advantage in both offense and defense.
Master the Corner Turn: When you’re long, turning sharply can be tricky. Practice making smooth, wide turns to avoid accidental collisions with your own body or other snakes. Conversely, a quick, tight turn can sometimes trap an unsuspecting smaller snake.
The Endless Appeal of the Serpent's Saga
Slither io isn't just about winning; it's about the journey. The thrill of outmaneuvering a colossal snake, the satisfaction of seeing your own grow to an impressive length, and the continuous challenge of navigating a bustling arena make for a truly engaging experience. The game's vibrant colors, smooth animations, and surprisingly competitive gameplay combine to create an addictive and enjoyable pastime. Whether you're a seasoned gamer or just looking for a simple yet captivating diversion, take a chance on this modern classic. You might just find yourself happily slithering for hours on end, aiming to become the longest and most fearsome serpent in the arena. Experience the fun yourself at Slither io and embark on your own journey to the top of the leaderboard!
In the vast and ever-growing world of mobile gaming, sometimes the simplest concepts deliver the most satisfying experiences. One such gem is Block Blast, a captivating puzzle game that combines the familiar mechanics of Tetris with a fresh, strategic twist. If you're looking for a relaxing yet engaging way to challenge your mind, then Block Blast might just be your next addiction.
https://blockblasts.io/
What is Block Blast?
Imagine a 10x10 grid, initially empty. Your goal is to fill this grid with various-shaped blocks that appear at the bottom of the screen, one set of three at a time. The catch? You can't rotate the blocks, and you must place all three given blocks before new ones appear. The objective is to clear lines and columns by filling them completely. Once a line or column is full, it disappears, freeing up space for more blocks and earning you points. The game ends when you can no longer place any of the current blocks on the grid.
Gameplay: Simple to Grasp, Challenging to Master
The beauty of Block Blast lies in its straightforward mechanics. You simply drag and drop the blocks from the bottom onto the grid. There's no time limit, no frantic swiping – just thoughtful placement. However, don't let the simplicity fool you. As the grid fills up, strategic thinking becomes paramount. You'll quickly learn the importance of anticipating future block shapes and planning your placements to create more clearing opportunities. Should you save that long straight piece for a full column clear, or use it now to open up a crucial corner? These are the delightful dilemmas you'll face. The game encourages a flow state, where you’re constantly evaluating and adapting to the evolving grid.
Tips for Becoming a Block Blast Master
While the game is easy to pick up, a few strategies can significantly boost your scores and enjoyment:
Prioritize Clears: Don't be afraid to clear lines or columns, even if it means using a block in a less-than-ideal spot. Clearing space is crucial for longevity.
Think Ahead: Always glance at the next set of blocks. This allows you to plan your current placements with future pieces in mind, creating combos and larger clears.
Corners are King: Filling corners and edges can be tricky, so try to tackle them early when you have more space. Don't leave isolated blocks in odd spots.
The "L" and "T" Block Dilemma: These oddly shaped blocks can be your best friends or worst enemies. Learn how to integrate them into your clears effectively, often by leaving gaps for them.
Practice Makes Perfect: Like any puzzle game, consistent play will improve your spatial reasoning and pattern recognition, leading to higher scores.
Conclusion
Block Blast offers a delightful blend of relaxation and mental stimulation. Its intuitive design makes it accessible to everyone, while its strategic depth keeps players engaged for countless hours. Whether you're looking for a quick brain break or a long, meditative puzzle session, Block Blast provides a satisfying and rewarding experience. Give it a try, and you might just find your new favorite way to unwind and sharpen your mind, one block at a time.
Ready to unleash your inner fruit ninja without the mess? Then get ready to dive into the addictively simple, yet surprisingly challenging world of Slice Master. This game, readily available online, is perfect for a quick burst of fun or a more extended gaming session. It’s a testament to the fact that gameplay doesn't need to be complex to be engaging.
https://slicemasterfree.com
Gameplay: Simple Mechanics, Endless Fun
The core concept of Slice Master is refreshingly straightforward. Colorful fruits are launched into the air, and your mission is to slice them into pieces before they fall off the screen. You control a virtual blade with your mouse or finger (depending on the platform), and drawing lines through the fruit initiates the slicing action.
The catch? You have limited lives, and letting too many fruits fall untouched will result in a game over. Occasionally, you'll also encounter bombs mixed in with the fruit barrage. Accidentally slicing a bomb will end your run instantly, adding a layer of strategic thinking to the rapid-fire action.
As you progress, the game throws different types of fruit at you, some requiring multiple slices, and the speed increases gradually, demanding faster reflexes and more precise movements. Special fruits might offer score multipliers or other benefits, adding further depth to the gameplay. It’s a game where practice truly makes perfect, and mastering the art of fruit slicing is incredibly satisfying. You can try it out now by clicking on Slice Master.
Tips for Achieving Fruit-Slicing Mastery
While the game seems simple on the surface, a few strategies can significantly improve your score and extend your gameplay.
• Focus on Efficiency: Instead of frantically slashing at individual fruits, try to slice multiple fruits with a single, well-aimed swipe. This not only increases your score but also conserves your limited slicing time.
• Prioritize High-Value Fruits: Keep an eye out for special fruits that offer bonus points or multipliers. Slicing these at the right moment can dramatically boost your score.
• Be Mindful of Bombs: This one is crucial! Always be aware of the position of the bombs and avoid them at all costs. A moment of carelessness can instantly end your game. Try to train yourself to recognize them early and plan your slices accordingly.
• Practice Makes Perfect: Like any skill-based game, practice is essential for improving your reflexes and accuracy. The more you play, the better you'll become at predicting fruit trajectories and executing precise slices. So, keep practicing and you'll be reaching new high scores in no time!
In Conclusion: A Slice of Addictive Fun
Slice Master offers a surprisingly addictive and engaging gaming experience, despite its simple premise. Its accessible gameplay, combined with the escalating challenge, makes it a perfect choice for a quick dose of entertainment or a more extended gaming session. Whether you're looking for a casual distraction or a skill-based challenge, Slice Master provides a satisfying and fun way to test your reflexes and accuracy. So, grab your virtual blade and prepare to unleash your inner fruit-slicing ninja!
In case MMIO size is bigger than 4G and peer2peer DMA goes
through host bridge, we trigger a code path that assigns the
total linked IOVA (which is greater than 4G) to mapped_len.
Previously, `mapped_len` was declared as 32-bit `unsigned int`.
When accumulating `size_t` lengths, this leads to a silent wrap-around.
This truncation causes truncated lengths to be passed to functions
like `fill_sg_entry()`.
Fix this by changing `mapped_len` to `size_t` (64-bit). While
at it, fix similar potential overflow issues in `calc_sg_nents`
by using `size_t` for `nents` and checking against `UINT_MAX`
and using `unsigned int` for the loop iterator in `fill_sg_entry`
to match.
Fixes: 3aa31a8bb11e ("dma-buf: provide phys_vec to scatter-gather mapping routine")
Cc: stable(a)vger.kernel.org
Cc: iommu(a)lists.linux.dev
Reviewed-by: Pranjal Shrivastava <praan(a)google.com>
Signed-off-by: David Hu <xuehaohu(a)google.com>
---
Changes in v5:
- Removed WARN_ON_ONCE from calc_sg_nents() to avoid log noise (Jason).
- Added explicit check for `!nents` in dma_buf_phys_vec_to_sgt() to
cleanly return -EINVAL on overflow (Jason).
Changes in v4:
- Added WARN_ON_ONCE() to the nents overflow check to prevent silent
failures (Claude Bot).
Changes in v3:
- Removed leftover sentence fragment from the commit message.
- Kept `nents = 0` initialization (previously stated as removed in the
v2 changelog) as it is strictly required for the `+=` accumulation
loop in `calc_sg_nents()`.
Changes in v2:
- Fixed 'IVOA' -> 'IOVA' typo and expanded commit message (Claude Bot).
- Added Reverse Xmas tree formatting (Pranjal).
- Folded in extra bounds checking for calc_sg_nents() (Pranjal).
- Folded in type consistency fix for fill_sg_entry() (Pranjal).
drivers/dma-buf/dma-buf-mapping.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/dma-buf/dma-buf-mapping.c b/drivers/dma-buf/dma-buf-mapping.c
index 794acff2546a..607b7998463d 100644
--- a/drivers/dma-buf/dma-buf-mapping.c
+++ b/drivers/dma-buf/dma-buf-mapping.c
@@ -10,7 +10,7 @@ static struct scatterlist *fill_sg_entry(struct scatterlist *sgl, size_t length,
dma_addr_t addr)
{
unsigned int len, nents;
- int i;
+ unsigned int i;
nents = DIV_ROUND_UP(length, UINT_MAX);
for (i = 0; i < nents; i++) {
@@ -36,7 +36,7 @@ static unsigned int calc_sg_nents(struct dma_iova_state *state,
struct phys_vec *phys_vec, size_t nr_ranges,
size_t size)
{
- unsigned int nents = 0;
+ size_t nents = 0;
size_t i;
if (!state || !dma_use_iova(state)) {
@@ -51,6 +51,9 @@ static unsigned int calc_sg_nents(struct dma_iova_state *state,
nents = DIV_ROUND_UP(size, UINT_MAX);
}
+ if (nents > UINT_MAX)
+ return 0;
+
return nents;
}
@@ -95,9 +98,10 @@ struct sg_table *dma_buf_phys_vec_to_sgt(struct dma_buf_attachment *attach,
size_t nr_ranges, size_t size,
enum dma_data_direction dir)
{
- unsigned int nents, mapped_len = 0;
struct dma_buf_dma *dma;
struct scatterlist *sgl;
+ size_t mapped_len = 0;
+ unsigned int nents;
dma_addr_t addr;
size_t i;
int ret;
@@ -133,6 +137,11 @@ struct sg_table *dma_buf_phys_vec_to_sgt(struct dma_buf_attachment *attach,
}
nents = calc_sg_nents(dma->state, phys_vec, nr_ranges, size);
+ if (!nents) {
+ ret = -EINVAL;
+ goto err_free_state;
+ }
+
ret = sg_alloc_table(&dma->sgt, nents, GFP_KERNEL | __GFP_ZERO);
if (ret)
goto err_free_state;
--
2.54.0.929.g9b7fa37559-goog
In case MMIO size is bigger than 4G and peer2peer DMA goes
through host bridge, we trigger a code path that assigns the
total linked IOVA (which is greater than 4G) to mapped_len.
Previously, `mapped_len` was declared as 32-bit `unsigned int`.
When accumulating `size_t` lengths, this leads to a silent wrap-around.
This truncation causes truncated lengths to be passed to functions
like `fill_sg_entry()`.
Fix this by changing `mapped_len` to `size_t` (64-bit). While
at it, fix similar potential overflow issues in `calc_sg_nents`
by using `size_t` for `nents` and checking against `UINT_MAX`
and using `unsigned int` for the loop iterator in `fill_sg_entry`
to match.
Fixes: 3aa31a8bb11e ("dma-buf: provide phys_vec to scatter-gather mapping routine")
Cc: stable(a)vger.kernel.org
Cc: iommu(a)lists.linux.dev
Reviewed-by: Pranjal Shrivastava <praan(a)google.com>
Signed-off-by: David Hu <xuehaohu(a)google.com>
---
Changes in v4:
- Added WARN_ON_ONCE() to the nents overflow check to prevent silent
failures (Claude Bot).
Changes in v3:
- Removed leftover sentence fragment from the commit message.
- Kept `nents = 0` initialization (previously stated as removed in the
v2 changelog) as it is strictly required for the `+=` accumulation
loop in `calc_sg_nents()`.
Changes in v2:
- Fixed 'IVOA' -> 'IOVA' typo and expanded commit message (Claude Bot).
- Added Reverse Xmas tree formatting (Pranjal).
- Folded in extra bounds checking for calc_sg_nents() (Pranjal).
- Folded in type consistency fix for fill_sg_entry() (Pranjal).
drivers/dma-buf/dma-buf-mapping.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/dma-buf/dma-buf-mapping.c b/drivers/dma-buf/dma-buf-mapping.c
index 794acff2546a..1aabc0ee70bb 100644
--- a/drivers/dma-buf/dma-buf-mapping.c
+++ b/drivers/dma-buf/dma-buf-mapping.c
@@ -10,7 +10,7 @@ static struct scatterlist *fill_sg_entry(struct scatterlist *sgl, size_t length,
dma_addr_t addr)
{
unsigned int len, nents;
- int i;
+ unsigned int i;
nents = DIV_ROUND_UP(length, UINT_MAX);
for (i = 0; i < nents; i++) {
@@ -36,7 +36,7 @@ static unsigned int calc_sg_nents(struct dma_iova_state *state,
struct phys_vec *phys_vec, size_t nr_ranges,
size_t size)
{
- unsigned int nents = 0;
+ size_t nents = 0;
size_t i;
if (!state || !dma_use_iova(state)) {
@@ -51,6 +51,9 @@ static unsigned int calc_sg_nents(struct dma_iova_state *state,
nents = DIV_ROUND_UP(size, UINT_MAX);
}
+ if (WARN_ON_ONCE(nents > UINT_MAX))
+ return 0;
+
return nents;
}
@@ -95,9 +98,10 @@ struct sg_table *dma_buf_phys_vec_to_sgt(struct dma_buf_attachment *attach,
size_t nr_ranges, size_t size,
enum dma_data_direction dir)
{
- unsigned int nents, mapped_len = 0;
struct dma_buf_dma *dma;
struct scatterlist *sgl;
+ size_t mapped_len = 0;
+ unsigned int nents;
dma_addr_t addr;
size_t i;
int ret;
--
2.54.0.929.g9b7fa37559-goog
The reason for this is to properly support the spi nor chip on the
Jetson Xavier NX module. Prior to this, it would time out on all
transfers and sometimes even trigger a cbb fault, locking up the entire
unit. With this, reading and writing to the flash memory works as
expected.
This also fixes the tegra210-quad spi driver to properly use the dma
memory space instead of the spi controllers. Without this, enabling dma
on the controllers results in mmu faults.
The driver change has only been tested on tegra210 / p3450 and tegra194
/ p3518 as that is the only available test platforms. Tegra234 and
Tegra241 should also be verified. I have p3766 for tegra234, but the
qspi flash memory is firewalled by mb1 on all publicly available
bootloaders, and no other spi devices are part of the devkit.
---
Changes in v2:
- Drop bindings patches
- Add patch to use dma memory space instead of the spi controllers when
dma is enabled.
- Drop iommu properties from final patch
- Link to v1: https://lore.kernel.org/r/20260515-tegra194-qspi-iommu-v1-0-57dfb63cd3d6@gm…
---
Aaron Kling (2):
spi: tegra210-quad: Allocate DMA memory for DMA engine
arm64: tegra: Enable DMA Support on Tegra194 QSPI
arch/arm64/boot/dts/nvidia/tegra194.dtsi | 4 ++++
drivers/spi/spi-tegra210-quad.c | 29 ++++++++++++++++++-----------
2 files changed, 22 insertions(+), 11 deletions(-)
---
base-commit: c1ecb239fa3456529a32255359fc78b69eb9d847
change-id: 20260515-tegra194-qspi-iommu-e4e4644d5fdf
Best regards,
--
Aaron Kling <webgeek1234(a)gmail.com>
Changes since the RFC:
- Include support for ForeignOwnable for ARef, so that a Fence can be
stuffed into an XArray et al. (Code by Danilo)
- Implement ForeignOwnable (with new borrow type) for DriverFence, so
that it can be stuffed into an XArray.
- Include the rcu::RcuBox data type to defer dropping data with RCU
(Cody by Alice)
- Port DmaFence to RcuBox to make UAF bugs through later, new dma_fence
callbacks (backend_ops) impossible.
- Force users to pass their fence data in an RcuBox (or have it not
need drop()) through a Sealed trait.
- Document the rules for the user's DriverFence::data's drop
implementation very clearly (deadlock danger).
- rustfmt, Clippy.
- Various style suggestions, safety comments, etc. (Önur)
- Add __rust_helper prefix to helper functions. (Önur)
Changes in RFC v3:
- Omit JobQueue patches for now
- Completely redesign the memory layout: Instead of a Fence
refcounting a DriverFence, both now live in the same allocation to
allow for future support the dma_fence backend_ops callbacks which
need to do container_of. (mostly Boris's feedback)
- Allow for pre-allocating fences to avoid deadlocks when submitting
jobs to a GPU. (Boris)
- Simultaneously, allow for pre-preparing fence callback objects, so
the driver can allocate them when it sees fit. (code largely stolen
and inspired by Daniel).
- Signal fences on drop, ensure synchronization.
- Force users to set an error code when signalling.
- Write more documentation
- A ton of minor other changes.
Alright, so since the last RFCs did not reveal significant design
issues, I decided to transition this series to a v1 and hope that we can
get it upstream.
This now includes code for more common infrastructure that dma_fence
needs, contributed by Danilo and Alice.
---
Old cover letter for RFC:
So, this is the spiritual successor of the first / second RFC [1]. v2
also contained code for drm::JobQueue, but mostly to show how the fence
code would be used. JobQueue is under heavy rework right now, so I don't
want to bother your eyes with it. The docstring examples should show how
Rust fences are supposed to be used, though.
This v3 contains a huge amount of highly valuable feedback from a
variety of people, notably Boris, but also from Alice, Gary and Danilo.
There are some TODOs open (a better trait for fence backend_ops and RCU
support), but my hope is that this effort is now finally approaching its
end.
I would greatly appreciate feedback and especially more information
about what might be missing to make this usable, which is obviously
where Daniel's and Boris's feedback will be valuable once more.
Please regard this patch just as what it's titled: an RFC, to discuss a
bit more and to inform a broader community about what the current state
is and where this is heading at.
Many regards,
Philipp
[1] https://lore.kernel.org/rust-for-linux/20260203081403.68733-2-phasta@kernel…
Alice Ryhl (1):
rust: rcu: add RcuBox type
Danilo Krummrich (1):
rust: types: implement ForeignOwnable for ARef<T>
Philipp Stanner (2):
rust: Add dma_fence abstractions
MAINTAINERS: Add entry for Rust dma-buf
MAINTAINERS | 2 +
rust/bindings/bindings_helper.h | 2 +
rust/helpers/dma_fence.c | 48 ++
rust/helpers/helpers.c | 1 +
rust/kernel/dma_buf/dma_fence.rs | 821 +++++++++++++++++++++++++++++++
rust/kernel/dma_buf/mod.rs | 13 +
rust/kernel/lib.rs | 1 +
rust/kernel/sync/aref.rs | 39 ++
rust/kernel/sync/rcu.rs | 31 +-
rust/kernel/sync/rcu/rcu_box.rs | 145 ++++++
10 files changed, 1102 insertions(+), 1 deletion(-)
create mode 100644 rust/helpers/dma_fence.c
create mode 100644 rust/kernel/dma_buf/dma_fence.rs
create mode 100644 rust/kernel/dma_buf/mod.rs
create mode 100644 rust/kernel/sync/rcu/rcu_box.rs
--
2.54.0
The kerneldoc comment on dma_fence_init() and dma_fence_init64() describe
the legacy reason to pass an external lock as a need to prevent multiple
fences "from signaling out of order". However, this wording is a bit
misleading: a shared spinlock does not (and cannot) prevent the signaler
from signaling out of order. Signaling order is the driver's responsibility
regardless of whether the lock is shared or per-fence.
Reword both comments to better describe the legacy use cases where a
shared lock was needed.
Signed-off-by: MaÃra Canal <mcanal(a)igalia.com>
---
v1 -> v2: https://lore.kernel.org/dri-devel/20260411185756.1887119-4-mcanal@igalia.co…
- Be more explicit about not allowing new users to use an external lock.
- De-duplicate the explanation in dma_fence_init64() by pointing to the
dma_fence_init() documentation.
v2 -> v3: https://lore.kernel.org/dri-devel/20260419134943.54833-2-mcanal@igalia.com/…
- Apply Christian's suggestion with small readability improvements.
---
drivers/dma-buf/dma-fence.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index b3bfa6943a8e..c7ea1e75d38a 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -1102,9 +1102,12 @@ __dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
* context and seqno are used for easy comparison between fences, allowing
* to check which fence is later by simply using dma_fence_later().
*
- * It is strongly discouraged to provide an external lock because this couples
- * lock and fence life time. This is only allowed for legacy use cases when
- * multiple fences need to be prevented from signaling out of order.
+ * External locks are a relic of legacy use cases that needed a shared lock
+ * to serialize signaling when no out-of-order signaling was possible through
+ * &dma_fence_ops.signaled. Drivers have abandoned this concept since the
+ * introduction of the callback, but the external lock is still around. New
+ * users MUST NOT use external locks, as they force the issuer to outlive all
+ * fences that reference the lock.
*/
void
dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
@@ -1129,9 +1132,8 @@ EXPORT_SYMBOL(dma_fence_init);
* Context and seqno are used for easy comparison between fences, allowing
* to check which fence is later by simply using dma_fence_later().
*
- * It is strongly discouraged to provide an external lock because this couples
- * lock and fence life time. This is only allowed for legacy use cases when
- * multiple fences need to be prevented from signaling out of order.
+ * New users MUST NOT use external locks. Check the documentation in
+ * dma_fence_init() to understand the motives behind the legacy use cases.
*/
void
dma_fence_init64(struct dma_fence *fence, const struct dma_fence_ops *ops,
--
2.54.0