This series adds a new DRM/Accel driver that supports the C7x DSPs
inside some Texas Instruments SoCs such as the J722S. These can be used
as accelerators for various workloads, including machine learning
inference.
This driver controls the power state of the hardware via remoteproc and
communicates with the firmware running on the DSP via rpmsg_virtio. The
kernel driver itself allocates buffers, manages contexts, and submits
jobs to the DSP firmware. Buffers are mapped by the DSP itself using its
MMU, providing memory isolation among different clients.
The source code for the firmware running on the DSP is available at:
https://gitlab.freedesktop.org/tomeu/thames_firmware/.
Everything else is done in userspace, as a Gallium driver (also called
thames) that is part of the Mesa3D project: https://docs.mesa3d.org/teflon.html
If there is more than one core that advertises the same rpmsg_virtio
service name, the driver will load balance jobs between them with
drm-gpu-scheduler.
Userspace portion of the driver: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39298
Signed-off-by: Tomeu Vizoso <tomeu(a)tomeuvizoso.net>
---
Tomeu Vizoso (5):
arm64: dts: ti: k3-j722s-ti-ipc-firmware: Add memory pool for DSP i/o buffers
accel/thames: Add driver for the C7x DSPs in TI SoCs
accel/thames: Add IOCTLs for BO creation and mapping
accel/thames: Add IOCTL for job submission
accel/thames: Add IOCTL for memory synchronization
Documentation/accel/thames/index.rst | 28 ++
MAINTAINERS | 9 +
.../boot/dts/ti/k3-j722s-ti-ipc-firmware.dtsi | 11 +-
drivers/accel/Kconfig | 1 +
drivers/accel/Makefile | 3 +-
drivers/accel/thames/Kconfig | 26 ++
drivers/accel/thames/Makefile | 11 +
drivers/accel/thames/thames_core.c | 161 +++++++
drivers/accel/thames/thames_core.h | 53 +++
drivers/accel/thames/thames_device.c | 93 +++++
drivers/accel/thames/thames_device.h | 46 ++
drivers/accel/thames/thames_drv.c | 180 ++++++++
drivers/accel/thames/thames_drv.h | 21 +
drivers/accel/thames/thames_gem.c | 407 ++++++++++++++++++
drivers/accel/thames/thames_gem.h | 45 ++
drivers/accel/thames/thames_ipc.h | 204 +++++++++
drivers/accel/thames/thames_job.c | 463 +++++++++++++++++++++
drivers/accel/thames/thames_job.h | 51 +++
drivers/accel/thames/thames_rpmsg.c | 276 ++++++++++++
drivers/accel/thames/thames_rpmsg.h | 27 ++
20 files changed, 2113 insertions(+), 3 deletions(-)
---
base-commit: 27927a79b3c6aebd18f38507a8160294243763dc
change-id: 20260113-thames-334127a2d91d
Best regards,
--
Tomeu Vizoso <tomeu(a)tomeuvizoso.net>
This series implements a dma-buf “revoke” mechanism: to allow a dma-buf
exporter to explicitly invalidate (“kill”) a shared buffer after it has
been distributed to importers, so that further CPU and device access is
prevented and importers reliably observe failure.
Today, dma-buf effectively provides “if you have the fd, you can keep using
the memory indefinitely.” That assumption breaks down when an exporter must
reclaim, reset, evict, or otherwise retire backing memory after it has been
shared. Concrete cases include GPU reset and recovery where old allocations
become unsafe to access, memory eviction/overcommit where backing storage
must be withdrawn, and security or isolation situations where continued access
must be prevented. While drivers can sometimes approximate this with
exporter-specific fencing and policy, there is no core dma-buf state transition
that communicates “this buffer is no longer valid; fail access” across all
access paths.
The change in this series is to introduce a core “revoked” state on the dma-buf
object and a corresponding exporter-triggered revoke operation. Once a dma-buf
is revoked, new access paths are blocked so that attempts to DMA-map, vmap, or
mmap the buffer fail in a consistent way.
In addition, the series aims to invalidate existing access as much as the kernel
allows: device mappings are torn down where possible so devices and IOMMUs cannot
continue DMA.
The semantics are intentionally simple: revoke is a one-way, permanent transition
for the lifetime of that dma-buf instance.
From a compatibility perspective, users that never invoke revoke are unaffected,
and exporters that adopt it gain a core-supported enforcement mechanism rather
than relying on ad hoc driver behavior. The intent is to keep the interface
minimal and avoid imposing policy; the series provides the mechanism to terminate
access, with policy remaining in the exporter and higher-level components.
BTW, see this megathread [1] for additional context.
Ironically, it was posted exactly one year ago.
[1] https://lore.kernel.org/all/20250107142719.179636-2-yilun.xu@linux.intel.co…
Thanks
Cc: linux-rdma(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-media(a)vger.kernel.org
Cc: dri-devel(a)lists.freedesktop.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: kvm(a)vger.kernel.org
Cc: iommu(a)lists.linux.dev
To: Jason Gunthorpe <jgg(a)ziepe.ca>
To: Leon Romanovsky <leon(a)kernel.org>
To: Sumit Semwal <sumit.semwal(a)linaro.org>
To: Christian König <christian.koenig(a)amd.com>
To: Alex Williamson <alex(a)shazbot.org>
To: Kevin Tian <kevin.tian(a)intel.com>
To: Joerg Roedel <joro(a)8bytes.org>
To: Will Deacon <will(a)kernel.org>
To: Robin Murphy <robin.murphy(a)arm.com>
Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
---
Leon Romanovsky (4):
dma-buf: Introduce revoke semantics
vfio: Use dma-buf revoke semantics
iommufd: Require DMABUF revoke semantics
iommufd/selftest: Reuse dma-buf revoke semantics
drivers/dma-buf/dma-buf.c | 36 ++++++++++++++++++++++++++++++++----
drivers/iommu/iommufd/pages.c | 2 +-
drivers/iommu/iommufd/selftest.c | 12 ++++--------
drivers/vfio/pci/vfio_pci_dmabuf.c | 27 ++++++---------------------
include/linux/dma-buf.h | 31 +++++++++++++++++++++++++++++++
5 files changed, 74 insertions(+), 34 deletions(-)
---
base-commit: 9ace4753a5202b02191d54e9fdf7f9e3d02b85eb
change-id: 20251221-dmabuf-revoke-b90ef16e4236
Best regards,
--
Leon Romanovsky <leonro(a)nvidia.com>
The system dma-buf heap lets userspace allocate buffers from the page
allocator. However, these allocations are not accounted for in memcg,
allowing processes to escape limits that may be configured.
Pass the __GFP_ACCOUNT for our allocations to account them into memcg.
Userspace components using the system heap can be constrained with, e.g:
systemd-run --user --scope -p MemoryMax=10M ...
Signed-off-by: Eric Chanudet <echanude(a)redhat.com>
---
drivers/dma-buf/heaps/system_heap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
index 4c782fe33fd4..c91fcdff4b77 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -38,10 +38,10 @@ struct dma_heap_attachment {
bool mapped;
};
-#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO)
+#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_ACCOUNT)
#define HIGH_ORDER_GFP (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
| __GFP_NORETRY) & ~__GFP_RECLAIM) \
- | __GFP_COMP)
+ | __GFP_COMP | __GFP_ACCOUNT)
static gfp_t order_flags[] = {HIGH_ORDER_GFP, HIGH_ORDER_GFP, LOW_ORDER_GFP};
/*
* The selection of the orders used for allocation (1MB, 64K, 4K) is designed
--
2.52.0