The Arm Ethos-U65/85 NPUs are designed for edge AI inference
applications[0].
The driver works with Mesa Teflon. A merge request for Ethos support is
here[1]. The UAPI should also be compatible with the downstream (open
source) driver stack[2] and Vela compiler though that has not been
implemented.
Testing so far has been on i.MX93 boards with Ethos-U65. Support for U85
is still todo. Only minor changes on driver side will be needed for U85
support.
A git tree is here[3].
Rob
[0] https://www.arm.com/products/silicon-ip-cpu?families=ethos%20npus
[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36699/
[2] https://gitlab.arm.com/artificial-intelligence/ethos-u/
[3] git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git ethos-v2
Signed-off-by: Rob Herring (Arm) <robh(a)kernel.org>
---
Changes in v2:
- Rebase on v6.17-rc1 adapting to scheduler changes
- scheduler: Drop the reset workqueue. According to the scheduler docs,
we don't need it since we have a single h/w queue.
- scheduler: Rework the timeout handling to continue running if we are
making progress. Fixes timeouts on larger jobs.
- Reset the NPU on resume so it's in a known state
- Add error handling on clk_get() calls
- Fix drm_mm splat on module unload. We were missing a put on the
cmdstream BO in the scheduler clean-up.
- Fix 0-day report needing explicit bitfield.h include
- Link to v1: https://lore.kernel.org/r/20250722-ethos-v1-0-cc1c5a0cbbfb@kernel.org
---
Rob Herring (Arm) (2):
dt-bindings: npu: Add Arm Ethos-U65/U85
accel: Add Arm Ethos-U NPU driver
.../devicetree/bindings/npu/arm,ethos.yaml | 79 +++
MAINTAINERS | 9 +
drivers/accel/Kconfig | 1 +
drivers/accel/Makefile | 1 +
drivers/accel/ethos/Kconfig | 10 +
drivers/accel/ethos/Makefile | 4 +
drivers/accel/ethos/ethos_device.h | 181 ++++++
drivers/accel/ethos/ethos_drv.c | 418 ++++++++++++
drivers/accel/ethos/ethos_drv.h | 15 +
drivers/accel/ethos/ethos_gem.c | 707 +++++++++++++++++++++
drivers/accel/ethos/ethos_gem.h | 46 ++
drivers/accel/ethos/ethos_job.c | 514 +++++++++++++++
drivers/accel/ethos/ethos_job.h | 41 ++
include/uapi/drm/ethos_accel.h | 262 ++++++++
14 files changed, 2288 insertions(+)
---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20250715-ethos-3fdd39ef6f19
Best regards,
--
Rob Herring (Arm) <robh(a)kernel.org>
Hello all,
This series makes it so the udmabuf will sync the backing buffer
with the set of attached devices as required for DMA-BUFs when
doing {begin,end}_cpu_access.
Thanks
Andrew
Changes for v2:
- fix attachment table use-after-free
- rebased on v6.17-rc1
Andrew Davis (3):
udmabuf: Keep track current device mappings
udmabuf: Sync buffer mappings for attached devices
udmabuf: Use module_misc_device() to register this device
drivers/dma-buf/udmabuf.c | 133 +++++++++++++++++++-------------------
1 file changed, 67 insertions(+), 66 deletions(-)
--
2.39.2
On 14.08.25 10:16, Janusz Krzysztofik wrote:
> When first user starts waiting on a not yet signaled fence of a chain
> link, a dma_fence_chain callback is added to a user fence of that link.
> When the user fence of that chain link is then signaled, the chain is
> traversed in search for a first not signaled link and the callback is
> rearmed on a user fence of that link.
>
> Since chain fences may be exposed to user space, e.g. over drm_syncobj
> IOCTLs, users may start waiting on any link of the chain, then many links
> of a chain may have signaling enabled and their callbacks added to their
> user fences. Once an arbitrary user fence is signaled, all
> dma_fence_chain callbacks added to it so far must be rearmed to another
> user fence of the chain. In extreme scenarios, when all N links of a
> chain are awaited and then signaled in reverse order, the dma_fence_chain
> callback may be called up to N * (N + 1) / 2 times (an arithmetic series).
>
> To avoid that potential excessive accumulation of dma_fence_chain
> callbacks, rearm a trimmed-down, signal only callback version to the base
> fence of a previous link, if not yet signaled, otherwise just signal the
> base fence of the current link instead of traversing the chain in search
> for a first not signaled link and moving all callbacks collected so far to
> a user fence of that link.
Well clear NAK to that! You can easily overflow the kernel stack with that!
Additional to this messing with the fence ops outside of the dma_fence code is an absolute no-go.
Regards,
Christian.
>
> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12904
> Suggested-by: Chris Wilson <chris.p.wilson(a)linux.intel.com>
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik(a)linux.intel.com>
> ---
> drivers/dma-buf/dma-fence-chain.c | 101 +++++++++++++++++++++++++-----
> 1 file changed, 84 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c
> index a8a90acf4f34d..90eff264ee05c 100644
> --- a/drivers/dma-buf/dma-fence-chain.c
> +++ b/drivers/dma-buf/dma-fence-chain.c
> @@ -119,46 +119,113 @@ static const char *dma_fence_chain_get_timeline_name(struct dma_fence *fence)
> return "unbound";
> }
>
> -static void dma_fence_chain_irq_work(struct irq_work *work)
> +static void signal_irq_work(struct irq_work *work)
> {
> struct dma_fence_chain *chain;
>
> chain = container_of(work, typeof(*chain), work);
>
> - /* Try to rearm the callback */
> - if (!dma_fence_chain_enable_signaling(&chain->base))
> - /* Ok, we are done. No more unsignaled fences left */
> - dma_fence_signal(&chain->base);
> + dma_fence_signal(&chain->base);
> dma_fence_put(&chain->base);
> }
>
> -static void dma_fence_chain_cb(struct dma_fence *f, struct dma_fence_cb *cb)
> +static void signal_cb(struct dma_fence *f, struct dma_fence_cb *cb)
> +{
> + struct dma_fence_chain *chain;
> +
> + chain = container_of(cb, typeof(*chain), cb);
> + init_irq_work(&chain->work, signal_irq_work);
> + irq_work_queue(&chain->work);
> +}
> +
> +static void rearm_irq_work(struct irq_work *work)
> +{
> + struct dma_fence_chain *chain;
> + struct dma_fence *prev;
> +
> + chain = container_of(work, typeof(*chain), work);
> +
> + rcu_read_lock();
> + prev = rcu_dereference(chain->prev);
> + if (prev && dma_fence_add_callback(prev, &chain->cb, signal_cb))
> + prev = NULL;
> + rcu_read_unlock();
> + if (prev)
> + return;
> +
> + /* Ok, we are done. No more unsignaled fences left */
> + signal_irq_work(work);
> +}
> +
> +static inline bool fence_is_signaled__nested(struct dma_fence *fence)
> +{
> + if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> + return true;
> +
> + if (fence->ops->signaled && fence->ops->signaled(fence)) {
> + unsigned long flags;
> +
> + spin_lock_irqsave_nested(fence->lock, flags, SINGLE_DEPTH_NESTING);
> + dma_fence_signal_locked(fence);
> + spin_unlock_irqrestore(fence->lock, flags);
> +
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static bool prev_is_signaled(struct dma_fence_chain *chain)
> +{
> + struct dma_fence *prev;
> + bool result;
> +
> + rcu_read_lock();
> + prev = rcu_dereference(chain->prev);
> + result = !prev || fence_is_signaled__nested(prev);
> + rcu_read_unlock();
> +
> + return result;
> +}
> +
> +static void rearm_or_signal_cb(struct dma_fence *f, struct dma_fence_cb *cb)
> {
> struct dma_fence_chain *chain;
>
> chain = container_of(cb, typeof(*chain), cb);
> - init_irq_work(&chain->work, dma_fence_chain_irq_work);
> + if (prev_is_signaled(chain)) {
> + /* Ok, we are done. No more unsignaled fences left */
> + init_irq_work(&chain->work, signal_irq_work);
> + } else {
> + /* Try to rearm the callback */
> + init_irq_work(&chain->work, rearm_irq_work);
> + }
> +
> irq_work_queue(&chain->work);
> - dma_fence_put(f);
> }
>
> static bool dma_fence_chain_enable_signaling(struct dma_fence *fence)
> {
> struct dma_fence_chain *head = to_dma_fence_chain(fence);
> + int err = -ENOENT;
>
> - dma_fence_get(&head->base);
> - dma_fence_chain_for_each(fence, &head->base) {
> - struct dma_fence *f = dma_fence_chain_contained(fence);
> + if (WARN_ON(!head))
> + return false;
>
> - dma_fence_get(f);
> - if (!dma_fence_add_callback(f, &head->cb, dma_fence_chain_cb)) {
> + dma_fence_get(fence);
> + if (head->fence)
> + err = dma_fence_add_callback(head->fence, &head->cb, rearm_or_signal_cb);
> + if (err) {
> + if (prev_is_signaled(head)) {
> dma_fence_put(fence);
> - return true;
> + } else {
> + init_irq_work(&head->work, rearm_irq_work);
> + irq_work_queue(&head->work);
> + err = 0;
> }
> - dma_fence_put(f);
> }
> - dma_fence_put(&head->base);
> - return false;
> +
> + return !err;
> }
>
> static bool dma_fence_chain_signaled(struct dma_fence *fence)
Hi Amirreza,
kernel test robot noticed the following build warnings:
[auto build test WARNING on 2674d1eadaa2fd3a918dfcdb6d0bb49efe8a8bb9]
url: https://github.com/intel-lab-lkp/linux/commits/Amirreza-Zarrabi/tee-allow-a…
base: 2674d1eadaa2fd3a918dfcdb6d0bb49efe8a8bb9
patch link: https://lore.kernel.org/r/20250812-qcom-tee-using-tee-ss-without-mem-obj-v7…
patch subject: [PATCH v7 08/11] tee: add Qualcomm TEE driver
config: hexagon-randconfig-r072-20250814 (https://download.01.org/0day-ci/archive/20250814/202508140527.ighXikjo-lkp@…)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project 3769ce013be2879bf0b329c14a16f5cb766f26ce)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250814/202508140527.ighXikjo-lkp@…)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp(a)intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202508140527.ighXikjo-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> drivers/tee/qcomtee/user_obj.c:384:12: warning: format specifies type 'unsigned long' but the argument has type 'u64' (aka 'unsigned long long') [-Wformat]
383 | &qcomtee_user_object_ops, "uo-%lu",
| ~~~
| %llu
384 | param->u.objref.id);
| ^~~~~~~~~~~~~~~~~~
1 warning generated.
vim +384 drivers/tee/qcomtee/user_obj.c
355
356 /**
357 * qcomtee_user_param_to_object() - OBJREF parameter to &struct qcomtee_object.
358 * @object: object returned.
359 * @param: TEE parameter.
360 * @ctx: context in which the conversion should happen.
361 *
362 * @param is an OBJREF with %QCOMTEE_OBJREF_FLAG_USER flags.
363 *
364 * Return: On success, returns 0; on failure, returns < 0.
365 */
366 int qcomtee_user_param_to_object(struct qcomtee_object **object,
367 struct tee_param *param,
368 struct tee_context *ctx)
369 {
370 struct qcomtee_user_object *user_object __free(kfree) = NULL;
371 int err;
372
373 user_object = kzalloc(sizeof(*user_object), GFP_KERNEL);
374 if (!user_object)
375 return -ENOMEM;
376
377 user_object->ctx = ctx;
378 user_object->object_id = param->u.objref.id;
379 /* By default, always notify userspace upon release. */
380 user_object->notify = true;
381 err = qcomtee_object_user_init(&user_object->object,
382 QCOMTEE_OBJECT_TYPE_CB,
383 &qcomtee_user_object_ops, "uo-%lu",
> 384 param->u.objref.id);
385 if (err)
386 return err;
387 /* Matching teedev_ctx_put() is in qcomtee_user_object_release(). */
388 teedev_ctx_get(ctx);
389
390 *object = &no_free_ptr(user_object)->object;
391
392 return 0;
393 }
394
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Hi Amir,
On Wed, Aug 13, 2025 at 2:37 AM Amirreza Zarrabi
<amirreza.zarrabi(a)oss.qualcomm.com> wrote:
>
> This patch series introduces a Trusted Execution Environment (TEE)
> driver for Qualcomm TEE (QTEE). QTEE enables Trusted Applications (TAs)
> and services to run securely. It uses an object-based interface, where
> each service is an object with sets of operations. Clients can invoke
> these operations on objects, which can generate results, including other
> objects. For example, an object can load a TA and return another object
> that represents the loaded TA, allowing access to its services.
>
There are some build errors/warnings for arm and x86_64, see
https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/jens/plans/31DmCOn1pF…
Thanks,
Jens
Hi,
This patch set allocates the protected DMA-bufs from a DMA-heap
instantiated from the TEE subsystem.
The TEE subsystem handles the DMA-buf allocations since it is the TEE
(OP-TEE, AMD-TEE, TS-TEE, or perhaps a future QTEE) which sets up the
protection for the memory used for the DMA-bufs.
The DMA-heap uses a protected memory pool provided by the backend TEE
driver, allowing it to choose how to allocate the protected physical
memory.
The allocated DMA-bufs must be imported with a new TEE_IOC_SHM_REGISTER_FD
before they can be passed as arguments when requesting services from the
secure world.
Three use-cases (Secure Video Playback, Trusted UI, and Secure Video
Recording) have been identified so far to serve as examples of what can be
expected. The use-cases have predefined DMA-heap names,
"protected,secure-video", "protected,trusted-ui", and
"protected,secure-video-record". The backend driver registers protected
memory pools for the use-cases it supports.
Each use-case has its own protected memory pool since different use-cases
require isolation from different parts of the system. A protected memory
pool can be based on a static carveout instantiated while probing the TEE
backend driver, or dynamically allocated from CMA (dma_alloc_pages()) and
made protected as needed by the TEE.
This can be tested on a RockPi 4B+ with the following steps:
repo init -u https://github.com/jenswi-linaro/manifest.git -m rockpi4.xml \
-b prototype/sdp-v11
repo sync -j8
cd build
make toolchains -j$(nproc)
make all -j$(nproc)
# Copy ../out/rockpi4.img to an SD card and boot the RockPi from that
# Connect a monitor to the RockPi
# login and at the prompt:
gst-launch-1.0 videotestsrc ! \
aesenc key=1f9423681beb9a79215820f6bda73d0f \
iv=e9aa8e834d8d70b7e0d254ff670dd718 serialize-iv=true ! \
aesdec key=1f9423681beb9a79215820f6bda73d0f ! \
kmssink
The aesdec module has been hacked to use an OP-TEE TA to decrypt the stream
into protected DMA-bufs which are consumed by the kmssink.
The primitive QEMU tests from previous patch sets can be tested on RockPi
in the same way using:
xtest --sdp-basic
The primitive tests are tested on QEMU with the following steps:
repo init -u https://github.com/jenswi-linaro/manifest.git -m qemu_v8.xml \
-b prototype/sdp-v11
repo sync -j8
cd build
make toolchains -j$(nproc)
make SPMC_AT_EL=1 all -j$(nproc)
make SPMC_AT_EL=1 run-only
# login and at the prompt:
xtest --sdp-basic
The SPMC_AT_EL=1 parameter configures the build with FF-A and an SPMC at
S-EL1 inside OP-TEE. The parameter can be changed to SPMC_AT_EL=n to test
without FF-A using the original SMC ABI instead. Please remember to do
%make arm-tf-clean
for TF-A to be rebuilt properly using the new configuration.
https://optee.readthedocs.io/en/latest/building/prerequisites.html
list dependencies required to build the above.
The primitive tests are pretty basic, mostly checking that a Trusted
Application in the secure world can access and manipulate the memory. There
are also some negative tests for out of bounds buffers, etc.
Thanks,
Jens
Changes since V10:
* Changed the new ABI OPTEE_MSG_CMD_GET_PROTMEM_CONFIG to report a list
of u32 memory attributes instead of u16 endpoints to make room for both
endpoint and access permissions in each entry.
* In "tee: new ioctl to a register tee_shm from a dmabuf file descriptor",
remove the unused path for DMA-bufs allocated by other means than the on
in the TEE SS.
* In "tee: implement protected DMA-heap", handle unloading of the
backend driver module implementing the heap. The heap is reference
counted and also calls tee_device_get() to guarantee that the module
remains available while the heap is instantiated.
* In "optee: support protected memory allocation", use
dma_coerce_mask_and_coherent() instead of open-coding the function.
* Added Sumit's R-B to
- "optee: smc abi: dynamic protected memory allocation"
- "optee: FF-A: dynamic protected memory allocation"
- "optee: support protected memory allocation"
- "tee: implement protected DMA-heap"
- "dma-buf: dma-heap: export declared functions"
Changes since V9:
* Adding Sumit's R-B to "optee: sync secure world ABI headers"
* Update commit message as requested for "dma-buf: dma-heap: export
declared functions".
* In "tee: implement protected DMA-heap":
- add the hidden config option TEE_DMABUF_HEAPS to tell if the TEE
subsystem can support DMA heaps
- add a pfn_valid() to check that the passed physical address can be
used by __pfn_to_page() and friends
- remove the memremap() call, the caller is should do that instead if
needed
* In "tee: add tee_shm_alloc_dma_mem()" guard the calls to
dma_alloc_pages() and dma_free_pages() with TEE_DMABUF_HEAPS to avoid
linking errors in some configurations
* In "optee: support protected memory allocation":
- add the hidden config option OPTEE_STATIC_PROTMEM_POOL to tell if the
driver can support a static protected memory pool
- optee_protmem_pool_init() is slightly refactored to make the patches
that follow easier
- Call devm_memremap() before calling tee_protmem_static_pool_alloc()
Changes since V8:
* Using dma_alloc_pages() instead of cma_alloc() so the direct dependency on
CMA can be removed together with the patches
"cma: export cma_alloc() and cma_release()" and
"dma-contiguous: export dma_contiguous_default_area". The patch
* Renaming the patch "tee: add tee_shm_alloc_cma_phys_mem()" to
"tee: add tee_shm_alloc_dma_mem()"
* Setting DMA mask for the OP-TEE TEE device based on input from the secure
world instead of relying on the parent device so following patches are
removed: "tee: tee_device_alloc(): copy dma_mask from parent device" and
"optee: pass parent device to tee_device_alloc()".
* Adding Sumit Garg's R-B to "tee: refactor params_from_user()"
* In the patch "tee: implement protected DMA-heap", map the physical memory
passed to tee_protmem_static_pool_alloc().
Changes since V7:
* Adding "dma-buf: dma-heap: export declared functions",
"cma: export cma_alloc() and cma_release()", and
"dma-contiguous: export dma_contiguous_default_area" to export the symbols
needed to keep the TEE subsystem as a load module.
* Removing CONFIG_TEE_DMABUF_HEAP and CONFIG_TEE_CMA since they aren't
needed any longer.
* Addressing review comments in "optee: sync secure world ABI headers"
* Better align protected memory pool initialization between the smc-abi and
ffa-abi parts of the optee driver.
* Removing the patch "optee: account for direction while converting parameters"
Changes since V6:
* Restricted memory is now known as protected memory since to use the same
term as https://docs.vulkan.org/guide/latest/protected.html. Update all
patches to consistently use protected memory.
* In "tee: implement protected DMA-heap" add the hidden config option
TEE_DMABUF_HEAP to tell if the DMABUF_HEAPS functions are available
for the TEE subsystem
* Adding "tee: refactor params_from_user()", broken out from the patch
"tee: new ioctl to a register tee_shm from a dmabuf file descriptor"
* For "tee: new ioctl to a register tee_shm from a dmabuf file descriptor":
- Update commit message to mention protected memory
- Remove and open code tee_shm_get_parent_shm() in param_from_user_memref()
* In "tee: add tee_shm_alloc_cma_phys_mem" add the hidden config option
TEE_CMA to tell if the CMA functions are available for the TEE subsystem
* For "tee: tee_device_alloc(): copy dma_mask from parent device" and
"optee: pass parent device to tee_device_alloc", added
Reviewed-by: Sumit Garg <sumit.garg(a)kernel.org>
Changes since V5:
* Removing "tee: add restricted memory allocation" and
"tee: add TEE_IOC_RSTMEM_FD_INFO"
* Adding "tee: implement restricted DMA-heap",
"tee: new ioctl to a register tee_shm from a dmabuf file descriptor",
"tee: add tee_shm_alloc_cma_phys_mem()",
"optee: pass parent device to tee_device_alloc()", and
"tee: tee_device_alloc(): copy dma_mask from parent device"
* The two TEE driver OPs "rstmem_alloc()" and "rstmem_free()" are replaced
with a struct tee_rstmem_pool abstraction.
* Replaced the the TEE_IOC_RSTMEM_ALLOC user space API with the DMA-heap API
Changes since V4:
* Adding the patch "tee: add TEE_IOC_RSTMEM_FD_INFO" needed by the
GStreamer demo
* Removing the dummy CPU access and mmap functions from the dma_buf_ops
* Fixing a compile error in "optee: FF-A: dynamic restricted memory allocation"
reported by kernel test robot <lkp(a)intel.com>
Changes since V3:
* Make the use_case and flags field in struct tee_shm u32's instead of
u16's
* Add more description for TEE_IOC_RSTMEM_ALLOC in the header file
* Import namespace DMA_BUF in module tee, reported by lkp(a)intel.com
* Added a note in the commit message for "optee: account for direction
while converting parameters" why it's needed
* Factor out dynamic restricted memory allocation from
"optee: support restricted memory allocation" into two new commits
"optee: FF-A: dynamic restricted memory allocation" and
"optee: smc abi: dynamic restricted memory allocation"
* Guard CMA usage with #ifdef CONFIG_CMA, effectively disabling dynamic
restricted memory allocate if CMA isn't configured
Changes since the V2 RFC:
* Based on v6.12
* Replaced the flags for SVP and Trusted UID memory with a u32 field with
unique id for each use case
* Added dynamic allocation of restricted memory pools
* Added OP-TEE ABI both with and without FF-A for dynamic restricted memory
* Added support for FF-A with FFA_LEND
Changes since the V1 RFC:
* Based on v6.11
* Complete rewrite, replacing the restricted heap with TEE_IOC_RSTMEM_ALLOC
Changes since Olivier's post [2]:
* Based on Yong Wu's post [1] where much of dma-buf handling is done in
the generic restricted heap
* Simplifications and cleanup
* New commit message for "dma-buf: heaps: add Linaro restricted dmabuf heap
support"
* Replaced the word "secure" with "restricted" where applicable
Etienne Carriere (1):
tee: new ioctl to a register tee_shm from a dmabuf file descriptor
Jens Wiklander (8):
optee: sync secure world ABI headers
dma-buf: dma-heap: export declared functions
tee: implement protected DMA-heap
tee: refactor params_from_user()
tee: add tee_shm_alloc_dma_mem()
optee: support protected memory allocation
optee: FF-A: dynamic protected memory allocation
optee: smc abi: dynamic protected memory allocation
drivers/dma-buf/dma-heap.c | 3 +
drivers/tee/Kconfig | 5 +
drivers/tee/Makefile | 1 +
drivers/tee/optee/Kconfig | 5 +
drivers/tee/optee/Makefile | 1 +
drivers/tee/optee/core.c | 7 +
drivers/tee/optee/ffa_abi.c | 146 ++++++++-
drivers/tee/optee/optee_ffa.h | 27 +-
drivers/tee/optee/optee_msg.h | 84 ++++-
drivers/tee/optee/optee_private.h | 15 +-
drivers/tee/optee/optee_smc.h | 37 ++-
drivers/tee/optee/protmem.c | 335 ++++++++++++++++++++
drivers/tee/optee/smc_abi.c | 141 ++++++++-
drivers/tee/tee_core.c | 157 +++++++---
drivers/tee/tee_heap.c | 500 ++++++++++++++++++++++++++++++
drivers/tee/tee_private.h | 14 +
drivers/tee/tee_shm.c | 157 +++++++++-
include/linux/tee_core.h | 59 ++++
include/linux/tee_drv.h | 10 +
include/uapi/linux/tee.h | 31 ++
20 files changed, 1668 insertions(+), 67 deletions(-)
create mode 100644 drivers/tee/optee/protmem.c
create mode 100644 drivers/tee/tee_heap.c
base-commit: 038d61fd642278bab63ee8ef722c50d10ab01e8f
--
2.43.0
On 25-07-25, 16:20, Jyothi Kumar Seerapu wrote:
>
>
> On 7/23/2025 12:45 PM, Vinod Koul wrote:
> > On 22-07-25, 15:46, Dmitry Baryshkov wrote:
> > > On Tue, Jul 22, 2025 at 05:50:08PM +0530, Jyothi Kumar Seerapu wrote:
> > > > On 7/19/2025 3:27 PM, Dmitry Baryshkov wrote:
> > > > > On Mon, Jul 07, 2025 at 09:58:30PM +0530, Jyothi Kumar Seerapu wrote:
> > > > > > On 7/4/2025 1:11 AM, Dmitry Baryshkov wrote:
> > > > > > > On Thu, 3 Jul 2025 at 15:51, Jyothi Kumar Seerapu
> >
> > [Folks, would be nice to trim replies]
> >
> > > > > > Could you please confirm if can go with the similar approach of unmap the
> > > > > > processed TREs based on a fixed threshold or constant value, instead of
> > > > > > unmapping them all at once?
> > > > >
> > > > > I'd still say, that's a bad idea. Please stay within the boundaries of
> > > > > the DMA API.
> > > > >
> > > > I agree with the approach you suggested—it's the GPI's responsibility to
> > > > manage the available TREs.
> > > >
> > > > However, I'm curious whether can we set a dynamic watermark value perhaps
> > > > half the available TREs) to trigger unmapping of processed TREs ? This would
> > > > allow the software to prepare the next set of TREs while the hardware
> > > > continues processing the remaining ones, enabling better parallelism and
> > > > throughput.
> > >
> > > Let's land the simple implementation first, which can then be improved.
> > > However I don't see any way to return 'above the watermark' from the DMA
> > > controller. You might need to enhance the API.
> >
> > Traditionally, we set the dma transfers for watermark level and we get a
> > interrupt. So you might want to set the callback for watermark level
> > and then do mapping/unmapping etc in the callback. This is typical model
> > for dmaengines, we should follow that well
> >
> > BR
>
> Thanks Dmitry and Vinod, I will work on V7 patch for submitting the I2C
> messages until they fit and and unmap all processed messages together for
> now.
>
> Regarding the watermark mechanism, looks GENI SE DMA supports watermark
> interrupts but it appears that GPI DMA doesn't have such provision of
> watermark.
What is the mechanism to get interrupts from the GPI? If you submit 10
txn, can you ask it to interrupt when half of them are done?
--
~Vinod
On Mon, 21 Jul 2025 11:17:27 +0200, Tomeu Vizoso wrote:
> This series adds a new driver for the NPU that Rockchip includes in its
> newer SoCs, developed by them on the NVDLA base.
>
> In its current form, it supports the specific NPU in the RK3588 SoC.
>
> The userspace driver is part of Mesa and an initial draft can be found at:
>
> [...]
Applied, thanks!
[07/10] arm64: dts: rockchip: add pd_npu label for RK3588 power domains
commit: 6d64bceb97a1c93b3cc2131f7e023ef2f9cf33f2
[08/10] arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588-base
commit: a31dfc060a747f08705ace36d8de006bc13318fa
[09/10] arm64: dts: rockchip: Enable the NPU on quartzpro64
commit: 640366d644b1e282771a09c72be37162b6eda438
[10/10] arm64: dts: rockchip: enable NPU on ROCK 5B
commit: 3af6a83fc85033e44ce5cd0e1de54dc20b7e15af
Best regards,
--
Heiko Stuebner <heiko(a)sntech.de>