This patch series introduces the Qualcomm DSP Accelerator (QDA) driver, a DRM-based accelerator driver for Qualcomm DSPs. The driver provides a standardized interface for offloading computational tasks to DSPs found on Qualcomm SoCs, supporting all DSP domains.
The QDA driver implements the FastRPC protocol over the DRM accel subsystem. It uses the same device-tree node structure as the existing fastrpc driver in drivers/misc/. The approach for binding the QDA driver to device-tree nodes while coexisting with the fastrpc driver is an open item described below.
RFC thread: https://lore.kernel.org/dri-devel/20260224-qda-firstpost-v1-0-fe46a9c1a046@o...
User-space staging branch ========================= https://github.com/qualcomm/fastrpc/tree/accel/staging
Key Features ============
* Standard DRM accelerator interface via /dev/accel/accelN * GEM-based buffer management with DMA-BUF import/export (PRIME) * IOMMU-based memory isolation using per-process context banks * FastRPC protocol implementation for DSP communication * RPMsg transport layer for reliable message passing * Support for all DSP domains (ADSP, CDSP, SDSP, GDSP) * DRM IOCTL interface for DSP session management, buffer allocation, and remote procedure invocation
Architecture ============
1. DRM Accelerator Framework Integration The driver registers as a DRM accel device, exposing a standard /dev/accel/accelN character device node. This provides established DRM infrastructure for device management, file operations, and IOCTL dispatch.
2. Memory Management Buffers are managed as GEM objects with full PRIME support for DMA-BUF import/export. This enables seamless buffer sharing with other DRM drivers (GPU, camera, video) using standard kernel mechanisms.
3. IOMMU Context Bank Management IOMMU context banks (CBs) are represented as proper struct device instances on a custom virtual bus (qda-compute-cb). Each CB device is registered with the IOMMU subsystem and receives its own IOMMU domain, enabling per-session address space isolation. The custom bus was introduced because IOMMU context banks are synthetic constructs — not real platform devices — and to ensure CB device lifetime is strictly subordinate to the parent QDA device. See also: https://lore.kernel.org/all/245d602f-3037-4ae3-9af9-d98f37258aae@oss.qualcom...
4. Memory Manager Architecture A pluggable memory manager coordinates IOMMU device assignment and buffer allocation. The current implementation uses a DMA-coherent backend with SID-prefixed DMA addresses for DSP firmware compatibility.
5. Transport Layer RPMsg communication is handled in a dedicated transport layer (qda_rpmsg.c), separate from the core DRM driver logic.
6. Code Organization The driver is organized across multiple files (~4600 lines total): * qda_drv.c: Core driver and DRM integration * qda_rpmsg.c: RPMsg transport layer * qda_cb.c: Context bank device management * qda_compute_bus.c: Custom virtual bus for CB devices * qda_gem.c: GEM object management * qda_prime.c: DMA-BUF import (PRIME) * qda_memory_manager.c: IOMMU device registry and allocation * qda_memory_dma.c: DMA-coherent allocation backend * qda_fastrpc.c: FastRPC protocol implementation * qda_ioctl.c: IOCTL dispatch
7. UAPI Design The driver exposes DRM-style IOCTLs defined in include/uapi/drm/qda_accel.h, following DRM UAPI conventions (__u32/__u64 types, C++ guard, GPL-2.0-only WITH Linux-syscall-note).
Patch Series Organization ==========================
Patch 01: MAINTAINERS entry Patch 02: Driver documentation (Documentation/accel/qda/) Patches 03-04: Core driver skeleton and compute bus Patch 05: iommu: Register qda-compute-cb bus with IOMMU subsystem Patches 06-07: CB device enumeration and memory manager Patch 08: QUERY IOCTL and UAPI header Patches 09-11: GEM buffer management and PRIME import Patches 12-15: FastRPC protocol (invoke, session create/release, map/unmap)
Open Items ===========
1. Device-Tree Compatible String The QDA driver uses the same device-tree node structure and properties as the existing fastrpc driver in drivers/misc/. A mechanism is needed to allow the QDA driver to bind to its device node independently of the fastrpc driver.
The intended coexistence model is: platforms that require the complete fastrpc feature set continue to use "qcom,fastrpc"; new platforms where a feature available only in QDA takes priority, or where QDA's current feature set is sufficient, use a QDA-specific compatible string. New feature development is directed toward QDA rather than the existing fastrpc driver. As QDA matures toward feature parity with fastrpc, platforms can adopt the QDA-specific compatible string exclusively.
The options under consideration are:
a) Add a new "qcom,qda" compatible string to the existing qcom,fastrpc.yaml binding, since the DT node structure and properties are identical. This avoids a separate binding file but adds a QDA-specific string to a fastrpc binding.
b) Introduce a separate qcom,qda.yaml binding that references or inherits the fastrpc binding properties.
Seeking guidance from DT binding maintainers on the preferred approach.
2. Privilege Level Management Currently, daemon processes and user processes have the same access level as both use the same accel device node. This needs to be addressed as daemons attach to privileged DSP protection domains and require higher privilege levels for system-level operations. Seeking guidance on the best approach: separate device nodes, capability-based checks, or DRM master/authentication mechanisms.
3. UAPI Compatibility Layer A compatibility layer is needed to facilitate migration of client applications from the existing FastRPC UAPI to the new QDA UAPI, ensuring a smooth transition for existing userspace code. Seeking guidance on the preferred implementation approach: in-kernel translation layer, userspace wrapper library, or hybrid solution.
An initial evaluation of an in-kernel translation shim was performed, where legacy FastRPC device nodes (/dev/fastrpc-*) are exposed and requests are internally routed to the QDA accel driver. The goal was to keep the compatibility layer minimal, reuse existing QDA helper paths (attach, buffer allocation, mapping, etc.), and avoid duplication of GEM and buffer management logic.
However, the following challenges were identified:
a) Dependency on drm_file for QDA helpers QDA relies on GEM-backed allocations and per-client handle namespaces, which require a valid struct drm_file. Since GEM handles are scoped per drm_file, the compatibility layer cannot directly reuse QDA helper paths without establishing a proper drm_file context for each client.
b) Lack of public API for drm_file creation Creating a drm_file directly (similar to mock_drm_getfile()-style approaches) is not feasible, as the required helpers (drm_file_alloc(), drm_file_free(), etc.) are internal to the DRM core and not exported. This prevents external drivers from safely constructing and managing drm_file instances.
c) VFS-based open is not a viable solution Opening the underlying accel device (/dev/accel/accelN) from the compatibility driver via filp_open() does provide a valid drm_file, but introduces reliance on userspace-visible device paths, lack of stability in containerized or chroot environments, and no clean mapping between legacy device nodes and accel devices.
d) Userspace proxy limitations (CUSE) A CUSE-based userspace proxy was evaluated. However, DMA-buf file descriptors passed by legacy applications cannot be directly reused in the CUSE daemon (file descriptors are process-specific), which breaks buffer sharing semantics.
e) drm_client-based approaches do not match requirements drm_client APIs (used for fbdev emulation) rely on a shared drm_file and do not provide the per-client isolation required by FastRPC semantics.
Due to the above constraints, it is currently unclear how to implement an in-kernel compatibility layer that correctly handles per-client drm_file contexts without relying on VFS paths or non-exported DRM internals.
4. Documentation Improvements Add detailed IOCTL usage examples, document DSP firmware interface requirements, and create a migration guide from the existing FastRPC driver.
5. Per-Session Memory Allocation Develop a userspace API to support memory allocation on a per-session basis, enabling session-specific memory management.
6. Audio and Sensors PD Support The current series does not handle Audio PD and Sensors PD functionalities. These specialized protection domains require additional support for real-time constraints and power management.
Interface Compatibility ========================
The QDA driver uses the same device-tree node structure and child node layout (including "qcom,fastrpc-compute-cb" child nodes) as the existing fastrpc driver. The underlying FastRPC protocol and DSP firmware interface are compatible with the existing fastrpc driver, ensuring that DSP firmware and libraries continue to work without modification.
References ==========
Previous discussions on this migration: - https://lkml.org/lkml/2024/6/24/479 - https://lkml.org/lkml/2024/6/21/1252
Testing =======
The driver has been tested on Qualcomm platforms with: - Basic FastRPC attach/release operations - DSP process creation and initialization - Memory mapping/unmapping operations - Dynamic invocation with various buffer types - GEM buffer allocation and mmap - PRIME buffer import from other subsystems
Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- Ekansh Gupta (15): MAINTAINERS: Add entry for Qualcomm DSP Accelerator (QDA) driver accel/qda: Add QDA driver documentation accel/qda: Add initial QDA DRM accelerator driver accel/qda: Add compute bus for QDA context banks iommu: Add QDA compute context bank bus to iommu_buses accel/qda: Create compute context bank devices on QDA compute bus accel/qda: Add memory manager for CB devices accel/qda: Add QUERY IOCTL and QDA UAPI header accel/qda: Add DMA-backed GEM objects and memory manager integration accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs accel/qda: Add PRIME DMA-BUF import support accel/qda: Add FastRPC invocation support accel/qda: Add DSP process creation and release accel/qda: Add remote memory mapping to DSP address space accel/qda: Add remote memory unmap from DSP address space
Documentation/accel/index.rst | 1 + Documentation/accel/qda/index.rst | 13 + Documentation/accel/qda/qda.rst | 146 +++++ MAINTAINERS | 9 + drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 2 + drivers/accel/qda/Kconfig | 34 + drivers/accel/qda/Makefile | 19 + drivers/accel/qda/qda_cb.c | 146 +++++ drivers/accel/qda/qda_cb.h | 32 + drivers/accel/qda/qda_compute_bus.c | 68 ++ drivers/accel/qda/qda_drv.c | 192 ++++++ drivers/accel/qda/qda_drv.h | 91 +++ drivers/accel/qda/qda_fastrpc.c | 1058 ++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_fastrpc.h | 390 ++++++++++++ drivers/accel/qda/qda_gem.c | 177 ++++++ drivers/accel/qda/qda_gem.h | 62 ++ drivers/accel/qda/qda_ioctl.c | 296 +++++++++ drivers/accel/qda/qda_ioctl.h | 19 + drivers/accel/qda/qda_memory_dma.c | 110 ++++ drivers/accel/qda/qda_memory_dma.h | 17 + drivers/accel/qda/qda_memory_manager.c | 380 ++++++++++++ drivers/accel/qda/qda_memory_manager.h | 75 +++ drivers/accel/qda/qda_prime.c | 184 ++++++ drivers/accel/qda/qda_prime.h | 18 + drivers/accel/qda/qda_rpmsg.c | 248 ++++++++ drivers/accel/qda/qda_rpmsg.h | 30 + drivers/iommu/iommu.c | 4 + include/linux/qda_compute_bus.h | 32 + include/uapi/drm/qda_accel.h | 229 +++++++ 30 files changed, 4083 insertions(+) --- base-commit: 80dd246accce631c328ea43294e53b2b2dd2aa32 change-id: 20260519-qda-series-78c2bf0ed78b
Best regards,
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Add a MAINTAINERS entry for the Qualcomm DSP Accelerator (QDA) driver, covering the driver source under drivers/accel/qda, documentation under Documentation/accel/qda, and the UAPI header include/uapi/drm/qda_accel.h. The linux-arm-msm and dri-devel mailing lists are listed as the relevant review lists.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- MAINTAINERS | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS index 3dd58a16f06a..c5c13ad1e6fe 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -22083,6 +22083,15 @@ S: Maintained F: Documentation/devicetree/bindings/crypto/qcom-qce.yaml F: drivers/crypto/qce/
+QUALCOMM DSP ACCELERATOR (QDA) DRIVER +M: Ekansh Gupta ekansh.gupta@oss.qualcomm.com +L: linux-arm-msm@vger.kernel.org +L: dri-devel@lists.freedesktop.org +S: Supported +F: Documentation/accel/qda/ +F: drivers/accel/qda/ +F: include/uapi/drm/qda_accel.h + QUALCOMM EMAC GIGABIT ETHERNET DRIVER M: Timur Tabi timur@kernel.org L: netdev@vger.kernel.org
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Add documentation for the Qualcomm DSP Accelerator (QDA) driver under Documentation/accel/qda/. The documentation covers the driver architecture, GEM-based buffer management, IOMMU context bank isolation, and the RPMsg transport layer.
The user-space API section describes the DRM IOCTLs for session management, GEM buffer allocation, and remote procedure invocation via the FastRPC protocol, along with a typical application lifecycle example. Sections for dynamic debug and basic testing are also included.
Wire the new documentation into the Compute Accelerators index at Documentation/accel/index.rst.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- Documentation/accel/index.rst | 1 + Documentation/accel/qda/index.rst | 13 ++++ Documentation/accel/qda/qda.rst | 146 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 160 insertions(+)
diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst index cbc7d4c3876a..5901ea7f784c 100644 --- a/Documentation/accel/index.rst +++ b/Documentation/accel/index.rst @@ -10,4 +10,5 @@ Compute Accelerators introduction amdxdna/index qaic/index + qda/index rocket/index diff --git a/Documentation/accel/qda/index.rst b/Documentation/accel/qda/index.rst new file mode 100644 index 000000000000..013400cf9c25 --- /dev/null +++ b/Documentation/accel/qda/index.rst @@ -0,0 +1,13 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +================================== +accel/qda Qualcomm DSP Accelerator +================================== + +The QDA driver provides a DRM accel based interface for Qualcomm DSP offload. +It uses the FastRPC protocol and integrates with DRM and GEM infrastructure +for device and buffer management. + +.. toctree:: + + qda diff --git a/Documentation/accel/qda/qda.rst b/Documentation/accel/qda/qda.rst new file mode 100644 index 000000000000..9f49af6e6acc --- /dev/null +++ b/Documentation/accel/qda/qda.rst @@ -0,0 +1,146 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +===================================== +Qualcomm DSP Accelerator (QDA) Driver +===================================== + +Introduction +============ + +The QDA driver is a DRM accel driver for Qualcomm's DSPs. It provides a +DRM accel based interface for Qualcomm DSP offload, supporting workloads +such as AI inference, computer vision, audio processing, and sensor offload +on Qualcomm SoCs. It uses the FastRPC protocol and integrates with DRM and +GEM infrastructure for device and buffer management. + +Key Features +============ + +* **DRM accel Interface**: Exposes a standard character device node + (e.g., ``/dev/accel/accel0``) via the DRM accel subsystem. +* **FastRPC Protocol**: Implements the FastRPC protocol for communication + between the application processor and the DSP. +* **GEM Buffer Management**: Uses the DRM GEM interface for buffer + allocation, lifecycle management, and DMA-BUF import/export. +* **IOMMU Isolation**: Uses IOMMU context banks to enforce memory isolation + between different DSP user sessions. +* **Modular Design**: Clean separation between the core DRM logic, the + memory manager, and the RPMsg-based transport layer. + +Architecture +============ + +The QDA driver consists of several functional blocks: + +1. **Core Driver (``qda_drv``)**: Manages device registration, file operations, + and DRM accel integration. +2. **Memory Manager (``qda_memory_manager``)**: A flexible memory management + layer that handles IOMMU context banks. It supports pluggable backends + (such as DMA-coherent) to adapt to different SoC memory architectures. +3. **GEM Subsystem**: Implements the DRM GEM interface for buffer management: + + * **``qda_gem``**: Core GEM object management, including allocation, mmap + operations, and buffer lifecycle management. + * **``qda_prime``**: PRIME import functionality for DMA-BUF interoperability + with other kernel subsystems. + +4. **Transport Layer (``qda_rpmsg``)**: Abstraction over the RPMsg framework + to handle low-level message passing with the DSP firmware. +5. **Compute Bus (``qda_compute_bus``)**: A custom virtual bus used to + enumerate and manage the specific compute context banks defined in the + device tree. The bus was introduced because IOMMU context banks (CBs) are + synthetic constructs — not real platform devices — making a platform driver + an incorrect abstraction for them. The earlier platform-driver approach also + had a race condition: device nodes were created before the RPMsg channel + resources were fully initialized, and because ``probe`` runs asynchronously, + applications could open a CB device and attempt to start a session before + the underlying transport was ready. The compute bus makes CB lifetime + explicitly subordinate to the parent QDA device, closing that window. +6. **FastRPC Core (``qda_fastrpc``)**: Implements the protocol logic for + marshalling arguments and handling remote invocations. + +User-Space API +============== + +The driver exposes a set of DRM-compliant IOCTLs: + +* ``DRM_IOCTL_QDA_QUERY``: Query DSP type (e.g., "cdsp", "adsp") + and capabilities. +* ``DRM_IOCTL_QDA_REMOTE_SESSION_CREATE``: Initialize a new process context + on the DSP. +* ``DRM_IOCTL_QDA_REMOTE_INVOKE``: Submit a remote method invocation (the + primary execution unit). +* ``DRM_IOCTL_QDA_GEM_CREATE``: Allocate a GEM buffer object for DSP usage. +* ``DRM_IOCTL_QDA_GEM_MMAP_OFFSET``: Retrieve mmap offsets for memory mapping. +* ``DRM_IOCTL_QDA_REMOTE_MAP`` / ``DRM_IOCTL_QDA_REMOTE_MUNMAP``: Map or unmap + buffers into the DSP's virtual address space. Each accepts a ``request`` + field selecting between a legacy operation (``QDA_MAP_REQUEST_LEGACY`` / + ``QDA_MUNMAP_REQUEST_LEGACY``) and an attribute-based operation + (``QDA_MAP_REQUEST_ATTR`` / ``QDA_MUNMAP_REQUEST_ATTR``). + +Usage Example +============= + +A typical lifecycle for a user-space application: + +1. **Discovery**: Open ``/dev/accel/accel*`` and use + ``DRM_IOCTL_QDA_QUERY`` to identify the DSP domain served by that + device node. +2. **Initialization**: Call ``DRM_IOCTL_QDA_REMOTE_SESSION_CREATE`` to + establish a session and create a process context on the DSP. +3. **Memory**: Allocate buffers via ``DRM_IOCTL_QDA_GEM_CREATE`` or import + DMA-BUFs (PRIME fd) from other drivers using ``DRM_IOCTL_PRIME_FD_TO_HANDLE``. +4. **Execution**: Use ``DRM_IOCTL_QDA_REMOTE_INVOKE`` to pass arguments and + execute functions on the DSP. +5. **Cleanup**: Close file descriptors to automatically release resources and + detach the session. + +Internal Implementation +======================= + +Memory Management +----------------- +The driver's memory manager creates virtual "IOMMU devices" that map to +hardware context banks. This allows the driver to manage multiple isolated +address spaces. The implementation uses a DMA-coherent backend to ensure data consistency +between the CPU and DSP without manual cache maintenance in most cases. + +Debugging +========= +The driver includes extensive dynamic debug support. Enable it via the +kernel's dynamic debug control: + +.. code-block:: bash + + echo "file drivers/accel/qda/* +p" > /sys/kernel/debug/dynamic_debug/control + +Testing +======= +The QDA driver can be exercised using the ``fastrpc_test`` utility from the +FastRPC userspace library. Run the test application: + +.. code-block:: bash + + fastrpc_test -d 3 -U 1 -t linux -a v68 + +**Options** + +``-d domain`` + Select the DSP domain to run on: + + * ``0`` — ADSP + * ``1`` — MDSP + * ``2`` — SDSP + * ``3`` — CDSP *(default on targets with CDSP)* + +``-U unsigned_PD`` + Select signed or unsigned protection domain: + + * ``0`` — signed PD + * ``1`` — unsigned PD *(default)* + +``-t target`` + Target platform: ``android`` or ``linux`` *(default: linux)* + +``-a arch_version`` + DSP architecture version, e.g. ``v68``, ``v75`` *(default: v68)*
On Tue, May 19, 2026 at 11:45:52AM +0530, Ekansh Gupta via B4 Relay wrote:
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Add documentation for the Qualcomm DSP Accelerator (QDA) driver under Documentation/accel/qda/. The documentation covers the driver architecture, GEM-based buffer management, IOMMU context bank isolation, and the RPMsg transport layer.
The user-space API section describes the DRM IOCTLs for session management, GEM buffer allocation, and remote procedure invocation via the FastRPC protocol, along with a typical application lifecycle example. Sections for dynamic debug and basic testing are also included.
Wire the new documentation into the Compute Accelerators index at Documentation/accel/index.rst.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Documentation/accel/index.rst | 1 + Documentation/accel/qda/index.rst | 13 ++++ Documentation/accel/qda/qda.rst | 146 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 160 insertions(+)
diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst index cbc7d4c3876a..5901ea7f784c 100644 --- a/Documentation/accel/index.rst +++ b/Documentation/accel/index.rst @@ -10,4 +10,5 @@ Compute Accelerators introduction amdxdna/index qaic/index
- qda/index rocket/index
diff --git a/Documentation/accel/qda/index.rst b/Documentation/accel/qda/index.rst new file mode 100644 index 000000000000..013400cf9c25 --- /dev/null +++ b/Documentation/accel/qda/index.rst @@ -0,0 +1,13 @@ +.. SPDX-License-Identifier: GPL-2.0-only
+================================== +accel/qda Qualcomm DSP Accelerator +==================================
+The QDA driver provides a DRM accel based interface for Qualcomm DSP offload. +It uses the FastRPC protocol and integrates with DRM and GEM infrastructure +for device and buffer management.
+.. toctree::
- qda
diff --git a/Documentation/accel/qda/qda.rst b/Documentation/accel/qda/qda.rst new file mode 100644 index 000000000000..9f49af6e6acc --- /dev/null +++ b/Documentation/accel/qda/qda.rst @@ -0,0 +1,146 @@ +.. SPDX-License-Identifier: GPL-2.0-only
+===================================== +Qualcomm DSP Accelerator (QDA) Driver +=====================================
+Introduction +============
+The QDA driver is a DRM accel driver for Qualcomm's DSPs. It provides a +DRM accel based interface for Qualcomm DSP offload, supporting workloads +such as AI inference, computer vision, audio processing, and sensor offload +on Qualcomm SoCs. It uses the FastRPC protocol and integrates with DRM and +GEM infrastructure for device and buffer management.
+Key Features +============
+* **DRM accel Interface**: Exposes a standard character device node
- (e.g., ``/dev/accel/accel0``) via the DRM accel subsystem.
+* **FastRPC Protocol**: Implements the FastRPC protocol for communication
- between the application processor and the DSP.
+* **GEM Buffer Management**: Uses the DRM GEM interface for buffer
- allocation, lifecycle management, and DMA-BUF import/export.
+* **IOMMU Isolation**: Uses IOMMU context banks to enforce memory isolation
- between different DSP user sessions.
+* **Modular Design**: Clean separation between the core DRM logic, the
- memory manager, and the RPMsg-based transport layer.
+Architecture +============
+The QDA driver consists of several functional blocks:
+1. **Core Driver (``qda_drv``)**: Manages device registration, file operations,
- and DRM accel integration.
+2. **Memory Manager (``qda_memory_manager``)**: A flexible memory management
- layer that handles IOMMU context banks. It supports pluggable backends
- (such as DMA-coherent) to adapt to different SoC memory architectures.
+3. **GEM Subsystem**: Implements the DRM GEM interface for buffer management:
- **``qda_gem``**: Core GEM object management, including allocation, mmap
operations, and buffer lifecycle management.
- **``qda_prime``**: PRIME import functionality for DMA-BUF interoperability
with other kernel subsystems.+4. **Transport Layer (``qda_rpmsg``)**: Abstraction over the RPMsg framework
- to handle low-level message passing with the DSP firmware.
+5. **Compute Bus (``qda_compute_bus``)**: A custom virtual bus used to
- enumerate and manage the specific compute context banks defined in the
- device tree. The bus was introduced because IOMMU context banks (CBs) are
- synthetic constructs — not real platform devices — making a platform driver
- an incorrect abstraction for them. The earlier platform-driver approach also
- had a race condition: device nodes were created before the RPMsg channel
- resources were fully initialized, and because ``probe`` runs asynchronously,
- applications could open a CB device and attempt to start a session before
- the underlying transport was ready. The compute bus makes CB lifetime
- explicitly subordinate to the parent QDA device, closing that window.
+6. **FastRPC Core (``qda_fastrpc``)**: Implements the protocol logic for
- marshalling arguments and handling remote invocations.
+User-Space API +==============
+The driver exposes a set of DRM-compliant IOCTLs:
+* ``DRM_IOCTL_QDA_QUERY``: Query DSP type (e.g., "cdsp", "adsp")
- and capabilities.
+* ``DRM_IOCTL_QDA_REMOTE_SESSION_CREATE``: Initialize a new process context
- on the DSP.
+* ``DRM_IOCTL_QDA_REMOTE_INVOKE``: Submit a remote method invocation (the
- primary execution unit).
+* ``DRM_IOCTL_QDA_GEM_CREATE``: Allocate a GEM buffer object for DSP usage. +* ``DRM_IOCTL_QDA_GEM_MMAP_OFFSET``: Retrieve mmap offsets for memory mapping. +* ``DRM_IOCTL_QDA_REMOTE_MAP`` / ``DRM_IOCTL_QDA_REMOTE_MUNMAP``: Map or unmap
- buffers into the DSP's virtual address space. Each accepts a ``request``
- field selecting between a legacy operation (``QDA_MAP_REQUEST_LEGACY`` /
- ``QDA_MUNMAP_REQUEST_LEGACY``) and an attribute-based operation
- (``QDA_MAP_REQUEST_ATTR`` / ``QDA_MUNMAP_REQUEST_ATTR``).
Explain, what happens in the users don't map the buffers into the DSP space. Will DRM_IOCTL_QDA_REMOTE_INVOKE handle the mapping or not? What is the difference between those two modes?
Would the driver benefit from using GPUVM?
+Usage Example +=============
+A typical lifecycle for a user-space application:
+1. **Discovery**: Open ``/dev/accel/accel*`` and use
- ``DRM_IOCTL_QDA_QUERY`` to identify the DSP domain served by that
- device node.
+2. **Initialization**: Call ``DRM_IOCTL_QDA_REMOTE_SESSION_CREATE`` to
- establish a session and create a process context on the DSP.
+3. **Memory**: Allocate buffers via ``DRM_IOCTL_QDA_GEM_CREATE`` or import
- DMA-BUFs (PRIME fd) from other drivers using ``DRM_IOCTL_PRIME_FD_TO_HANDLE``.
+4. **Execution**: Use ``DRM_IOCTL_QDA_REMOTE_INVOKE`` to pass arguments and
- execute functions on the DSP.
+5. **Cleanup**: Close file descriptors to automatically release resources and
- detach the session.
I'd have expected the description of the actual example. I.e. clone the app from https://the.addr, prepare clang >= NN.MM, QAIC (https://foo), run make, run the app, check the results. I'd remind that DRM Accel has a very specific requirement of having the working toolhain in the open-source.
+Internal Implementation +=======================
+Memory Management +----------------- +The driver's memory manager creates virtual "IOMMU devices" that map to +hardware context banks. This allows the driver to manage multiple isolated +address spaces. The implementation uses a DMA-coherent backend to ensure data consistency +between the CPU and DSP without manual cache maintenance in most cases.
GEM usage?
+Debugging +========= +The driver includes extensive dynamic debug support. Enable it via the +kernel's dynamic debug control:
+.. code-block:: bash
- echo "file drivers/accel/qda/* +p" > /sys/kernel/debug/dynamic_debug/control
+Testing +======= +The QDA driver can be exercised using the ``fastrpc_test`` utility from the +FastRPC userspace library. Run the test application:
pointer
+.. code-block:: bash
- fastrpc_test -d 3 -U 1 -t linux -a v68
+**Options**
+``-d domain``
- Select the DSP domain to run on:
- ``0`` — ADSP
- ``1`` — MDSP
- ``2`` — SDSP
- ``3`` — CDSP *(default on targets with CDSP)*
+``-U unsigned_PD``
- Select signed or unsigned protection domain:
- ``0`` — signed PD
- ``1`` — unsigned PD *(default)*
+``-t target``
- Target platform: ``android`` or ``linux`` *(default: linux)*
+``-a arch_version``
- DSP architecture version, e.g. ``v68``, ``v75`` *(default: v68)*
-- 2.34.1
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Add the foundational driver files for the Qualcomm DSP Accelerator (QDA), a DRM accel driver for Qualcomm DSPs. The driver integrates with the DRM accel subsystem (drivers/accel/) and provides:
- A standard /dev/accel/accel* character device node via DRM. - GEM-based buffer management with DMA-BUF import/export (PRIME). - IOMMU context bank management for per-session memory isolation. - Standard DRM IOCTLs for device management and job submission.
qda_drv.c / qda_drv.h: Core DRM driver registration. Defines the drm_driver ops table, per-file private state (qda_file_priv), and the main device structure (qda_dev) which embeds drm_device.
qda_rpmsg.c / qda_rpmsg.h: RPMsg transport layer. Registers an rpmsg_driver matching the "qcom,fastrpc" compatible string. On probe it allocates a qda_dev, reads the DSP domain name from the "label" DT property, and registers the DRM device.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 1 + drivers/accel/qda/Kconfig | 30 +++++++++++++ drivers/accel/qda/Makefile | 10 +++++ drivers/accel/qda/qda_drv.c | 97 ++++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_drv.h | 62 +++++++++++++++++++++++++++ drivers/accel/qda/qda_rpmsg.c | 99 +++++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_rpmsg.h | 13 ++++++ 8 files changed, 313 insertions(+)
diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig index bdf48ccafcf2..74ac0f71bc9d 100644 --- a/drivers/accel/Kconfig +++ b/drivers/accel/Kconfig @@ -29,6 +29,7 @@ source "drivers/accel/ethosu/Kconfig" source "drivers/accel/habanalabs/Kconfig" source "drivers/accel/ivpu/Kconfig" source "drivers/accel/qaic/Kconfig" +source "drivers/accel/qda/Kconfig" source "drivers/accel/rocket/Kconfig"
endif diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile index 1d3a7251b950..58c08dd5f389 100644 --- a/drivers/accel/Makefile +++ b/drivers/accel/Makefile @@ -5,4 +5,5 @@ obj-$(CONFIG_DRM_ACCEL_ARM_ETHOSU) += ethosu/ obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/ obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/ obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/ +obj-$(CONFIG_DRM_ACCEL_QDA) += qda/ obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/ \ No newline at end of file diff --git a/drivers/accel/qda/Kconfig b/drivers/accel/qda/Kconfig new file mode 100644 index 000000000000..484d21ff1b55 --- /dev/null +++ b/drivers/accel/qda/Kconfig @@ -0,0 +1,30 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# Qualcomm DSP accelerator driver +# + +config DRM_ACCEL_QDA + tristate "Qualcomm DSP accelerator" + depends on DRM_ACCEL + depends on ARCH_QCOM || COMPILE_TEST + depends on RPMSG + help + Enables the DRM-based accelerator driver for Qualcomm's Hexagon DSPs. + This driver provides a standardized interface for offloading computational + tasks to the DSP, including audio processing, sensor offload, computer + vision, and AI inference workloads. + + The driver supports all DSP domains (ADSP, CDSP, SDSP, GDSP) and + implements the FastRPC protocol for communication between the application + processor and DSP. It integrates with the Linux kernel's Compute + Accelerators subsystem (drivers/accel/) and provides a modern alternative + to the legacy FastRPC driver found in drivers/misc/. + + Key features include DMA-BUF interoperability for seamless buffer sharing + with other multimedia subsystems, IOMMU-based memory isolation, and + standard DRM IOCTLs for device management and job submission. + + If unsure, say N. + + To compile this driver as a module, choose M here: the + module will be called qda. diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile new file mode 100644 index 000000000000..dbe809067a8b --- /dev/null +++ b/drivers/accel/qda/Makefile @@ -0,0 +1,10 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# Makefile for Qualcomm DSP accelerator driver +# + +obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o + +qda-y := \ + qda_drv.o \ + qda_rpmsg.o diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c new file mode 100644 index 000000000000..1c1bab68d445 --- /dev/null +++ b/drivers/accel/qda/qda_drv.c @@ -0,0 +1,97 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <linux/module.h> +#include <linux/slab.h> +#include <drm/drm_accel.h> +#include <drm/drm_drv.h> +#include <drm/drm_file.h> +#include <drm/drm_gem.h> +#include <drm/drm_ioctl.h> +#include <drm/drm_print.h> + +#include "qda_drv.h" +#include "qda_rpmsg.h" + +static int qda_open(struct drm_device *dev, struct drm_file *file) +{ + struct qda_file_priv *qda_file_priv; + + qda_file_priv = kzalloc_obj(*qda_file_priv); + if (!qda_file_priv) + return -ENOMEM; + + qda_file_priv->qda_dev = qda_dev_from_drm(dev); + file->driver_priv = qda_file_priv; + + return 0; +} + +static void qda_postclose(struct drm_device *dev, struct drm_file *file) +{ + struct qda_file_priv *qda_file_priv = file->driver_priv; + + kfree(qda_file_priv); + file->driver_priv = NULL; +} + +DEFINE_DRM_ACCEL_FOPS(qda_accel_fops); + +static const struct drm_driver qda_drm_driver = { + .driver_features = DRIVER_COMPUTE_ACCEL, + .fops = &qda_accel_fops, + .open = qda_open, + .postclose = qda_postclose, + .name = QDA_DRIVER_NAME, + .desc = "Qualcomm DSP Accelerator Driver", +}; + +struct qda_dev *qda_alloc_device(struct device *dev) +{ + struct qda_dev *qdev; + + qdev = devm_drm_dev_alloc(dev, &qda_drm_driver, struct qda_dev, drm_dev); + if (IS_ERR(qdev)) + return ERR_CAST(qdev); + + return qdev; +} + +void qda_unregister_device(struct qda_dev *qdev) +{ + drm_dev_unregister(&qdev->drm_dev); +} + +int qda_register_device(struct qda_dev *qdev) +{ + int ret; + + ret = drm_dev_register(&qdev->drm_dev, 0); + if (ret) + drm_err(&qdev->drm_dev, "Failed to register DRM device: %d\n", ret); + + return ret; +} + +static int __init qda_core_init(void) +{ + int ret; + + ret = qda_rpmsg_register(); + if (ret) + return ret; + + pr_info("qda: QDA driver initialization complete\n"); + return 0; +} + +static void __exit qda_core_exit(void) +{ + qda_rpmsg_unregister(); +} + +module_init(qda_core_init); +module_exit(qda_core_exit); + +MODULE_AUTHOR("Qualcomm AI Infra Team"); +MODULE_DESCRIPTION("Qualcomm DSP Accelerator Driver"); +MODULE_LICENSE("GPL"); diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h new file mode 100644 index 000000000000..7ba2ef19a411 --- /dev/null +++ b/drivers/accel/qda/qda_drv.h @@ -0,0 +1,62 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. + */ + +#ifndef __QDA_DRV_H__ +#define __QDA_DRV_H__ + +#include <linux/device.h> +#include <linux/rpmsg.h> +#include <linux/types.h> +#include <drm/drm_device.h> +#include <drm/drm_drv.h> +#include <drm/drm_file.h> + +/* Driver identification */ +#define QDA_DRIVER_NAME "qda" + +/** + * struct qda_file_priv - Per-process private data for DRM file + */ +struct qda_file_priv { + /** @qda_dev: Back-pointer to device structure */ + struct qda_dev *qda_dev; +}; + +/** + * struct qda_dev - Main device structure for QDA driver + * + * The DRM device is embedded as the first member so that container_of() + * can recover the qda_dev from any drm_device pointer. + */ +struct qda_dev { + /** @drm_dev: Embedded DRM device; recover via qda_dev_from_drm() */ + struct drm_device drm_dev; + /** @rpdev: RPMsg device for communication with the remote processor */ + struct rpmsg_device *rpdev; + /** @dev: Underlying Linux device */ + struct device *dev; + /** @dsp_name: Name of the DSP domain (e.g. "cdsp", "adsp") */ + const char *dsp_name; +}; + +/** + * qda_dev_from_drm - Recover qda_dev from an embedded drm_device pointer + * @dev: Pointer to the embedded drm_device + * + * Return: Pointer to the enclosing qda_dev. + */ +static inline struct qda_dev *qda_dev_from_drm(struct drm_device *dev) +{ + return container_of(dev, struct qda_dev, drm_dev); +} + +/* Device allocation (uses devm_drm_dev_alloc internally) */ +struct qda_dev *qda_alloc_device(struct device *dev); + +/* Core device lifecycle */ +int qda_register_device(struct qda_dev *qdev); +void qda_unregister_device(struct qda_dev *qdev); + +#endif /* __QDA_DRV_H__ */ diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c new file mode 100644 index 000000000000..6eaf1b145f8a --- /dev/null +++ b/drivers/accel/qda/qda_rpmsg.c @@ -0,0 +1,99 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <linux/module.h> +#include <linux/of.h> +#include <linux/rpmsg.h> +#include <drm/drm_print.h> + +#include "qda_drv.h" +#include "qda_rpmsg.h" + +static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev) +{ + struct qda_dev *qdev; + + qdev = qda_alloc_device(&rpdev->dev); + if (IS_ERR(qdev)) + return qdev; + + qdev->dev = &rpdev->dev; + qdev->rpdev = rpdev; + dev_set_drvdata(&rpdev->dev, qdev); + + return qdev; +} + +static int qda_rpmsg_cb(struct rpmsg_device *rpdev, void *data, int len, + void *priv, u32 src) +{ + /* Placeholder: responses will be dispatched here */ + return 0; +} + +static void qda_rpmsg_remove(struct rpmsg_device *rpdev) +{ + struct qda_dev *qdev = dev_get_drvdata(&rpdev->dev); + + drm_dev_unplug(&qdev->drm_dev); + qdev->rpdev = NULL; + qda_unregister_device(qdev); + dev_info(qdev->dev, "RPMsg device removed\n"); +} + +static int qda_rpmsg_probe(struct rpmsg_device *rpdev) +{ + struct qda_dev *qdev; + const char *label; + int ret; + + dev_dbg(&rpdev->dev, "QDA RPMsg probe starting\n"); + + qdev = alloc_and_init_qdev(rpdev); + if (IS_ERR(qdev)) + return PTR_ERR(qdev); + + ret = of_property_read_string(rpdev->dev.of_node, "label", &label); + if (ret) { + dev_err(qdev->dev, "Missing 'label' property in DT node: %d\n", ret); + return ret; + } + qdev->dsp_name = label; + + ret = qda_register_device(qdev); + if (ret) + return ret; + + drm_info(&qdev->drm_dev, "QDA RPMsg probe complete for %s\n", qdev->dsp_name); + return 0; +} + +static const struct of_device_id qda_rpmsg_id_table[] = { + { .compatible = "qcom,fastrpc" }, + {}, +}; +MODULE_DEVICE_TABLE(of, qda_rpmsg_id_table); + +static struct rpmsg_driver qda_rpmsg_driver = { + .probe = qda_rpmsg_probe, + .remove = qda_rpmsg_remove, + .callback = qda_rpmsg_cb, + .drv = { + .name = "qcom,fastrpc", + .of_match_table = qda_rpmsg_id_table, + }, +}; + +int qda_rpmsg_register(void) +{ + int ret = register_rpmsg_driver(&qda_rpmsg_driver); + + if (ret) + pr_err("qda: Failed to register RPMsg driver: %d\n", ret); + + return ret; +} + +void qda_rpmsg_unregister(void) +{ + unregister_rpmsg_driver(&qda_rpmsg_driver); +} diff --git a/drivers/accel/qda/qda_rpmsg.h b/drivers/accel/qda/qda_rpmsg.h new file mode 100644 index 000000000000..5229d834b34b --- /dev/null +++ b/drivers/accel/qda/qda_rpmsg.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. + */ + +#ifndef __QDA_RPMSG_H__ +#define __QDA_RPMSG_H__ + +/* RPMsg transport layer registration */ +int qda_rpmsg_register(void); +void qda_rpmsg_unregister(void); + +#endif /* __QDA_RPMSG_H__ */
On Tue, May 19, 2026 at 11:45:53AM +0530, Ekansh Gupta via B4 Relay wrote:
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Add the foundational driver files for the Qualcomm DSP Accelerator (QDA), a DRM accel driver for Qualcomm DSPs. The driver integrates with the DRM accel subsystem (drivers/accel/) and provides:
- A standard /dev/accel/accel* character device node via DRM.
- GEM-based buffer management with DMA-BUF import/export (PRIME).
- IOMMU context bank management for per-session memory isolation.
- Standard DRM IOCTLs for device management and job submission.
qda_drv.c / qda_drv.h: Core DRM driver registration. Defines the drm_driver ops table, per-file private state (qda_file_priv), and the main device structure (qda_dev) which embeds drm_device.
qda_rpmsg.c / qda_rpmsg.h: RPMsg transport layer. Registers an rpmsg_driver matching the "qcom,fastrpc" compatible string. On probe it allocates a qda_dev, reads the DSP domain name from the "label" DT property, and registers the DRM device.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 1 + drivers/accel/qda/Kconfig | 30 +++++++++++++ drivers/accel/qda/Makefile | 10 +++++ drivers/accel/qda/qda_drv.c | 97 ++++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_drv.h | 62 +++++++++++++++++++++++++++ drivers/accel/qda/qda_rpmsg.c | 99 +++++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_rpmsg.h | 13 ++++++ 8 files changed, 313 insertions(+)
diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig index bdf48ccafcf2..74ac0f71bc9d 100644 --- a/drivers/accel/Kconfig +++ b/drivers/accel/Kconfig @@ -29,6 +29,7 @@ source "drivers/accel/ethosu/Kconfig" source "drivers/accel/habanalabs/Kconfig" source "drivers/accel/ivpu/Kconfig" source "drivers/accel/qaic/Kconfig" +source "drivers/accel/qda/Kconfig" source "drivers/accel/rocket/Kconfig" endif diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile index 1d3a7251b950..58c08dd5f389 100644 --- a/drivers/accel/Makefile +++ b/drivers/accel/Makefile @@ -5,4 +5,5 @@ obj-$(CONFIG_DRM_ACCEL_ARM_ETHOSU) += ethosu/ obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/ obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/ obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/ +obj-$(CONFIG_DRM_ACCEL_QDA) += qda/ obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/ \ No newline at end of file diff --git a/drivers/accel/qda/Kconfig b/drivers/accel/qda/Kconfig new file mode 100644 index 000000000000..484d21ff1b55 --- /dev/null +++ b/drivers/accel/qda/Kconfig @@ -0,0 +1,30 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# Qualcomm DSP accelerator driver +#
+config DRM_ACCEL_QDA
- tristate "Qualcomm DSP accelerator"
- depends on DRM_ACCEL
- depends on ARCH_QCOM || COMPILE_TEST
- depends on RPMSG
- help
Enables the DRM-based accelerator driver for Qualcomm's Hexagon DSPs.This driver provides a standardized interface for offloading computationaltasks to the DSP, including audio processing, sensor offload, computervision, and AI inference workloads.The driver supports all DSP domains (ADSP, CDSP, SDSP, GDSP) andimplements the FastRPC protocol for communication between the applicationprocessor and DSP. It integrates with the Linux kernel's ComputeAccelerators subsystem (drivers/accel/) and provides a modern alternativeto the legacy FastRPC driver found in drivers/misc/.Key features include DMA-BUF interoperability for seamless buffer sharing
Key features of what? Consider distro maintainers reading your help text in order to identify whether to enable it or not.
with other multimedia subsystems, IOMMU-based memory isolation, andstandard DRM IOCTLs for device management and job submission.If unsure, say N.To compile this driver as a module, choose M here: themodule will be called qda.diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile new file mode 100644 index 000000000000..dbe809067a8b --- /dev/null +++ b/drivers/accel/qda/Makefile @@ -0,0 +1,10 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# Makefile for Qualcomm DSP accelerator driver +#
+obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
+qda-y := \
- qda_drv.o \
- qda_rpmsg.o
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c new file mode 100644 index 000000000000..1c1bab68d445 --- /dev/null +++ b/drivers/accel/qda/qda_drv.c @@ -0,0 +1,97 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <linux/module.h> +#include <linux/slab.h> +#include <drm/drm_accel.h> +#include <drm/drm_drv.h> +#include <drm/drm_file.h> +#include <drm/drm_gem.h> +#include <drm/drm_ioctl.h> +#include <drm/drm_print.h>
+#include "qda_drv.h" +#include "qda_rpmsg.h"
+static int qda_open(struct drm_device *dev, struct drm_file *file) +{
- struct qda_file_priv *qda_file_priv;
- qda_file_priv = kzalloc_obj(*qda_file_priv);
- if (!qda_file_priv)
return -ENOMEM;- qda_file_priv->qda_dev = qda_dev_from_drm(dev);
- file->driver_priv = qda_file_priv;
- return 0;
+}
+static void qda_postclose(struct drm_device *dev, struct drm_file *file) +{
- struct qda_file_priv *qda_file_priv = file->driver_priv;
- kfree(qda_file_priv);
- file->driver_priv = NULL;
+}
+DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
+static const struct drm_driver qda_drm_driver = {
- .driver_features = DRIVER_COMPUTE_ACCEL,
- .fops = &qda_accel_fops,
- .open = qda_open,
- .postclose = qda_postclose,
- .name = QDA_DRIVER_NAME,
- .desc = "Qualcomm DSP Accelerator Driver",
+};
+struct qda_dev *qda_alloc_device(struct device *dev) +{
- struct qda_dev *qdev;
- qdev = devm_drm_dev_alloc(dev, &qda_drm_driver, struct qda_dev, drm_dev);
- if (IS_ERR(qdev))
return ERR_CAST(qdev);- return qdev;
+}
+void qda_unregister_device(struct qda_dev *qdev) +{
- drm_dev_unregister(&qdev->drm_dev);
+}
+int qda_register_device(struct qda_dev *qdev) +{
- int ret;
- ret = drm_dev_register(&qdev->drm_dev, 0);
- if (ret)
drm_err(&qdev->drm_dev, "Failed to register DRM device: %d\n", ret);- return ret;
+}
+static int __init qda_core_init(void) +{
- int ret;
- ret = qda_rpmsg_register();
- if (ret)
return ret;- pr_info("qda: QDA driver initialization complete\n");
- return 0;
+}
+static void __exit qda_core_exit(void) +{
- qda_rpmsg_unregister();
+}
+module_init(qda_core_init); +module_exit(qda_core_exit);
+MODULE_AUTHOR("Qualcomm AI Infra Team"); +MODULE_DESCRIPTION("Qualcomm DSP Accelerator Driver"); +MODULE_LICENSE("GPL"); diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h new file mode 100644 index 000000000000..7ba2ef19a411 --- /dev/null +++ b/drivers/accel/qda/qda_drv.h @@ -0,0 +1,62 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/*
- Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
- */
+#ifndef __QDA_DRV_H__ +#define __QDA_DRV_H__
+#include <linux/device.h> +#include <linux/rpmsg.h> +#include <linux/types.h> +#include <drm/drm_device.h> +#include <drm/drm_drv.h> +#include <drm/drm_file.h>
+/* Driver identification */ +#define QDA_DRIVER_NAME "qda"
+/**
- struct qda_file_priv - Per-process private data for DRM file
- */
+struct qda_file_priv {
- /** @qda_dev: Back-pointer to device structure */
- struct qda_dev *qda_dev;
+};
+/**
- struct qda_dev - Main device structure for QDA driver
- The DRM device is embedded as the first member so that container_of()
- can recover the qda_dev from any drm_device pointer.
- */
+struct qda_dev {
- /** @drm_dev: Embedded DRM device; recover via qda_dev_from_drm() */
- struct drm_device drm_dev;
- /** @rpdev: RPMsg device for communication with the remote processor */
- struct rpmsg_device *rpdev;
- /** @dev: Underlying Linux device */
- struct device *dev;
- /** @dsp_name: Name of the DSP domain (e.g. "cdsp", "adsp") */
- const char *dsp_name;
+};
+/**
- qda_dev_from_drm - Recover qda_dev from an embedded drm_device pointer
- @dev: Pointer to the embedded drm_device
- Return: Pointer to the enclosing qda_dev.
- */
+static inline struct qda_dev *qda_dev_from_drm(struct drm_device *dev) +{
- return container_of(dev, struct qda_dev, drm_dev);
+}
+/* Device allocation (uses devm_drm_dev_alloc internally) */ +struct qda_dev *qda_alloc_device(struct device *dev);
+/* Core device lifecycle */ +int qda_register_device(struct qda_dev *qdev); +void qda_unregister_device(struct qda_dev *qdev);
+#endif /* __QDA_DRV_H__ */ diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c new file mode 100644 index 000000000000..6eaf1b145f8a --- /dev/null +++ b/drivers/accel/qda/qda_rpmsg.c @@ -0,0 +1,99 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <linux/module.h> +#include <linux/of.h> +#include <linux/rpmsg.h> +#include <drm/drm_print.h>
+#include "qda_drv.h" +#include "qda_rpmsg.h"
+static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev)
Use the prefix uniformly.
+{
- struct qda_dev *qdev;
- qdev = qda_alloc_device(&rpdev->dev);
- if (IS_ERR(qdev))
return qdev;- qdev->dev = &rpdev->dev;
- qdev->rpdev = rpdev;
- dev_set_drvdata(&rpdev->dev, qdev);
- return qdev;
+}
+static int qda_rpmsg_cb(struct rpmsg_device *rpdev, void *data, int len,
void *priv, u32 src)+{
- /* Placeholder: responses will be dispatched here */
- return 0;
+}
+static void qda_rpmsg_remove(struct rpmsg_device *rpdev) +{
- struct qda_dev *qdev = dev_get_drvdata(&rpdev->dev);
- drm_dev_unplug(&qdev->drm_dev);
- qdev->rpdev = NULL;
- qda_unregister_device(qdev);
- dev_info(qdev->dev, "RPMsg device removed\n");
Drop the spamming. And useless (where it is useless) drm_dbg() / dev_dbg() spamming too.
+}
+static int qda_rpmsg_probe(struct rpmsg_device *rpdev) +{
- struct qda_dev *qdev;
- const char *label;
- int ret;
- dev_dbg(&rpdev->dev, "QDA RPMsg probe starting\n");
- qdev = alloc_and_init_qdev(rpdev);
- if (IS_ERR(qdev))
return PTR_ERR(qdev);- ret = of_property_read_string(rpdev->dev.of_node, "label", &label);
- if (ret) {
dev_err(qdev->dev, "Missing 'label' property in DT node: %d\n", ret);return ret;- }
- qdev->dsp_name = label;
Why not just of_property_read_string(...., &qdev->dsp_name)?
- ret = qda_register_device(qdev);
return qda_register_device();
- if (ret)
return ret;- drm_info(&qdev->drm_dev, "QDA RPMsg probe complete for %s\n", qdev->dsp_name);
- return 0;
+}
+static const struct of_device_id qda_rpmsg_id_table[] = {
- { .compatible = "qcom,fastrpc" },
- {},
+}; +MODULE_DEVICE_TABLE(of, qda_rpmsg_id_table);
+static struct rpmsg_driver qda_rpmsg_driver = {
- .probe = qda_rpmsg_probe,
- .remove = qda_rpmsg_remove,
- .callback = qda_rpmsg_cb,
- .drv = {
.name = "qcom,fastrpc",.of_match_table = qda_rpmsg_id_table,- },
+};
+int qda_rpmsg_register(void) +{
- int ret = register_rpmsg_driver(&qda_rpmsg_driver);
- if (ret)
pr_err("qda: Failed to register RPMsg driver: %d\n", ret);- return ret;
+}
+void qda_rpmsg_unregister(void) +{
- unregister_rpmsg_driver(&qda_rpmsg_driver);
+}
Just use module_rpmsg_driver(), drop all the wrappers and module_init() / exit().
diff --git a/drivers/accel/qda/qda_rpmsg.h b/drivers/accel/qda/qda_rpmsg.h new file mode 100644 index 000000000000..5229d834b34b --- /dev/null +++ b/drivers/accel/qda/qda_rpmsg.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/*
- Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
- */
+#ifndef __QDA_RPMSG_H__ +#define __QDA_RPMSG_H__
+/* RPMsg transport layer registration */ +int qda_rpmsg_register(void); +void qda_rpmsg_unregister(void);
+#endif /* __QDA_RPMSG_H__ */
-- 2.34.1
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Introduce a custom virtual bus (qda-compute-cb) for managing IOMMU context bank (CB) devices used by the QDA driver.
IOMMU context banks are synthetic constructs — they are not real platform devices and do not appear as children of a platform bus node in the device tree. Using a platform driver to represent them was therefore incorrect and introduced a probe-ordering race: device nodes were created before the RPMsg channel resources were fully initialized, and because probe runs asynchronously, user-space could open a CB device and attempt to start a session before the underlying transport was ready.
The qda-compute-cb bus solves this by allowing the main QDA driver to create CB devices explicitly and under its own control, making their lifetime strictly subordinate to the parent qda_dev. The bus provides a dma_configure callback that calls of_dma_configure() so that each CB device gets its own IOMMU domain derived from its device-tree node, enabling per-session memory isolation.
The bus type and the CB device constructor (create_qda_cb_device) are exported for use by the QDA memory manager.
A hidden Kconfig symbol (DRM_ACCEL_QDA_COMPUTE_BUS) is introduced and automatically selected by DRM_ACCEL_QDA so that the bus initialisation runs via postcore_initcall before any QDA device probes.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/accel/Makefile | 1 + drivers/accel/qda/Kconfig | 4 +++ drivers/accel/qda/Makefile | 2 ++ drivers/accel/qda/qda_compute_bus.c | 68 +++++++++++++++++++++++++++++++++++++ include/linux/qda_compute_bus.h | 32 +++++++++++++++++ 5 files changed, 107 insertions(+)
diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile index 58c08dd5f389..9ed843cd293f 100644 --- a/drivers/accel/Makefile +++ b/drivers/accel/Makefile @@ -6,4 +6,5 @@ obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/ obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/ obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/ obj-$(CONFIG_DRM_ACCEL_QDA) += qda/ +obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda/ obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/ \ No newline at end of file diff --git a/drivers/accel/qda/Kconfig b/drivers/accel/qda/Kconfig index 484d21ff1b55..2a61a4dda054 100644 --- a/drivers/accel/qda/Kconfig +++ b/drivers/accel/qda/Kconfig @@ -3,11 +3,15 @@ # Qualcomm DSP accelerator driver #
+config DRM_ACCEL_QDA_COMPUTE_BUS + bool + config DRM_ACCEL_QDA tristate "Qualcomm DSP accelerator" depends on DRM_ACCEL depends on ARCH_QCOM || COMPILE_TEST depends on RPMSG + select DRM_ACCEL_QDA_COMPUTE_BUS help Enables the DRM-based accelerator driver for Qualcomm's Hexagon DSPs. This driver provides a standardized interface for offloading computational diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index dbe809067a8b..424176f652a5 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -8,3 +8,5 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o qda-y := \ qda_drv.o \ qda_rpmsg.o + +obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o diff --git a/drivers/accel/qda/qda_compute_bus.c b/drivers/accel/qda/qda_compute_bus.c new file mode 100644 index 000000000000..c59d977e924d --- /dev/null +++ b/drivers/accel/qda/qda_compute_bus.c @@ -0,0 +1,68 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <linux/device.h> +#include <linux/init.h> +#include <linux/of.h> +#include <linux/of_device.h> +#include <linux/qda_compute_bus.h> +#include <linux/slab.h> + +static int qda_cb_bus_dma_configure(struct device *dev) +{ + return of_dma_configure(dev, dev->of_node, true); +} + +const struct bus_type qda_cb_bus_type = { + .name = "qda-compute-cb", + .dma_configure = qda_cb_bus_dma_configure, +}; +EXPORT_SYMBOL_GPL(qda_cb_bus_type); + +static void release_qda_cb_device(struct device *dev) +{ + of_node_put(dev->of_node); + kfree(dev); +} + +struct device *create_qda_cb_device(struct device *parent_device, const char *name, + u64 dma_mask, struct device_node *of_node) +{ + struct device *dev; + int ret; + + dev = kzalloc_obj(*dev); + if (!dev) + return ERR_PTR(-ENOMEM); + + dev->release = release_qda_cb_device; + dev->bus = &qda_cb_bus_type; + dev->parent = parent_device; + dev->coherent_dma_mask = dma_mask; + dev->dma_mask = &dev->coherent_dma_mask; + dev->of_node = of_node_get(of_node); + + dev_set_name(dev, "%s", name); + + ret = device_register(dev); + if (ret) { + put_device(dev); + return ERR_PTR(ret); + } + + return dev; +} +EXPORT_SYMBOL_GPL(create_qda_cb_device); + +static int __init qda_cb_bus_init(void) +{ + int err; + + err = bus_register(&qda_cb_bus_type); + if (err < 0) { + pr_err("qda-compute-cb bus registration failed: %d\n", err); + return err; + } + return 0; +} + +postcore_initcall(qda_cb_bus_init); diff --git a/include/linux/qda_compute_bus.h b/include/linux/qda_compute_bus.h new file mode 100644 index 000000000000..90bf248c7285 --- /dev/null +++ b/include/linux/qda_compute_bus.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. + */ + +#ifndef __QDA_COMPUTE_BUS_H__ +#define __QDA_COMPUTE_BUS_H__ + +#include <linux/device.h> + +/* + * Custom bus type for QDA compute context bank (CB) devices + * + * This bus type is used for manually created CB devices that represent + * IOMMU context banks. The custom bus allows proper IOMMU configuration + * and device management for these virtual devices. + */ +#ifdef CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS +extern const struct bus_type qda_cb_bus_type; + +struct device *create_qda_cb_device(struct device *parent_device, const char *name, + u64 dma_mask, struct device_node *of_node); +#else +static inline struct device *create_qda_cb_device(struct device *parent_device, + const char *name, u64 dma_mask, + struct device_node *of_node) +{ + return ERR_PTR(-ENODEV); +} +#endif + +#endif /* __QDA_COMPUTE_BUS_H__ */
On Tue, May 19, 2026 at 11:45:54AM +0530, Ekansh Gupta via B4 Relay wrote:
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Introduce a custom virtual bus (qda-compute-cb) for managing IOMMU context bank (CB) devices used by the QDA driver.
IOMMU context banks are synthetic constructs — they are not real platform devices and do not appear as children of a platform bus node in the device tree. Using a platform driver to represent them was therefore incorrect and introduced a probe-ordering race: device nodes were created before the RPMsg channel resources were fully initialized, and because probe runs asynchronously, user-space could open a CB device and attempt to start a session before the underlying transport was ready.
The qda-compute-cb bus solves this by allowing the main QDA driver to create CB devices explicitly and under its own control, making their lifetime strictly subordinate to the parent qda_dev. The bus provides a dma_configure callback that calls of_dma_configure() so that each CB device gets its own IOMMU domain derived from its device-tree node, enabling per-session memory isolation.
The bus type and the CB device constructor (create_qda_cb_device) are exported for use by the QDA memory manager.
A hidden Kconfig symbol (DRM_ACCEL_QDA_COMPUTE_BUS) is introduced and automatically selected by DRM_ACCEL_QDA so that the bus initialisation runs via postcore_initcall before any QDA device probes.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
drivers/accel/Makefile | 1 + drivers/accel/qda/Kconfig | 4 +++ drivers/accel/qda/Makefile | 2 ++ drivers/accel/qda/qda_compute_bus.c | 68 +++++++++++++++++++++++++++++++++++++ include/linux/qda_compute_bus.h | 32 +++++++++++++++++ 5 files changed, 107 insertions(+)
diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile index 58c08dd5f389..9ed843cd293f 100644 --- a/drivers/accel/Makefile +++ b/drivers/accel/Makefile @@ -6,4 +6,5 @@ obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/ obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/ obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/ obj-$(CONFIG_DRM_ACCEL_QDA) += qda/ +obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda/
Ugh. The previous line should be enough (but don't trust me).
obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/ \ No newline at end of file diff --git a/drivers/accel/qda/Kconfig b/drivers/accel/qda/Kconfig index 484d21ff1b55..2a61a4dda054 100644 --- a/drivers/accel/qda/Kconfig +++ b/drivers/accel/qda/Kconfig @@ -3,11 +3,15 @@ # Qualcomm DSP accelerator driver # +config DRM_ACCEL_QDA_COMPUTE_BUS
- bool
config DRM_ACCEL_QDA tristate "Qualcomm DSP accelerator" depends on DRM_ACCEL depends on ARCH_QCOM || COMPILE_TEST depends on RPMSG
- select DRM_ACCEL_QDA_COMPUTE_BUS help Enables the DRM-based accelerator driver for Qualcomm's Hexagon DSPs. This driver provides a standardized interface for offloading computational
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index dbe809067a8b..424176f652a5 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -8,3 +8,5 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o qda-y := \ qda_drv.o \ qda_rpmsg.o
+obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o diff --git a/drivers/accel/qda/qda_compute_bus.c b/drivers/accel/qda/qda_compute_bus.c new file mode 100644 index 000000000000..c59d977e924d --- /dev/null +++ b/drivers/accel/qda/qda_compute_bus.c @@ -0,0 +1,68 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <linux/device.h> +#include <linux/init.h> +#include <linux/of.h> +#include <linux/of_device.h> +#include <linux/qda_compute_bus.h> +#include <linux/slab.h>
+static int qda_cb_bus_dma_configure(struct device *dev) +{
- return of_dma_configure(dev, dev->of_node, true);
+}
+const struct bus_type qda_cb_bus_type = {
- .name = "qda-compute-cb",
- .dma_configure = qda_cb_bus_dma_configure,
+}; +EXPORT_SYMBOL_GPL(qda_cb_bus_type);
+static void release_qda_cb_device(struct device *dev) +{
- of_node_put(dev->of_node);
- kfree(dev);
+}
+struct device *create_qda_cb_device(struct device *parent_device, const char *name,
u64 dma_mask, struct device_node *of_node)+{
- struct device *dev;
- int ret;
- dev = kzalloc_obj(*dev);
- if (!dev)
return ERR_PTR(-ENOMEM);- dev->release = release_qda_cb_device;
- dev->bus = &qda_cb_bus_type;
- dev->parent = parent_device;
- dev->coherent_dma_mask = dma_mask;
- dev->dma_mask = &dev->coherent_dma_mask;
- dev->of_node = of_node_get(of_node);
- dev_set_name(dev, "%s", name);
- ret = device_register(dev);
- if (ret) {
put_device(dev);return ERR_PTR(ret);- }
- return dev;
+} +EXPORT_SYMBOL_GPL(create_qda_cb_device);
+static int __init qda_cb_bus_init(void) +{
- int err;
- err = bus_register(&qda_cb_bus_type);
- if (err < 0) {
pr_err("qda-compute-cb bus registration failed: %d\n", err);return err;- }
- return 0;
+}
+postcore_initcall(qda_cb_bus_init); diff --git a/include/linux/qda_compute_bus.h b/include/linux/qda_compute_bus.h new file mode 100644 index 000000000000..90bf248c7285 --- /dev/null +++ b/include/linux/qda_compute_bus.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/*
- Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
- */
+#ifndef __QDA_COMPUTE_BUS_H__ +#define __QDA_COMPUTE_BUS_H__
+#include <linux/device.h>
+/*
- Custom bus type for QDA compute context bank (CB) devices
- This bus type is used for manually created CB devices that represent
- IOMMU context banks. The custom bus allows proper IOMMU configuration
- and device management for these virtual devices.
- */
+#ifdef CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS +extern const struct bus_type qda_cb_bus_type;
+struct device *create_qda_cb_device(struct device *parent_device, const char *name,
u64 dma_mask, struct device_node *of_node);+#else +static inline struct device *create_qda_cb_device(struct device *parent_device,
const char *name, u64 dma_mask,struct device_node *of_node)+{
- return ERR_PTR(-ENODEV);
+} +#endif
+#endif /* __QDA_COMPUTE_BUS_H__ */
-- 2.34.1
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Register the QDA compute context bank bus (qda-compute-cb) with the IOMMU subsystem by adding it to the iommu_buses[] array.
The QDA driver creates synthetic devices on this bus to represent IOMMU context banks (CBs). Each CB device needs its own IOMMU domain so that the DSP memory manager can enforce per-session address space isolation. Without this registration, the IOMMU subsystem does not probe CB devices for IOMMU groups and of_dma_configure() in the bus dma_configure callback has no IOMMU domain to attach to.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/iommu/iommu.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index e8f13dcebbde..7d39050a8848 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -26,6 +26,7 @@ #include <linux/bitops.h> #include <linux/platform_device.h> #include <linux/property.h> +#include <linux/qda_compute_bus.h> #include <linux/fsl/mc.h> #include <linux/module.h> #include <linux/cc_platform.h> @@ -200,6 +201,9 @@ static const struct bus_type * const iommu_buses[] = { #ifdef CONFIG_CDX_BUS &cdx_bus_type, #endif +#ifdef CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS + &qda_cb_bus_type, +#endif };
/*
On Tue, May 19, 2026 at 11:45:55AM +0530, Ekansh Gupta via B4 Relay wrote:
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Register the QDA compute context bank bus (qda-compute-cb) with the IOMMU subsystem by adding it to the iommu_buses[] array.
The QDA driver creates synthetic devices on this bus to represent IOMMU context banks (CBs). Each CB device needs its own IOMMU domain so that the DSP memory manager can enforce per-session address space isolation. Without this registration, the IOMMU subsystem does not probe CB devices for IOMMU groups and of_dma_configure() in the bus dma_configure callback has no IOMMU domain to attach to.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
drivers/iommu/iommu.c | 4 ++++ 1 file changed, 4 insertions(+)
Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Introduce the CB (compute context bank) device management layer for the QDA driver. Each DSP domain node in the device tree may contain child nodes with compatible "qcom,fastrpc-compute-cb", each representing one IOMMU context bank. The driver enumerates those child nodes during RPMsg probe and creates a corresponding device on the qda-compute-cb bus for each one.
The CB devices are created via create_qda_cb_device(), which registers them on the qda-compute-cb bus so that the IOMMU subsystem assigns each device its own IOMMU domain, enabling per-session address space isolation for DSP buffer mapping.
The new qda_cb.c file provides two functions:
qda_create_cb_device() Reads the "reg" property from the DT child node to obtain the stream ID, constructs a unique device name of the form "qda-cb-<dsp>-<sid>", and registers the device on the compute bus. A qda_cb_dev entry is allocated and appended to qdev->cb_devs so that the list can be walked during teardown.
qda_destroy_cb_device() Removes the device from its IOMMU group before calling device_unregister(), ensuring the IOMMU domain is released cleanly.
CB devices are populated before the DRM device is registered and destroyed before it is unplugged, so no DRM operation can race with CB teardown. On probe failure after population, qda_cb_unpopulate() is called to clean up any CBs that were successfully created before the error.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/accel/qda/Makefile | 1 + drivers/accel/qda/qda_cb.c | 99 +++++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_cb.h | 32 ++++++++++++++ drivers/accel/qda/qda_drv.c | 1 + drivers/accel/qda/qda_drv.h | 3 ++ drivers/accel/qda/qda_rpmsg.c | 12 +++++- 6 files changed, 147 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index 424176f652a5..143c9e4e789e 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -6,6 +6,7 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
qda-y := \ + qda_cb.o \ qda_drv.o \ qda_rpmsg.o
diff --git a/drivers/accel/qda/qda_cb.c b/drivers/accel/qda/qda_cb.c new file mode 100644 index 000000000000..77caf8438c67 --- /dev/null +++ b/drivers/accel/qda/qda_cb.c @@ -0,0 +1,99 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <linux/dma-mapping.h> +#include <linux/device.h> +#include <linux/of.h> +#include <linux/iommu.h> +#include <linux/qda_compute_bus.h> +#include <linux/slab.h> +#include <drm/drm_print.h> +#include "qda_drv.h" +#include "qda_cb.h" + +int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node) +{ + struct device *cb_dev; + u32 sid = 0; + char name[64]; + struct qda_cb_dev *entry; + + drm_dbg_driver(&qdev->drm_dev, "Creating CB device for node: %s\n", cb_node->name); + + of_property_read_u32(cb_node, "reg", &sid); + + snprintf(name, sizeof(name), "qda-cb-%s-%u", qdev->dsp_name, sid); + + cb_dev = create_qda_cb_device(qdev->dev, name, DMA_BIT_MASK(32), cb_node); + if (IS_ERR(cb_dev)) { + drm_err(&qdev->drm_dev, "Failed to create CB device for SID %u: %ld\n", + sid, PTR_ERR(cb_dev)); + return PTR_ERR(cb_dev); + } + + entry = kzalloc_obj(*entry); + if (!entry) { + device_unregister(cb_dev); + return -ENOMEM; + } + + entry->dev = cb_dev; + list_add_tail(&entry->node, &qdev->cb_devs); + + drm_dbg_driver(&qdev->drm_dev, "Successfully created CB device for SID %u\n", sid); + return 0; +} + +void qda_cb_unpopulate(struct qda_dev *qdev) +{ + struct qda_cb_dev *entry, *tmp; + + list_for_each_entry_safe(entry, tmp, &qdev->cb_devs, node) { + list_del(&entry->node); + qda_destroy_cb_device(entry->dev); + kfree(entry); + } +} + +int qda_cb_populate(struct qda_dev *qdev, struct device_node *parent_node) +{ + struct device_node *child; + int count = 0, success = 0; + + for_each_child_of_node(parent_node, child) { + if (of_device_is_compatible(child, "qcom,fastrpc-compute-cb")) { + count++; + if (qda_create_cb_device(qdev, child) == 0) { + success++; + dev_dbg(qdev->dev, "Created CB device for node: %s\n", + child->name); + } else { + dev_err(qdev->dev, "Failed to create CB device for: %s\n", + child->name); + } + } + } + if (count == 0) + return 0; + return success > 0 ? 0 : -ENODEV; +} + +void qda_destroy_cb_device(struct device *cb_dev) +{ + struct iommu_group *group; + + if (!cb_dev) { + pr_debug("qda: NULL CB device passed to destroy\n"); + return; + } + + dev_dbg(cb_dev, "Destroying CB device %s\n", dev_name(cb_dev)); + + group = iommu_group_get(cb_dev); + if (group) { + dev_dbg(cb_dev, "Removing %s from IOMMU group\n", dev_name(cb_dev)); + iommu_group_remove_device(cb_dev); + iommu_group_put(group); + } + + device_unregister(cb_dev); +} diff --git a/drivers/accel/qda/qda_cb.h b/drivers/accel/qda/qda_cb.h new file mode 100644 index 000000000000..bd83d64fa425 --- /dev/null +++ b/drivers/accel/qda/qda_cb.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. + */ + +#ifndef __QDA_CB_H__ +#define __QDA_CB_H__ + +#include <linux/device.h> +#include <linux/list.h> +#include <linux/of.h> +#include "qda_drv.h" + +struct qda_cb_dev { + struct list_head node; + struct device *dev; +}; + +/* + * Compute bus (CB) device management + */ +int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node); +void qda_destroy_cb_device(struct device *cb_dev); + +/* + * Transport-agnostic CB device population/teardown. + * Called by any transport layer (RPMsg, etc.) during probe/remove. + */ +int qda_cb_populate(struct qda_dev *qdev, struct device_node *parent_node); +void qda_cb_unpopulate(struct qda_dev *qdev); + +#endif /* __QDA_CB_H__ */ diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index 1c1bab68d445..6c20d6a2fc47 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -53,6 +53,7 @@ struct qda_dev *qda_alloc_device(struct device *dev) if (IS_ERR(qdev)) return ERR_CAST(qdev);
+ INIT_LIST_HEAD(&qdev->cb_devs); return qdev; }
diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h index 7ba2ef19a411..2715f378775d 100644 --- a/drivers/accel/qda/qda_drv.h +++ b/drivers/accel/qda/qda_drv.h @@ -7,6 +7,7 @@ #define __QDA_DRV_H__
#include <linux/device.h> +#include <linux/list.h> #include <linux/rpmsg.h> #include <linux/types.h> #include <drm/drm_device.h> @@ -37,6 +38,8 @@ struct qda_dev { struct rpmsg_device *rpdev; /** @dev: Underlying Linux device */ struct device *dev; + /** @cb_devs: Compute context-bank (CB) child devices */ + struct list_head cb_devs; /** @dsp_name: Name of the DSP domain (e.g. "cdsp", "adsp") */ const char *dsp_name; }; diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c index 6eaf1b145f8a..afd9e851d00e 100644 --- a/drivers/accel/qda/qda_rpmsg.c +++ b/drivers/accel/qda/qda_rpmsg.c @@ -5,6 +5,7 @@ #include <linux/rpmsg.h> #include <drm/drm_print.h>
+#include "qda_cb.h" #include "qda_drv.h" #include "qda_rpmsg.h"
@@ -34,6 +35,7 @@ static void qda_rpmsg_remove(struct rpmsg_device *rpdev) { struct qda_dev *qdev = dev_get_drvdata(&rpdev->dev);
+ qda_cb_unpopulate(qdev); drm_dev_unplug(&qdev->drm_dev); qdev->rpdev = NULL; qda_unregister_device(qdev); @@ -59,9 +61,17 @@ static int qda_rpmsg_probe(struct rpmsg_device *rpdev) } qdev->dsp_name = label;
+ ret = qda_cb_populate(qdev, rpdev->dev.of_node); + if (ret) { + dev_err(qdev->dev, "Failed to populate child devices: %d\n", ret); + return ret; + } + ret = qda_register_device(qdev); - if (ret) + if (ret) { + qda_cb_unpopulate(qdev); return ret; + }
drm_info(&qdev->drm_dev, "QDA RPMsg probe complete for %s\n", qdev->dsp_name); return 0;
On Tue, May 19, 2026 at 11:45:56AM +0530, Ekansh Gupta via B4 Relay wrote:
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Introduce the CB (compute context bank) device management layer for the QDA driver. Each DSP domain node in the device tree may contain child nodes with compatible "qcom,fastrpc-compute-cb", each representing one IOMMU context bank. The driver enumerates those child nodes during RPMsg probe and creates a corresponding device on the qda-compute-cb bus for each one.
The CB devices are created via create_qda_cb_device(), which registers them on the qda-compute-cb bus so that the IOMMU subsystem assigns each device its own IOMMU domain, enabling per-session address space isolation for DSP buffer mapping.
The new qda_cb.c file provides two functions:
qda_create_cb_device() Reads the "reg" property from the DT child node to obtain the stream ID, constructs a unique device name of the form "qda-cb-<dsp>-<sid>", and registers the device on the compute bus. A qda_cb_dev entry is allocated and appended to qdev->cb_devs so that the list can be walked during teardown.
qda_destroy_cb_device() Removes the device from its IOMMU group before calling device_unregister(), ensuring the IOMMU domain is released cleanly.
CB devices are populated before the DRM device is registered and destroyed before it is unplugged, so no DRM operation can race with CB teardown. On probe failure after population, qda_cb_unpopulate() is called to clean up any CBs that were successfully created before the error.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
drivers/accel/qda/Makefile | 1 + drivers/accel/qda/qda_cb.c | 99 +++++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_cb.h | 32 ++++++++++++++ drivers/accel/qda/qda_drv.c | 1 + drivers/accel/qda/qda_drv.h | 3 ++ drivers/accel/qda/qda_rpmsg.c | 12 +++++- 6 files changed, 147 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index 424176f652a5..143c9e4e789e 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -6,6 +6,7 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o qda-y := \
- qda_cb.o \ qda_drv.o \ qda_rpmsg.o
diff --git a/drivers/accel/qda/qda_cb.c b/drivers/accel/qda/qda_cb.c new file mode 100644 index 000000000000..77caf8438c67 --- /dev/null +++ b/drivers/accel/qda/qda_cb.c @@ -0,0 +1,99 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <linux/dma-mapping.h> +#include <linux/device.h> +#include <linux/of.h> +#include <linux/iommu.h> +#include <linux/qda_compute_bus.h> +#include <linux/slab.h> +#include <drm/drm_print.h> +#include "qda_drv.h" +#include "qda_cb.h"
+int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node) +{
- struct device *cb_dev;
- u32 sid = 0;
- char name[64];
- struct qda_cb_dev *entry;
- drm_dbg_driver(&qdev->drm_dev, "Creating CB device for node: %s\n", cb_node->name);
- of_property_read_u32(cb_node, "reg", &sid);
- snprintf(name, sizeof(name), "qda-cb-%s-%u", qdev->dsp_name, sid);
- cb_dev = create_qda_cb_device(qdev->dev, name, DMA_BIT_MASK(32), cb_node);
Wrong prefix. Pass the name format and the params to this function. Use kasprintf in it.
- if (IS_ERR(cb_dev)) {
drm_err(&qdev->drm_dev, "Failed to create CB device for SID %u: %ld\n",sid, PTR_ERR(cb_dev));return PTR_ERR(cb_dev);- }
- entry = kzalloc_obj(*entry);
- if (!entry) {
device_unregister(cb_dev);return -ENOMEM;- }
- entry->dev = cb_dev;
- list_add_tail(&entry->node, &qdev->cb_devs);
- drm_dbg_driver(&qdev->drm_dev, "Successfully created CB device for SID %u\n", sid);
- return 0;
+}
+void qda_cb_unpopulate(struct qda_dev *qdev) +{
- struct qda_cb_dev *entry, *tmp;
- list_for_each_entry_safe(entry, tmp, &qdev->cb_devs, node) {
list_del(&entry->node);qda_destroy_cb_device(entry->dev);kfree(entry);- }
+}
+int qda_cb_populate(struct qda_dev *qdev, struct device_node *parent_node) +{
- struct device_node *child;
- int count = 0, success = 0;
- for_each_child_of_node(parent_node, child) {
if (of_device_is_compatible(child, "qcom,fastrpc-compute-cb")) {count++;if (qda_create_cb_device(qdev, child) == 0) {success++;dev_dbg(qdev->dev, "Created CB device for node: %s\n",child->name);
Stop counting successes.
} else {dev_err(qdev->dev, "Failed to create CB device for: %s\n",child->name);
Unwind, return error.
}}- }
- if (count == 0)
return 0;- return success > 0 ? 0 : -ENODEV;
+}
+void qda_destroy_cb_device(struct device *cb_dev) +{
- struct iommu_group *group;
- if (!cb_dev) {
How can it be?
pr_debug("qda: NULL CB device passed to destroy\n");return;- }
- dev_dbg(cb_dev, "Destroying CB device %s\n", dev_name(cb_dev));
- group = iommu_group_get(cb_dev);
- if (group) {
dev_dbg(cb_dev, "Removing %s from IOMMU group\n", dev_name(cb_dev));
Be uniform. It's either drm_dbg_foo() or dev_dbg() all over the place. Don't mix them.
iommu_group_remove_device(cb_dev);iommu_group_put(group);- }
- device_unregister(cb_dev);
+} @@ -59,9 +61,17 @@ static int qda_rpmsg_probe(struct rpmsg_device *rpdev) } qdev->dsp_name = label;
- ret = qda_cb_populate(qdev, rpdev->dev.of_node);
- if (ret) {
dev_err(qdev->dev, "Failed to populate child devices: %d\n", ret);return ret;- }
- ret = qda_register_device(qdev);
- if (ret)
- if (ret) {
return ret;qda_cb_unpopulate(qdev);
Unwinding registration?
- }
drm_info(&qdev->drm_dev, "QDA RPMsg probe complete for %s\n", qdev->dsp_name); return 0;
-- 2.34.1
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Introduce the QDA memory manager (qda_memory_manager) to track and manage the IOMMU devices that back each compute context bank (CB).
Each CB device registered on the qda-compute-cb bus is assigned a unique ID via an XArray and wrapped in a qda_iommu_device descriptor that records the device pointer and its stream ID. This registry allows the driver to look up the correct IOMMU domain for a given session when mapping DSP buffers.
The memory manager is initialised in qda_init_device() before CB devices are populated and torn down in qda_deinit_device() after they are destroyed, ensuring no dangling references remain in the XArray.
qda_cb.c is extended with qda_cb_setup_device(), which is called immediately after a CB device is registered on the bus. It allocates a qda_iommu_device, registers it with the memory manager, and stores it as the CB device's driver data so that qda_destroy_cb_device() can retrieve and unregister it during teardown.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/accel/qda/Makefile | 1 + drivers/accel/qda/qda_cb.c | 47 ++++++++++++++ drivers/accel/qda/qda_drv.c | 34 ++++++++++ drivers/accel/qda/qda_drv.h | 5 ++ drivers/accel/qda/qda_memory_manager.c | 111 +++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_memory_manager.h | 49 +++++++++++++++ drivers/accel/qda/qda_rpmsg.c | 7 +++ 7 files changed, 254 insertions(+)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index 143c9e4e789e..701fad5ffb50 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -8,6 +8,7 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o qda-y := \ qda_cb.o \ qda_drv.o \ + qda_memory_manager.o \ qda_rpmsg.o
obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o diff --git a/drivers/accel/qda/qda_cb.c b/drivers/accel/qda/qda_cb.c index 77caf8438c67..6d540bb0ec7b 100644 --- a/drivers/accel/qda/qda_cb.c +++ b/drivers/accel/qda/qda_cb.c @@ -8,11 +8,42 @@ #include <linux/slab.h> #include <drm/drm_print.h> #include "qda_drv.h" +#include "qda_memory_manager.h" #include "qda_cb.h"
+static int qda_cb_setup_device(struct qda_dev *qdev, struct device *cb_dev, u32 sid) +{ + struct qda_iommu_device *iommu_dev; + int rc; + + drm_dbg_driver(&qdev->drm_dev, "Setting up CB device %s\n", dev_name(cb_dev)); + + iommu_dev = kzalloc_obj(*iommu_dev); + if (!iommu_dev) + return -ENOMEM; + + iommu_dev->dev = cb_dev; + iommu_dev->qdev = qdev; + iommu_dev->sid = sid; + + rc = qda_memory_manager_register_device(qdev->iommu_mgr, iommu_dev); + if (rc) { + drm_err(&qdev->drm_dev, "Failed to register IOMMU device: %d\n", rc); + kfree(iommu_dev); + return rc; + } + + dev_set_drvdata(cb_dev, iommu_dev); + + drm_dbg_driver(&qdev->drm_dev, "CB device setup complete - SID: %u\n", sid); + + return 0; +} + int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node) { struct device *cb_dev; + int ret; u32 sid = 0; char name[64]; struct qda_cb_dev *entry; @@ -30,6 +61,13 @@ int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node) return PTR_ERR(cb_dev); }
+ ret = qda_cb_setup_device(qdev, cb_dev, sid); + if (ret) { + drm_err(&qdev->drm_dev, "CB device setup failed: %d\n", ret); + device_unregister(cb_dev); + return ret; + } + entry = kzalloc_obj(*entry); if (!entry) { device_unregister(cb_dev); @@ -80,6 +118,7 @@ int qda_cb_populate(struct qda_dev *qdev, struct device_node *parent_node) void qda_destroy_cb_device(struct device *cb_dev) { struct iommu_group *group; + struct qda_iommu_device *iommu_dev;
if (!cb_dev) { pr_debug("qda: NULL CB device passed to destroy\n"); @@ -88,6 +127,14 @@ void qda_destroy_cb_device(struct device *cb_dev)
dev_dbg(cb_dev, "Destroying CB device %s\n", dev_name(cb_dev));
+ iommu_dev = dev_get_drvdata(cb_dev); + if (iommu_dev && iommu_dev->qdev && iommu_dev->qdev->iommu_mgr) { + dev_dbg(cb_dev, "Unregistering IOMMU device for %s\n", + dev_name(cb_dev)); + qda_memory_manager_unregister_device(iommu_dev->qdev->iommu_mgr, + iommu_dev); + } + group = iommu_group_get(cb_dev); if (group) { dev_dbg(cb_dev, "Removing %s from IOMMU group\n", dev_name(cb_dev)); diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index 6c20d6a2fc47..0ad5d9873d7e 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -57,6 +57,40 @@ struct qda_dev *qda_alloc_device(struct device *dev) return qdev; }
+static void cleanup_memory_manager(struct qda_dev *qdev) +{ + if (qdev->iommu_mgr) { + qda_memory_manager_exit(qdev->iommu_mgr); + kfree(qdev->iommu_mgr); + qdev->iommu_mgr = NULL; + } +} + +static int init_memory_manager(struct qda_dev *qdev) +{ + qdev->iommu_mgr = kzalloc_obj(*qdev->iommu_mgr); + if (!qdev->iommu_mgr) + return -ENOMEM; + + return qda_memory_manager_init(qdev->iommu_mgr); +} + +void qda_deinit_device(struct qda_dev *qdev) +{ + cleanup_memory_manager(qdev); +} + +int qda_init_device(struct qda_dev *qdev) +{ + int ret; + + ret = init_memory_manager(qdev); + if (ret) + drm_err(&qdev->drm_dev, "Failed to initialize memory manager: %d\n", ret); + + return ret; +} + void qda_unregister_device(struct qda_dev *qdev) { drm_dev_unregister(&qdev->drm_dev); diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h index 2715f378775d..eb089e586b17 100644 --- a/drivers/accel/qda/qda_drv.h +++ b/drivers/accel/qda/qda_drv.h @@ -13,6 +13,7 @@ #include <drm/drm_device.h> #include <drm/drm_drv.h> #include <drm/drm_file.h> +#include "qda_memory_manager.h"
/* Driver identification */ #define QDA_DRIVER_NAME "qda" @@ -40,6 +41,8 @@ struct qda_dev { struct device *dev; /** @cb_devs: Compute context-bank (CB) child devices */ struct list_head cb_devs; + /** @iommu_mgr: IOMMU/memory manager instance */ + struct qda_memory_manager *iommu_mgr; /** @dsp_name: Name of the DSP domain (e.g. "cdsp", "adsp") */ const char *dsp_name; }; @@ -59,6 +62,8 @@ static inline struct qda_dev *qda_dev_from_drm(struct drm_device *dev) struct qda_dev *qda_alloc_device(struct device *dev);
/* Core device lifecycle */ +int qda_init_device(struct qda_dev *qdev); +void qda_deinit_device(struct qda_dev *qdev); int qda_register_device(struct qda_dev *qdev); void qda_unregister_device(struct qda_dev *qdev);
diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c new file mode 100644 index 000000000000..00a9c0ae4224 --- /dev/null +++ b/drivers/accel/qda/qda_memory_manager.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. + +#include <linux/refcount.h> +#include <linux/slab.h> +#include <linux/spinlock.h> +#include <linux/xarray.h> +#include <drm/drm_file.h> +#include "qda_drv.h" +#include "qda_memory_manager.h" + +static void cleanup_all_memory_devices(struct qda_memory_manager *mem_mgr) +{ + unsigned long index; + void *entry; + + pr_debug("qda: Starting cleanup of all memory devices\n"); + + xa_for_each(&mem_mgr->device_xa, index, entry) { + struct qda_iommu_device *iommu_dev = entry; + + pr_debug("qda: Cleaning up device id=%lu\n", index); + + xa_erase(&mem_mgr->device_xa, index); + kfree(iommu_dev); + } + + pr_debug("qda: Completed cleanup of all memory devices\n"); +} + +static int allocate_device_id(struct qda_memory_manager *mem_mgr, + struct qda_iommu_device *iommu_dev, u32 *id) +{ + int ret; + + ret = xa_alloc(&mem_mgr->device_xa, id, iommu_dev, + xa_limit_31b, GFP_KERNEL); + if (ret) { + dev_err(iommu_dev->dev, "Failed to allocate XArray ID: %d\n", ret); + return ret; + } + + dev_dbg(iommu_dev->dev, "Allocated device id=%u\n", *id); + return 0; +} + +/** + * qda_memory_manager_register_device() - Register an IOMMU device + * @mem_mgr: Pointer to memory manager + * @iommu_dev: Pointer to IOMMU device to register + * + * Return: 0 on success, negative error code on failure + */ +int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr, + struct qda_iommu_device *iommu_dev) +{ + int ret; + u32 id; + + ret = allocate_device_id(mem_mgr, iommu_dev, &id); + if (ret) { + dev_err(iommu_dev->dev, + "Failed to allocate device ID: %d (sid=%u)\n", + ret, iommu_dev->sid); + return ret; + } + + iommu_dev->id = id; + + dev_dbg(iommu_dev->dev, "Registered device id=%u (sid=%u)\n", id, iommu_dev->sid); + + return 0; +} + +/** + * qda_memory_manager_unregister_device() - Unregister an IOMMU device + * @mem_mgr: Pointer to memory manager + * @iommu_dev: Pointer to IOMMU device to unregister + */ +void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr, + struct qda_iommu_device *iommu_dev) +{ + xa_erase(&mem_mgr->device_xa, iommu_dev->id); + kfree(iommu_dev); +} + +/** + * qda_memory_manager_init() - Initialize the memory manager + * @mem_mgr: Pointer to memory manager structure to initialize + * + * Return: 0 on success, negative error code on failure + */ +int qda_memory_manager_init(struct qda_memory_manager *mem_mgr) +{ + pr_debug("qda: Initializing memory manager\n"); + + xa_init_flags(&mem_mgr->device_xa, XA_FLAGS_ALLOC); + + pr_debug("qda: Memory manager initialized successfully\n"); + return 0; +} + +/** + * qda_memory_manager_exit() - Clean up the memory manager + * @mem_mgr: Pointer to memory manager structure to clean up + */ +void qda_memory_manager_exit(struct qda_memory_manager *mem_mgr) +{ + cleanup_all_memory_devices(mem_mgr); + pr_debug("qda: Memory manager exited\n"); +} diff --git a/drivers/accel/qda/qda_memory_manager.h b/drivers/accel/qda/qda_memory_manager.h new file mode 100644 index 000000000000..0243f9c0c5aa --- /dev/null +++ b/drivers/accel/qda/qda_memory_manager.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. + */ + +#ifndef __QDA_MEMORY_MANAGER_H__ +#define __QDA_MEMORY_MANAGER_H__ + +#include <linux/device.h> +#include <linux/xarray.h> +#include "qda_drv.h" + +/** + * struct qda_iommu_device - IOMMU device instance for memory management + * + * Represents a single IOMMU-enabled device managed by the memory manager. + * Each device can be assigned to a specific process session. + */ +struct qda_iommu_device { + /** @dev: Pointer to the underlying device */ + struct device *dev; + /** @qdev: Back-pointer to the parent QDA device */ + struct qda_dev *qdev; + /** @id: Unique identifier assigned by the memory manager XArray */ + u32 id; + /** @sid: Stream ID for IOMMU transactions */ + u32 sid; +}; + +/** + * struct qda_memory_manager - Central memory management coordinator + * + * Coordinates memory management across multiple IOMMU devices. Maintains + * a registry of devices using an XArray for O(1) lookup by ID. + */ +struct qda_memory_manager { + /** @device_xa: XArray storing all registered IOMMU devices */ + struct xarray device_xa; +}; + +int qda_memory_manager_init(struct qda_memory_manager *mem_mgr); +void qda_memory_manager_exit(struct qda_memory_manager *mem_mgr); + +int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr, + struct qda_iommu_device *iommu_dev); +void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr, + struct qda_iommu_device *iommu_dev); + +#endif /* __QDA_MEMORY_MANAGER_H__ */ diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c index afd9e851d00e..719dabb028c5 100644 --- a/drivers/accel/qda/qda_rpmsg.c +++ b/drivers/accel/qda/qda_rpmsg.c @@ -39,6 +39,7 @@ static void qda_rpmsg_remove(struct rpmsg_device *rpdev) drm_dev_unplug(&qdev->drm_dev); qdev->rpdev = NULL; qda_unregister_device(qdev); + qda_deinit_device(qdev); dev_info(qdev->dev, "RPMsg device removed\n"); }
@@ -61,14 +62,20 @@ static int qda_rpmsg_probe(struct rpmsg_device *rpdev) } qdev->dsp_name = label;
+ ret = qda_init_device(qdev); + if (ret) + return ret; + ret = qda_cb_populate(qdev, rpdev->dev.of_node); if (ret) { dev_err(qdev->dev, "Failed to populate child devices: %d\n", ret); + qda_deinit_device(qdev); return ret; }
ret = qda_register_device(qdev); if (ret) { + qda_deinit_device(qdev); qda_cb_unpopulate(qdev); return ret; }
On Tue, May 19, 2026 at 11:45:57AM +0530, Ekansh Gupta via B4 Relay wrote:
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Introduce the QDA memory manager (qda_memory_manager) to track and manage the IOMMU devices that back each compute context bank (CB).
Each CB device registered on the qda-compute-cb bus is assigned a unique ID via an XArray and wrapped in a qda_iommu_device descriptor
Why do you need an XArray? The number of devices is (more or less) fixed. You can use a normal array, allocated in the probe function after counting OF children nodes.
that records the device pointer and its stream ID. This registry allows the driver to look up the correct IOMMU domain for a given session when mapping DSP buffers.
The memory manager is initialised in qda_init_device() before CB devices are populated and torn down in qda_deinit_device() after they are destroyed, ensuring no dangling references remain in the XArray.
qda_cb.c is extended with qda_cb_setup_device(), which is called immediately after a CB device is registered on the bus. It allocates a qda_iommu_device, registers it with the memory manager, and stores it as the CB device's driver data so that qda_destroy_cb_device() can retrieve and unregister it during teardown.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
drivers/accel/qda/Makefile | 1 + drivers/accel/qda/qda_cb.c | 47 ++++++++++++++ drivers/accel/qda/qda_drv.c | 34 ++++++++++ drivers/accel/qda/qda_drv.h | 5 ++ drivers/accel/qda/qda_memory_manager.c | 111 +++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_memory_manager.h | 49 +++++++++++++++ drivers/accel/qda/qda_rpmsg.c | 7 +++ 7 files changed, 254 insertions(+)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index 143c9e4e789e..701fad5ffb50 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -8,6 +8,7 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o qda-y := \ qda_cb.o \ qda_drv.o \
- qda_memory_manager.o \ qda_rpmsg.o
obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o diff --git a/drivers/accel/qda/qda_cb.c b/drivers/accel/qda/qda_cb.c index 77caf8438c67..6d540bb0ec7b 100644 --- a/drivers/accel/qda/qda_cb.c +++ b/drivers/accel/qda/qda_cb.c @@ -8,11 +8,42 @@ #include <linux/slab.h> #include <drm/drm_print.h> #include "qda_drv.h" +#include "qda_memory_manager.h" #include "qda_cb.h" +static int qda_cb_setup_device(struct qda_dev *qdev, struct device *cb_dev, u32 sid) +{
- struct qda_iommu_device *iommu_dev;
- int rc;
- drm_dbg_driver(&qdev->drm_dev, "Setting up CB device %s\n", dev_name(cb_dev));
- iommu_dev = kzalloc_obj(*iommu_dev);
- if (!iommu_dev)
return -ENOMEM;- iommu_dev->dev = cb_dev;
- iommu_dev->qdev = qdev;
- iommu_dev->sid = sid;
- rc = qda_memory_manager_register_device(qdev->iommu_mgr, iommu_dev);
- if (rc) {
drm_err(&qdev->drm_dev, "Failed to register IOMMU device: %d\n", rc);kfree(iommu_dev);return rc;- }
- dev_set_drvdata(cb_dev, iommu_dev);
- drm_dbg_driver(&qdev->drm_dev, "CB device setup complete - SID: %u\n", sid);
- return 0;
+}
int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node) { struct device *cb_dev;
- int ret; u32 sid = 0; char name[64]; struct qda_cb_dev *entry;
@@ -30,6 +61,13 @@ int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node) return PTR_ERR(cb_dev); }
- ret = qda_cb_setup_device(qdev, cb_dev, sid);
- if (ret) {
drm_err(&qdev->drm_dev, "CB device setup failed: %d\n", ret);device_unregister(cb_dev);return ret;- }
- entry = kzalloc_obj(*entry); if (!entry) { device_unregister(cb_dev);
@@ -80,6 +118,7 @@ int qda_cb_populate(struct qda_dev *qdev, struct device_node *parent_node) void qda_destroy_cb_device(struct device *cb_dev) { struct iommu_group *group;
- struct qda_iommu_device *iommu_dev;
if (!cb_dev) { pr_debug("qda: NULL CB device passed to destroy\n"); @@ -88,6 +127,14 @@ void qda_destroy_cb_device(struct device *cb_dev) dev_dbg(cb_dev, "Destroying CB device %s\n", dev_name(cb_dev));
- iommu_dev = dev_get_drvdata(cb_dev);
- if (iommu_dev && iommu_dev->qdev && iommu_dev->qdev->iommu_mgr) {
dev_dbg(cb_dev, "Unregistering IOMMU device for %s\n",dev_name(cb_dev));qda_memory_manager_unregister_device(iommu_dev->qdev->iommu_mgr,iommu_dev);- }
- group = iommu_group_get(cb_dev); if (group) { dev_dbg(cb_dev, "Removing %s from IOMMU group\n", dev_name(cb_dev));
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index 6c20d6a2fc47..0ad5d9873d7e 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -57,6 +57,40 @@ struct qda_dev *qda_alloc_device(struct device *dev) return qdev; } +static void cleanup_memory_manager(struct qda_dev *qdev)
Prefixes...
+{
- if (qdev->iommu_mgr) {
qda_memory_manager_exit(qdev->iommu_mgr);kfree(qdev->iommu_mgr);qdev->iommu_mgr = NULL;- }
+}
+static int init_memory_manager(struct qda_dev *qdev) +{
- qdev->iommu_mgr = kzalloc_obj(*qdev->iommu_mgr);
- if (!qdev->iommu_mgr)
return -ENOMEM;- return qda_memory_manager_init(qdev->iommu_mgr);
+}
+void qda_deinit_device(struct qda_dev *qdev) +{
- cleanup_memory_manager(qdev);
Ugh, inline all your one-line wrappers.
+}
+int qda_init_device(struct qda_dev *qdev) +{
- int ret;
- ret = init_memory_manager(qdev);
- if (ret)
drm_err(&qdev->drm_dev, "Failed to initialize memory manager: %d\n", ret);- return ret;
+}
void qda_unregister_device(struct qda_dev *qdev) { drm_dev_unregister(&qdev->drm_dev); diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h index 2715f378775d..eb089e586b17 100644 --- a/drivers/accel/qda/qda_drv.h +++ b/drivers/accel/qda/qda_drv.h @@ -13,6 +13,7 @@ #include <drm/drm_device.h> #include <drm/drm_drv.h> #include <drm/drm_file.h> +#include "qda_memory_manager.h" /* Driver identification */ #define QDA_DRIVER_NAME "qda" @@ -40,6 +41,8 @@ struct qda_dev { struct device *dev; /** @cb_devs: Compute context-bank (CB) child devices */ struct list_head cb_devs;
- /** @iommu_mgr: IOMMU/memory manager instance */
- struct qda_memory_manager *iommu_mgr; /** @dsp_name: Name of the DSP domain (e.g. "cdsp", "adsp") */ const char *dsp_name;
}; @@ -59,6 +62,8 @@ static inline struct qda_dev *qda_dev_from_drm(struct drm_device *dev) struct qda_dev *qda_alloc_device(struct device *dev); /* Core device lifecycle */ +int qda_init_device(struct qda_dev *qdev); +void qda_deinit_device(struct qda_dev *qdev); int qda_register_device(struct qda_dev *qdev); void qda_unregister_device(struct qda_dev *qdev); diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c new file mode 100644 index 000000000000..00a9c0ae4224 --- /dev/null +++ b/drivers/accel/qda/qda_memory_manager.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+#include <linux/refcount.h> +#include <linux/slab.h> +#include <linux/spinlock.h> +#include <linux/xarray.h> +#include <drm/drm_file.h> +#include "qda_drv.h" +#include "qda_memory_manager.h"
+static void cleanup_all_memory_devices(struct qda_memory_manager *mem_mgr) +{
- unsigned long index;
- void *entry;
- pr_debug("qda: Starting cleanup of all memory devices\n");
pr_debug is a third way to debug. Stop it, please.
- xa_for_each(&mem_mgr->device_xa, index, entry) {
struct qda_iommu_device *iommu_dev = entry;pr_debug("qda: Cleaning up device id=%lu\n", index);xa_erase(&mem_mgr->device_xa, index);kfree(iommu_dev);- }
- pr_debug("qda: Completed cleanup of all memory devices\n");
+}
On Tue, May 19, 2026 at 11:45:57AM +0530, Ekansh Gupta via B4 Relay wrote:
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Introduce the QDA memory manager (qda_memory_manager) to track and manage the IOMMU devices that back each compute context bank (CB).
Each CB device registered on the qda-compute-cb bus is assigned a unique ID via an XArray and wrapped in a qda_iommu_device descriptor that records the device pointer and its stream ID. This registry allows the driver to look up the correct IOMMU domain for a given session when mapping DSP buffers.
The memory manager is initialised in qda_init_device() before CB devices are populated and torn down in qda_deinit_device() after they are destroyed, ensuring no dangling references remain in the XArray.
qda_cb.c is extended with qda_cb_setup_device(), which is called immediately after a CB device is registered on the bus. It allocates a qda_iommu_device, registers it with the memory manager, and stores it as the CB device's driver data so that qda_destroy_cb_device() can retrieve and unregister it during teardown.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
drivers/accel/qda/Makefile | 1 + drivers/accel/qda/qda_cb.c | 47 ++++++++++++++ drivers/accel/qda/qda_drv.c | 34 ++++++++++ drivers/accel/qda/qda_drv.h | 5 ++ drivers/accel/qda/qda_memory_manager.c | 111 +++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_memory_manager.h | 49 +++++++++++++++ drivers/accel/qda/qda_rpmsg.c | 7 +++ 7 files changed, 254 insertions(+)
@@ -61,14 +62,20 @@ static int qda_rpmsg_probe(struct rpmsg_device *rpdev) } qdev->dsp_name = label;
- ret = qda_init_device(qdev);
- if (ret)
return ret;- ret = qda_cb_populate(qdev, rpdev->dev.of_node); if (ret) { dev_err(qdev->dev, "Failed to populate child devices: %d\n", ret);
return ret; }qda_deinit_device(qdev);ret = qda_register_device(qdev); if (ret) {
qda_cb_unpopulate(qdev);qda_deinit_device(qdev);
No, this is not how you unwind in the error case in the kernel. Follow the established patterns.
return ret;}
-- 2.34.1
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Introduce the DRM_IOCTL_QDA_QUERY IOCTL, which allows user-space to identify which DSP domain a given /dev/accel/accel* node represents (e.g. "cdsp", "adsp").
include/uapi/drm/qda_accel.h Defines the QDA IOCTL command numbers and the associated data structures. The header follows the standard DRM UAPI conventions: __u8/__u32 types, a C++ extern "C" guard, and GPL-2.0-only WITH Linux-syscall-note licensing.
drivers/accel/qda/qda_ioctl.c / qda_ioctl.h Implements qda_ioctl_query(), which copies the DSP domain name stored in qda_dev.dsp_name into the user-supplied drm_qda_query buffer using strscpy().
drivers/accel/qda/qda_drv.c Registers the qda_ioctls[] table with the drm_driver so that the DRM core dispatches DRM_IOCTL_QDA_QUERY to qda_ioctl_query().
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/accel/qda/Makefile | 1 + drivers/accel/qda/qda_drv.c | 8 +++++++ drivers/accel/qda/qda_ioctl.c | 26 +++++++++++++++++++++++ drivers/accel/qda/qda_ioctl.h | 13 ++++++++++++ include/uapi/drm/qda_accel.h | 49 +++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 97 insertions(+)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index 701fad5ffb50..b658dad35fee 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -8,6 +8,7 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o qda-y := \ qda_cb.o \ qda_drv.o \ + qda_ioctl.o \ qda_memory_manager.o \ qda_rpmsg.o
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index 0ad5d9873d7e..becd831d10be 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -8,8 +8,10 @@ #include <drm/drm_gem.h> #include <drm/drm_ioctl.h> #include <drm/drm_print.h> +#include <drm/qda_accel.h>
#include "qda_drv.h" +#include "qda_ioctl.h" #include "qda_rpmsg.h"
static int qda_open(struct drm_device *dev, struct drm_file *file) @@ -36,11 +38,17 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file)
DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
+static const struct drm_ioctl_desc qda_ioctls[] = { + DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0), +}; + static const struct drm_driver qda_drm_driver = { .driver_features = DRIVER_COMPUTE_ACCEL, .fops = &qda_accel_fops, .open = qda_open, .postclose = qda_postclose, + .ioctls = qda_ioctls, + .num_ioctls = ARRAY_SIZE(qda_ioctls), .name = QDA_DRIVER_NAME, .desc = "Qualcomm DSP Accelerator Driver", }; diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c new file mode 100644 index 000000000000..761d3567c33f --- /dev/null +++ b/drivers/accel/qda/qda_ioctl.c @@ -0,0 +1,26 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <drm/drm_ioctl.h> +#include <drm/qda_accel.h> +#include "qda_drv.h" +#include "qda_ioctl.h" + +/** + * qda_ioctl_query() - Query DSP device information + * @dev: DRM device structure + * @data: User-space data (struct drm_qda_query) + * @file_priv: DRM file private data + * + * Return: 0 on success, negative error code on failure + */ +int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv) +{ + struct drm_qda_query *args = data; + struct qda_dev *qdev; + + qdev = qda_dev_from_drm(dev); + + strscpy(args->dsp_name, qdev->dsp_name, sizeof(args->dsp_name)); + + return 0; +} diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h new file mode 100644 index 000000000000..b8fd536a111f --- /dev/null +++ b/drivers/accel/qda/qda_ioctl.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. + */ + +#ifndef __QDA_IOCTL_H__ +#define __QDA_IOCTL_H__ + +#include "qda_drv.h" + +int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv); + +#endif /* __QDA_IOCTL_H__ */ diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h new file mode 100644 index 000000000000..1971a4263065 --- /dev/null +++ b/include/uapi/drm/qda_accel.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. + */ + +#ifndef __QDA_ACCEL_H__ +#define __QDA_ACCEL_H__ + +#include "drm.h" + +#if defined(__cplusplus) +extern "C" { +#endif + +/* + * QDA IOCTL command numbers + * + * These define the command numbers for QDA-specific IOCTLs. + * They are used with DRM_COMMAND_BASE to create the full IOCTL numbers. + */ +#define DRM_QDA_QUERY 0x00 + +/* + * QDA IOCTL definitions + * + * These macros define the actual IOCTL numbers used by userspace applications. + * They combine the command numbers with DRM_COMMAND_BASE and specify the + * data structure and direction (read/write) for each IOCTL. + */ +#define DRM_IOCTL_QDA_QUERY DRM_IOR(DRM_COMMAND_BASE + DRM_QDA_QUERY, \ + struct drm_qda_query) + +/** + * struct drm_qda_query - Device information query structure + * @dsp_name: Name of DSP (e.g., "adsp", "cdsp", "cdsp1", "gdsp0", "gdsp1") + * + * This structure is used with DRM_IOCTL_QDA_QUERY to query device type, + * allowing userspace to identify which DSP a device node represents. The + * kernel provides the DSP name directly as a null-terminated string. + */ +struct drm_qda_query { + __u8 dsp_name[16]; +}; + +#if defined(__cplusplus) +} +#endif + +#endif /* __QDA_ACCEL_H__ */
On Tue, May 19, 2026 at 11:45:58AM +0530, Ekansh Gupta via B4 Relay wrote:
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Introduce the DRM_IOCTL_QDA_QUERY IOCTL, which allows user-space to identify which DSP domain a given /dev/accel/accel* node represents (e.g. "cdsp", "adsp").
include/uapi/drm/qda_accel.h Defines the QDA IOCTL command numbers and the associated data structures. The header follows the standard DRM UAPI conventions: __u8/__u32 types, a C++ extern "C" guard, and GPL-2.0-only WITH Linux-syscall-note licensing.
drivers/accel/qda/qda_ioctl.c / qda_ioctl.h Implements qda_ioctl_query(), which copies the DSP domain name stored in qda_dev.dsp_name into the user-supplied drm_qda_query buffer using strscpy().
drivers/accel/qda/qda_drv.c Registers the qda_ioctls[] table with the drm_driver so that the DRM core dispatches DRM_IOCTL_QDA_QUERY to qda_ioctl_query().
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
drivers/accel/qda/Makefile | 1 + drivers/accel/qda/qda_drv.c | 8 +++++++ drivers/accel/qda/qda_ioctl.c | 26 +++++++++++++++++++++++ drivers/accel/qda/qda_ioctl.h | 13 ++++++++++++ include/uapi/drm/qda_accel.h | 49 +++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 97 insertions(+)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index 701fad5ffb50..b658dad35fee 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -8,6 +8,7 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o qda-y := \ qda_cb.o \ qda_drv.o \
- qda_ioctl.o \ qda_memory_manager.o \ qda_rpmsg.o
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index 0ad5d9873d7e..becd831d10be 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -8,8 +8,10 @@ #include <drm/drm_gem.h> #include <drm/drm_ioctl.h> #include <drm/drm_print.h> +#include <drm/qda_accel.h> #include "qda_drv.h" +#include "qda_ioctl.h" #include "qda_rpmsg.h" static int qda_open(struct drm_device *dev, struct drm_file *file) @@ -36,11 +38,17 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file) DEFINE_DRM_ACCEL_FOPS(qda_accel_fops); +static const struct drm_ioctl_desc qda_ioctls[] = {
- DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0),
+};
static const struct drm_driver qda_drm_driver = { .driver_features = DRIVER_COMPUTE_ACCEL, .fops = &qda_accel_fops, .open = qda_open, .postclose = qda_postclose,
- .ioctls = qda_ioctls,
- .num_ioctls = ARRAY_SIZE(qda_ioctls), .name = QDA_DRIVER_NAME, .desc = "Qualcomm DSP Accelerator Driver",
}; diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c new file mode 100644 index 000000000000..761d3567c33f --- /dev/null +++ b/drivers/accel/qda/qda_ioctl.c @@ -0,0 +1,26 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <drm/drm_ioctl.h> +#include <drm/qda_accel.h> +#include "qda_drv.h" +#include "qda_ioctl.h"
+/**
- qda_ioctl_query() - Query DSP device information
- @dev: DRM device structure
- @data: User-space data (struct drm_qda_query)
- @file_priv: DRM file private data
- Return: 0 on success, negative error code on failure
- */
+int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv) +{
- struct drm_qda_query *args = data;
- struct qda_dev *qdev;
- qdev = qda_dev_from_drm(dev);
- strscpy(args->dsp_name, qdev->dsp_name, sizeof(args->dsp_name));
- return 0;
+} diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h new file mode 100644 index 000000000000..b8fd536a111f --- /dev/null +++ b/drivers/accel/qda/qda_ioctl.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/*
- Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
- */
+#ifndef __QDA_IOCTL_H__ +#define __QDA_IOCTL_H__
+#include "qda_drv.h"
+int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv);
+#endif /* __QDA_IOCTL_H__ */ diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h new file mode 100644 index 000000000000..1971a4263065 --- /dev/null +++ b/include/uapi/drm/qda_accel.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */ +/*
- Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
- */
+#ifndef __QDA_ACCEL_H__ +#define __QDA_ACCEL_H__
+#include "drm.h"
+#if defined(__cplusplus) +extern "C" { +#endif
+/*
- QDA IOCTL command numbers
- These define the command numbers for QDA-specific IOCTLs.
- They are used with DRM_COMMAND_BASE to create the full IOCTL numbers.
- */
+#define DRM_QDA_QUERY 0x00
+/*
- QDA IOCTL definitions
- These macros define the actual IOCTL numbers used by userspace applications.
- They combine the command numbers with DRM_COMMAND_BASE and specify the
- data structure and direction (read/write) for each IOCTL.
- */
+#define DRM_IOCTL_QDA_QUERY DRM_IOR(DRM_COMMAND_BASE + DRM_QDA_QUERY, \
struct drm_qda_query)+/**
- struct drm_qda_query - Device information query structure
- @dsp_name: Name of DSP (e.g., "adsp", "cdsp", "cdsp1", "gdsp0", "gdsp1")
- This structure is used with DRM_IOCTL_QDA_QUERY to query device type,
- allowing userspace to identify which DSP a device node represents. The
- kernel provides the DSP name directly as a null-terminated string.
- */
+struct drm_qda_query {
- __u8 dsp_name[16];
Are you sure that you want to query only the name? No extra options, no attributes, no hardware capabilities?
+};
+#if defined(__cplusplus) +} +#endif
+#endif /* __QDA_ACCEL_H__ */
-- 2.34.1
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Introduce DMA-coherent buffer management for the QDA driver, wiring together the GEM subsystem, the IOMMU memory manager, and a DMA allocation backend.
qda_gem.c / qda_gem.h Implements the GEM object lifecycle for QDA buffers. Each buffer is represented by a qda_gem_obj which embeds a drm_gem_object and carries the kernel virtual address, DMA address, and a pointer to the IOMMU device that performed the allocation. The .free callback delegates to the memory manager, and the .mmap callback uses dma_mmap_coherent() via the DMA backend.
qda_memory_dma.c / qda_memory_dma.h DMA coherent allocation backend. qda_dma_alloc() calls dma_alloc_coherent() on the CB device and encodes the stream ID (SID) in the upper 32 bits of the returned DMA address, following the Qualcomm FastRPC convention for IOMMU address space tagging. qda_dma_free() strips the SID prefix before calling dma_free_coherent().
qda_memory_manager.c Adds process-to-device assignment: each DRM file (process) is assigned one IOMMU context bank device for the lifetime of the session. qda_memory_manager_assign_device() first checks whether the process already has a device (reusing it with a refcount increment), then falls back to claiming an unassigned device. qda_memory_manager_alloc() and qda_memory_manager_free() delegate to the DMA backend after resolving the correct CB device for the calling process.
qda_drv.c / qda_drv.h qda_file_priv gains an assigned_iommu_dev pointer and a pid field. The .postclose callback decrements the IOMMU device refcount and clears the process assignment when the last reference is dropped.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/accel/qda/Makefile | 2 + drivers/accel/qda/qda_drv.c | 13 ++ drivers/accel/qda/qda_drv.h | 4 + drivers/accel/qda/qda_gem.c | 156 +++++++++++++++++++++++ drivers/accel/qda/qda_gem.h | 54 ++++++++ drivers/accel/qda/qda_memory_dma.c | 110 ++++++++++++++++ drivers/accel/qda/qda_memory_dma.h | 17 +++ drivers/accel/qda/qda_memory_manager.c | 224 +++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_memory_manager.h | 28 ++++- 9 files changed, 607 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index b658dad35fee..a46ddceecfc5 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -8,7 +8,9 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o qda-y := \ qda_cb.o \ qda_drv.o \ + qda_gem.o \ qda_ioctl.o \ + qda_memory_dma.o \ qda_memory_manager.o \ qda_rpmsg.o
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index becd831d10be..1b534fea50c8 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -22,6 +22,7 @@ static int qda_open(struct drm_device *dev, struct drm_file *file) if (!qda_file_priv) return -ENOMEM;
+ qda_file_priv->pid = current->pid; qda_file_priv->qda_dev = qda_dev_from_drm(dev); file->driver_priv = qda_file_priv;
@@ -32,6 +33,18 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file) { struct qda_file_priv *qda_file_priv = file->driver_priv;
+ if (qda_file_priv->assigned_iommu_dev) { + struct qda_iommu_device *iommu_dev = qda_file_priv->assigned_iommu_dev; + unsigned long flags; + + if (refcount_dec_and_test(&iommu_dev->refcount)) { + spin_lock_irqsave(&iommu_dev->lock, flags); + iommu_dev->assigned_pid = 0; + iommu_dev->assigned_file_priv = NULL; + spin_unlock_irqrestore(&iommu_dev->lock, flags); + } + } + kfree(qda_file_priv); file->driver_priv = NULL; } diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h index eb089e586b17..8a7d647ac8fc 100644 --- a/drivers/accel/qda/qda_drv.h +++ b/drivers/accel/qda/qda_drv.h @@ -24,6 +24,10 @@ struct qda_file_priv { /** @qda_dev: Back-pointer to device structure */ struct qda_dev *qda_dev; + /** @assigned_iommu_dev: IOMMU device assigned to this process */ + struct qda_iommu_device *assigned_iommu_dev; + /** @pid: Process ID for tracking */ + pid_t pid; };
/** diff --git a/drivers/accel/qda/qda_gem.c b/drivers/accel/qda/qda_gem.c new file mode 100644 index 000000000000..568b3c2e64b7 --- /dev/null +++ b/drivers/accel/qda/qda_gem.c @@ -0,0 +1,156 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <drm/drm_gem.h> +#include <drm/drm_prime.h> +#include <drm/drm_print.h> +#include <linux/slab.h> +#include <linux/dma-mapping.h> +#include "qda_drv.h" +#include "qda_gem.h" +#include "qda_memory_manager.h" +#include "qda_memory_dma.h" + +static void setup_vma_flags(struct vm_area_struct *vma) +{ + vm_flags_set(vma, VM_DONTEXPAND); + vm_flags_set(vma, VM_DONTDUMP); +} + +/** + * qda_gem_free_object() - Free a GEM object and its associated resources + * @gem_obj: DRM GEM object to free + */ +void qda_gem_free_object(struct drm_gem_object *gem_obj) +{ + struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(gem_obj); + struct qda_dev *qdev = qda_dev_from_drm(gem_obj->dev); + + if (qda_gem_obj->virt && qdev->iommu_mgr) + qda_memory_manager_free(qdev->iommu_mgr, qda_gem_obj); + + drm_gem_object_release(gem_obj); + kfree(qda_gem_obj); +} + +/** + * qda_gem_mmap_obj() - Map a GEM object into userspace + * @drm_obj: DRM GEM object to map + * @vma: Virtual memory area to map into + * + * Return: 0 on success, negative error code on failure + */ +int qda_gem_mmap_obj(struct drm_gem_object *drm_obj, struct vm_area_struct *vma) +{ + struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(drm_obj); + int ret; + + /* Reset vm_pgoff for DMA mmap */ + vma->vm_pgoff = 0; + + ret = qda_dma_mmap(qda_gem_obj, vma); + if (ret == 0) + setup_vma_flags(vma); + + return ret; +} + +static const struct drm_gem_object_funcs qda_gem_object_funcs = { + .free = qda_gem_free_object, + .mmap = qda_gem_mmap_obj, +}; + +/** + * qda_gem_alloc_object() - Allocate a new QDA GEM object + * @drm_dev: DRM device + * @aligned_size: Size of the object in bytes (must be page-aligned) + * + * Return: Pointer to the new GEM object, or ERR_PTR on failure + */ +struct qda_gem_obj *qda_gem_alloc_object(struct drm_device *drm_dev, size_t aligned_size) +{ + struct qda_gem_obj *qda_gem_obj; + int ret; + + qda_gem_obj = kzalloc_obj(*qda_gem_obj); + if (!qda_gem_obj) + return ERR_PTR(-ENOMEM); + + ret = drm_gem_object_init(drm_dev, &qda_gem_obj->base, aligned_size); + if (ret) { + drm_err(drm_dev, "Failed to initialize GEM object: %d\n", ret); + kfree(qda_gem_obj); + return ERR_PTR(ret); + } + + qda_gem_obj->base.funcs = &qda_gem_object_funcs; + qda_gem_obj->size = aligned_size; + + drm_dbg_driver(drm_dev, "Allocated GEM object size=%zu\n", aligned_size); + return qda_gem_obj; +} + +void qda_gem_cleanup_object(struct qda_gem_obj *qda_gem_obj) +{ + drm_gem_object_release(&qda_gem_obj->base); + kfree(qda_gem_obj); +} + +struct drm_gem_object *qda_gem_lookup_object(struct drm_file *file_priv, u32 handle) +{ + struct drm_gem_object *gem_obj; + + gem_obj = drm_gem_object_lookup(file_priv, handle); + if (!gem_obj) + return ERR_PTR(-ENOENT); + + return gem_obj; +} + +int qda_gem_create_handle(struct drm_file *file_priv, struct drm_gem_object *gem_obj, u32 *handle) +{ + int ret; + + ret = drm_gem_handle_create(file_priv, gem_obj, handle); + drm_gem_object_put(gem_obj); + + return ret; +} + +/** + * qda_gem_create_object() - Allocate and initialize a GEM object with DMA backing + * @drm_dev: DRM device + * @iommu_mgr: Memory manager to use for DMA allocation + * @size: Requested size in bytes + * @file_priv: DRM file private data for process association + * + * Return: Pointer to the base DRM GEM object on success, ERR_PTR on failure + */ +struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev, + struct qda_memory_manager *iommu_mgr, size_t size, + struct drm_file *file_priv) +{ + struct qda_gem_obj *qda_gem_obj; + size_t aligned_size; + int ret; + + if (size == 0) { + drm_err(drm_dev, "Invalid size for GEM object creation\n"); + return ERR_PTR(-EINVAL); + } + + aligned_size = PAGE_ALIGN(size); + + qda_gem_obj = qda_gem_alloc_object(drm_dev, aligned_size); + if (IS_ERR(qda_gem_obj)) + return ERR_CAST(qda_gem_obj); + + ret = qda_memory_manager_alloc(iommu_mgr, qda_gem_obj, file_priv); + if (ret) { + drm_err(drm_dev, "Memory manager allocation failed: %d\n", ret); + qda_gem_cleanup_object(qda_gem_obj); + return ERR_PTR(ret); + } + + drm_dbg_driver(drm_dev, "GEM object created successfully size=%zu\n", aligned_size); + return &qda_gem_obj->base; +} diff --git a/drivers/accel/qda/qda_gem.h b/drivers/accel/qda/qda_gem.h new file mode 100644 index 000000000000..bb18f8155aa4 --- /dev/null +++ b/drivers/accel/qda/qda_gem.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. + */ +#ifndef __QDA_GEM_H__ +#define __QDA_GEM_H__ + +#include <linux/dma-mapping.h> +#include <linux/xarray.h> +#include <drm/drm_device.h> +#include <drm/drm_gem.h> +#include "qda_memory_manager.h" + +/** + * struct qda_gem_obj - QDA GEM buffer object + * + * Represents a GEM buffer object that can be allocated by the driver + * or imported from another driver via DMA-BUF. + */ +struct qda_gem_obj { + /** @base: DRM GEM object base — must be first member */ + struct drm_gem_object base; + /** @iommu_dev: IOMMU context bank device that performed the allocation */ + struct qda_iommu_device *iommu_dev; + /** @virt: Kernel virtual address of the allocated DMA memory */ + void *virt; + /** @dma_addr: DMA address (with SID encoded in upper 32 bits) */ + dma_addr_t dma_addr; + /** @size: Size of the buffer in bytes */ + size_t size; +}; + +/** + * to_qda_gem_obj - Cast a drm_gem_object pointer to qda_gem_obj + * @gem_obj: Pointer to the embedded drm_gem_object + */ +#define to_qda_gem_obj(gem_obj) container_of(gem_obj, struct qda_gem_obj, base) + +/* GEM object lifecycle */ +struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev, + struct qda_memory_manager *iommu_mgr, + size_t size, struct drm_file *file_priv); +void qda_gem_free_object(struct drm_gem_object *gem_obj); +int qda_gem_mmap_obj(struct drm_gem_object *gem_obj, struct vm_area_struct *vma); + +/* Internal helpers (also used by PRIME import) */ +struct qda_gem_obj *qda_gem_alloc_object(struct drm_device *drm_dev, size_t aligned_size); +void qda_gem_cleanup_object(struct qda_gem_obj *qda_gem_obj); + +/* Utility functions */ +struct drm_gem_object *qda_gem_lookup_object(struct drm_file *file_priv, u32 handle); +int qda_gem_create_handle(struct drm_file *file_priv, struct drm_gem_object *gem_obj, u32 *handle); + +#endif /* __QDA_GEM_H__ */ diff --git a/drivers/accel/qda/qda_memory_dma.c b/drivers/accel/qda/qda_memory_dma.c new file mode 100644 index 000000000000..97488c755d2d --- /dev/null +++ b/drivers/accel/qda/qda_memory_dma.c @@ -0,0 +1,110 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <linux/slab.h> +#include <linux/dma-mapping.h> +#include "qda_drv.h" +#include "qda_gem.h" +#include "qda_memory_dma.h" + +static dma_addr_t get_actual_dma_addr(struct qda_gem_obj *gem_obj) +{ + return gem_obj->dma_addr - ((u64)gem_obj->iommu_dev->sid << 32); +} + +static void setup_gem_object(struct qda_gem_obj *gem_obj, void *virt, + dma_addr_t dma_addr, struct qda_iommu_device *iommu_dev) +{ + gem_obj->virt = virt; + gem_obj->dma_addr = dma_addr; + gem_obj->iommu_dev = iommu_dev; +} + +static void cleanup_gem_object_fields(struct qda_gem_obj *gem_obj) +{ + gem_obj->virt = NULL; + gem_obj->dma_addr = 0; + gem_obj->iommu_dev = NULL; +} + +/** + * qda_dma_alloc() - Allocate DMA coherent memory for a GEM object + * @iommu_dev: Pointer to the QDA IOMMU device structure + * @gem_obj: Pointer to GEM object to allocate memory for + * @size: Size of memory to allocate in bytes + * + * Return: 0 on success, negative error code on failure + */ +int qda_dma_alloc(struct qda_iommu_device *iommu_dev, + struct qda_gem_obj *gem_obj, size_t size) +{ + void *virt; + dma_addr_t dma_addr; + + if (!iommu_dev || !iommu_dev->dev) { + pr_err("qda: Invalid iommu_dev or device for DMA allocation\n"); + return -EINVAL; + } + + virt = dma_alloc_coherent(iommu_dev->dev, size, &dma_addr, GFP_KERNEL); + if (!virt) + return -ENOMEM; + + dma_addr += ((u64)iommu_dev->sid << 32); + + dev_dbg(iommu_dev->dev, "DMA address with SID prefix: 0x%llx (sid=%u)\n", + (u64)dma_addr, iommu_dev->sid); + + setup_gem_object(gem_obj, virt, dma_addr, iommu_dev); + + return 0; +} + +/** + * qda_dma_free() - Free DMA coherent memory for a GEM object + * @gem_obj: Pointer to GEM object to free memory for + */ +void qda_dma_free(struct qda_gem_obj *gem_obj) +{ + if (!gem_obj || !gem_obj->iommu_dev) { + pr_debug("qda: Invalid gem_obj or iommu_dev for DMA free\n"); + return; + } + + dev_dbg(gem_obj->iommu_dev->dev, "DMA freeing: size=%zu, device_id=%u, dma_addr=0x%llx\n", + gem_obj->size, gem_obj->iommu_dev->id, gem_obj->dma_addr); + + dma_free_coherent(gem_obj->iommu_dev->dev, gem_obj->size, + gem_obj->virt, get_actual_dma_addr(gem_obj)); + + cleanup_gem_object_fields(gem_obj); +} + +/** + * qda_dma_mmap() - Map DMA memory into userspace + * @gem_obj: Pointer to GEM object containing DMA memory + * @vma: Virtual memory area to map into + * + * Return: 0 on success, negative error code on failure + */ +int qda_dma_mmap(struct qda_gem_obj *gem_obj, struct vm_area_struct *vma) +{ + struct qda_iommu_device *iommu_dev; + int ret; + + if (!gem_obj || !gem_obj->virt || !gem_obj->iommu_dev || !gem_obj->iommu_dev->dev) { + pr_err("qda: Invalid parameters for DMA mmap\n"); + return -EINVAL; + } + + iommu_dev = gem_obj->iommu_dev; + + ret = dma_mmap_coherent(iommu_dev->dev, vma, gem_obj->virt, + get_actual_dma_addr(gem_obj), gem_obj->size); + if (ret) { + dev_err(iommu_dev->dev, "DMA mmap failed: size=%zu, device_id=%u, ret=%d\n", + gem_obj->size, iommu_dev->id, ret); + return ret; + } + + return 0; +} diff --git a/drivers/accel/qda/qda_memory_dma.h b/drivers/accel/qda/qda_memory_dma.h new file mode 100644 index 000000000000..99352a99dc33 --- /dev/null +++ b/drivers/accel/qda/qda_memory_dma.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. + */ + +#ifndef __QDA_MEMORY_DMA_H__ +#define __QDA_MEMORY_DMA_H__ + +#include <linux/dma-mapping.h> +#include "qda_memory_manager.h" + +int qda_dma_alloc(struct qda_iommu_device *iommu_dev, + struct qda_gem_obj *gem_obj, size_t size); +void qda_dma_free(struct qda_gem_obj *gem_obj); +int qda_dma_mmap(struct qda_gem_obj *gem_obj, struct vm_area_struct *vma); + +#endif /* __QDA_MEMORY_DMA_H__ */ diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c index 00a9c0ae4224..82111275f420 100644 --- a/drivers/accel/qda/qda_memory_manager.c +++ b/drivers/accel/qda/qda_memory_manager.c @@ -6,8 +6,11 @@ #include <linux/spinlock.h> #include <linux/xarray.h> #include <drm/drm_file.h> +#include <drm/drm_print.h> #include "qda_drv.h" +#include "qda_gem.h" #include "qda_memory_manager.h" +#include "qda_memory_dma.h"
static void cleanup_all_memory_devices(struct qda_memory_manager *mem_mgr) { @@ -28,6 +31,14 @@ static void cleanup_all_memory_devices(struct qda_memory_manager *mem_mgr) pr_debug("qda: Completed cleanup of all memory devices\n"); }
+static void init_iommu_device_fields(struct qda_iommu_device *iommu_dev) +{ + spin_lock_init(&iommu_dev->lock); + refcount_set(&iommu_dev->refcount, 0); + iommu_dev->assigned_pid = 0; + iommu_dev->assigned_file_priv = NULL; +} + static int allocate_device_id(struct qda_memory_manager *mem_mgr, struct qda_iommu_device *iommu_dev, u32 *id) { @@ -44,6 +55,216 @@ static int allocate_device_id(struct qda_memory_manager *mem_mgr, return 0; }
+static struct qda_iommu_device *find_device_for_pid(struct qda_memory_manager *mem_mgr, + pid_t pid) +{ + unsigned long index; + void *entry; + struct qda_iommu_device *found_dev = NULL; + unsigned long flags; + + xa_lock_irqsave(&mem_mgr->device_xa, flags); + xa_for_each(&mem_mgr->device_xa, index, entry) { + struct qda_iommu_device *iommu_dev = entry; + + spin_lock(&iommu_dev->lock); + if (iommu_dev->assigned_pid == pid) { + found_dev = iommu_dev; + refcount_inc(&found_dev->refcount); + dev_dbg(found_dev->dev, "Reusing device id=%u for PID=%d (refcount=%u)\n", + found_dev->id, pid, refcount_read(&found_dev->refcount)); + spin_unlock(&iommu_dev->lock); + break; + } + spin_unlock(&iommu_dev->lock); + } + xa_unlock_irqrestore(&mem_mgr->device_xa, flags); + + return found_dev; +} + +static struct qda_iommu_device *assign_available_device_to_pid(struct qda_memory_manager *mem_mgr, + pid_t pid, + struct drm_file *file_priv) +{ + unsigned long index; + void *entry; + struct qda_iommu_device *selected_dev = NULL; + unsigned long flags; + + xa_lock_irqsave(&mem_mgr->device_xa, flags); + xa_for_each(&mem_mgr->device_xa, index, entry) { + struct qda_iommu_device *iommu_dev = entry; + + spin_lock(&iommu_dev->lock); + if (iommu_dev->assigned_pid == 0) { + iommu_dev->assigned_pid = pid; + iommu_dev->assigned_file_priv = file_priv; + selected_dev = iommu_dev; + refcount_set(&selected_dev->refcount, 1); + dev_dbg(selected_dev->dev, "Assigned device id=%u to PID=%d\n", + selected_dev->id, pid); + spin_unlock(&iommu_dev->lock); + break; + } + spin_unlock(&iommu_dev->lock); + } + xa_unlock_irqrestore(&mem_mgr->device_xa, flags); + + return selected_dev; +} + +static struct qda_iommu_device *get_process_iommu_device(struct qda_memory_manager *mem_mgr, + struct drm_file *file_priv) +{ + struct qda_file_priv *qda_priv; + + if (!file_priv || !file_priv->driver_priv) + return NULL; + + qda_priv = (struct qda_file_priv *)file_priv->driver_priv; + return qda_priv->assigned_iommu_dev; +} + +/** + * qda_memory_manager_assign_device() - Assign an IOMMU device to a process + * @mem_mgr: Pointer to memory manager + * @file_priv: DRM file private data for process association + * + * Return: 0 on success, negative error code on failure + */ +int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr, + struct drm_file *file_priv) +{ + struct qda_file_priv *qda_priv; + struct qda_iommu_device *selected_dev = NULL; + int ret = 0; + pid_t current_pid; + + if (!file_priv || !file_priv->driver_priv) { + pr_err("qda: Invalid file_priv or driver_priv\n"); + return -EINVAL; + } + + qda_priv = (struct qda_file_priv *)file_priv->driver_priv; + current_pid = qda_priv->pid; + + mutex_lock(&mem_mgr->process_assignment_lock); + + if (qda_priv->assigned_iommu_dev) { + dev_dbg(qda_priv->assigned_iommu_dev->dev, + "PID=%d already has device id=%u assigned\n", + current_pid, qda_priv->assigned_iommu_dev->id); + ret = 0; + goto unlock_and_return; + } + + selected_dev = find_device_for_pid(mem_mgr, current_pid); + + if (selected_dev) { + qda_priv->assigned_iommu_dev = selected_dev; + goto unlock_and_return; + } + + selected_dev = assign_available_device_to_pid(mem_mgr, current_pid, file_priv); + + if (!selected_dev) { + pr_err("qda: No available device for PID=%d\n", current_pid); + ret = -ENOMEM; + goto unlock_and_return; + } + + qda_priv->assigned_iommu_dev = selected_dev; + +unlock_and_return: + mutex_unlock(&mem_mgr->process_assignment_lock); + return ret; +} + +static struct qda_iommu_device *get_or_assign_iommu_device(struct qda_memory_manager *mem_mgr, + struct drm_file *file_priv) +{ + struct qda_iommu_device *iommu_dev; + int ret; + + iommu_dev = get_process_iommu_device(mem_mgr, file_priv); + if (iommu_dev) + return iommu_dev; + + ret = qda_memory_manager_assign_device(mem_mgr, file_priv); + if (ret) + return NULL; + + iommu_dev = get_process_iommu_device(mem_mgr, file_priv); + if (iommu_dev) + return iommu_dev; + + return NULL; +} + +/** + * qda_memory_manager_alloc() - Allocate memory for a GEM object + * @mem_mgr: Pointer to memory manager + * @gem_obj: Pointer to GEM object to allocate memory for + * @file_priv: DRM file private data for process association + * + * Return: 0 on success, negative error code on failure + */ +int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj, + struct drm_file *file_priv) +{ + struct qda_iommu_device *selected_dev; + size_t size; + int ret; + + if (!mem_mgr || !gem_obj || !file_priv) { + pr_err("qda: Invalid parameters for memory allocation\n"); + return -EINVAL; + } + + size = gem_obj->size; + if (size == 0) { + drm_err(gem_obj->base.dev, "Invalid allocation size: 0\n"); + return -EINVAL; + } + + selected_dev = get_or_assign_iommu_device(mem_mgr, file_priv); + + if (!selected_dev) { + drm_err(gem_obj->base.dev, + "Failed to get/assign device for allocation (size=%zu)\n", + size); + return -ENOMEM; + } + + ret = qda_dma_alloc(selected_dev, gem_obj, size); + if (ret) { + drm_err(gem_obj->base.dev, "Allocation failed: size=%zu, device_id=%u, ret=%d\n", + size, selected_dev->id, ret); + return ret; + } + + drm_dbg_driver(gem_obj->base.dev, + "Successfully allocated: size=%zu, device_id=%u, dma_addr=0x%llx\n", + size, selected_dev->id, gem_obj->dma_addr); + return 0; +} + +/** + * qda_memory_manager_free() - Free memory for a GEM object + * @mem_mgr: Pointer to memory manager + * @gem_obj: Pointer to GEM object to free memory for + */ +void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj) +{ + if (!gem_obj || !gem_obj->iommu_dev) { + pr_debug("qda: Invalid gem_obj or iommu_dev for free\n"); + return; + } + + qda_dma_free(gem_obj); +} + /** * qda_memory_manager_register_device() - Register an IOMMU device * @mem_mgr: Pointer to memory manager @@ -57,6 +278,8 @@ int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr, int ret; u32 id;
+ init_iommu_device_fields(iommu_dev); + ret = allocate_device_id(mem_mgr, iommu_dev, &id); if (ret) { dev_err(iommu_dev->dev, @@ -95,6 +318,7 @@ int qda_memory_manager_init(struct qda_memory_manager *mem_mgr) pr_debug("qda: Initializing memory manager\n");
xa_init_flags(&mem_mgr->device_xa, XA_FLAGS_ALLOC); + mutex_init(&mem_mgr->process_assignment_lock);
pr_debug("qda: Memory manager initialized successfully\n"); return 0; diff --git a/drivers/accel/qda/qda_memory_manager.h b/drivers/accel/qda/qda_memory_manager.h index 0243f9c0c5aa..252459bc10d0 100644 --- a/drivers/accel/qda/qda_memory_manager.h +++ b/drivers/accel/qda/qda_memory_manager.h @@ -7,8 +7,15 @@ #define __QDA_MEMORY_MANAGER_H__
#include <linux/device.h> +#include <linux/mutex.h> +#include <linux/refcount.h> +#include <linux/spinlock.h> #include <linux/xarray.h> -#include "qda_drv.h" +#include <drm/drm_file.h> + +/* Forward declarations */ +struct qda_dev; +struct qda_gem_obj;
/** * struct qda_iommu_device - IOMMU device instance for memory management @@ -21,10 +28,18 @@ struct qda_iommu_device { struct device *dev; /** @qdev: Back-pointer to the parent QDA device */ struct qda_dev *qdev; + /** @assigned_file_priv: DRM file private data for the assigned process */ + struct drm_file *assigned_file_priv; /** @id: Unique identifier assigned by the memory manager XArray */ u32 id; /** @sid: Stream ID for IOMMU transactions */ u32 sid; + /** @assigned_pid: Process ID of the process assigned to this device */ + pid_t assigned_pid; + /** @refcount: Reference counter for device */ + refcount_t refcount; + /** @lock: Spinlock protecting concurrent access to device */ + spinlock_t lock; };
/** @@ -36,6 +51,8 @@ struct qda_iommu_device { struct qda_memory_manager { /** @device_xa: XArray storing all registered IOMMU devices */ struct xarray device_xa; + /** @process_assignment_lock: Mutex protecting process-to-device assignments */ + struct mutex process_assignment_lock; };
int qda_memory_manager_init(struct qda_memory_manager *mem_mgr); @@ -46,4 +63,13 @@ int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr, void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr, struct qda_iommu_device *iommu_dev);
+int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr, + struct drm_file *file_priv); + +int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, + struct qda_gem_obj *gem_obj, + struct drm_file *file_priv); +void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, + struct qda_gem_obj *gem_obj); + #endif /* __QDA_MEMORY_MANAGER_H__ */
…
Assisted-by: Claude:claude-4-6-sonnet
…
Did such an information source gather the knowledge to benefit more from the application of scope-based resource management?
…
+++ b/drivers/accel/qda/qda_drv.c
…
@@ -32,6 +33,18 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file) {
…
if (refcount_dec_and_test(&iommu_dev->refcount)) {spin_lock_irqsave(&iommu_dev->lock, flags);iommu_dev->assigned_pid = 0;iommu_dev->assigned_file_priv = NULL;spin_unlock_irqrestore(&iommu_dev->lock, flags);}
…
Under which circumstances would you become interested to apply a statement like “guard(spinlock_irqsave)(&iommu_dev->lock);”? https://elixir.bootlin.com/linux/v7.1-rc4/source/include/linux/spinlock.h#L6...
Regards, Markus
Feel free to ignore everything Markus says.
On Tue, May 19, 2026 at 02:14:34PM +0200, Markus Elfring wrote:
…
Assisted-by: Claude:claude-4-6-sonnet
…
Did such an information source gather the knowledge to benefit more from the application of scope-based resource management?
…
+++ b/drivers/accel/qda/qda_drv.c
…
@@ -32,6 +33,18 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file) {
…
if (refcount_dec_and_test(&iommu_dev->refcount)) {spin_lock_irqsave(&iommu_dev->lock, flags);iommu_dev->assigned_pid = 0;iommu_dev->assigned_file_priv = NULL;spin_unlock_irqrestore(&iommu_dev->lock, flags);}…
Under which circumstances would you become interested to apply a statement like “guard(spinlock_irqsave)(&iommu_dev->lock);”? https://elixir.bootlin.com/linux/v7.1-rc4/source/include/linux/spinlock.h#L6...
Regards, Markus
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Expose two new DRM IOCTLs that allow user-space to allocate DMA-backed GEM buffer objects and retrieve their mmap offsets.
DRM_IOCTL_QDA_GEM_CREATE (drm_qda_gem_create) Allocates a DMA-coherent GEM buffer of the requested size. The memory manager assigns an IOMMU context bank to the calling process on first use and performs the DMA allocation against that bank. Returns a GEM handle that identifies the buffer for subsequent operations.
DRM_IOCTL_QDA_GEM_MMAP_OFFSET (drm_qda_gem_mmap_offset) Returns the mmap offset for an existing GEM handle. The offset can be passed directly to mmap() to map the buffer into user-space. Imported DMA-BUF objects are rejected by drm_gem_dumb_map_offset().
DRIVER_GEM is added to driver_features now that GEM IOCTLs are present.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/accel/qda/qda_drv.c | 4 +++- drivers/accel/qda/qda_ioctl.c | 50 +++++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_ioctl.h | 2 ++ include/uapi/drm/qda_accel.h | 36 +++++++++++++++++++++++++++++++ 4 files changed, 91 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index 1b534fea50c8..c9b9e56dcb28 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -53,10 +53,12 @@ DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
static const struct drm_ioctl_desc qda_ioctls[] = { DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0), + DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0), + DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0), };
static const struct drm_driver qda_drm_driver = { - .driver_features = DRIVER_COMPUTE_ACCEL, + .driver_features = DRIVER_GEM | DRIVER_COMPUTE_ACCEL, .fops = &qda_accel_fops, .open = qda_open, .postclose = qda_postclose, diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c index 761d3567c33f..1769c85a3e98 100644 --- a/drivers/accel/qda/qda_ioctl.c +++ b/drivers/accel/qda/qda_ioctl.c @@ -3,6 +3,7 @@ #include <drm/drm_ioctl.h> #include <drm/qda_accel.h> #include "qda_drv.h" +#include "qda_gem.h" #include "qda_ioctl.h"
/** @@ -24,3 +25,52 @@ int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_pr
return 0; } + +/** + * qda_ioctl_gem_create() - Create a GEM buffer object + * @dev: DRM device structure + * @data: User-space data (struct drm_qda_gem_create) + * @file_priv: DRM file private data + * + * Return: 0 on success, negative error code on failure + */ +int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *file_priv) +{ + struct drm_qda_gem_create *args = data; + struct drm_gem_object *gem_obj; + struct qda_dev *qdev; + + if (args->pad) + return -EINVAL; + + qdev = qda_dev_from_drm(dev); + if (!qdev->iommu_mgr) + return -ENODEV; + + gem_obj = qda_gem_create_object(dev, qdev->iommu_mgr, args->size, file_priv); + if (IS_ERR(gem_obj)) + return PTR_ERR(gem_obj); + + return qda_gem_create_handle(file_priv, gem_obj, &args->handle); +} + +/** + * qda_ioctl_gem_mmap_offset() - Get the mmap offset for a GEM object + * @dev: DRM device structure + * @data: User-space data (struct drm_qda_gem_mmap_offset) + * @file_priv: DRM file private data + * + * Uses drm_gem_dumb_map_offset() which rejects imported dma-buf objects + * (mmap of imported objects is not allowed). + * + * Return: 0 on success, negative error code on failure + */ +int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv) +{ + struct drm_qda_gem_mmap_offset *args = data; + + if (args->pad) + return -EINVAL; + + return drm_gem_dumb_map_offset(file_priv, dev, args->handle, &args->offset); +} diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h index b8fd536a111f..d1cbbfb6d965 100644 --- a/drivers/accel/qda/qda_ioctl.h +++ b/drivers/accel/qda/qda_ioctl.h @@ -9,5 +9,7 @@ #include "qda_drv.h"
int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv); +int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *file_priv); +int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv);
#endif /* __QDA_IOCTL_H__ */ diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h index 1971a4263065..319e21aae0d6 100644 --- a/include/uapi/drm/qda_accel.h +++ b/include/uapi/drm/qda_accel.h @@ -19,6 +19,8 @@ extern "C" { * They are used with DRM_COMMAND_BASE to create the full IOCTL numbers. */ #define DRM_QDA_QUERY 0x00 +#define DRM_QDA_GEM_CREATE 0x01 +#define DRM_QDA_GEM_MMAP_OFFSET 0x02
/* * QDA IOCTL definitions @@ -29,6 +31,10 @@ extern "C" { */ #define DRM_IOCTL_QDA_QUERY DRM_IOR(DRM_COMMAND_BASE + DRM_QDA_QUERY, \ struct drm_qda_query) +#define DRM_IOCTL_QDA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_CREATE, \ + struct drm_qda_gem_create) +#define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \ + struct drm_qda_gem_mmap_offset)
/** * struct drm_qda_query - Device information query structure @@ -42,6 +48,36 @@ struct drm_qda_query { __u8 dsp_name[16]; };
+/** + * struct drm_qda_gem_create - GEM buffer object creation parameters + * @size: Size of the GEM object to create in bytes (input) + * @handle: Allocated GEM handle (output) + * + * This structure is used with DRM_IOCTL_QDA_GEM_CREATE to allocate + * a new GEM buffer object. + */ +struct drm_qda_gem_create { + __u64 size; + __u32 handle; + __u32 pad; +}; + +/** + * struct drm_qda_gem_mmap_offset - GEM object mmap offset query + * @offset: mmap offset for the GEM object (output) + * @handle: GEM handle (input) + * @pad: Padding for 64-bit alignment + * + * This structure is used with DRM_IOCTL_QDA_GEM_MMAP_OFFSET to retrieve + * the mmap offset that can be used with mmap() to map the GEM object into + * user space. + */ +struct drm_qda_gem_mmap_offset { + __u64 offset; + __u32 handle; + __u32 pad; +}; + #if defined(__cplusplus) } #endif
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Allow user-space to import DMA-BUF file descriptors from other subsystems (GPU, camera, video) into the QDA driver via the standard DRM PRIME interface.
qda_prime.c Implements qda_gem_prime_import(), which is set as the driver's .gem_prime_import callback. On import it: 1. Short-circuits self-import: if the dma_buf was exported by this device and is not itself an import, the existing GEM object is returned with an incremented reference count. 2. Attaches to the dma_buf and maps it with DMA_BIDIRECTIONAL via dma_buf_map_attachment_unlocked(), obtaining an sg_table whose DMA addresses are IOMMU virtual addresses in the CB device's address space. 3. Calls qda_memory_manager_alloc() to record the IOMMU mapping and encode the SID in the upper 32 bits of the DMA address, matching the convention used for natively allocated buffers.
qda_prime_fd_to_handle() wraps drm_gem_prime_fd_to_handle() under qdev->import_lock, storing the calling file_priv in qdev->current_import_file_priv so that qda_gem_prime_import() can retrieve it (the .gem_prime_import callback does not receive file_priv directly).
qda_gem.c qda_gem_free_object() is extended to handle the imported-buffer teardown path: unmap the sg_table, detach from the dma_buf, and release the dma_buf reference. qda_gem_mmap_obj() rejects mmap requests on imported objects.
qda_memory_manager.c qda_memory_manager_map_imported() records the IOMMU-mapped DMA address from the first sg entry (the IOMMU maps the buffer as a contiguous range) and encodes the SID prefix. qda_memory_manager_free() skips the DMA free path for imported buffers since the memory is owned by the exporter.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/accel/qda/Makefile | 1 + drivers/accel/qda/qda_drv.c | 12 ++- drivers/accel/qda/qda_drv.h | 4 + drivers/accel/qda/qda_gem.c | 25 ++++- drivers/accel/qda/qda_gem.h | 8 ++ drivers/accel/qda/qda_memory_manager.c | 47 ++++++++- drivers/accel/qda/qda_prime.c | 184 +++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_prime.h | 18 ++++ 8 files changed, 295 insertions(+), 4 deletions(-)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index a46ddceecfc5..fb092e56d7f3 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -12,6 +12,7 @@ qda-y := \ qda_ioctl.o \ qda_memory_dma.o \ qda_memory_manager.o \ + qda_prime.o \ qda_rpmsg.o
obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index c9b9e56dcb28..ef8bd573b836 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -7,10 +7,12 @@ #include <drm/drm_file.h> #include <drm/drm_gem.h> #include <drm/drm_ioctl.h> +#include <drm/drm_prime.h> #include <drm/drm_print.h> #include <drm/qda_accel.h>
#include "qda_drv.h" +#include "qda_prime.h" #include "qda_ioctl.h" #include "qda_rpmsg.h"
@@ -64,6 +66,8 @@ static const struct drm_driver qda_drm_driver = { .postclose = qda_postclose, .ioctls = qda_ioctls, .num_ioctls = ARRAY_SIZE(qda_ioctls), + .gem_prime_import = qda_gem_prime_import, + .prime_fd_to_handle = qda_prime_fd_to_handle, .name = QDA_DRIVER_NAME, .desc = "Qualcomm DSP Accelerator Driver", }; @@ -100,6 +104,7 @@ static int init_memory_manager(struct qda_dev *qdev)
void qda_deinit_device(struct qda_dev *qdev) { + mutex_destroy(&qdev->import_lock); cleanup_memory_manager(qdev); }
@@ -107,9 +112,14 @@ int qda_init_device(struct qda_dev *qdev) { int ret;
+ mutex_init(&qdev->import_lock); + qdev->current_import_file_priv = NULL; + ret = init_memory_manager(qdev); - if (ret) + if (ret) { drm_err(&qdev->drm_dev, "Failed to initialize memory manager: %d\n", ret); + mutex_destroy(&qdev->import_lock); + }
return ret; } diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h index 8a7d647ac8fc..96ce4135e2d9 100644 --- a/drivers/accel/qda/qda_drv.h +++ b/drivers/accel/qda/qda_drv.h @@ -47,6 +47,10 @@ struct qda_dev { struct list_head cb_devs; /** @iommu_mgr: IOMMU/memory manager instance */ struct qda_memory_manager *iommu_mgr; + /** @import_lock: Lock protecting prime import context */ + struct mutex import_lock; + /** @current_import_file_priv: Current file_priv during prime import */ + struct drm_file *current_import_file_priv; /** @dsp_name: Name of the DSP domain (e.g. "cdsp", "adsp") */ const char *dsp_name; }; diff --git a/drivers/accel/qda/qda_gem.c b/drivers/accel/qda/qda_gem.c index 568b3c2e64b7..9e1ac7582d0c 100644 --- a/drivers/accel/qda/qda_gem.c +++ b/drivers/accel/qda/qda_gem.c @@ -9,6 +9,7 @@ #include "qda_gem.h" #include "qda_memory_manager.h" #include "qda_memory_dma.h" +#include "qda_prime.h"
static void setup_vma_flags(struct vm_area_struct *vma) { @@ -25,8 +26,20 @@ void qda_gem_free_object(struct drm_gem_object *gem_obj) struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(gem_obj); struct qda_dev *qdev = qda_dev_from_drm(gem_obj->dev);
- if (qda_gem_obj->virt && qdev->iommu_mgr) - qda_memory_manager_free(qdev->iommu_mgr, qda_gem_obj); + if (qda_gem_obj->is_imported) { + if (qda_gem_obj->attachment && qda_gem_obj->sgt) + dma_buf_unmap_attachment_unlocked(qda_gem_obj->attachment, + qda_gem_obj->sgt, DMA_BIDIRECTIONAL); + if (qda_gem_obj->attachment) + dma_buf_detach(qda_gem_obj->dma_buf, qda_gem_obj->attachment); + if (qda_gem_obj->dma_buf) + dma_buf_put(qda_gem_obj->dma_buf); + if (qda_gem_obj->iommu_dev && qdev->iommu_mgr) + qda_memory_manager_free(qdev->iommu_mgr, qda_gem_obj); + } else { + if (qda_gem_obj->virt && qdev->iommu_mgr) + qda_memory_manager_free(qdev->iommu_mgr, qda_gem_obj); + }
drm_gem_object_release(gem_obj); kfree(qda_gem_obj); @@ -44,6 +57,10 @@ int qda_gem_mmap_obj(struct drm_gem_object *drm_obj, struct vm_area_struct *vma) struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(drm_obj); int ret;
+ /* Imported dma-buf objects must be mmap'd through the exporter, not the importer */ + if (qda_gem_obj->is_imported) + return -EINVAL; + /* Reset vm_pgoff for DMA mmap */ vma->vm_pgoff = 0;
@@ -143,6 +160,10 @@ struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev, qda_gem_obj = qda_gem_alloc_object(drm_dev, aligned_size); if (IS_ERR(qda_gem_obj)) return ERR_CAST(qda_gem_obj); + qda_gem_obj->is_imported = false; + qda_gem_obj->dma_buf = NULL; + qda_gem_obj->attachment = NULL; + qda_gem_obj->sgt = NULL;
ret = qda_memory_manager_alloc(iommu_mgr, qda_gem_obj, file_priv); if (ret) { diff --git a/drivers/accel/qda/qda_gem.h b/drivers/accel/qda/qda_gem.h index bb18f8155aa4..0878f57715f6 100644 --- a/drivers/accel/qda/qda_gem.h +++ b/drivers/accel/qda/qda_gem.h @@ -22,12 +22,20 @@ struct qda_gem_obj { struct drm_gem_object base; /** @iommu_dev: IOMMU context bank device that performed the allocation */ struct qda_iommu_device *iommu_dev; + /** @dma_buf: Reference to imported dma_buf */ + struct dma_buf *dma_buf; + /** @attachment: DMA buf attachment */ + struct dma_buf_attachment *attachment; + /** @sgt: Scatter-gather table */ + struct sg_table *sgt; /** @virt: Kernel virtual address of the allocated DMA memory */ void *virt; /** @dma_addr: DMA address (with SID encoded in upper 32 bits) */ dma_addr_t dma_addr; /** @size: Size of the buffer in bytes */ size_t size; + /** @is_imported: True if buffer is imported, false if allocated */ + bool is_imported; };
/** diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c index 82111275f420..d2aa0e0e65f5 100644 --- a/drivers/accel/qda/qda_memory_manager.c +++ b/drivers/accel/qda/qda_memory_manager.c @@ -202,6 +202,41 @@ static struct qda_iommu_device *get_or_assign_iommu_device(struct qda_memory_man return NULL; }
+static int qda_memory_manager_map_imported(struct qda_memory_manager *mem_mgr, + struct qda_gem_obj *gem_obj, + struct qda_iommu_device *iommu_dev) +{ + struct scatterlist *sg; + dma_addr_t dma_addr; + + if (!gem_obj->is_imported || !gem_obj->sgt || !iommu_dev) { + drm_err(gem_obj->base.dev, "Invalid parameters for imported buffer mapping\n"); + return -EINVAL; + } + + sg = gem_obj->sgt->sgl; + if (!sg) { + drm_err(gem_obj->base.dev, "Invalid scatter-gather list for imported buffer\n"); + return -EINVAL; + } + + gem_obj->iommu_dev = iommu_dev; + + /* + * After dma_buf_map_attachment_unlocked(), sg_dma_address() returns the + * IOMMU virtual address, not the physical address. The IOMMU maps the + * entire buffer as a contiguous range in the IOMMU address space even if + * the underlying physical memory is non-contiguous. Therefore the first + * sg entry's DMA address is the start of the complete contiguous + * IOMMU-mapped range and is sufficient to describe the buffer to the DSP. + */ + dma_addr = sg_dma_address(sg); + dma_addr += ((u64)iommu_dev->sid << 32); + gem_obj->dma_addr = dma_addr; + + return 0; +} + /** * qda_memory_manager_alloc() - Allocate memory for a GEM object * @mem_mgr: Pointer to memory manager @@ -237,7 +272,11 @@ int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_ return -ENOMEM; }
- ret = qda_dma_alloc(selected_dev, gem_obj, size); + if (gem_obj->is_imported) + ret = qda_memory_manager_map_imported(mem_mgr, gem_obj, selected_dev); + else + ret = qda_dma_alloc(selected_dev, gem_obj, size); + if (ret) { drm_err(gem_obj->base.dev, "Allocation failed: size=%zu, device_id=%u, ret=%d\n", size, selected_dev->id, ret); @@ -262,6 +301,12 @@ void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, struct qda_gem_ return; }
+ if (gem_obj->is_imported) { + drm_dbg_driver(gem_obj->base.dev, + "Freed imported buffer tracking (no DMA free needed)\n"); + return; + } + qda_dma_free(gem_obj); }
diff --git a/drivers/accel/qda/qda_prime.c b/drivers/accel/qda/qda_prime.c new file mode 100644 index 000000000000..acb0ac8c40fd --- /dev/null +++ b/drivers/accel/qda/qda_prime.c @@ -0,0 +1,184 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <drm/drm_gem.h> +#include <drm/drm_prime.h> +#include <drm/drm_print.h> +#include <linux/slab.h> +#include <linux/dma-mapping.h> +#include "qda_drv.h" +#include "qda_gem.h" +#include "qda_prime.h" +#include "qda_memory_manager.h" + +static struct drm_gem_object *check_own_buffer(struct drm_device *dev, struct dma_buf *dma_buf) +{ + struct drm_gem_object *existing_gem; + + /* Only safe to access priv if this dma-buf was exported by this device */ + if (!drm_gem_is_prime_exported_dma_buf(dev, dma_buf)) + return NULL; + + existing_gem = dma_buf->priv; + if (existing_gem->dev != dev) + return NULL; + + if (to_qda_gem_obj(existing_gem)->is_imported) + return NULL; + + drm_gem_object_get(existing_gem); + return existing_gem; +} + +static struct qda_iommu_device *get_iommu_device_for_import(struct qda_dev *qdev, + struct drm_file **file_priv_out) +{ + struct drm_file *file_priv; + struct qda_file_priv *qda_file_priv; + struct qda_iommu_device *iommu_dev = NULL; + int ret; + + file_priv = qdev->current_import_file_priv; + *file_priv_out = file_priv; + + if (!file_priv || !file_priv->driver_priv) + return NULL; + + qda_file_priv = (struct qda_file_priv *)file_priv->driver_priv; + iommu_dev = qda_file_priv->assigned_iommu_dev; + + if (!iommu_dev) { + ret = qda_memory_manager_assign_device(qdev->iommu_mgr, file_priv); + if (ret) { + drm_err(&qdev->drm_dev, "Failed to assign IOMMU device: %d\n", ret); + return NULL; + } + + iommu_dev = qda_file_priv->assigned_iommu_dev; + } + + return iommu_dev; +} + +static int setup_dma_buf_mapping(struct qda_gem_obj *qda_gem_obj, struct dma_buf *dma_buf, + struct device *attach_dev, struct qda_dev *qdev) +{ + struct dma_buf_attachment *attachment; + struct sg_table *sgt; + int ret; + + attachment = dma_buf_attach(dma_buf, attach_dev); + if (IS_ERR(attachment)) { + ret = PTR_ERR(attachment); + drm_err(&qdev->drm_dev, "Failed to attach dma_buf: %d\n", ret); + return ret; + } + qda_gem_obj->attachment = attachment; + + sgt = dma_buf_map_attachment_unlocked(attachment, DMA_BIDIRECTIONAL); + if (IS_ERR(sgt)) { + ret = PTR_ERR(sgt); + drm_err(&qdev->drm_dev, "Failed to map dma_buf attachment: %d\n", ret); + dma_buf_detach(dma_buf, attachment); + return ret; + } + qda_gem_obj->sgt = sgt; + + return 0; +} + +/** + * qda_gem_prime_import() - Import a DMA-BUF as a GEM object + * @dev: DRM device structure + * @dma_buf: DMA-BUF to import + * + * Return: Pointer to the imported GEM object on success, ERR_PTR on failure + */ +struct drm_gem_object *qda_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) +{ + struct qda_dev *qdev = qda_dev_from_drm(dev); + struct qda_gem_obj *qda_gem_obj; + struct drm_file *file_priv; + struct qda_iommu_device *iommu_dev; + struct drm_gem_object *existing_gem; + size_t aligned_size; + int ret; + + if (!qdev->iommu_mgr) { + drm_err(dev, "Invalid iommu_mgr\n"); + return ERR_PTR(-ENODEV); + } + + existing_gem = check_own_buffer(dev, dma_buf); + if (existing_gem) + return existing_gem; + + iommu_dev = get_iommu_device_for_import(qdev, &file_priv); + if (!iommu_dev || !iommu_dev->dev) { + drm_err(dev, "No IOMMU device assigned for prime import\n"); + return ERR_PTR(-ENODEV); + } + + drm_dbg_driver(dev, "Using IOMMU device %u for prime import\n", iommu_dev->id); + + aligned_size = PAGE_ALIGN(dma_buf->size); + qda_gem_obj = qda_gem_alloc_object(dev, aligned_size); + if (IS_ERR(qda_gem_obj)) + return ERR_CAST(qda_gem_obj); + + qda_gem_obj->is_imported = true; + qda_gem_obj->dma_buf = dma_buf; + qda_gem_obj->virt = NULL; + qda_gem_obj->iommu_dev = iommu_dev; + + get_dma_buf(dma_buf); + + ret = setup_dma_buf_mapping(qda_gem_obj, dma_buf, iommu_dev->dev, qdev); + if (ret) + goto err_put_dma_buf; + + ret = qda_memory_manager_alloc(qdev->iommu_mgr, qda_gem_obj, file_priv); + if (ret) { + drm_err(dev, "Failed to allocate IOMMU mapping: %d\n", ret); + goto err_unmap; + } + + drm_dbg_driver(dev, "Prime import completed successfully size=%zu\n", aligned_size); + return &qda_gem_obj->base; + +err_unmap: + dma_buf_unmap_attachment_unlocked(qda_gem_obj->attachment, + qda_gem_obj->sgt, DMA_BIDIRECTIONAL); + dma_buf_detach(dma_buf, qda_gem_obj->attachment); +err_put_dma_buf: + dma_buf_put(dma_buf); + qda_gem_cleanup_object(qda_gem_obj); + return ERR_PTR(ret); +} + +/** + * qda_prime_fd_to_handle() - Convert a PRIME fd to a GEM handle + * @dev: DRM device structure + * @file_priv: DRM file private data + * @prime_fd: File descriptor of the PRIME buffer + * @handle: Output GEM handle + * + * Return: 0 on success, negative error code on failure + */ +int qda_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv, + int prime_fd, u32 *handle) +{ + struct qda_dev *qdev = qda_dev_from_drm(dev); + int ret; + + mutex_lock(&qdev->import_lock); + qdev->current_import_file_priv = file_priv; + + ret = drm_gem_prime_fd_to_handle(dev, file_priv, prime_fd, handle); + + qdev->current_import_file_priv = NULL; + mutex_unlock(&qdev->import_lock); + + return ret; +} + +MODULE_IMPORT_NS("DMA_BUF"); diff --git a/drivers/accel/qda/qda_prime.h b/drivers/accel/qda/qda_prime.h new file mode 100644 index 000000000000..9b3850d54fa7 --- /dev/null +++ b/drivers/accel/qda/qda_prime.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. + */ + +#ifndef __QDA_PRIME_H__ +#define __QDA_PRIME_H__ + +#include <drm/drm_device.h> +#include <drm/drm_file.h> +#include <drm/drm_gem.h> +#include <linux/dma-buf.h> + +struct drm_gem_object *qda_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf); +int qda_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv, + int prime_fd, u32 *handle); + +#endif /* __QDA_PRIME_H__ */
On 5/19/26 08:16, Ekansh Gupta via B4 Relay wrote:
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Allow user-space to import DMA-BUF file descriptors from other subsystems (GPU, camera, video) into the QDA driver via the standard DRM PRIME interface.
qda_prime.c Implements qda_gem_prime_import(), which is set as the driver's .gem_prime_import callback. On import it:
- Short-circuits self-import: if the dma_buf was exported by this device and is not itself an import, the existing GEM object is returned with an incremented reference count.
- Attaches to the dma_buf and maps it with DMA_BIDIRECTIONAL via dma_buf_map_attachment_unlocked(), obtaining an sg_table whose DMA addresses are IOMMU virtual addresses in the CB device's address space.
- Calls qda_memory_manager_alloc() to record the IOMMU mapping and encode the SID in the upper 32 bits of the DMA address, matching the convention used for natively allocated buffers.
qda_prime_fd_to_handle() wraps drm_gem_prime_fd_to_handle() under qdev->import_lock, storing the calling file_priv in qdev->current_import_file_priv so that qda_gem_prime_import() can retrieve it (the .gem_prime_import callback does not receive file_priv directly).
qda_gem.c qda_gem_free_object() is extended to handle the imported-buffer teardown path: unmap the sg_table, detach from the dma_buf, and release the dma_buf reference. qda_gem_mmap_obj() rejects mmap requests on imported objects.
qda_memory_manager.c qda_memory_manager_map_imported() records the IOMMU-mapped DMA address from the first sg entry (the IOMMU maps the buffer as a contiguous range) and encodes the SID prefix.
No it doesn't.
qda_memory_manager_free() skips the DMA free path for imported buffers since the memory is owned by the exporter.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
drivers/accel/qda/Makefile | 1 + drivers/accel/qda/qda_drv.c | 12 ++- drivers/accel/qda/qda_drv.h | 4 + drivers/accel/qda/qda_gem.c | 25 ++++- drivers/accel/qda/qda_gem.h | 8 ++ drivers/accel/qda/qda_memory_manager.c | 47 ++++++++- drivers/accel/qda/qda_prime.c | 184 +++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_prime.h | 18 ++++ 8 files changed, 295 insertions(+), 4 deletions(-)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index a46ddceecfc5..fb092e56d7f3 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -12,6 +12,7 @@ qda-y := \ qda_ioctl.o \ qda_memory_dma.o \ qda_memory_manager.o \
qda_prime.o \ qda_rpmsg.oobj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index c9b9e56dcb28..ef8bd573b836 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -7,10 +7,12 @@ #include <drm/drm_file.h> #include <drm/drm_gem.h> #include <drm/drm_ioctl.h> +#include <drm/drm_prime.h> #include <drm/drm_print.h> #include <drm/qda_accel.h>
#include "qda_drv.h" +#include "qda_prime.h" #include "qda_ioctl.h" #include "qda_rpmsg.h"
@@ -64,6 +66,8 @@ static const struct drm_driver qda_drm_driver = { .postclose = qda_postclose, .ioctls = qda_ioctls, .num_ioctls = ARRAY_SIZE(qda_ioctls),
.gem_prime_import = qda_gem_prime_import,.prime_fd_to_handle = qda_prime_fd_to_handle, .name = QDA_DRIVER_NAME, .desc = "Qualcomm DSP Accelerator Driver",}; @@ -100,6 +104,7 @@ static int init_memory_manager(struct qda_dev *qdev)
void qda_deinit_device(struct qda_dev *qdev) {
mutex_destroy(&qdev->import_lock); cleanup_memory_manager(qdev);}
@@ -107,9 +112,14 @@ int qda_init_device(struct qda_dev *qdev) { int ret;
mutex_init(&qdev->import_lock);qdev->current_import_file_priv = NULL;ret = init_memory_manager(qdev);
if (ret)
if (ret) { drm_err(&qdev->drm_dev, "Failed to initialize memory manager: %d\n", ret);mutex_destroy(&qdev->import_lock);} return ret;} diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h index 8a7d647ac8fc..96ce4135e2d9 100644 --- a/drivers/accel/qda/qda_drv.h +++ b/drivers/accel/qda/qda_drv.h @@ -47,6 +47,10 @@ struct qda_dev { struct list_head cb_devs; /** @iommu_mgr: IOMMU/memory manager instance */ struct qda_memory_manager *iommu_mgr;
/** @import_lock: Lock protecting prime import context */struct mutex import_lock;/** @current_import_file_priv: Current file_priv during prime import */struct drm_file *current_import_file_priv; /** @dsp_name: Name of the DSP domain (e.g. "cdsp", "adsp") */ const char *dsp_name;}; diff --git a/drivers/accel/qda/qda_gem.c b/drivers/accel/qda/qda_gem.c index 568b3c2e64b7..9e1ac7582d0c 100644 --- a/drivers/accel/qda/qda_gem.c +++ b/drivers/accel/qda/qda_gem.c @@ -9,6 +9,7 @@ #include "qda_gem.h" #include "qda_memory_manager.h" #include "qda_memory_dma.h" +#include "qda_prime.h"
static void setup_vma_flags(struct vm_area_struct *vma) { @@ -25,8 +26,20 @@ void qda_gem_free_object(struct drm_gem_object *gem_obj) struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(gem_obj); struct qda_dev *qdev = qda_dev_from_drm(gem_obj->dev);
if (qda_gem_obj->virt && qdev->iommu_mgr)qda_memory_manager_free(qdev->iommu_mgr, qda_gem_obj);
if (qda_gem_obj->is_imported) {if (qda_gem_obj->attachment && qda_gem_obj->sgt)dma_buf_unmap_attachment_unlocked(qda_gem_obj->attachment,qda_gem_obj->sgt, DMA_BIDIRECTIONAL);if (qda_gem_obj->attachment)dma_buf_detach(qda_gem_obj->dma_buf, qda_gem_obj->attachment);if (qda_gem_obj->dma_buf)dma_buf_put(qda_gem_obj->dma_buf);if (qda_gem_obj->iommu_dev && qdev->iommu_mgr)qda_memory_manager_free(qdev->iommu_mgr, qda_gem_obj);} else {if (qda_gem_obj->virt && qdev->iommu_mgr)qda_memory_manager_free(qdev->iommu_mgr, qda_gem_obj);} drm_gem_object_release(gem_obj); kfree(qda_gem_obj);@@ -44,6 +57,10 @@ int qda_gem_mmap_obj(struct drm_gem_object *drm_obj, struct vm_area_struct *vma) struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(drm_obj); int ret;
/* Imported dma-buf objects must be mmap'd through the exporter, not the importer */if (qda_gem_obj->is_imported)return -EINVAL;/* Reset vm_pgoff for DMA mmap */ vma->vm_pgoff = 0;@@ -143,6 +160,10 @@ struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev, qda_gem_obj = qda_gem_alloc_object(drm_dev, aligned_size); if (IS_ERR(qda_gem_obj)) return ERR_CAST(qda_gem_obj);
qda_gem_obj->is_imported = false;qda_gem_obj->dma_buf = NULL;qda_gem_obj->attachment = NULL;qda_gem_obj->sgt = NULL; ret = qda_memory_manager_alloc(iommu_mgr, qda_gem_obj, file_priv); if (ret) {diff --git a/drivers/accel/qda/qda_gem.h b/drivers/accel/qda/qda_gem.h index bb18f8155aa4..0878f57715f6 100644 --- a/drivers/accel/qda/qda_gem.h +++ b/drivers/accel/qda/qda_gem.h @@ -22,12 +22,20 @@ struct qda_gem_obj { struct drm_gem_object base; /** @iommu_dev: IOMMU context bank device that performed the allocation */ struct qda_iommu_device *iommu_dev;
/** @dma_buf: Reference to imported dma_buf */struct dma_buf *dma_buf;/** @attachment: DMA buf attachment */struct dma_buf_attachment *attachment;/** @sgt: Scatter-gather table */struct sg_table *sgt; /** @virt: Kernel virtual address of the allocated DMA memory */ void *virt; /** @dma_addr: DMA address (with SID encoded in upper 32 bits) */ dma_addr_t dma_addr; /** @size: Size of the buffer in bytes */ size_t size;/** @is_imported: True if buffer is imported, false if allocated */bool is_imported;};
/** diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c index 82111275f420..d2aa0e0e65f5 100644 --- a/drivers/accel/qda/qda_memory_manager.c +++ b/drivers/accel/qda/qda_memory_manager.c @@ -202,6 +202,41 @@ static struct qda_iommu_device *get_or_assign_iommu_device(struct qda_memory_man return NULL; }
+static int qda_memory_manager_map_imported(struct qda_memory_manager *mem_mgr,
struct qda_gem_obj *gem_obj,struct qda_iommu_device *iommu_dev)+{
struct scatterlist *sg;dma_addr_t dma_addr;if (!gem_obj->is_imported || !gem_obj->sgt || !iommu_dev) {drm_err(gem_obj->base.dev, "Invalid parameters for imported buffer mapping\n");return -EINVAL;}sg = gem_obj->sgt->sgl;if (!sg) {drm_err(gem_obj->base.dev, "Invalid scatter-gather list for imported buffer\n");return -EINVAL;}gem_obj->iommu_dev = iommu_dev;/** After dma_buf_map_attachment_unlocked(), sg_dma_address() returns the* IOMMU virtual address, not the physical address. The IOMMU maps the* entire buffer as a contiguous range in the IOMMU address space even if* the underlying physical memory is non-contiguous. Therefore the first* sg entry's DMA address is the start of the complete contiguous* IOMMU-mapped range and is sufficient to describe the buffer to the DSP.*/dma_addr = sg_dma_address(sg);dma_addr += ((u64)iommu_dev->sid << 32);gem_obj->dma_addr = dma_addr;
That handling here is completely broken since it assumes that the exporter maps the buffer as contigious range.
But that's in no way guaranteed.
Regards, Christian.
return 0;+}
/**
- qda_memory_manager_alloc() - Allocate memory for a GEM object
- @mem_mgr: Pointer to memory manager
@@ -237,7 +272,11 @@ int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_ return -ENOMEM; }
ret = qda_dma_alloc(selected_dev, gem_obj, size);
if (gem_obj->is_imported)ret = qda_memory_manager_map_imported(mem_mgr, gem_obj, selected_dev);elseret = qda_dma_alloc(selected_dev, gem_obj, size);if (ret) { drm_err(gem_obj->base.dev, "Allocation failed: size=%zu, device_id=%u, ret=%d\n", size, selected_dev->id, ret);@@ -262,6 +301,12 @@ void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, struct qda_gem_ return; }
if (gem_obj->is_imported) {drm_dbg_driver(gem_obj->base.dev,"Freed imported buffer tracking (no DMA free needed)\n");return;}qda_dma_free(gem_obj);}
diff --git a/drivers/accel/qda/qda_prime.c b/drivers/accel/qda/qda_prime.c new file mode 100644 index 000000000000..acb0ac8c40fd --- /dev/null +++ b/drivers/accel/qda/qda_prime.c @@ -0,0 +1,184 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <drm/drm_gem.h> +#include <drm/drm_prime.h> +#include <drm/drm_print.h> +#include <linux/slab.h> +#include <linux/dma-mapping.h> +#include "qda_drv.h" +#include "qda_gem.h" +#include "qda_prime.h" +#include "qda_memory_manager.h"
+static struct drm_gem_object *check_own_buffer(struct drm_device *dev, struct dma_buf *dma_buf) +{
struct drm_gem_object *existing_gem;/* Only safe to access priv if this dma-buf was exported by this device */if (!drm_gem_is_prime_exported_dma_buf(dev, dma_buf))return NULL;existing_gem = dma_buf->priv;if (existing_gem->dev != dev)return NULL;if (to_qda_gem_obj(existing_gem)->is_imported)return NULL;drm_gem_object_get(existing_gem);return existing_gem;+}
+static struct qda_iommu_device *get_iommu_device_for_import(struct qda_dev *qdev,
struct drm_file **file_priv_out)+{
struct drm_file *file_priv;struct qda_file_priv *qda_file_priv;struct qda_iommu_device *iommu_dev = NULL;int ret;file_priv = qdev->current_import_file_priv;*file_priv_out = file_priv;if (!file_priv || !file_priv->driver_priv)return NULL;qda_file_priv = (struct qda_file_priv *)file_priv->driver_priv;iommu_dev = qda_file_priv->assigned_iommu_dev;if (!iommu_dev) {ret = qda_memory_manager_assign_device(qdev->iommu_mgr, file_priv);if (ret) {drm_err(&qdev->drm_dev, "Failed to assign IOMMU device: %d\n", ret);return NULL;}iommu_dev = qda_file_priv->assigned_iommu_dev;}return iommu_dev;+}
+static int setup_dma_buf_mapping(struct qda_gem_obj *qda_gem_obj, struct dma_buf *dma_buf,
struct device *attach_dev, struct qda_dev *qdev)+{
struct dma_buf_attachment *attachment;struct sg_table *sgt;int ret;attachment = dma_buf_attach(dma_buf, attach_dev);if (IS_ERR(attachment)) {ret = PTR_ERR(attachment);drm_err(&qdev->drm_dev, "Failed to attach dma_buf: %d\n", ret);return ret;}qda_gem_obj->attachment = attachment;sgt = dma_buf_map_attachment_unlocked(attachment, DMA_BIDIRECTIONAL);if (IS_ERR(sgt)) {ret = PTR_ERR(sgt);drm_err(&qdev->drm_dev, "Failed to map dma_buf attachment: %d\n", ret);dma_buf_detach(dma_buf, attachment);return ret;}qda_gem_obj->sgt = sgt;return 0;+}
+/**
- qda_gem_prime_import() - Import a DMA-BUF as a GEM object
- @dev: DRM device structure
- @dma_buf: DMA-BUF to import
- Return: Pointer to the imported GEM object on success, ERR_PTR on failure
- */
+struct drm_gem_object *qda_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf) +{
struct qda_dev *qdev = qda_dev_from_drm(dev);struct qda_gem_obj *qda_gem_obj;struct drm_file *file_priv;struct qda_iommu_device *iommu_dev;struct drm_gem_object *existing_gem;size_t aligned_size;int ret;if (!qdev->iommu_mgr) {drm_err(dev, "Invalid iommu_mgr\n");return ERR_PTR(-ENODEV);}existing_gem = check_own_buffer(dev, dma_buf);if (existing_gem)return existing_gem;iommu_dev = get_iommu_device_for_import(qdev, &file_priv);if (!iommu_dev || !iommu_dev->dev) {drm_err(dev, "No IOMMU device assigned for prime import\n");return ERR_PTR(-ENODEV);}drm_dbg_driver(dev, "Using IOMMU device %u for prime import\n", iommu_dev->id);aligned_size = PAGE_ALIGN(dma_buf->size);qda_gem_obj = qda_gem_alloc_object(dev, aligned_size);if (IS_ERR(qda_gem_obj))return ERR_CAST(qda_gem_obj);qda_gem_obj->is_imported = true;qda_gem_obj->dma_buf = dma_buf;qda_gem_obj->virt = NULL;qda_gem_obj->iommu_dev = iommu_dev;get_dma_buf(dma_buf);ret = setup_dma_buf_mapping(qda_gem_obj, dma_buf, iommu_dev->dev, qdev);if (ret)goto err_put_dma_buf;ret = qda_memory_manager_alloc(qdev->iommu_mgr, qda_gem_obj, file_priv);if (ret) {drm_err(dev, "Failed to allocate IOMMU mapping: %d\n", ret);goto err_unmap;}drm_dbg_driver(dev, "Prime import completed successfully size=%zu\n", aligned_size);return &qda_gem_obj->base;+err_unmap:
dma_buf_unmap_attachment_unlocked(qda_gem_obj->attachment,qda_gem_obj->sgt, DMA_BIDIRECTIONAL);dma_buf_detach(dma_buf, qda_gem_obj->attachment);+err_put_dma_buf:
dma_buf_put(dma_buf);qda_gem_cleanup_object(qda_gem_obj);return ERR_PTR(ret);+}
+/**
- qda_prime_fd_to_handle() - Convert a PRIME fd to a GEM handle
- @dev: DRM device structure
- @file_priv: DRM file private data
- @prime_fd: File descriptor of the PRIME buffer
- @handle: Output GEM handle
- Return: 0 on success, negative error code on failure
- */
+int qda_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
int prime_fd, u32 *handle)+{
struct qda_dev *qdev = qda_dev_from_drm(dev);int ret;mutex_lock(&qdev->import_lock);qdev->current_import_file_priv = file_priv;ret = drm_gem_prime_fd_to_handle(dev, file_priv, prime_fd, handle);qdev->current_import_file_priv = NULL;mutex_unlock(&qdev->import_lock);return ret;+}
+MODULE_IMPORT_NS("DMA_BUF"); diff --git a/drivers/accel/qda/qda_prime.h b/drivers/accel/qda/qda_prime.h new file mode 100644 index 000000000000..9b3850d54fa7 --- /dev/null +++ b/drivers/accel/qda/qda_prime.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/*
- Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
- */
+#ifndef __QDA_PRIME_H__ +#define __QDA_PRIME_H__
+#include <drm/drm_device.h> +#include <drm/drm_file.h> +#include <drm/drm_gem.h> +#include <linux/dma-buf.h>
+struct drm_gem_object *qda_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf); +int qda_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
int prime_fd, u32 *handle);+#endif /* __QDA_PRIME_H__ */
-- 2.34.1
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Implement the FastRPC remote procedure call path, allowing user-space to invoke methods on the DSP via DRM_IOCTL_QDA_REMOTE_INVOKE.
qda_fastrpc.c / qda_fastrpc.h Implements the FastRPC protocol layer: argument marshalling (qda_fastrpc_invoke_pack), response unmarshalling (qda_fastrpc_invoke_unpack), and invocation context lifecycle management. Each invocation allocates a fastrpc_invoke_context which tracks buffer descriptors, GEM objects, and the completion used to synchronise with the DSP response.
Buffer arguments are handled in three ways: - DMA-BUF fd: imported via PRIME, IOMMU-mapped dma_addr used - Direct (inline): copied into the GEM-backed message buffer - DMA handle: fd forwarded to DSP, physical page descriptor computed
qda_rpmsg.c Implements qda_rpmsg_send_msg() which sends the wire-format fastrpc_msg (embedded as the first member of qda_msg) directly via rpmsg_send(), and qda_rpmsg_wait_for_rsp() which blocks on the context completion. The RPMsg callback dispatches responses to waiting contexts via the ctx_xa XArray.
qda_ioctl.c qda_ioctl_invoke() drives the full invocation lifecycle: allocate context → assign XArray ID → prepare args → allocate GEM message buffer → pack → send → wait → unpack → free.
qda_drv.h / qda_drv.c qda_dev gains ctx_xa (XArray for in-flight context lookup) and remote_session_id_counter (atomic counter for session IDs). qda_file_priv gains remote_session_id for per-session tracking.
include/uapi/drm/qda_accel.h Adds DRM_IOCTL_QDA_REMOTE_INVOKE (command 0x07; command numbers 0x03–0x06 are reserved) and the associated drm_qda_invoke_args and drm_qda_fastrpc_invoke_args structures.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/accel/qda/Makefile | 1 + drivers/accel/qda/qda_drv.c | 17 ++ drivers/accel/qda/qda_drv.h | 8 + drivers/accel/qda/qda_fastrpc.c | 597 ++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_fastrpc.h | 271 ++++++++++++++++++ drivers/accel/qda/qda_ioctl.c | 104 +++++++ drivers/accel/qda/qda_ioctl.h | 1 + drivers/accel/qda/qda_rpmsg.c | 136 ++++++++- drivers/accel/qda/qda_rpmsg.h | 17 ++ include/uapi/drm/qda_accel.h | 39 +++ 10 files changed, 1189 insertions(+), 2 deletions(-)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index fb092e56d7f3..2d10420cd1ec 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -8,6 +8,7 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o qda-y := \ qda_cb.o \ qda_drv.o \ + qda_fastrpc.o \ qda_gem.o \ qda_ioctl.o \ qda_memory_dma.o \ diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index ef8bd573b836..704c7d3127d2 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -26,6 +26,8 @@ static int qda_open(struct drm_device *dev, struct drm_file *file)
qda_file_priv->pid = current->pid; qda_file_priv->qda_dev = qda_dev_from_drm(dev); + qda_file_priv->remote_session_id = + atomic_inc_return(&qda_file_priv->qda_dev->remote_session_id_counter); file->driver_priv = qda_file_priv;
return 0; @@ -57,6 +59,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = { DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0), DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0), DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0), + DRM_IOCTL_DEF_DRV(QDA_REMOTE_INVOKE, qda_ioctl_invoke, 0), };
static const struct drm_driver qda_drm_driver = { @@ -93,6 +96,17 @@ static void cleanup_memory_manager(struct qda_dev *qdev) } }
+static void cleanup_device_resources(struct qda_dev *qdev) +{ + xa_destroy(&qdev->ctx_xa); +} + +static void init_device_resources(struct qda_dev *qdev) +{ + atomic_set(&qdev->remote_session_id_counter, 0); + xa_init_flags(&qdev->ctx_xa, XA_FLAGS_ALLOC1); +} + static int init_memory_manager(struct qda_dev *qdev) { qdev->iommu_mgr = kzalloc_obj(*qdev->iommu_mgr); @@ -106,6 +120,7 @@ void qda_deinit_device(struct qda_dev *qdev) { mutex_destroy(&qdev->import_lock); cleanup_memory_manager(qdev); + cleanup_device_resources(qdev); }
int qda_init_device(struct qda_dev *qdev) @@ -114,10 +129,12 @@ int qda_init_device(struct qda_dev *qdev)
mutex_init(&qdev->import_lock); qdev->current_import_file_priv = NULL; + init_device_resources(qdev);
ret = init_memory_manager(qdev); if (ret) { drm_err(&qdev->drm_dev, "Failed to initialize memory manager: %d\n", ret); + cleanup_device_resources(qdev); mutex_destroy(&qdev->import_lock); }
diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h index 96ce4135e2d9..420cccff42bf 100644 --- a/drivers/accel/qda/qda_drv.h +++ b/drivers/accel/qda/qda_drv.h @@ -6,10 +6,12 @@ #ifndef __QDA_DRV_H__ #define __QDA_DRV_H__
+#include <linux/atomic.h> #include <linux/device.h> #include <linux/list.h> #include <linux/rpmsg.h> #include <linux/types.h> +#include <linux/xarray.h> #include <drm/drm_device.h> #include <drm/drm_drv.h> #include <drm/drm_file.h> @@ -28,6 +30,8 @@ struct qda_file_priv { struct qda_iommu_device *assigned_iommu_dev; /** @pid: Process ID for tracking */ pid_t pid; + /** @remote_session_id: Unique session identifier */ + u32 remote_session_id; };
/** @@ -51,8 +55,12 @@ struct qda_dev { struct mutex import_lock; /** @current_import_file_priv: Current file_priv during prime import */ struct drm_file *current_import_file_priv; + /** @ctx_xa: XArray for FastRPC context management */ + struct xarray ctx_xa; /** @dsp_name: Name of the DSP domain (e.g. "cdsp", "adsp") */ const char *dsp_name; + /** @remote_session_id_counter: Atomic counter for unique session IDs */ + atomic_t remote_session_id_counter; };
/** diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c new file mode 100644 index 000000000000..0ec37175a098 --- /dev/null +++ b/drivers/accel/qda/qda_fastrpc.c @@ -0,0 +1,597 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <linux/slab.h> +#include <linux/uaccess.h> +#include <linux/sort.h> +#include <linux/completion.h> +#include <linux/dma-buf.h> +#include <drm/drm_gem.h> +#include "qda_fastrpc.h" +#include "qda_drv.h" +#include "qda_gem.h" +#include "qda_memory_manager.h" +#include "qda_prime.h" + +/** + * get_gem_obj_from_dmabuf_fd() - Import a DMA-BUF fd and return the GEM object + * @ctx: FastRPC invocation context + * @dmabuf_fd: DMA-BUF file descriptor supplied by user space + * @gem_obj: Output GEM object (caller must call drm_gem_object_put() when done) + * + * Imports the DMA-BUF fd into the QDA device via qda_prime_fd_to_handle() + * (which performs IOMMU device assignment for newly imported buffers) and + * then looks up the resulting GEM object. The caller is responsible for + * calling drm_gem_object_put() on the returned object. + * + * Return: 0 on success, negative error code on failure + */ +static int get_gem_obj_from_dmabuf_fd(struct fastrpc_invoke_context *ctx, + int dmabuf_fd, + struct drm_gem_object **gem_obj) +{ + struct drm_device *dev = ctx->file_priv->minor->dev; + u32 handle; + int ret; + + ret = qda_prime_fd_to_handle(dev, ctx->file_priv, dmabuf_fd, &handle); + if (ret) + return ret; + + *gem_obj = drm_gem_object_lookup(ctx->file_priv, handle); + if (!*gem_obj) + return -ENOENT; + + return 0; +} + +static void setup_pages_from_gem_obj(struct qda_gem_obj *qda_gem_obj, + struct fastrpc_phy_page *pages) +{ + pages->addr = qda_gem_obj->dma_addr; + pages->size = qda_gem_obj->size; +} + +static u64 calculate_vma_offset(u64 user_ptr) +{ + struct vm_area_struct *vma; + u64 user_ptr_page_mask = user_ptr & PAGE_MASK; + u64 vma_offset = 0; + + mmap_read_lock(current->mm); + vma = find_vma(current->mm, user_ptr); + if (vma) + vma_offset = user_ptr_page_mask - vma->vm_start; + mmap_read_unlock(current->mm); + + return vma_offset; +} + +static u64 calculate_page_aligned_size(u64 ptr, u64 len) +{ + u64 pg_start = (ptr & PAGE_MASK) >> PAGE_SHIFT; + u64 pg_end = ((ptr + len - 1) & PAGE_MASK) >> PAGE_SHIFT; + u64 aligned_size = (pg_end - pg_start + 1) * PAGE_SIZE; + + return aligned_size; +} + +static struct fastrpc_invoke_buf *fastrpc_invoke_buf_start(union fastrpc_remote_arg *pra, int len) +{ + return (struct fastrpc_invoke_buf *)(&pra[len]); +} + +static struct fastrpc_phy_page *fastrpc_phy_page_start(struct fastrpc_invoke_buf *buf, int len) +{ + return (struct fastrpc_phy_page *)(&buf[len]); +} + +static int fastrpc_get_meta_size(struct fastrpc_invoke_context *ctx) +{ + int size = 0; + + size = (sizeof(struct fastrpc_remote_buf) + + sizeof(struct fastrpc_invoke_buf) + + sizeof(struct fastrpc_phy_page)) * ctx->nscalars + + sizeof(u64) * FASTRPC_MAX_FDLIST + + sizeof(u32) * FASTRPC_MAX_CRCLIST; + + return size; +} + +static u64 fastrpc_get_payload_size(struct fastrpc_invoke_context *ctx, int metalen) +{ + u64 size = 0; + int oix; + + size = ALIGN(metalen, FASTRPC_ALIGN); + + for (oix = 0; oix < ctx->nbufs; oix++) { + int i = ctx->olaps[oix].raix; + + if (ctx->args[i].fd == 0 || ctx->args[i].fd == -1) { + if (ctx->olaps[oix].offset == 0) + size = ALIGN(size, FASTRPC_ALIGN); + + size += (ctx->olaps[oix].mend - ctx->olaps[oix].mstart); + } + } + + return size; +} + +/** + * qda_fastrpc_context_free() - Free an invocation context + * @ref: Reference counter embedded in the context + * + * Called when the reference count reaches zero; releases all resources + * associated with the invocation context. + */ +void qda_fastrpc_context_free(struct kref *ref) +{ + struct fastrpc_invoke_context *ctx; + int i; + + ctx = container_of(ref, struct fastrpc_invoke_context, refcount); + if (ctx->gem_objs) { + for (i = 0; i < ctx->nscalars; ++i) { + if (ctx->gem_objs[i]) + drm_gem_object_put(ctx->gem_objs[i]); + } + kfree(ctx->gem_objs); + } + + if (ctx->msg_gem_obj) + drm_gem_object_put(&ctx->msg_gem_obj->base); + + kfree(ctx->olaps); + + kfree(ctx->args); + kfree(ctx->req); + kfree(ctx->rsp); + kfree(ctx->input_pages); + kfree(ctx->inbuf); + + kfree(ctx); +} + +#define CMP(aa, bb) ((aa) == (bb) ? 0 : (aa) < (bb) ? -1 : 1) + +static int olaps_cmp(const void *a, const void *b) +{ + struct fastrpc_buf_overlap *pa = (struct fastrpc_buf_overlap *)a; + struct fastrpc_buf_overlap *pb = (struct fastrpc_buf_overlap *)b; + /* sort with lowest starting buffer first */ + int st = CMP(pa->start, pb->start); + /* sort with highest ending buffer first */ + int ed = CMP(pb->end, pa->end); + + return st == 0 ? ed : st; +} + +static void fastrpc_get_buff_overlaps(struct fastrpc_invoke_context *ctx) +{ + u64 max_end = 0; + int i; + + for (i = 0; i < ctx->nbufs; ++i) { + ctx->olaps[i].start = ctx->args[i].ptr; + ctx->olaps[i].end = ctx->olaps[i].start + ctx->args[i].length; + ctx->olaps[i].raix = i; + } + + sort(ctx->olaps, ctx->nbufs, sizeof(*ctx->olaps), olaps_cmp, NULL); + + for (i = 0; i < ctx->nbufs; ++i) { + if (ctx->olaps[i].start < max_end) { + ctx->olaps[i].mstart = max_end; + ctx->olaps[i].mend = ctx->olaps[i].end; + ctx->olaps[i].offset = max_end - ctx->olaps[i].start; + + if (ctx->olaps[i].end > max_end) { + max_end = ctx->olaps[i].end; + } else { + ctx->olaps[i].mend = 0; + ctx->olaps[i].mstart = 0; + } + } else { + ctx->olaps[i].mend = ctx->olaps[i].end; + ctx->olaps[i].mstart = ctx->olaps[i].start; + ctx->olaps[i].offset = 0; + max_end = ctx->olaps[i].end; + } + } +} + +/** + * qda_fastrpc_context_alloc() - Allocate a new FastRPC invocation context + * + * Return: Pointer to allocated context, or ERR_PTR on failure + */ +struct fastrpc_invoke_context *qda_fastrpc_context_alloc(void) +{ + struct fastrpc_invoke_context *ctx = NULL; + + ctx = kzalloc_obj(*ctx); + if (!ctx) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&ctx->node); + + ctx->retval = -1; + ctx->pid = current->pid; + init_completion(&ctx->work); + ctx->msg_gem_obj = NULL; + kref_init(&ctx->refcount); + + return ctx; +} + +/* + * process_fd_buffer() - Handle an in/out buffer argument backed by a DMA-BUF fd + * + * args[i].fd is a DMA-BUF fd. We import it to obtain the GEM object and its + * IOMMU-mapped dma_addr for the physical page descriptor. The DSP uses the + * physical address directly for this buffer type; the fd is not forwarded. + */ +static int process_fd_buffer(struct fastrpc_invoke_context *ctx, int i, + union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages) +{ + struct drm_gem_object *gem_obj; + struct qda_gem_obj *qda_gem_obj; + int err; + u64 len = ctx->args[i].length; + u64 vma_offset; + + err = get_gem_obj_from_dmabuf_fd(ctx, ctx->args[i].fd, &gem_obj); + if (err) + return err; + + ctx->gem_objs[i] = gem_obj; + qda_gem_obj = to_qda_gem_obj(gem_obj); + + rpra[i].buf.pv = (u64)ctx->args[i].ptr; + + pages[i].addr = qda_gem_obj->dma_addr; + + vma_offset = calculate_vma_offset(ctx->args[i].ptr); + pages[i].addr += vma_offset; + pages[i].size = calculate_page_aligned_size(ctx->args[i].ptr, len); + + return 0; +} + +static int process_direct_buffer(struct fastrpc_invoke_context *ctx, int i, int oix, + union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages, + uintptr_t *args, u64 *rlen, u64 pkt_size) +{ + int mlen; + u64 len = ctx->args[i].length; + int inbufs = ctx->inbufs; + + if (ctx->olaps[oix].offset == 0) { + *rlen -= ALIGN(*args, FASTRPC_ALIGN) - *args; + *args = ALIGN(*args, FASTRPC_ALIGN); + } + + mlen = ctx->olaps[oix].mend - ctx->olaps[oix].mstart; + + if (*rlen < mlen) + return -ENOSPC; + + rpra[i].buf.pv = *args - ctx->olaps[oix].offset; + + pages[i].addr = ctx->msg->phys - ctx->olaps[oix].offset + (pkt_size - *rlen); + pages[i].addr = pages[i].addr & PAGE_MASK; + pages[i].size = calculate_page_aligned_size(rpra[i].buf.pv, len); + + *args = *args + mlen; + *rlen -= mlen; + + if (i < inbufs) { + void *dst = (void *)(uintptr_t)rpra[i].buf.pv; + void *src = (void *)(uintptr_t)ctx->args[i].ptr; + + /* + * For user-space invocations (INVOKE_DYNAMIC), ptr is a user + * virtual address and must be copied safely. For all other + * (kernel-internal) invocations, ptr is a kernel address set + * by the driver itself and can be copied directly. + */ + if (ctx->type == FASTRPC_RMID_INVOKE_DYNAMIC) { + if (copy_from_user(dst, (void __user *)src, len)) + return -EFAULT; + } else { + memcpy(dst, src, len); + } + } + + return 0; +} + +/* + * process_dma_handle() - Handle a DMA-handle scalar argument + * + * args[i].fd is a DMA-BUF fd. We import it to get the physical page + * descriptor for the kernel, but forward the original DMA-BUF fd to the + * DSP in rpra[i].dma.fd so the DSP can identify the buffer by its fd. + */ +static int process_dma_handle(struct fastrpc_invoke_context *ctx, int i, + union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages) +{ + if (ctx->args[i].fd > 0) { + struct drm_gem_object *gem_obj; + struct qda_gem_obj *qda_gem_obj; + int err; + + err = get_gem_obj_from_dmabuf_fd(ctx, ctx->args[i].fd, &gem_obj); + if (err) + return err; + + ctx->gem_objs[i] = gem_obj; + qda_gem_obj = to_qda_gem_obj(gem_obj); + + setup_pages_from_gem_obj(qda_gem_obj, &pages[i]); + + /* Forward the original DMA-BUF fd to the DSP */ + rpra[i].dma.fd = ctx->args[i].fd; + rpra[i].dma.len = ctx->args[i].length; + rpra[i].dma.offset = (u64)ctx->args[i].ptr; + } else { + rpra[i].buf.pv = ctx->args[i].ptr; + rpra[i].buf.len = ctx->args[i].length; + } + + return 0; +} + +/** + * qda_fastrpc_get_header_size() - Compute the FastRPC message header size + * @ctx: FastRPC invocation context + * @out_size: Pointer to store the aligned packet size in bytes + * + * Return: 0 on success, negative error code on failure + */ +int qda_fastrpc_get_header_size(struct fastrpc_invoke_context *ctx, size_t *out_size) +{ + ctx->inbufs = REMOTE_SCALARS_INBUFS(ctx->sc); + ctx->metalen = fastrpc_get_meta_size(ctx); + ctx->pkt_size = fastrpc_get_payload_size(ctx, ctx->metalen); + + ctx->aligned_pkt_size = PAGE_ALIGN(ctx->pkt_size); + if (ctx->aligned_pkt_size == 0) + return -EINVAL; + + *out_size = ctx->aligned_pkt_size; + return 0; +} + +static int fastrpc_get_args(struct fastrpc_invoke_context *ctx) +{ + union fastrpc_remote_arg *rpra; + struct fastrpc_invoke_buf *list; + struct fastrpc_phy_page *pages; + int i, oix, err = 0; + u64 rlen; + uintptr_t args; + size_t hdr_size; + + ctx->inbufs = REMOTE_SCALARS_INBUFS(ctx->sc); + err = qda_fastrpc_get_header_size(ctx, &hdr_size); + if (err) + return err; + + ctx->msg->buf = ctx->msg_gem_obj->virt; + ctx->msg->phys = ctx->msg_gem_obj->dma_addr; + + memset(ctx->msg->buf, 0, ctx->aligned_pkt_size); + + rpra = (union fastrpc_remote_arg *)ctx->msg->buf; + ctx->list = fastrpc_invoke_buf_start(rpra, ctx->nscalars); + ctx->pages = fastrpc_phy_page_start(ctx->list, ctx->nscalars); + list = ctx->list; + pages = ctx->pages; + args = (uintptr_t)ctx->msg->buf + ctx->metalen; + rlen = ctx->pkt_size - ctx->metalen; + ctx->rpra = rpra; + + for (oix = 0; oix < ctx->nbufs; ++oix) { + i = ctx->olaps[oix].raix; + + rpra[i].buf.pv = 0; + rpra[i].buf.len = ctx->args[i].length; + list[i].num = ctx->args[i].length ? 1 : 0; + list[i].pgidx = i; + + if (!ctx->args[i].length) + continue; + + if (ctx->args[i].fd > 0) + err = process_fd_buffer(ctx, i, rpra, pages); + else + err = process_direct_buffer(ctx, i, oix, rpra, pages, &args, &rlen, + ctx->pkt_size); + + if (err) + goto bail_gem; + } + + for (i = ctx->nbufs; i < ctx->nscalars; ++i) { + list[i].num = ctx->args[i].length ? 1 : 0; + list[i].pgidx = i; + + err = process_dma_handle(ctx, i, rpra, pages); + if (err) + goto bail_gem; + } + + return 0; + +bail_gem: + if (ctx->msg_gem_obj) { + drm_gem_object_put(&ctx->msg_gem_obj->base); + ctx->msg_gem_obj = NULL; + } + + return err; +} + +static int fastrpc_put_args(struct fastrpc_invoke_context *ctx, struct qda_msg *msg) +{ + union fastrpc_remote_arg *rpra; + int i, err = 0; + + if (!ctx) + return -EINVAL; + + rpra = ctx->rpra; + if (!rpra) + return -EINVAL; + + for (i = ctx->inbufs; i < ctx->nbufs; ++i) { + if (ctx->args[i].fd <= 0) { + void *src = (void *)(uintptr_t)rpra[i].buf.pv; + void *dst = (void *)(uintptr_t)ctx->args[i].ptr; + u64 len = rpra[i].buf.len; + + if (ctx->type == FASTRPC_RMID_INVOKE_DYNAMIC) + err = copy_to_user((void __user *)dst, src, len) ? -EFAULT : 0; + else + memcpy(dst, src, len); + if (err) + break; + } + } + + return err; +} + +/** + * qda_fastrpc_invoke_pack() - Pack an invocation context into a QDA message + * @ctx: FastRPC invocation context + * @msg: QDA message structure to pack into + * + * Return: 0 on success, negative error code on failure + */ +int qda_fastrpc_invoke_pack(struct fastrpc_invoke_context *ctx, + struct qda_msg *msg) +{ + int err = 0; + + if (ctx->handle == FASTRPC_INIT_HANDLE) + msg->fastrpc.remote_session_id = 0; + else + msg->fastrpc.remote_session_id = ctx->remote_session_id; + + ctx->msg = msg; + + err = fastrpc_get_args(ctx); + if (err) + return err; + + dma_wmb(); + + msg->fastrpc.tid = ctx->pid; + msg->fastrpc.ctx = ctx->ctxid | ctx->pd; + msg->fastrpc.handle = ctx->handle; + msg->fastrpc.sc = ctx->sc; + msg->fastrpc.addr = ctx->msg->phys; + msg->fastrpc.size = roundup(ctx->pkt_size, PAGE_SIZE); + msg->fastrpc_ctx = ctx; + msg->file_priv = ctx->file_priv; + + return 0; +} + +/** + * qda_fastrpc_invoke_unpack() - Unpack a response message into an invocation context + * @ctx: FastRPC invocation context + * @msg: QDA message structure to unpack from + * + * Return: 0 on success, negative error code on failure + */ +int qda_fastrpc_invoke_unpack(struct fastrpc_invoke_context *ctx, + struct qda_msg *msg) +{ + int err; + + dma_rmb(); + + err = fastrpc_put_args(ctx, msg); + if (err) + return err; + + err = ctx->retval; + return err; +} + +static int fastrpc_prepare_args_invoke(struct fastrpc_invoke_context *ctx, char __user *argp) +{ + struct drm_qda_invoke_args invoke_args; + struct drm_qda_fastrpc_invoke_args *args = NULL; + u32 nscalars; + + /* argp is DRM ioctl data (kernel pointer); args pointer within it is user-space */ + memcpy(&invoke_args, argp, sizeof(invoke_args)); + + ctx->handle = invoke_args.handle; + ctx->sc = invoke_args.sc; + + nscalars = REMOTE_SCALARS_LENGTH(ctx->sc); + if (!nscalars) { + ctx->args = NULL; + return 0; + } + + args = kcalloc(nscalars, sizeof(*args), GFP_KERNEL); + if (!args) + return -ENOMEM; + + if (copy_from_user(args, u64_to_user_ptr(invoke_args.args), + nscalars * sizeof(*args))) { + kfree(args); + return -EFAULT; + } + + ctx->args = args; + return 0; +} + +/** + * qda_fastrpc_prepare_args() - Prepare arguments for a FastRPC invocation + * @ctx: FastRPC invocation context + * @argp: User-space pointer to invocation arguments + * + * Return: 0 on success, negative error code on failure + */ +int qda_fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp) +{ + int err; + + switch (ctx->type) { + case FASTRPC_RMID_INVOKE_DYNAMIC: + err = fastrpc_prepare_args_invoke(ctx, argp); + break; + default: + return -EINVAL; + } + if (err) + return err; + + ctx->nscalars = REMOTE_SCALARS_LENGTH(ctx->sc); + ctx->nbufs = REMOTE_SCALARS_INBUFS(ctx->sc) + REMOTE_SCALARS_OUTBUFS(ctx->sc); + + if (ctx->nscalars) { + ctx->gem_objs = kcalloc(ctx->nscalars, sizeof(*ctx->gem_objs), GFP_KERNEL); + if (!ctx->gem_objs) + return -ENOMEM; + ctx->olaps = kcalloc(ctx->nscalars, sizeof(*ctx->olaps), GFP_KERNEL); + if (!ctx->olaps) { + kfree(ctx->gem_objs); + ctx->gem_objs = NULL; + return -ENOMEM; + } + fastrpc_get_buff_overlaps(ctx); + } + + return err; +} diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h new file mode 100644 index 000000000000..ce77baeccfba --- /dev/null +++ b/drivers/accel/qda/qda_fastrpc.h @@ -0,0 +1,271 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. + */ + +#ifndef __QDA_FASTRPC_H__ +#define __QDA_FASTRPC_H__ + +#include <linux/completion.h> +#include <linux/kref.h> +#include <linux/list.h> +#include <linux/types.h> +#include <drm/drm_drv.h> +#include <drm/drm_file.h> +#include <drm/qda_accel.h> + +/* Forward declarations */ +struct qda_gem_obj; + +/* + * FastRPC scalar extraction macros + * + * These macros extract different fields from the scalar value that describes + * the arguments passed in a FastRPC invocation. + */ +#define REMOTE_SCALARS_INBUFS(sc) (((sc) >> 16) & 0x0ff) +#define REMOTE_SCALARS_OUTBUFS(sc) (((sc) >> 8) & 0x0ff) +#define REMOTE_SCALARS_INHANDLES(sc) (((sc) >> 4) & 0x0f) +#define REMOTE_SCALARS_OUTHANDLES(sc) ((sc) & 0x0f) +#define REMOTE_SCALARS_LENGTH(sc) (REMOTE_SCALARS_INBUFS(sc) + \ + REMOTE_SCALARS_OUTBUFS(sc) + \ + REMOTE_SCALARS_INHANDLES(sc) + \ + REMOTE_SCALARS_OUTHANDLES(sc)) + +/* FastRPC configuration constants */ +#define FASTRPC_ALIGN 128 /* Alignment requirement */ +#define FASTRPC_MAX_FDLIST 16 /* Maximum file descriptors */ +#define FASTRPC_MAX_CRCLIST 64 /* Maximum CRC list entries */ + +/* + * FastRPC scalar construction macros + * + * These macros build the scalar value that describes the arguments + * for a FastRPC invocation. + */ +#define FASTRPC_BUILD_SCALARS(attr, method, in, out, oin, oout) \ + (((attr & 0x07) << 29) | \ + ((method & 0x1f) << 24) | \ + ((in & 0xff) << 16) | \ + ((out & 0xff) << 8) | \ + ((oin & 0x0f) << 4) | \ + (oout & 0x0f)) + +#define FASTRPC_SCALARS(method, in, out) \ + FASTRPC_BUILD_SCALARS(0, method, in, out, 0, 0) + +/** + * struct fastrpc_buf_overlap - Buffer overlap tracking structure + * + * Tracks overlapping buffer regions to optimise memory mapping and avoid + * redundant mappings of the same physical memory. + */ +struct fastrpc_buf_overlap { + /** @start: Start address of the buffer in user virtual address space */ + u64 start; + /** @end: End address of the buffer in user virtual address space */ + u64 end; + /** @raix: Remote argument index associated with this overlap */ + int raix; + /** @mstart: Start address of the mapped region */ + u64 mstart; + /** @mend: End address of the mapped region */ + u64 mend; + /** @offset: Offset within the mapped region */ + u64 offset; +}; + +/** + * struct fastrpc_remote_dmahandle - Remote DMA handle descriptor + */ +struct fastrpc_remote_dmahandle { + /** @fd: DMA-BUF file descriptor */ + s32 fd; + /** @offset: Byte offset within the DMA-BUF */ + u32 offset; + /** @len: Length of the region in bytes */ + u32 len; +}; + +/** + * struct fastrpc_remote_buf - Remote buffer descriptor + */ +struct fastrpc_remote_buf { + /** @pv: Buffer pointer (user virtual address) */ + u64 pv; + /** @len: Length of the buffer in bytes */ + u64 len; +}; + +/** + * union fastrpc_remote_arg - Remote argument (buffer or DMA handle) + */ +union fastrpc_remote_arg { + /** @buf: Inline buffer descriptor */ + struct fastrpc_remote_buf buf; + /** @dma: DMA-BUF handle descriptor */ + struct fastrpc_remote_dmahandle dma; +}; + +/** + * struct fastrpc_phy_page - Physical page descriptor + */ +struct fastrpc_phy_page { + /** @addr: Physical (IOMMU) address of the page */ + u64 addr; + /** @size: Size of the contiguous region in bytes */ + u64 size; +}; + +/** + * struct fastrpc_invoke_buf - Invoke buffer descriptor + */ +struct fastrpc_invoke_buf { + /** @num: Number of contiguous physical regions */ + u32 num; + /** @pgidx: Index into the physical page array */ + u32 pgidx; +}; + +/** + * struct fastrpc_msg - FastRPC wire message for remote invocations + * + * Sent to the remote processor via RPMsg. This is the exact layout + * the DSP expects; do not reorder or add fields without DSP firmware + * coordination. + */ +struct fastrpc_msg { + /** @remote_session_id: Session identifier on the remote processor */ + int remote_session_id; + /** @tid: Thread ID of the invoking thread */ + int tid; + /** @ctx: Context identifier for matching request/response */ + u64 ctx; + /** @handle: Handle of the remote method to invoke */ + u32 handle; + /** @sc: Scalars value encoding in/out buffer counts */ + u32 sc; + /** @addr: Physical address of the message payload buffer */ + u64 addr; + /** @size: Size of the message payload in bytes */ + u64 size; +}; + +/** + * struct qda_msg - FastRPC message with kernel-internal bookkeeping + * + * The wire-format portion is kept in the embedded @fastrpc member (must + * be first) so that &qda_msg->fastrpc can be passed directly to + * rpmsg_send() without a copy. + */ +struct qda_msg { + /** + * @fastrpc: Wire-format message sent to the DSP via RPMsg. + * Must be the first member. + */ + struct fastrpc_msg fastrpc; + /** @buf: Kernel virtual address of the payload buffer */ + void *buf; + /** @phys: Physical/DMA address of the payload buffer */ + u64 phys; + /** @ret: Return value from the remote processor */ + int ret; + /** @fastrpc_ctx: Back-pointer to the owning invocation context */ + struct fastrpc_invoke_context *fastrpc_ctx; + /** @file_priv: DRM file private data for GEM object lookup */ + struct drm_file *file_priv; +}; + +/** + * struct fastrpc_invoke_context - Remote procedure call invocation context + * + * Maintains all state for a single remote procedure call, including buffer + * management, synchronisation, and result handling. + */ +struct fastrpc_invoke_context { + /** @node: List node for linking contexts in a queue */ + struct list_head node; + /** @ctxid: Unique context identifier (XArray key shifted left by 4) */ + u64 ctxid; + /** @inbufs: Number of input buffers */ + int inbufs; + /** @outbufs: Number of output buffers */ + int outbufs; + /** @handles: Number of DMA-BUF handle arguments */ + int handles; + /** @nscalars: Total number of scalar arguments */ + int nscalars; + /** @nbufs: Total number of buffer arguments (inbufs + outbufs) */ + int nbufs; + /** @pid: Process ID of the calling process */ + int pid; + /** @retval: Return value from the remote invocation */ + int retval; + /** @metalen: Length of the FastRPC metadata header in bytes */ + int metalen; + /** @remote_session_id: Session identifier on the remote processor */ + int remote_session_id; + /** @pd: Protection domain identifier encoded into the context ID */ + int pd; + /** @type: Invocation type (e.g. FASTRPC_RMID_INVOKE_DYNAMIC) */ + int type; + /** @sc: Scalars value encoding in/out buffer counts */ + u32 sc; + /** @handle: Handle of the remote method being invoked */ + u32 handle; + /** @crc: Pointer to CRC values for data integrity checking */ + u32 *crc; + /** @fdlist: Pointer to array of DMA-BUF file descriptors */ + u64 *fdlist; + /** @pkt_size: Total payload size in bytes */ + u64 pkt_size; + /** @aligned_pkt_size: Page-aligned payload size for GEM allocation */ + u64 aligned_pkt_size; + /** @list: Array of invoke buffer descriptors */ + struct fastrpc_invoke_buf *list; + /** @pages: Array of physical page descriptors for all arguments */ + struct fastrpc_phy_page *pages; + /** @input_pages: Array of physical page descriptors for input buffers */ + struct fastrpc_phy_page *input_pages; + /** @work: Completion used to synchronise with the DSP response */ + struct completion work; + /** @msg: Pointer to the QDA message structure for this invocation */ + struct qda_msg *msg; + /** @rpra: Array of remote procedure arguments */ + union fastrpc_remote_arg *rpra; + /** @gem_objs: Array of GEM objects imported for argument buffers */ + struct drm_gem_object **gem_objs; + /** @args: User-space invoke argument descriptors */ + struct drm_qda_fastrpc_invoke_args *args; + /** @olaps: Array of buffer overlap descriptors for deduplication */ + struct fastrpc_buf_overlap *olaps; + /** @refcount: Reference counter for context lifetime management */ + struct kref refcount; + /** @msg_gem_obj: GEM object backing the message payload buffer */ + struct qda_gem_obj *msg_gem_obj; + /** @file_priv: DRM file private data */ + struct drm_file *file_priv; + /** @init_mem_gem_obj: GEM object for protection domain init memory */ + struct qda_gem_obj *init_mem_gem_obj; + /** @req: Pointer to kernel-internal request buffer */ + void *req; + /** @rsp: Pointer to kernel-internal response buffer */ + void *rsp; + /** @inbuf: Pointer to kernel-internal input buffer */ + void *inbuf; +}; + +/* Remote Method ID table - identifies initialization and control operations */ +#define FASTRPC_RMID_INVOKE_DYNAMIC 0xFFFFFFFF /* Dynamic method invocation */ + +/* Common handle for initialization operations */ +#define FASTRPC_INIT_HANDLE 0x1 + +void qda_fastrpc_context_free(struct kref *ref); +struct fastrpc_invoke_context *qda_fastrpc_context_alloc(void); +int qda_fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp); +int qda_fastrpc_get_header_size(struct fastrpc_invoke_context *ctx, size_t *out_size); +int qda_fastrpc_invoke_pack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg); +int qda_fastrpc_invoke_unpack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg); + +#endif /* __QDA_FASTRPC_H__ */ diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c index 1769c85a3e98..c81268c20b04 100644 --- a/drivers/accel/qda/qda_ioctl.c +++ b/drivers/accel/qda/qda_ioctl.c @@ -3,8 +3,10 @@ #include <drm/drm_ioctl.h> #include <drm/qda_accel.h> #include "qda_drv.h" +#include "qda_fastrpc.h" #include "qda_gem.h" #include "qda_ioctl.h" +#include "qda_rpmsg.h"
/** * qda_ioctl_query() - Query DSP device information @@ -74,3 +76,105 @@ int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_fil
return drm_gem_dumb_map_offset(file_priv, dev, args->handle, &args->offset); } + +static int fastrpc_context_get_id(struct fastrpc_invoke_context *ctx, struct qda_dev *qdev) +{ + int ret; + u32 id; + + if (!qdev) + return -EINVAL; + + ret = xa_alloc(&qdev->ctx_xa, &id, ctx, xa_limit_32b, GFP_KERNEL); + if (ret) + return ret; + + ctx->ctxid = id << 4; + return 0; +} + +static void fastrpc_context_put_id(struct fastrpc_invoke_context *ctx, struct qda_dev *qdev) +{ + if (qdev) + xa_erase(&qdev->ctx_xa, ctx->ctxid >> 4); +} + +static int fastrpc_invoke(int type, struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct qda_file_priv *qda_file_priv = file_priv->driver_priv; + struct qda_dev *qdev = qda_file_priv->qda_dev; + struct qda_msg msg; + struct fastrpc_invoke_context *ctx; + struct drm_gem_object *gem_obj; + int err; + size_t hdr_size; + + ctx = qda_fastrpc_context_alloc(); + if (IS_ERR(ctx)) + return PTR_ERR(ctx); + + err = fastrpc_context_get_id(ctx, qdev); + if (err) { + kref_put(&ctx->refcount, qda_fastrpc_context_free); + return err; + } + + ctx->type = type; + ctx->file_priv = file_priv; + ctx->remote_session_id = qda_file_priv->remote_session_id; + + err = qda_fastrpc_prepare_args(ctx, (char __user *)data); + if (err) + goto err_context_free; + + err = qda_fastrpc_get_header_size(ctx, &hdr_size); + if (err) + goto err_context_free; + + gem_obj = qda_gem_create_object(dev, qdev->iommu_mgr, hdr_size, file_priv); + if (IS_ERR(gem_obj)) { + err = PTR_ERR(gem_obj); + goto err_context_free; + } + + ctx->msg_gem_obj = to_qda_gem_obj(gem_obj); + + err = qda_fastrpc_invoke_pack(ctx, &msg); + if (err) + goto err_context_free; + + err = qda_rpmsg_send_msg(qdev, &msg); + if (err) + goto err_context_free; + + err = qda_rpmsg_wait_for_rsp(ctx); + if (err) + goto err_context_free; + + err = qda_fastrpc_invoke_unpack(ctx, &msg); + if (err) + goto err_context_free; + + fastrpc_context_put_id(ctx, qdev); + kref_put(&ctx->refcount, qda_fastrpc_context_free); + return 0; + +err_context_free: + fastrpc_context_put_id(ctx, qdev); + kref_put(&ctx->refcount, qda_fastrpc_context_free); + return err; +} + +/** + * qda_ioctl_invoke() - Perform a dynamic FastRPC method invocation + * @dev: DRM device structure + * @data: User-space data (struct qda_invoke_args) + * @file_priv: DRM file private data + * + * Return: 0 on success, negative error code on failure + */ +int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv) +{ + return fastrpc_invoke(FASTRPC_RMID_INVOKE_DYNAMIC, dev, data, file_priv); +} diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h index d1cbbfb6d965..3bb9cfd98370 100644 --- a/drivers/accel/qda/qda_ioctl.h +++ b/drivers/accel/qda/qda_ioctl.h @@ -11,5 +11,6 @@ int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv); int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *file_priv); int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv); +int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv);
#endif /* __QDA_IOCTL_H__ */ diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c index 719dabb028c5..44b12a9f2808 100644 --- a/drivers/accel/qda/qda_rpmsg.c +++ b/drivers/accel/qda/qda_rpmsg.c @@ -1,14 +1,81 @@ // SPDX-License-Identifier: GPL-2.0-only // Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <linux/completion.h> #include <linux/module.h> #include <linux/of.h> #include <linux/rpmsg.h> +#include <linux/sched.h> +#include <linux/wait.h> #include <drm/drm_print.h>
#include "qda_cb.h" #include "qda_drv.h" +#include "qda_fastrpc.h" #include "qda_rpmsg.h"
+static int validate_device_availability(struct qda_dev *qdev) +{ + if (!qdev) + return -ENODEV; + + if (!qdev->rpdev) { + drm_dbg_driver(&qdev->drm_dev, "RPMsg device unavailable: rpdev is NULL\n"); + return -ENODEV; + } + return 0; +} + +static struct fastrpc_invoke_context *get_and_validate_context(struct qda_msg *msg, + struct qda_dev *qdev) +{ + struct fastrpc_invoke_context *ctx = msg->fastrpc_ctx; + + if (!ctx) { + drm_dbg_driver(&qdev->drm_dev, "FastRPC context not found in message\n"); + return ERR_PTR(-EINVAL); + } + + kref_get(&ctx->refcount); + return ctx; +} + +static int validate_callback_params(struct qda_dev *qdev, void *data, int len) +{ + if (!qdev) + return -ENODEV; + + if (len < sizeof(struct qda_invoke_rsp)) { + drm_dbg_driver(&qdev->drm_dev, "Invalid message size from remote: %d\n", len); + return -EINVAL; + } + return 0; +} + +static unsigned long extract_context_id(struct qda_invoke_rsp *resp_msg) +{ + return resp_msg->ctx >> 4; +} + +static struct fastrpc_invoke_context *find_context_by_id(struct qda_dev *qdev, + unsigned long ctxid) +{ + struct fastrpc_invoke_context *ctx; + + ctx = xa_load(&qdev->ctx_xa, ctxid); + if (!ctx) { + drm_dbg_driver(&qdev->drm_dev, "FastRPC context not found for ctxid: %lu\n", ctxid); + return ERR_PTR(-ENOENT); + } + return ctx; +} + +static void complete_context_processing(struct fastrpc_invoke_context *ctx, int retval) +{ + ctx->retval = retval; + complete(&ctx->work); + kref_put(&ctx->refcount, qda_fastrpc_context_free); +} + static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev) { struct qda_dev *qdev; @@ -24,11 +91,76 @@ static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev) return qdev; }
+int qda_rpmsg_send_msg(struct qda_dev *qdev, struct qda_msg *msg) +{ + int ret, idx; + struct fastrpc_invoke_context *ctx; + + if (!qdev) + return -ENODEV; + + if (!drm_dev_enter(&qdev->drm_dev, &idx)) + return -ENODEV; + + ret = validate_device_availability(qdev); + if (ret) + goto out_exit; + + ctx = get_and_validate_context(msg, qdev); + if (IS_ERR(ctx)) { + ret = PTR_ERR(ctx); + goto out_exit; + } + + ret = rpmsg_send(qdev->rpdev->ept, &msg->fastrpc, sizeof(msg->fastrpc)); + if (ret) { + drm_err(&qdev->drm_dev, "rpmsg_send failed: %d\n", ret); + kref_put(&ctx->refcount, qda_fastrpc_context_free); + } + +out_exit: + drm_dev_exit(idx); + return ret; +} + +int qda_rpmsg_wait_for_rsp(struct fastrpc_invoke_context *ctx) +{ + return wait_for_completion_interruptible(&ctx->work); +} + static int qda_rpmsg_cb(struct rpmsg_device *rpdev, void *data, int len, void *priv, u32 src) { - /* Placeholder: responses will be dispatched here */ - return 0; + struct qda_dev *qdev = dev_get_drvdata(&rpdev->dev); + struct qda_invoke_rsp *resp_msg = (struct qda_invoke_rsp *)data; + struct fastrpc_invoke_context *ctx; + unsigned long ctxid; + int ret, idx; + + if (!qdev) + return -ENODEV; + + if (!drm_dev_enter(&qdev->drm_dev, &idx)) + return -ENODEV; + + ret = validate_callback_params(qdev, data, len); + if (ret) + goto out_exit; + + ctxid = extract_context_id(resp_msg); + + ctx = find_context_by_id(qdev, ctxid); + if (IS_ERR(ctx)) { + ret = PTR_ERR(ctx); + goto out_exit; + } + + complete_context_processing(ctx, resp_msg->retval); + ret = 0; + +out_exit: + drm_dev_exit(idx); + return ret; }
static void qda_rpmsg_remove(struct rpmsg_device *rpdev) diff --git a/drivers/accel/qda/qda_rpmsg.h b/drivers/accel/qda/qda_rpmsg.h index 5229d834b34b..bf601e915017 100644 --- a/drivers/accel/qda/qda_rpmsg.h +++ b/drivers/accel/qda/qda_rpmsg.h @@ -6,6 +6,23 @@ #ifndef __QDA_RPMSG_H__ #define __QDA_RPMSG_H__
+#include "qda_drv.h" +#include "qda_fastrpc.h" + +/** + * struct qda_invoke_rsp - Response structure for FastRPC invocations + */ +struct qda_invoke_rsp { + /** @ctx: Invoke caller context for matching request/response */ + u64 ctx; + /** @retval: Return value from the remote invocation */ + int retval; +}; + +/* RPMsg transport layer functions */ +int qda_rpmsg_send_msg(struct qda_dev *qdev, struct qda_msg *msg); +int qda_rpmsg_wait_for_rsp(struct fastrpc_invoke_context *ctx); + /* RPMsg transport layer registration */ int qda_rpmsg_register(void); void qda_rpmsg_unregister(void); diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h index 319e21aae0d6..72512213741f 100644 --- a/include/uapi/drm/qda_accel.h +++ b/include/uapi/drm/qda_accel.h @@ -21,6 +21,8 @@ extern "C" { #define DRM_QDA_QUERY 0x00 #define DRM_QDA_GEM_CREATE 0x01 #define DRM_QDA_GEM_MMAP_OFFSET 0x02 +/* Command numbers 0x03-0x06 reserved for INIT_ATTACH, INIT_CREATE, MAP, MUNMAP */ +#define DRM_QDA_REMOTE_INVOKE 0x07
/* * QDA IOCTL definitions @@ -35,6 +37,8 @@ extern "C" { struct drm_qda_gem_create) #define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \ struct drm_qda_gem_mmap_offset) +#define DRM_IOCTL_QDA_REMOTE_INVOKE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_REMOTE_INVOKE, \ + struct drm_qda_invoke_args)
/** * struct drm_qda_query - Device information query structure @@ -78,6 +82,41 @@ struct drm_qda_gem_mmap_offset { __u32 pad; };
+/** + * struct drm_qda_fastrpc_invoke_args - FastRPC invocation argument descriptor + * @ptr: Pointer to argument data (user virtual address) + * @length: Length of the argument data in bytes + * @fd: DMA-BUF file descriptor for buffer arguments, -1/0 for scalar arguments + * @attr: Argument attributes and flags + * + * This structure describes a single argument passed to a FastRPC invocation. + * Arguments can be either scalar values or buffer references (via DMA-BUF fd). + */ +struct drm_qda_fastrpc_invoke_args { + __u64 ptr; + __u64 length; + __s32 fd; + __u32 attr; +}; + +/** + * struct drm_qda_invoke_args - Dynamic FastRPC invocation parameters + * @handle: Remote handle to invoke on the DSP + * @sc: FastRPC scalars value encoding the number of in/out buffers + * @args: User-space pointer to array of drm_qda_fastrpc_invoke_args descriptors; + * the fd field in each entry must be a DMA-BUF fd (or -1/0 for + * inline scalar buffers) + * + * This structure is used with DRM_IOCTL_QDA_REMOTE_INVOKE to perform a + * dynamic remote procedure call on the DSP. The args pointer must reference + * an array of REMOTE_SCALARS_LENGTH(sc) drm_qda_fastrpc_invoke_args entries. + */ +struct drm_qda_invoke_args { + __u32 handle; + __u32 sc; + __u64 args; +}; + #if defined(__cplusplus) } #endif
On Tue, May 19, 2026 at 11:46:02AM +0530, Ekansh Gupta via B4 Relay wrote:
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Implement the FastRPC remote procedure call path, allowing user-space to invoke methods on the DSP via DRM_IOCTL_QDA_REMOTE_INVOKE.
qda_fastrpc.c / qda_fastrpc.h Implements the FastRPC protocol layer: argument marshalling (qda_fastrpc_invoke_pack), response unmarshalling (qda_fastrpc_invoke_unpack), and invocation context lifecycle management. Each invocation allocates a fastrpc_invoke_context which tracks buffer descriptors, GEM objects, and the completion used to synchronise with the DSP response.
Buffer arguments are handled in three ways:
- DMA-BUF fd: imported via PRIME, IOMMU-mapped dma_addr used
- Direct (inline): copied into the GEM-backed message buffer
- DMA handle: fd forwarded to DSP, physical page descriptor computed
No. This needs to go away. The QDA should support only one way to pass data - via the GEM buffers. Everything else should be handled by the shim layer, etc.
qda_rpmsg.c Implements qda_rpmsg_send_msg() which sends the wire-format fastrpc_msg (embedded as the first member of qda_msg) directly via rpmsg_send(), and qda_rpmsg_wait_for_rsp() which blocks on the context completion. The RPMsg callback dispatches responses to waiting contexts via the ctx_xa XArray.
qda_ioctl.c qda_ioctl_invoke() drives the full invocation lifecycle: allocate context → assign XArray ID → prepare args → allocate GEM message buffer → pack → send → wait → unpack → free.
qda_drv.h / qda_drv.c qda_dev gains ctx_xa (XArray for in-flight context lookup) and remote_session_id_counter (atomic counter for session IDs). qda_file_priv gains remote_session_id for per-session tracking.
include/uapi/drm/qda_accel.h Adds DRM_IOCTL_QDA_REMOTE_INVOKE (command 0x07; command numbers 0x03–0x06 are reserved) and the associated drm_qda_invoke_args and drm_qda_fastrpc_invoke_args structures.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
drivers/accel/qda/Makefile | 1 + drivers/accel/qda/qda_drv.c | 17 ++ drivers/accel/qda/qda_drv.h | 8 + drivers/accel/qda/qda_fastrpc.c | 597 ++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_fastrpc.h | 271 ++++++++++++++++++ drivers/accel/qda/qda_ioctl.c | 104 +++++++ drivers/accel/qda/qda_ioctl.h | 1 + drivers/accel/qda/qda_rpmsg.c | 136 ++++++++- drivers/accel/qda/qda_rpmsg.h | 17 ++ include/uapi/drm/qda_accel.h | 39 +++ 10 files changed, 1189 insertions(+), 2 deletions(-)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile index fb092e56d7f3..2d10420cd1ec 100644 --- a/drivers/accel/qda/Makefile +++ b/drivers/accel/qda/Makefile @@ -8,6 +8,7 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o qda-y := \ qda_cb.o \ qda_drv.o \
- qda_fastrpc.o \ qda_gem.o \ qda_ioctl.o \ qda_memory_dma.o \
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index ef8bd573b836..704c7d3127d2 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -26,6 +26,8 @@ static int qda_open(struct drm_device *dev, struct drm_file *file) qda_file_priv->pid = current->pid; qda_file_priv->qda_dev = qda_dev_from_drm(dev);
- qda_file_priv->remote_session_id =
file->driver_priv = qda_file_priv;atomic_inc_return(&qda_file_priv->qda_dev->remote_session_id_counter);return 0; @@ -57,6 +59,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = { DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0), DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0), DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
- DRM_IOCTL_DEF_DRV(QDA_REMOTE_INVOKE, qda_ioctl_invoke, 0),
}; static const struct drm_driver qda_drm_driver = { @@ -93,6 +96,17 @@ static void cleanup_memory_manager(struct qda_dev *qdev) } } +static void cleanup_device_resources(struct qda_dev *qdev) +{
- xa_destroy(&qdev->ctx_xa);
+}
+static void init_device_resources(struct qda_dev *qdev) +{
- atomic_set(&qdev->remote_session_id_counter, 0);
- xa_init_flags(&qdev->ctx_xa, XA_FLAGS_ALLOC1);
+}
static int init_memory_manager(struct qda_dev *qdev) { qdev->iommu_mgr = kzalloc_obj(*qdev->iommu_mgr); @@ -106,6 +120,7 @@ void qda_deinit_device(struct qda_dev *qdev) { mutex_destroy(&qdev->import_lock); cleanup_memory_manager(qdev);
- cleanup_device_resources(qdev);
} int qda_init_device(struct qda_dev *qdev) @@ -114,10 +129,12 @@ int qda_init_device(struct qda_dev *qdev) mutex_init(&qdev->import_lock); qdev->current_import_file_priv = NULL;
- init_device_resources(qdev);
ret = init_memory_manager(qdev); if (ret) { drm_err(&qdev->drm_dev, "Failed to initialize memory manager: %d\n", ret);
mutex_destroy(&qdev->import_lock); }cleanup_device_resources(qdev);diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h index 96ce4135e2d9..420cccff42bf 100644 --- a/drivers/accel/qda/qda_drv.h +++ b/drivers/accel/qda/qda_drv.h @@ -6,10 +6,12 @@ #ifndef __QDA_DRV_H__ #define __QDA_DRV_H__ +#include <linux/atomic.h> #include <linux/device.h> #include <linux/list.h> #include <linux/rpmsg.h> #include <linux/types.h> +#include <linux/xarray.h> #include <drm/drm_device.h> #include <drm/drm_drv.h> #include <drm/drm_file.h> @@ -28,6 +30,8 @@ struct qda_file_priv { struct qda_iommu_device *assigned_iommu_dev; /** @pid: Process ID for tracking */ pid_t pid;
- /** @remote_session_id: Unique session identifier */
- u32 remote_session_id;
}; /** @@ -51,8 +55,12 @@ struct qda_dev { struct mutex import_lock; /** @current_import_file_priv: Current file_priv during prime import */ struct drm_file *current_import_file_priv;
- /** @ctx_xa: XArray for FastRPC context management */
- struct xarray ctx_xa; /** @dsp_name: Name of the DSP domain (e.g. "cdsp", "adsp") */ const char *dsp_name;
- /** @remote_session_id_counter: Atomic counter for unique session IDs */
- atomic_t remote_session_id_counter;
}; /** diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c new file mode 100644 index 000000000000..0ec37175a098 --- /dev/null +++ b/drivers/accel/qda/qda_fastrpc.c @@ -0,0 +1,597 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. +#include <linux/slab.h> +#include <linux/uaccess.h> +#include <linux/sort.h> +#include <linux/completion.h> +#include <linux/dma-buf.h> +#include <drm/drm_gem.h> +#include "qda_fastrpc.h" +#include "qda_drv.h" +#include "qda_gem.h" +#include "qda_memory_manager.h" +#include "qda_prime.h"
+/**
- get_gem_obj_from_dmabuf_fd() - Import a DMA-BUF fd and return the GEM object
- @ctx: FastRPC invocation context
- @dmabuf_fd: DMA-BUF file descriptor supplied by user space
- @gem_obj: Output GEM object (caller must call drm_gem_object_put() when done)
- Imports the DMA-BUF fd into the QDA device via qda_prime_fd_to_handle()
- (which performs IOMMU device assignment for newly imported buffers) and
- then looks up the resulting GEM object. The caller is responsible for
- calling drm_gem_object_put() on the returned object.
- Return: 0 on success, negative error code on failure
- */
+static int get_gem_obj_from_dmabuf_fd(struct fastrpc_invoke_context *ctx,
int dmabuf_fd,struct drm_gem_object **gem_obj)+{
- struct drm_device *dev = ctx->file_priv->minor->dev;
- u32 handle;
- int ret;
- ret = qda_prime_fd_to_handle(dev, ctx->file_priv, dmabuf_fd, &handle);
- if (ret)
return ret;- *gem_obj = drm_gem_object_lookup(ctx->file_priv, handle);
- if (!*gem_obj)
return -ENOENT;- return 0;
+}
+static void setup_pages_from_gem_obj(struct qda_gem_obj *qda_gem_obj,
struct fastrpc_phy_page *pages)+{
- pages->addr = qda_gem_obj->dma_addr;
- pages->size = qda_gem_obj->size;
+}
+static u64 calculate_vma_offset(u64 user_ptr) +{
- struct vm_area_struct *vma;
- u64 user_ptr_page_mask = user_ptr & PAGE_MASK;
- u64 vma_offset = 0;
- mmap_read_lock(current->mm);
- vma = find_vma(current->mm, user_ptr);
- if (vma)
vma_offset = user_ptr_page_mask - vma->vm_start;- mmap_read_unlock(current->mm);
- return vma_offset;
+}
+static u64 calculate_page_aligned_size(u64 ptr, u64 len) +{
- u64 pg_start = (ptr & PAGE_MASK) >> PAGE_SHIFT;
- u64 pg_end = ((ptr + len - 1) & PAGE_MASK) >> PAGE_SHIFT;
- u64 aligned_size = (pg_end - pg_start + 1) * PAGE_SIZE;
- return aligned_size;
+}
+static struct fastrpc_invoke_buf *fastrpc_invoke_buf_start(union fastrpc_remote_arg *pra, int len) +{
- return (struct fastrpc_invoke_buf *)(&pra[len]);
+}
+static struct fastrpc_phy_page *fastrpc_phy_page_start(struct fastrpc_invoke_buf *buf, int len) +{
- return (struct fastrpc_phy_page *)(&buf[len]);
+}
+static int fastrpc_get_meta_size(struct fastrpc_invoke_context *ctx) +{
- int size = 0;
- size = (sizeof(struct fastrpc_remote_buf) +
sizeof(struct fastrpc_invoke_buf) +sizeof(struct fastrpc_phy_page)) * ctx->nscalars +sizeof(u64) * FASTRPC_MAX_FDLIST +sizeof(u32) * FASTRPC_MAX_CRCLIST;- return size;
+}
+static u64 fastrpc_get_payload_size(struct fastrpc_invoke_context *ctx, int metalen) +{
- u64 size = 0;
- int oix;
- size = ALIGN(metalen, FASTRPC_ALIGN);
- for (oix = 0; oix < ctx->nbufs; oix++) {
int i = ctx->olaps[oix].raix;if (ctx->args[i].fd == 0 || ctx->args[i].fd == -1) {if (ctx->olaps[oix].offset == 0)size = ALIGN(size, FASTRPC_ALIGN);size += (ctx->olaps[oix].mend - ctx->olaps[oix].mstart);}- }
- return size;
+}
+/**
- qda_fastrpc_context_free() - Free an invocation context
- @ref: Reference counter embedded in the context
- Called when the reference count reaches zero; releases all resources
- associated with the invocation context.
- */
+void qda_fastrpc_context_free(struct kref *ref) +{
- struct fastrpc_invoke_context *ctx;
- int i;
- ctx = container_of(ref, struct fastrpc_invoke_context, refcount);
- if (ctx->gem_objs) {
for (i = 0; i < ctx->nscalars; ++i) {if (ctx->gem_objs[i])drm_gem_object_put(ctx->gem_objs[i]);}kfree(ctx->gem_objs);- }
- if (ctx->msg_gem_obj)
drm_gem_object_put(&ctx->msg_gem_obj->base);- kfree(ctx->olaps);
- kfree(ctx->args);
- kfree(ctx->req);
- kfree(ctx->rsp);
- kfree(ctx->input_pages);
- kfree(ctx->inbuf);
- kfree(ctx);
+}
+#define CMP(aa, bb) ((aa) == (bb) ? 0 : (aa) < (bb) ? -1 : 1)
+static int olaps_cmp(const void *a, const void *b) +{
- struct fastrpc_buf_overlap *pa = (struct fastrpc_buf_overlap *)a;
- struct fastrpc_buf_overlap *pb = (struct fastrpc_buf_overlap *)b;
- /* sort with lowest starting buffer first */
- int st = CMP(pa->start, pb->start);
- /* sort with highest ending buffer first */
- int ed = CMP(pb->end, pa->end);
- return st == 0 ? ed : st;
+}
+static void fastrpc_get_buff_overlaps(struct fastrpc_invoke_context *ctx) +{
- u64 max_end = 0;
- int i;
- for (i = 0; i < ctx->nbufs; ++i) {
ctx->olaps[i].start = ctx->args[i].ptr;ctx->olaps[i].end = ctx->olaps[i].start + ctx->args[i].length;ctx->olaps[i].raix = i;- }
- sort(ctx->olaps, ctx->nbufs, sizeof(*ctx->olaps), olaps_cmp, NULL);
- for (i = 0; i < ctx->nbufs; ++i) {
if (ctx->olaps[i].start < max_end) {ctx->olaps[i].mstart = max_end;ctx->olaps[i].mend = ctx->olaps[i].end;ctx->olaps[i].offset = max_end - ctx->olaps[i].start;if (ctx->olaps[i].end > max_end) {max_end = ctx->olaps[i].end;} else {ctx->olaps[i].mend = 0;ctx->olaps[i].mstart = 0;}} else {ctx->olaps[i].mend = ctx->olaps[i].end;ctx->olaps[i].mstart = ctx->olaps[i].start;ctx->olaps[i].offset = 0;max_end = ctx->olaps[i].end;}- }
+}
+/**
- qda_fastrpc_context_alloc() - Allocate a new FastRPC invocation context
- Return: Pointer to allocated context, or ERR_PTR on failure
- */
+struct fastrpc_invoke_context *qda_fastrpc_context_alloc(void) +{
- struct fastrpc_invoke_context *ctx = NULL;
- ctx = kzalloc_obj(*ctx);
- if (!ctx)
return ERR_PTR(-ENOMEM);- INIT_LIST_HEAD(&ctx->node);
- ctx->retval = -1;
- ctx->pid = current->pid;
- init_completion(&ctx->work);
- ctx->msg_gem_obj = NULL;
- kref_init(&ctx->refcount);
- return ctx;
+}
+/*
- process_fd_buffer() - Handle an in/out buffer argument backed by a DMA-BUF fd
- args[i].fd is a DMA-BUF fd. We import it to obtain the GEM object and its
- IOMMU-mapped dma_addr for the physical page descriptor. The DSP uses the
- physical address directly for this buffer type; the fd is not forwarded.
- */
+static int process_fd_buffer(struct fastrpc_invoke_context *ctx, int i,
union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages)+{
- struct drm_gem_object *gem_obj;
- struct qda_gem_obj *qda_gem_obj;
- int err;
- u64 len = ctx->args[i].length;
- u64 vma_offset;
- err = get_gem_obj_from_dmabuf_fd(ctx, ctx->args[i].fd, &gem_obj);
- if (err)
return err;- ctx->gem_objs[i] = gem_obj;
- qda_gem_obj = to_qda_gem_obj(gem_obj);
- rpra[i].buf.pv = (u64)ctx->args[i].ptr;
- pages[i].addr = qda_gem_obj->dma_addr;
- vma_offset = calculate_vma_offset(ctx->args[i].ptr);
- pages[i].addr += vma_offset;
- pages[i].size = calculate_page_aligned_size(ctx->args[i].ptr, len);
- return 0;
+}
+static int process_direct_buffer(struct fastrpc_invoke_context *ctx, int i, int oix,
union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages,uintptr_t *args, u64 *rlen, u64 pkt_size)+{
- int mlen;
- u64 len = ctx->args[i].length;
- int inbufs = ctx->inbufs;
- if (ctx->olaps[oix].offset == 0) {
*rlen -= ALIGN(*args, FASTRPC_ALIGN) - *args;*args = ALIGN(*args, FASTRPC_ALIGN);- }
- mlen = ctx->olaps[oix].mend - ctx->olaps[oix].mstart;
- if (*rlen < mlen)
return -ENOSPC;- rpra[i].buf.pv = *args - ctx->olaps[oix].offset;
- pages[i].addr = ctx->msg->phys - ctx->olaps[oix].offset + (pkt_size - *rlen);
- pages[i].addr = pages[i].addr & PAGE_MASK;
- pages[i].size = calculate_page_aligned_size(rpra[i].buf.pv, len);
- *args = *args + mlen;
- *rlen -= mlen;
- if (i < inbufs) {
void *dst = (void *)(uintptr_t)rpra[i].buf.pv;void *src = (void *)(uintptr_t)ctx->args[i].ptr;/** For user-space invocations (INVOKE_DYNAMIC), ptr is a user* virtual address and must be copied safely. For all other* (kernel-internal) invocations, ptr is a kernel address set* by the driver itself and can be copied directly.*/if (ctx->type == FASTRPC_RMID_INVOKE_DYNAMIC) {if (copy_from_user(dst, (void __user *)src, len))return -EFAULT;} else {memcpy(dst, src, len);}- }
- return 0;
+}
+/*
- process_dma_handle() - Handle a DMA-handle scalar argument
- args[i].fd is a DMA-BUF fd. We import it to get the physical page
- descriptor for the kernel, but forward the original DMA-BUF fd to the
- DSP in rpra[i].dma.fd so the DSP can identify the buffer by its fd.
- */
+static int process_dma_handle(struct fastrpc_invoke_context *ctx, int i,
union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages)+{
- if (ctx->args[i].fd > 0) {
struct drm_gem_object *gem_obj;struct qda_gem_obj *qda_gem_obj;int err;err = get_gem_obj_from_dmabuf_fd(ctx, ctx->args[i].fd, &gem_obj);if (err)return err;ctx->gem_objs[i] = gem_obj;qda_gem_obj = to_qda_gem_obj(gem_obj);setup_pages_from_gem_obj(qda_gem_obj, &pages[i]);/* Forward the original DMA-BUF fd to the DSP */rpra[i].dma.fd = ctx->args[i].fd;rpra[i].dma.len = ctx->args[i].length;rpra[i].dma.offset = (u64)ctx->args[i].ptr;- } else {
rpra[i].buf.pv = ctx->args[i].ptr;rpra[i].buf.len = ctx->args[i].length;- }
- return 0;
+}
+/**
- qda_fastrpc_get_header_size() - Compute the FastRPC message header size
- @ctx: FastRPC invocation context
- @out_size: Pointer to store the aligned packet size in bytes
- Return: 0 on success, negative error code on failure
- */
+int qda_fastrpc_get_header_size(struct fastrpc_invoke_context *ctx, size_t *out_size) +{
- ctx->inbufs = REMOTE_SCALARS_INBUFS(ctx->sc);
- ctx->metalen = fastrpc_get_meta_size(ctx);
- ctx->pkt_size = fastrpc_get_payload_size(ctx, ctx->metalen);
- ctx->aligned_pkt_size = PAGE_ALIGN(ctx->pkt_size);
- if (ctx->aligned_pkt_size == 0)
return -EINVAL;- *out_size = ctx->aligned_pkt_size;
- return 0;
+}
+static int fastrpc_get_args(struct fastrpc_invoke_context *ctx) +{
- union fastrpc_remote_arg *rpra;
- struct fastrpc_invoke_buf *list;
- struct fastrpc_phy_page *pages;
- int i, oix, err = 0;
- u64 rlen;
- uintptr_t args;
- size_t hdr_size;
- ctx->inbufs = REMOTE_SCALARS_INBUFS(ctx->sc);
- err = qda_fastrpc_get_header_size(ctx, &hdr_size);
- if (err)
return err;- ctx->msg->buf = ctx->msg_gem_obj->virt;
- ctx->msg->phys = ctx->msg_gem_obj->dma_addr;
- memset(ctx->msg->buf, 0, ctx->aligned_pkt_size);
- rpra = (union fastrpc_remote_arg *)ctx->msg->buf;
- ctx->list = fastrpc_invoke_buf_start(rpra, ctx->nscalars);
- ctx->pages = fastrpc_phy_page_start(ctx->list, ctx->nscalars);
- list = ctx->list;
- pages = ctx->pages;
- args = (uintptr_t)ctx->msg->buf + ctx->metalen;
- rlen = ctx->pkt_size - ctx->metalen;
- ctx->rpra = rpra;
- for (oix = 0; oix < ctx->nbufs; ++oix) {
i = ctx->olaps[oix].raix;rpra[i].buf.pv = 0;rpra[i].buf.len = ctx->args[i].length;list[i].num = ctx->args[i].length ? 1 : 0;list[i].pgidx = i;if (!ctx->args[i].length)continue;if (ctx->args[i].fd > 0)err = process_fd_buffer(ctx, i, rpra, pages);elseerr = process_direct_buffer(ctx, i, oix, rpra, pages, &args, &rlen,ctx->pkt_size);if (err)goto bail_gem;- }
- for (i = ctx->nbufs; i < ctx->nscalars; ++i) {
list[i].num = ctx->args[i].length ? 1 : 0;list[i].pgidx = i;err = process_dma_handle(ctx, i, rpra, pages);if (err)goto bail_gem;- }
- return 0;
+bail_gem:
- if (ctx->msg_gem_obj) {
drm_gem_object_put(&ctx->msg_gem_obj->base);ctx->msg_gem_obj = NULL;- }
- return err;
+}
+static int fastrpc_put_args(struct fastrpc_invoke_context *ctx, struct qda_msg *msg) +{
- union fastrpc_remote_arg *rpra;
- int i, err = 0;
- if (!ctx)
return -EINVAL;- rpra = ctx->rpra;
- if (!rpra)
return -EINVAL;- for (i = ctx->inbufs; i < ctx->nbufs; ++i) {
if (ctx->args[i].fd <= 0) {void *src = (void *)(uintptr_t)rpra[i].buf.pv;void *dst = (void *)(uintptr_t)ctx->args[i].ptr;u64 len = rpra[i].buf.len;if (ctx->type == FASTRPC_RMID_INVOKE_DYNAMIC)err = copy_to_user((void __user *)dst, src, len) ? -EFAULT : 0;elsememcpy(dst, src, len);if (err)break;}- }
- return err;
+}
+/**
- qda_fastrpc_invoke_pack() - Pack an invocation context into a QDA message
- @ctx: FastRPC invocation context
- @msg: QDA message structure to pack into
- Return: 0 on success, negative error code on failure
- */
+int qda_fastrpc_invoke_pack(struct fastrpc_invoke_context *ctx,
struct qda_msg *msg)+{
- int err = 0;
- if (ctx->handle == FASTRPC_INIT_HANDLE)
msg->fastrpc.remote_session_id = 0;- else
msg->fastrpc.remote_session_id = ctx->remote_session_id;- ctx->msg = msg;
- err = fastrpc_get_args(ctx);
- if (err)
return err;- dma_wmb();
- msg->fastrpc.tid = ctx->pid;
- msg->fastrpc.ctx = ctx->ctxid | ctx->pd;
- msg->fastrpc.handle = ctx->handle;
- msg->fastrpc.sc = ctx->sc;
- msg->fastrpc.addr = ctx->msg->phys;
- msg->fastrpc.size = roundup(ctx->pkt_size, PAGE_SIZE);
- msg->fastrpc_ctx = ctx;
- msg->file_priv = ctx->file_priv;
- return 0;
+}
+/**
- qda_fastrpc_invoke_unpack() - Unpack a response message into an invocation context
- @ctx: FastRPC invocation context
- @msg: QDA message structure to unpack from
- Return: 0 on success, negative error code on failure
- */
+int qda_fastrpc_invoke_unpack(struct fastrpc_invoke_context *ctx,
struct qda_msg *msg)+{
- int err;
- dma_rmb();
- err = fastrpc_put_args(ctx, msg);
- if (err)
return err;- err = ctx->retval;
- return err;
+}
+static int fastrpc_prepare_args_invoke(struct fastrpc_invoke_context *ctx, char __user *argp) +{
- struct drm_qda_invoke_args invoke_args;
- struct drm_qda_fastrpc_invoke_args *args = NULL;
- u32 nscalars;
- /* argp is DRM ioctl data (kernel pointer); args pointer within it is user-space */
- memcpy(&invoke_args, argp, sizeof(invoke_args));
- ctx->handle = invoke_args.handle;
- ctx->sc = invoke_args.sc;
- nscalars = REMOTE_SCALARS_LENGTH(ctx->sc);
- if (!nscalars) {
ctx->args = NULL;return 0;- }
- args = kcalloc(nscalars, sizeof(*args), GFP_KERNEL);
- if (!args)
return -ENOMEM;- if (copy_from_user(args, u64_to_user_ptr(invoke_args.args),
nscalars * sizeof(*args))) {kfree(args);return -EFAULT;- }
- ctx->args = args;
- return 0;
+}
+/**
- qda_fastrpc_prepare_args() - Prepare arguments for a FastRPC invocation
- @ctx: FastRPC invocation context
- @argp: User-space pointer to invocation arguments
- Return: 0 on success, negative error code on failure
- */
+int qda_fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp) +{
- int err;
- switch (ctx->type) {
- case FASTRPC_RMID_INVOKE_DYNAMIC:
err = fastrpc_prepare_args_invoke(ctx, argp);break;- default:
return -EINVAL;- }
- if (err)
return err;- ctx->nscalars = REMOTE_SCALARS_LENGTH(ctx->sc);
- ctx->nbufs = REMOTE_SCALARS_INBUFS(ctx->sc) + REMOTE_SCALARS_OUTBUFS(ctx->sc);
- if (ctx->nscalars) {
ctx->gem_objs = kcalloc(ctx->nscalars, sizeof(*ctx->gem_objs), GFP_KERNEL);if (!ctx->gem_objs)return -ENOMEM;ctx->olaps = kcalloc(ctx->nscalars, sizeof(*ctx->olaps), GFP_KERNEL);if (!ctx->olaps) {kfree(ctx->gem_objs);ctx->gem_objs = NULL;return -ENOMEM;}fastrpc_get_buff_overlaps(ctx);- }
- return err;
+} diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h new file mode 100644 index 000000000000..ce77baeccfba --- /dev/null +++ b/drivers/accel/qda/qda_fastrpc.h @@ -0,0 +1,271 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/*
- Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
- */
+#ifndef __QDA_FASTRPC_H__ +#define __QDA_FASTRPC_H__
+#include <linux/completion.h> +#include <linux/kref.h> +#include <linux/list.h> +#include <linux/types.h> +#include <drm/drm_drv.h> +#include <drm/drm_file.h> +#include <drm/qda_accel.h>
+/* Forward declarations */ +struct qda_gem_obj;
+/*
- FastRPC scalar extraction macros
- These macros extract different fields from the scalar value that describes
- the arguments passed in a FastRPC invocation.
- */
+#define REMOTE_SCALARS_INBUFS(sc) (((sc) >> 16) & 0x0ff) +#define REMOTE_SCALARS_OUTBUFS(sc) (((sc) >> 8) & 0x0ff) +#define REMOTE_SCALARS_INHANDLES(sc) (((sc) >> 4) & 0x0f) +#define REMOTE_SCALARS_OUTHANDLES(sc) ((sc) & 0x0f) +#define REMOTE_SCALARS_LENGTH(sc) (REMOTE_SCALARS_INBUFS(sc) + \
REMOTE_SCALARS_OUTBUFS(sc) + \REMOTE_SCALARS_INHANDLES(sc) + \REMOTE_SCALARS_OUTHANDLES(sc))+/* FastRPC configuration constants */ +#define FASTRPC_ALIGN 128 /* Alignment requirement */ +#define FASTRPC_MAX_FDLIST 16 /* Maximum file descriptors */ +#define FASTRPC_MAX_CRCLIST 64 /* Maximum CRC list entries */
+/*
- FastRPC scalar construction macros
- These macros build the scalar value that describes the arguments
- for a FastRPC invocation.
- */
+#define FASTRPC_BUILD_SCALARS(attr, method, in, out, oin, oout) \
(((attr & 0x07) << 29) | \((method & 0x1f) << 24) | \((in & 0xff) << 16) | \((out & 0xff) << 8) | \((oin & 0x0f) << 4) | \(oout & 0x0f))+#define FASTRPC_SCALARS(method, in, out) \
FASTRPC_BUILD_SCALARS(0, method, in, out, 0, 0)+/**
- struct fastrpc_buf_overlap - Buffer overlap tracking structure
- Tracks overlapping buffer regions to optimise memory mapping and avoid
- redundant mappings of the same physical memory.
WHat for? Even if this is a valid optimization, implement it as a subsequent patch. The first goal should be very simple - get GEM buffers from the app, pass them to the DSP, read the results.
- */
+struct fastrpc_buf_overlap {
Stop clashing the names with the existing fastrpc driver.
- /** @start: Start address of the buffer in user virtual address space */
- u64 start;
- /** @end: End address of the buffer in user virtual address space */
- u64 end;
- /** @raix: Remote argument index associated with this overlap */
- int raix;
- /** @mstart: Start address of the mapped region */
- u64 mstart;
- /** @mend: End address of the mapped region */
- u64 mend;
- /** @offset: Offset within the mapped region */
- u64 offset;
+};
+/**
- struct fastrpc_remote_dmahandle - Remote DMA handle descriptor
- */
+struct fastrpc_remote_dmahandle {
- /** @fd: DMA-BUF file descriptor */
- s32 fd;
- /** @offset: Byte offset within the DMA-BUF */
- u32 offset;
- /** @len: Length of the region in bytes */
- u32 len;
+};
+/**
- struct fastrpc_remote_buf - Remote buffer descriptor
- */
+struct fastrpc_remote_buf {
- /** @pv: Buffer pointer (user virtual address) */
- u64 pv;
- /** @len: Length of the buffer in bytes */
- u64 len;
+};
+/**
- union fastrpc_remote_arg - Remote argument (buffer or DMA handle)
- */
+union fastrpc_remote_arg {
- /** @buf: Inline buffer descriptor */
- struct fastrpc_remote_buf buf;
- /** @dma: DMA-BUF handle descriptor */
- struct fastrpc_remote_dmahandle dma;
+};
+/**
- struct fastrpc_phy_page - Physical page descriptor
- */
+struct fastrpc_phy_page {
- /** @addr: Physical (IOMMU) address of the page */
- u64 addr;
- /** @size: Size of the contiguous region in bytes */
- u64 size;
+};
+/**
- struct fastrpc_invoke_buf - Invoke buffer descriptor
- */
+struct fastrpc_invoke_buf {
- /** @num: Number of contiguous physical regions */
- u32 num;
- /** @pgidx: Index into the physical page array */
- u32 pgidx;
+};
+/**
- struct fastrpc_msg - FastRPC wire message for remote invocations
- Sent to the remote processor via RPMsg. This is the exact layout
- the DSP expects; do not reorder or add fields without DSP firmware
- coordination.
- */
+struct fastrpc_msg {
- /** @remote_session_id: Session identifier on the remote processor */
- int remote_session_id;
- /** @tid: Thread ID of the invoking thread */
- int tid;
- /** @ctx: Context identifier for matching request/response */
- u64 ctx;
- /** @handle: Handle of the remote method to invoke */
- u32 handle;
- /** @sc: Scalars value encoding in/out buffer counts */
- u32 sc;
- /** @addr: Physical address of the message payload buffer */
- u64 addr;
- /** @size: Size of the message payload in bytes */
- u64 size;
+};
+/**
- struct qda_msg - FastRPC message with kernel-internal bookkeeping
- The wire-format portion is kept in the embedded @fastrpc member (must
- be first) so that &qda_msg->fastrpc can be passed directly to
- rpmsg_send() without a copy.
- */
+struct qda_msg {
- /**
* @fastrpc: Wire-format message sent to the DSP via RPMsg.* Must be the first member.*/- struct fastrpc_msg fastrpc;
- /** @buf: Kernel virtual address of the payload buffer */
- void *buf;
- /** @phys: Physical/DMA address of the payload buffer */
- u64 phys;
- /** @ret: Return value from the remote processor */
- int ret;
- /** @fastrpc_ctx: Back-pointer to the owning invocation context */
- struct fastrpc_invoke_context *fastrpc_ctx;
- /** @file_priv: DRM file private data for GEM object lookup */
- struct drm_file *file_priv;
+};
+/**
- struct fastrpc_invoke_context - Remote procedure call invocation context
- Maintains all state for a single remote procedure call, including buffer
- management, synchronisation, and result handling.
- */
+struct fastrpc_invoke_context {
- /** @node: List node for linking contexts in a queue */
- struct list_head node;
- /** @ctxid: Unique context identifier (XArray key shifted left by 4) */
- u64 ctxid;
- /** @inbufs: Number of input buffers */
- int inbufs;
- /** @outbufs: Number of output buffers */
- int outbufs;
- /** @handles: Number of DMA-BUF handle arguments */
- int handles;
- /** @nscalars: Total number of scalar arguments */
- int nscalars;
- /** @nbufs: Total number of buffer arguments (inbufs + outbufs) */
- int nbufs;
If it is inbufs + outbufs, why do you need it here?
- /** @pid: Process ID of the calling process */
- int pid;
- /** @retval: Return value from the remote invocation */
- int retval;
- /** @metalen: Length of the FastRPC metadata header in bytes */
- int metalen;
size_t, also why do you need it?
- /** @remote_session_id: Session identifier on the remote processor */
- int remote_session_id;
- /** @pd: Protection domain identifier encoded into the context ID */
- int pd;
- /** @type: Invocation type (e.g. FASTRPC_RMID_INVOKE_DYNAMIC) */
- int type;
- /** @sc: Scalars value encoding in/out buffer counts */
- u32 sc;
How is this different from the counts above?
- /** @handle: Handle of the remote method being invoked */
- u32 handle;
- /** @crc: Pointer to CRC values for data integrity checking */
- u32 *crc;
Add it later. It's unused. Drop all unused fields.
- /** @fdlist: Pointer to array of DMA-BUF file descriptors */
- u64 *fdlist;
Why do you need DMA-BUFs in the invocation context? They all should be GEM buffers.
- /** @pkt_size: Total payload size in bytes */
- u64 pkt_size;
- /** @aligned_pkt_size: Page-aligned payload size for GEM allocation */
- u64 aligned_pkt_size;
- /** @list: Array of invoke buffer descriptors */
- struct fastrpc_invoke_buf *list;
- /** @pages: Array of physical page descriptors for all arguments */
- struct fastrpc_phy_page *pages;
- /** @input_pages: Array of physical page descriptors for input buffers */
- struct fastrpc_phy_page *input_pages;
I think you are trying to bring all the complexity from the old driver with no added benefit. Please don't. Use the existing memory manager. Let it handle all the gory details. If someting is not there, we should consider extending GEM instead.
- /** @work: Completion used to synchronise with the DSP response */
- struct completion work;
- /** @msg: Pointer to the QDA message structure for this invocation */
- struct qda_msg *msg;
- /** @rpra: Array of remote procedure arguments */
- union fastrpc_remote_arg *rpra;
- /** @gem_objs: Array of GEM objects imported for argument buffers */
- struct drm_gem_object **gem_objs;
- /** @args: User-space invoke argument descriptors */
- struct drm_qda_fastrpc_invoke_args *args;
- /** @olaps: Array of buffer overlap descriptors for deduplication */
- struct fastrpc_buf_overlap *olaps;
- /** @refcount: Reference counter for context lifetime management */
- struct kref refcount;
- /** @msg_gem_obj: GEM object backing the message payload buffer */
- struct qda_gem_obj *msg_gem_obj;
- /** @file_priv: DRM file private data */
- struct drm_file *file_priv;
- /** @init_mem_gem_obj: GEM object for protection domain init memory */
- struct qda_gem_obj *init_mem_gem_obj;
- /** @req: Pointer to kernel-internal request buffer */
- void *req;
- /** @rsp: Pointer to kernel-internal response buffer */
- void *rsp;
- /** @inbuf: Pointer to kernel-internal input buffer */
- void *inbuf;
+};
+/* Remote Method ID table - identifies initialization and control operations */ +#define FASTRPC_RMID_INVOKE_DYNAMIC 0xFFFFFFFF /* Dynamic method invocation */
+/* Common handle for initialization operations */ +#define FASTRPC_INIT_HANDLE 0x1
+void qda_fastrpc_context_free(struct kref *ref); +struct fastrpc_invoke_context *qda_fastrpc_context_alloc(void); +int qda_fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp); +int qda_fastrpc_get_header_size(struct fastrpc_invoke_context *ctx, size_t *out_size); +int qda_fastrpc_invoke_pack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg); +int qda_fastrpc_invoke_unpack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg);
+#endif /* __QDA_FASTRPC_H__ */ diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c index 1769c85a3e98..c81268c20b04 100644 --- a/drivers/accel/qda/qda_ioctl.c +++ b/drivers/accel/qda/qda_ioctl.c @@ -3,8 +3,10 @@ #include <drm/drm_ioctl.h> #include <drm/qda_accel.h> #include "qda_drv.h" +#include "qda_fastrpc.h" #include "qda_gem.h" #include "qda_ioctl.h" +#include "qda_rpmsg.h" /**
- qda_ioctl_query() - Query DSP device information
@@ -74,3 +76,105 @@ int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_fil return drm_gem_dumb_map_offset(file_priv, dev, args->handle, &args->offset); }
+static int fastrpc_context_get_id(struct fastrpc_invoke_context *ctx, struct qda_dev *qdev) +{
- int ret;
- u32 id;
- if (!qdev)
return -EINVAL;- ret = xa_alloc(&qdev->ctx_xa, &id, ctx, xa_limit_32b, GFP_KERNEL);
- if (ret)
return ret;- ctx->ctxid = id << 4;
Why is it being shifted?
- return 0;
+}
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Implement the REMOTE_SESSION_CREATE and INIT_RELEASE FastRPC operations, which establish and tear down a user process on the DSP.
DRM_IOCTL_QDA_REMOTE_SESSION_CREATE (drm_qda_init_create) Creates a new process on the DSP by sending an INIT_CREATE message via the FastRPC INIT_HANDLE. The caller provides an ELF file (via DMA-BUF fd or direct pointer) and optional process attributes. A 4 MB GEM buffer is allocated per session to hold the DSP process image; this buffer is stored in qda_file_priv and reused for the lifetime of the session.
If attrs is non-zero, INIT_CREATE_ATTR is used instead of INIT_CREATE to pass the extended attribute and signature fields.
INIT_RELEASE Sends a release message to the DSP when the DRM file is closed (qda_postclose via qda_release_dsp_process), freeing the remote process and its resources. The release is skipped if the device has already been unplugged.
qda_fastrpc.c fastrpc_prepare_args_init_create() marshals the six-argument create-process payload: the inbuf descriptor, process name, ELF file, physical pages, attrs, and siglen. fastrpc_prepare_args_release_process() marshals the single- argument release payload (remote_session_id).
qda_drv.c qda_postclose() is extended to call qda_release_dsp_process() under drm_dev_enter() so the release message is only sent while the device is still accessible.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/accel/qda/qda_drv.c | 8 +++ drivers/accel/qda/qda_drv.h | 5 ++ drivers/accel/qda/qda_fastrpc.c | 140 ++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_fastrpc.h | 39 +++++++++-- drivers/accel/qda/qda_ioctl.c | 52 +++++++++++++++ drivers/accel/qda/qda_ioctl.h | 1 + include/uapi/drm/qda_accel.h | 32 ++++++++- 7 files changed, 270 insertions(+), 7 deletions(-)
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index 704c7d3127d2..4eaba9b050c0 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -36,6 +36,13 @@ static int qda_open(struct drm_device *dev, struct drm_file *file) static void qda_postclose(struct drm_device *dev, struct drm_file *file) { struct qda_file_priv *qda_file_priv = file->driver_priv; + int idx; + + /* Only send the DSP release message while the device is accessible */ + if (drm_dev_enter(dev, &idx)) { + qda_release_dsp_process(qda_file_priv->qda_dev, file); + drm_dev_exit(idx); + }
if (qda_file_priv->assigned_iommu_dev) { struct qda_iommu_device *iommu_dev = qda_file_priv->assigned_iommu_dev; @@ -59,6 +66,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = { DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0), DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0), DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0), + DRM_IOCTL_DEF_DRV(QDA_REMOTE_SESSION_CREATE, qda_ioctl_init_create, 0), DRM_IOCTL_DEF_DRV(QDA_REMOTE_INVOKE, qda_ioctl_invoke, 0), };
diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h index 420cccff42bf..4b4639961d95 100644 --- a/drivers/accel/qda/qda_drv.h +++ b/drivers/accel/qda/qda_drv.h @@ -28,6 +28,8 @@ struct qda_file_priv { struct qda_dev *qda_dev; /** @assigned_iommu_dev: IOMMU device assigned to this process */ struct qda_iommu_device *assigned_iommu_dev; + /** @init_mem_gem_obj: GEM object for PD initialization memory */ + struct qda_gem_obj *init_mem_gem_obj; /** @pid: Process ID for tracking */ pid_t pid; /** @remote_session_id: Unique session identifier */ @@ -83,4 +85,7 @@ void qda_deinit_device(struct qda_dev *qdev); int qda_register_device(struct qda_dev *qdev); void qda_unregister_device(struct qda_dev *qdev);
+/* DSP process / protection domain management */ +int qda_release_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv); + #endif /* __QDA_DRV_H__ */ diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c index 0ec37175a098..305915022b91 100644 --- a/drivers/accel/qda/qda_fastrpc.c +++ b/drivers/accel/qda/qda_fastrpc.c @@ -524,6 +524,138 @@ int qda_fastrpc_invoke_unpack(struct fastrpc_invoke_context *ctx, return err; }
+static void setup_create_process_args(struct drm_qda_fastrpc_invoke_args *args, + struct fastrpc_create_process_inbuf *inbuf, + struct drm_qda_init_create *init, + struct fastrpc_phy_page *pages) +{ + args[0].ptr = (u64)(uintptr_t)inbuf; + args[0].length = sizeof(*inbuf); + args[0].fd = -1; + + args[1].ptr = (u64)(uintptr_t)current->comm; + args[1].length = inbuf->namelen; + args[1].fd = -1; + + args[2].ptr = (u64)init->file; + args[2].length = inbuf->filelen; + args[2].fd = init->filefd; /* DMA-BUF fd forwarded to DSP */ + + args[3].ptr = (u64)(uintptr_t)pages; + args[3].length = 1 * sizeof(*pages); + args[3].fd = -1; + + args[4].ptr = (u64)(uintptr_t)&inbuf->attrs; + args[4].length = sizeof(inbuf->attrs); + args[4].fd = -1; + + args[5].ptr = (u64)(uintptr_t)&inbuf->siglen; + args[5].length = sizeof(inbuf->siglen); + args[5].fd = -1; +} + +static void setup_single_arg(struct drm_qda_fastrpc_invoke_args *args, const void *ptr, size_t size) +{ + args[0].ptr = (u64)(uintptr_t)ptr; + args[0].length = size; + args[0].fd = -1; +} + +static int fastrpc_prepare_args_release_process(struct fastrpc_invoke_context *ctx) +{ + struct drm_qda_fastrpc_invoke_args *args; + + args = kzalloc_obj(*args); + if (!args) + return -ENOMEM; + + setup_single_arg(args, &ctx->remote_session_id, sizeof(ctx->remote_session_id)); + ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_RELEASE, 1, 0); + ctx->args = args; + ctx->handle = FASTRPC_INIT_HANDLE; + + return 0; +} + +static int fastrpc_prepare_args_init_create(struct fastrpc_invoke_context *ctx, + char __user *argp) +{ + struct drm_qda_init_create init; + struct drm_qda_fastrpc_invoke_args *args; + struct fastrpc_create_process_inbuf *inbuf; + int err; + u32 sc; + + args = kcalloc(FASTRPC_CREATE_PROCESS_NARGS, sizeof(*args), GFP_KERNEL); + if (!args) + return -ENOMEM; + + ctx->input_pages = kcalloc(1, sizeof(*ctx->input_pages), GFP_KERNEL); + if (!ctx->input_pages) { + err = -ENOMEM; + goto err_free_args; + } + + ctx->inbuf = kcalloc(1, sizeof(*inbuf), GFP_KERNEL); + if (!ctx->inbuf) { + err = -ENOMEM; + goto err_free_input_pages; + } + inbuf = ctx->inbuf; + + memcpy(&init, argp, sizeof(init)); + + if (init.filelen > FASTRPC_INIT_FILELEN_MAX) { + err = -EINVAL; + goto err_free_inbuf; + } + + /* + * Validate that the DMA-BUF fd is importable. The fd itself is kept + * in init.filefd and forwarded to the DSP via setup_create_process_args(). + */ + if (init.filelen && init.filefd > 0) { + struct drm_gem_object *file_gem_obj; + + err = get_gem_obj_from_dmabuf_fd(ctx, init.filefd, &file_gem_obj); + if (err) { + err = -EINVAL; + goto err_free_inbuf; + } + drm_gem_object_put(file_gem_obj); + } + + inbuf->remote_session_id = ctx->remote_session_id; + inbuf->namelen = strlen(current->comm) + 1; + inbuf->filelen = init.filelen; + inbuf->pageslen = 1; + inbuf->attrs = init.attrs; + inbuf->siglen = init.siglen; + + setup_pages_from_gem_obj(ctx->init_mem_gem_obj, &ctx->input_pages[0]); + + setup_create_process_args(args, inbuf, &init, ctx->input_pages); + + sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_CREATE, 4, 0); + if (init.attrs) + sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_CREATE_ATTR, 4, 0); + ctx->sc = sc; + ctx->args = args; + ctx->handle = FASTRPC_INIT_HANDLE; + + return 0; + +err_free_inbuf: + kfree(ctx->inbuf); + ctx->inbuf = NULL; +err_free_input_pages: + kfree(ctx->input_pages); + ctx->input_pages = NULL; +err_free_args: + kfree(args); + return err; +} + static int fastrpc_prepare_args_invoke(struct fastrpc_invoke_context *ctx, char __user *argp) { struct drm_qda_invoke_args invoke_args; @@ -568,6 +700,14 @@ int qda_fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *ar int err;
switch (ctx->type) { + case FASTRPC_RMID_INIT_RELEASE: + err = fastrpc_prepare_args_release_process(ctx); + break; + case FASTRPC_RMID_INIT_CREATE: + case FASTRPC_RMID_INIT_CREATE_ATTR: + ctx->pd = QDA_USER_PD; + err = fastrpc_prepare_args_init_create(ctx, argp); + break; case FASTRPC_RMID_INVOKE_DYNAMIC: err = fastrpc_prepare_args_invoke(ctx, argp); break; diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h index ce77baeccfba..1c1236f9525e 100644 --- a/drivers/accel/qda/qda_fastrpc.h +++ b/drivers/accel/qda/qda_fastrpc.h @@ -127,6 +127,27 @@ struct fastrpc_invoke_buf { u32 pgidx; };
+/** + * struct fastrpc_create_process_inbuf - Input buffer for process creation + * + * This structure defines the input buffer format for creating a new + * process on the remote DSP. + */ +struct fastrpc_create_process_inbuf { + /** @remote_session_id: Client identifier for the session */ + int remote_session_id; + /** @namelen: Length of the process name string including NUL terminator */ + u32 namelen; + /** @filelen: Length of the ELF shell file in bytes */ + u32 filelen; + /** @pageslen: Number of physical page descriptors */ + u32 pageslen; + /** @attrs: Process attribute flags */ + u32 attrs; + /** @siglen: Length of the signature data in bytes */ + u32 siglen; +}; + /** * struct fastrpc_msg - FastRPC wire message for remote invocations * @@ -153,10 +174,6 @@ struct fastrpc_msg {
/** * struct qda_msg - FastRPC message with kernel-internal bookkeeping - * - * The wire-format portion is kept in the embedded @fastrpc member (must - * be first) so that &qda_msg->fastrpc can be passed directly to - * rpmsg_send() without a copy. */ struct qda_msg { /** @@ -245,7 +262,7 @@ struct fastrpc_invoke_context { struct qda_gem_obj *msg_gem_obj; /** @file_priv: DRM file private data */ struct drm_file *file_priv; - /** @init_mem_gem_obj: GEM object for protection domain init memory */ + /** @init_mem_gem_obj: GEM object for PD initialization memory */ struct qda_gem_obj *init_mem_gem_obj; /** @req: Pointer to kernel-internal request buffer */ void *req; @@ -256,11 +273,23 @@ struct fastrpc_invoke_context { };
/* Remote Method ID table - identifies initialization and control operations */ +#define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP process */ +#define FASTRPC_RMID_INIT_CREATE 6 /* Create DSP process */ +#define FASTRPC_RMID_INIT_CREATE_ATTR 7 /* Create DSP process with attributes */ #define FASTRPC_RMID_INVOKE_DYNAMIC 0xFFFFFFFF /* Dynamic method invocation */
/* Common handle for initialization operations */ #define FASTRPC_INIT_HANDLE 0x1
+/* Protection Domain (PD) identifiers */ +#define QDA_ROOT_PD (0) +#define QDA_USER_PD (1) + +/* Number of arguments for process creation */ +#define FASTRPC_CREATE_PROCESS_NARGS 6 +/* Maximum initialization file size (4 MB) */ +#define FASTRPC_INIT_FILELEN_MAX (4 * 1024 * 1024) + void qda_fastrpc_context_free(struct kref *ref); struct fastrpc_invoke_context *qda_fastrpc_context_alloc(void); int qda_fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp); diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c index c81268c20b04..33f0a798ad13 100644 --- a/drivers/accel/qda/qda_ioctl.c +++ b/drivers/accel/qda/qda_ioctl.c @@ -109,6 +109,7 @@ static int fastrpc_invoke(int type, struct drm_device *dev, void *data, struct drm_gem_object *gem_obj; int err; size_t hdr_size; + size_t initmem_size = FASTRPC_INIT_FILELEN_MAX;
ctx = qda_fastrpc_context_alloc(); if (IS_ERR(ctx)) @@ -124,6 +125,27 @@ static int fastrpc_invoke(int type, struct drm_device *dev, void *data, ctx->file_priv = file_priv; ctx->remote_session_id = qda_file_priv->remote_session_id;
+ if (type == FASTRPC_RMID_INIT_CREATE) { + struct drm_gem_object *initmem_gem_obj; + + if (qda_file_priv->init_mem_gem_obj) { + drm_gem_object_put(&qda_file_priv->init_mem_gem_obj->base); + qda_file_priv->init_mem_gem_obj = NULL; + } + + initmem_gem_obj = qda_gem_create_object(dev, qdev->iommu_mgr, + initmem_size, file_priv); + if (IS_ERR(initmem_gem_obj)) { + err = PTR_ERR(initmem_gem_obj); + goto err_context_free; + } + + ctx->init_mem_gem_obj = to_qda_gem_obj(initmem_gem_obj); + qda_file_priv->init_mem_gem_obj = ctx->init_mem_gem_obj; + } else if (type == FASTRPC_RMID_INIT_RELEASE) { + ctx->init_mem_gem_obj = qda_file_priv->init_mem_gem_obj; + } + err = qda_fastrpc_prepare_args(ctx, (char __user *)data); if (err) goto err_context_free; @@ -161,11 +183,41 @@ static int fastrpc_invoke(int type, struct drm_device *dev, void *data, return 0;
err_context_free: + if (type == FASTRPC_RMID_INIT_RELEASE && !err && qda_file_priv->init_mem_gem_obj) { + drm_gem_object_put(&qda_file_priv->init_mem_gem_obj->base); + qda_file_priv->init_mem_gem_obj = NULL; + } + fastrpc_context_put_id(ctx, qdev); kref_put(&ctx->refcount, qda_fastrpc_context_free); return err; }
+/** + * qda_ioctl_init_create() - Create a DSP process + * @dev: DRM device structure + * @data: User-space data (struct drm_qda_init_create) + * @file_priv: DRM file private data + * + * Return: 0 on success, negative error code on failure + */ +int qda_ioctl_init_create(struct drm_device *dev, void *data, struct drm_file *file_priv) +{ + return fastrpc_invoke(FASTRPC_RMID_INIT_CREATE, dev, data, file_priv); +} + +/** + * qda_release_dsp_process() - Release DSP process resources for a file + * @qdev: QDA device structure + * @file_priv: DRM file private data + * + * Return: 0 on success, negative error code on failure + */ +int qda_release_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv) +{ + return fastrpc_invoke(FASTRPC_RMID_INIT_RELEASE, &qdev->drm_dev, NULL, file_priv); +} + /** * qda_ioctl_invoke() - Perform a dynamic FastRPC method invocation * @dev: DRM device structure diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h index 3bb9cfd98370..192565434363 100644 --- a/drivers/accel/qda/qda_ioctl.h +++ b/drivers/accel/qda/qda_ioctl.h @@ -9,6 +9,7 @@ #include "qda_drv.h"
int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv); +int qda_ioctl_init_create(struct drm_device *dev, void *data, struct drm_file *file_priv); int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *file_priv); int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv); int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv); diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h index 72512213741f..711e2523a570 100644 --- a/include/uapi/drm/qda_accel.h +++ b/include/uapi/drm/qda_accel.h @@ -21,8 +21,9 @@ extern "C" { #define DRM_QDA_QUERY 0x00 #define DRM_QDA_GEM_CREATE 0x01 #define DRM_QDA_GEM_MMAP_OFFSET 0x02 -/* Command numbers 0x03-0x06 reserved for INIT_ATTACH, INIT_CREATE, MAP, MUNMAP */ -#define DRM_QDA_REMOTE_INVOKE 0x07 +/* Command number 0x03 reserved for INIT_ATTACH; 0x05-0x06 reserved for MAP, MUNMAP */ +#define DRM_QDA_REMOTE_SESSION_CREATE 0x04 +#define DRM_QDA_REMOTE_INVOKE 0x07
/* * QDA IOCTL definitions @@ -37,6 +38,9 @@ extern "C" { struct drm_qda_gem_create) #define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \ struct drm_qda_gem_mmap_offset) +#define DRM_IOCTL_QDA_REMOTE_SESSION_CREATE \ + DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_REMOTE_SESSION_CREATE, \ + struct drm_qda_init_create) #define DRM_IOCTL_QDA_REMOTE_INVOKE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_REMOTE_INVOKE, \ struct drm_qda_invoke_args)
@@ -99,6 +103,30 @@ struct drm_qda_fastrpc_invoke_args { __u32 attr; };
+/** + * struct drm_qda_init_create - Accelerator process initialization parameters + * @filelen: Length of the ELF file in bytes + * @filefd: DMA-BUF file descriptor containing the ELF file + * @attrs: Process attributes flags + * @siglen: Length of signature data in bytes + * @file: Pointer to ELF file data if not using filefd + * + * This structure is used with DRM_IOCTL_QDA_INIT_CREATE to initialize + * a new process on the accelerator. The process code is provided either + * via a file descriptor (filefd, typically a GEM object) or a direct + * pointer (file). Set file to 0 if using filefd. + * + * The attrs field contains bit flags for debug mode, privileged execution, + * and other process attributes. + */ +struct drm_qda_init_create { + __u32 filelen; + __s32 filefd; + __u32 attrs; + __u32 siglen; + __u64 file; +}; + /** * struct drm_qda_invoke_args - Dynamic FastRPC invocation parameters * @handle: Remote handle to invoke on the DSP
On Tue, May 19, 2026 at 11:46:03AM +0530, Ekansh Gupta via B4 Relay wrote:
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Implement the REMOTE_SESSION_CREATE and INIT_RELEASE FastRPC operations, which establish and tear down a user process on the DSP.
DRM_IOCTL_QDA_REMOTE_SESSION_CREATE (drm_qda_init_create) Creates a new process on the DSP by sending an INIT_CREATE message via the FastRPC INIT_HANDLE. The caller provides an ELF file (via DMA-BUF fd or direct pointer) and optional process attributes. A 4 MB GEM buffer is allocated per session to hold the DSP process image; this buffer is stored in qda_file_priv and reused for the lifetime of the session.
If attrs is non-zero, INIT_CREATE_ATTR is used instead of INIT_CREATE to pass the extended attribute and signature fields.
What is the difference?
INIT_RELEASE Sends a release message to the DSP when the DRM file is closed (qda_postclose via qda_release_dsp_process), freeing the remote process and its resources. The release is skipped if the device has already been unplugged.
qda_fastrpc.c fastrpc_prepare_args_init_create() marshals the six-argument create-process payload: the inbuf descriptor, process name, ELF file, physical pages, attrs, and siglen. fastrpc_prepare_args_release_process() marshals the single- argument release payload (remote_session_id).
qda_drv.c qda_postclose() is extended to call qda_release_dsp_process() under drm_dev_enter() so the release message is only sent while the device is still accessible.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
drivers/accel/qda/qda_drv.c | 8 +++ drivers/accel/qda/qda_drv.h | 5 ++ drivers/accel/qda/qda_fastrpc.c | 140 ++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_fastrpc.h | 39 +++++++++-- drivers/accel/qda/qda_ioctl.c | 52 +++++++++++++++ drivers/accel/qda/qda_ioctl.h | 1 + include/uapi/drm/qda_accel.h | 32 ++++++++- 7 files changed, 270 insertions(+), 7 deletions(-)
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index 704c7d3127d2..4eaba9b050c0 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -36,6 +36,13 @@ static int qda_open(struct drm_device *dev, struct drm_file *file) static void qda_postclose(struct drm_device *dev, struct drm_file *file) { struct qda_file_priv *qda_file_priv = file->driver_priv;
- int idx;
- /* Only send the DSP release message while the device is accessible */
- if (drm_dev_enter(dev, &idx)) {
qda_release_dsp_process(qda_file_priv->qda_dev, file);drm_dev_exit(idx);- }
if (qda_file_priv->assigned_iommu_dev) { struct qda_iommu_device *iommu_dev = qda_file_priv->assigned_iommu_dev; @@ -59,6 +66,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = { DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0), DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0), DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
- DRM_IOCTL_DEF_DRV(QDA_REMOTE_SESSION_CREATE, qda_ioctl_init_create, 0),
Why is it being added in the middle?
DRM_IOCTL_DEF_DRV(QDA_REMOTE_INVOKE, qda_ioctl_invoke, 0), }; diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h index 420cccff42bf..4b4639961d95 100644 --- a/drivers/accel/qda/qda_drv.h +++ b/drivers/accel/qda/qda_drv.h @@ -28,6 +28,8 @@ struct qda_file_priv { struct qda_dev *qda_dev; /** @assigned_iommu_dev: IOMMU device assigned to this process */ struct qda_iommu_device *assigned_iommu_dev;
- /** @init_mem_gem_obj: GEM object for PD initialization memory */
- struct qda_gem_obj *init_mem_gem_obj; /** @pid: Process ID for tracking */ pid_t pid; /** @remote_session_id: Unique session identifier */
@@ -83,4 +85,7 @@ void qda_deinit_device(struct qda_dev *qdev); int qda_register_device(struct qda_dev *qdev); void qda_unregister_device(struct qda_dev *qdev); +/* DSP process / protection domain management */ +int qda_release_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv);
#endif /* __QDA_DRV_H__ */ diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c index 0ec37175a098..305915022b91 100644 --- a/drivers/accel/qda/qda_fastrpc.c +++ b/drivers/accel/qda/qda_fastrpc.c @@ -524,6 +524,138 @@ int qda_fastrpc_invoke_unpack(struct fastrpc_invoke_context *ctx, return err; } +static void setup_create_process_args(struct drm_qda_fastrpc_invoke_args *args,
struct fastrpc_create_process_inbuf *inbuf,struct drm_qda_init_create *init,struct fastrpc_phy_page *pages)+{
- args[0].ptr = (u64)(uintptr_t)inbuf;
- args[0].length = sizeof(*inbuf);
- args[0].fd = -1;
- args[1].ptr = (u64)(uintptr_t)current->comm;
- args[1].length = inbuf->namelen;
- args[1].fd = -1;
- args[2].ptr = (u64)init->file;
- args[2].length = inbuf->filelen;
- args[2].fd = init->filefd; /* DMA-BUF fd forwarded to DSP */
- args[3].ptr = (u64)(uintptr_t)pages;
- args[3].length = 1 * sizeof(*pages);
- args[3].fd = -1;
- args[4].ptr = (u64)(uintptr_t)&inbuf->attrs;
- args[4].length = sizeof(inbuf->attrs);
- args[4].fd = -1;
- args[5].ptr = (u64)(uintptr_t)&inbuf->siglen;
- args[5].length = sizeof(inbuf->siglen);
- args[5].fd = -1;
+}
+static void setup_single_arg(struct drm_qda_fastrpc_invoke_args *args, const void *ptr, size_t size) +{
- args[0].ptr = (u64)(uintptr_t)ptr;
- args[0].length = size;
- args[0].fd = -1;
+}
+static int fastrpc_prepare_args_release_process(struct fastrpc_invoke_context *ctx) +{
- struct drm_qda_fastrpc_invoke_args *args;
- args = kzalloc_obj(*args);
- if (!args)
return -ENOMEM;- setup_single_arg(args, &ctx->remote_session_id, sizeof(ctx->remote_session_id));
- ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_RELEASE, 1, 0);
- ctx->args = args;
- ctx->handle = FASTRPC_INIT_HANDLE;
- return 0;
+}
+static int fastrpc_prepare_args_init_create(struct fastrpc_invoke_context *ctx,
char __user *argp)+{
- struct drm_qda_init_create init;
- struct drm_qda_fastrpc_invoke_args *args;
- struct fastrpc_create_process_inbuf *inbuf;
- int err;
- u32 sc;
- args = kcalloc(FASTRPC_CREATE_PROCESS_NARGS, sizeof(*args), GFP_KERNEL);
- if (!args)
return -ENOMEM;- ctx->input_pages = kcalloc(1, sizeof(*ctx->input_pages), GFP_KERNEL);
- if (!ctx->input_pages) {
err = -ENOMEM;goto err_free_args;- }
- ctx->inbuf = kcalloc(1, sizeof(*inbuf), GFP_KERNEL);
- if (!ctx->inbuf) {
err = -ENOMEM;goto err_free_input_pages;- }
- inbuf = ctx->inbuf;
- memcpy(&init, argp, sizeof(init));
- if (init.filelen > FASTRPC_INIT_FILELEN_MAX) {
err = -EINVAL;goto err_free_inbuf;- }
- /*
* Validate that the DMA-BUF fd is importable. The fd itself is kept* in init.filefd and forwarded to the DSP via setup_create_process_args().*/- if (init.filelen && init.filefd > 0) {
struct drm_gem_object *file_gem_obj;err = get_gem_obj_from_dmabuf_fd(ctx, init.filefd, &file_gem_obj);if (err) {err = -EINVAL;goto err_free_inbuf;}drm_gem_object_put(file_gem_obj);- }
- inbuf->remote_session_id = ctx->remote_session_id;
- inbuf->namelen = strlen(current->comm) + 1;
- inbuf->filelen = init.filelen;
- inbuf->pageslen = 1;
- inbuf->attrs = init.attrs;
- inbuf->siglen = init.siglen;
- setup_pages_from_gem_obj(ctx->init_mem_gem_obj, &ctx->input_pages[0]);
- setup_create_process_args(args, inbuf, &init, ctx->input_pages);
- sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_CREATE, 4, 0);
- if (init.attrs)
sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_CREATE_ATTR, 4, 0);- ctx->sc = sc;
- ctx->args = args;
- ctx->handle = FASTRPC_INIT_HANDLE;
- return 0;
+err_free_inbuf:
- kfree(ctx->inbuf);
- ctx->inbuf = NULL;
+err_free_input_pages:
- kfree(ctx->input_pages);
- ctx->input_pages = NULL;
+err_free_args:
- kfree(args);
- return err;
+}
static int fastrpc_prepare_args_invoke(struct fastrpc_invoke_context *ctx, char __user *argp) { struct drm_qda_invoke_args invoke_args; @@ -568,6 +700,14 @@ int qda_fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *ar int err; switch (ctx->type) {
- case FASTRPC_RMID_INIT_RELEASE:
err = fastrpc_prepare_args_release_process(ctx);break;- case FASTRPC_RMID_INIT_CREATE:
- case FASTRPC_RMID_INIT_CREATE_ATTR:
ctx->pd = QDA_USER_PD;err = fastrpc_prepare_args_init_create(ctx, argp); case FASTRPC_RMID_INVOKE_DYNAMIC: err = fastrpc_prepare_args_invoke(ctx, argp); break;break;diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h index ce77baeccfba..1c1236f9525e 100644 --- a/drivers/accel/qda/qda_fastrpc.h +++ b/drivers/accel/qda/qda_fastrpc.h @@ -127,6 +127,27 @@ struct fastrpc_invoke_buf { u32 pgidx; }; +/**
- struct fastrpc_create_process_inbuf - Input buffer for process creation
- This structure defines the input buffer format for creating a new
- process on the remote DSP.
- */
+struct fastrpc_create_process_inbuf {
- /** @remote_session_id: Client identifier for the session */
- int remote_session_id;
- /** @namelen: Length of the process name string including NUL terminator */
- u32 namelen;
- /** @filelen: Length of the ELF shell file in bytes */
- u32 filelen;
- /** @pageslen: Number of physical page descriptors */
- u32 pageslen;
- /** @attrs: Process attribute flags */
- u32 attrs;
- /** @siglen: Length of the signature data in bytes */
- u32 siglen;
+};
/**
- struct fastrpc_msg - FastRPC wire message for remote invocations
@@ -153,10 +174,6 @@ struct fastrpc_msg { /**
- struct qda_msg - FastRPC message with kernel-internal bookkeeping
- The wire-format portion is kept in the embedded @fastrpc member (must
- be first) so that &qda_msg->fastrpc can be passed directly to
*/
- rpmsg_send() without a copy.
struct qda_msg { /** @@ -245,7 +262,7 @@ struct fastrpc_invoke_context { struct qda_gem_obj *msg_gem_obj; /** @file_priv: DRM file private data */ struct drm_file *file_priv;
- /** @init_mem_gem_obj: GEM object for protection domain init memory */
- /** @init_mem_gem_obj: GEM object for PD initialization memory */ struct qda_gem_obj *init_mem_gem_obj; /** @req: Pointer to kernel-internal request buffer */ void *req;
@@ -256,11 +273,23 @@ struct fastrpc_invoke_context { }; /* Remote Method ID table - identifies initialization and control operations */ +#define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP process */ +#define FASTRPC_RMID_INIT_CREATE 6 /* Create DSP process */ +#define FASTRPC_RMID_INIT_CREATE_ATTR 7 /* Create DSP process with attributes */ #define FASTRPC_RMID_INVOKE_DYNAMIC 0xFFFFFFFF /* Dynamic method invocation */ /* Common handle for initialization operations */ #define FASTRPC_INIT_HANDLE 0x1 +/* Protection Domain (PD) identifiers */ +#define QDA_ROOT_PD (0) +#define QDA_USER_PD (1)
+/* Number of arguments for process creation */ +#define FASTRPC_CREATE_PROCESS_NARGS 6 +/* Maximum initialization file size (4 MB) */ +#define FASTRPC_INIT_FILELEN_MAX (4 * 1024 * 1024)
void qda_fastrpc_context_free(struct kref *ref); struct fastrpc_invoke_context *qda_fastrpc_context_alloc(void); int qda_fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp); diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c index c81268c20b04..33f0a798ad13 100644 --- a/drivers/accel/qda/qda_ioctl.c +++ b/drivers/accel/qda/qda_ioctl.c @@ -109,6 +109,7 @@ static int fastrpc_invoke(int type, struct drm_device *dev, void *data, struct drm_gem_object *gem_obj; int err; size_t hdr_size;
- size_t initmem_size = FASTRPC_INIT_FILELEN_MAX;
ctx = qda_fastrpc_context_alloc(); if (IS_ERR(ctx)) @@ -124,6 +125,27 @@ static int fastrpc_invoke(int type, struct drm_device *dev, void *data, ctx->file_priv = file_priv; ctx->remote_session_id = qda_file_priv->remote_session_id;
- if (type == FASTRPC_RMID_INIT_CREATE) {
struct drm_gem_object *initmem_gem_obj;if (qda_file_priv->init_mem_gem_obj) {
Why is it non-NULL here?
drm_gem_object_put(&qda_file_priv->init_mem_gem_obj->base);qda_file_priv->init_mem_gem_obj = NULL;}initmem_gem_obj = qda_gem_create_object(dev, qdev->iommu_mgr,initmem_size, file_priv);if (IS_ERR(initmem_gem_obj)) {err = PTR_ERR(initmem_gem_obj);goto err_context_free;}ctx->init_mem_gem_obj = to_qda_gem_obj(initmem_gem_obj);qda_file_priv->init_mem_gem_obj = ctx->init_mem_gem_obj;- } else if (type == FASTRPC_RMID_INIT_RELEASE) {
ctx->init_mem_gem_obj = qda_file_priv->init_mem_gem_obj;- }
- err = qda_fastrpc_prepare_args(ctx, (char __user *)data); if (err) goto err_context_free;
@@ -161,11 +183,41 @@ static int fastrpc_invoke(int type, struct drm_device *dev, void *data, return 0; err_context_free:
- if (type == FASTRPC_RMID_INIT_RELEASE && !err && qda_file_priv->init_mem_gem_obj) {
drm_gem_object_put(&qda_file_priv->init_mem_gem_obj->base);qda_file_priv->init_mem_gem_obj = NULL;- }
- fastrpc_context_put_id(ctx, qdev); kref_put(&ctx->refcount, qda_fastrpc_context_free); return err;
} +/**
- qda_ioctl_init_create() - Create a DSP process
- @dev: DRM device structure
- @data: User-space data (struct drm_qda_init_create)
- @file_priv: DRM file private data
- Return: 0 on success, negative error code on failure
- */
+int qda_ioctl_init_create(struct drm_device *dev, void *data, struct drm_file *file_priv) +{
- return fastrpc_invoke(FASTRPC_RMID_INIT_CREATE, dev, data, file_priv);
Where is INIT_CREATE_ATTR, which you described earlier?
+}
+/**
- qda_release_dsp_process() - Release DSP process resources for a file
- @qdev: QDA device structure
- @file_priv: DRM file private data
- Return: 0 on success, negative error code on failure
- */
+int qda_release_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv) +{
- return fastrpc_invoke(FASTRPC_RMID_INIT_RELEASE, &qdev->drm_dev, NULL, file_priv);
+}
/**
- qda_ioctl_invoke() - Perform a dynamic FastRPC method invocation
- @dev: DRM device structure
diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h index 3bb9cfd98370..192565434363 100644 --- a/drivers/accel/qda/qda_ioctl.h +++ b/drivers/accel/qda/qda_ioctl.h @@ -9,6 +9,7 @@ #include "qda_drv.h" int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv); +int qda_ioctl_init_create(struct drm_device *dev, void *data, struct drm_file *file_priv); int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *file_priv); int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv); int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv); diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h index 72512213741f..711e2523a570 100644 --- a/include/uapi/drm/qda_accel.h +++ b/include/uapi/drm/qda_accel.h @@ -21,8 +21,9 @@ extern "C" { #define DRM_QDA_QUERY 0x00 #define DRM_QDA_GEM_CREATE 0x01 #define DRM_QDA_GEM_MMAP_OFFSET 0x02 -/* Command numbers 0x03-0x06 reserved for INIT_ATTACH, INIT_CREATE, MAP, MUNMAP */ -#define DRM_QDA_REMOTE_INVOKE 0x07 +/* Command number 0x03 reserved for INIT_ATTACH; 0x05-0x06 reserved for MAP, MUNMAP */ +#define DRM_QDA_REMOTE_SESSION_CREATE 0x04 +#define DRM_QDA_REMOTE_INVOKE 0x07 /*
- QDA IOCTL definitions
@@ -37,6 +38,9 @@ extern "C" { struct drm_qda_gem_create) #define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \ struct drm_qda_gem_mmap_offset) +#define DRM_IOCTL_QDA_REMOTE_SESSION_CREATE \
- DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_REMOTE_SESSION_CREATE, \
struct drm_qda_init_create)#define DRM_IOCTL_QDA_REMOTE_INVOKE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_REMOTE_INVOKE, \ struct drm_qda_invoke_args) @@ -99,6 +103,30 @@ struct drm_qda_fastrpc_invoke_args { __u32 attr; }; +/**
- struct drm_qda_init_create - Accelerator process initialization parameters
- @filelen: Length of the ELF file in bytes
- @filefd: DMA-BUF file descriptor containing the ELF file
- @attrs: Process attributes flags
- @siglen: Length of signature data in bytes
- @file: Pointer to ELF file data if not using filefd
- This structure is used with DRM_IOCTL_QDA_INIT_CREATE to initialize
- a new process on the accelerator. The process code is provided either
- via a file descriptor (filefd, typically a GEM object) or a direct
- pointer (file). Set file to 0 if using filefd.
- The attrs field contains bit flags for debug mode, privileged execution,
- and other process attributes.
- */
+struct drm_qda_init_create {
- __u32 filelen;
- __s32 filefd;
- __u32 attrs;
- __u32 siglen;
- __u64 file;
+};
/**
- struct drm_qda_invoke_args - Dynamic FastRPC invocation parameters
- @handle: Remote handle to invoke on the DSP
-- 2.34.1
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Implement DRM_IOCTL_QDA_REMOTE_MAP, which maps a DMA buffer into the DSP's virtual address space and returns the DSP virtual address to user-space. Two mapping modes are supported:
QDA_MAP_REQUEST_LEGACY (FASTRPC_RMID_INIT_MMAP) Legacy three-argument mapping: sends a fastrpc_map_req_msg to the DSP containing the session ID, mapping flags, and virtual address hint, together with the physical page descriptor resolved from the DMA-BUF fd. The DSP returns the assigned virtual address in fastrpc_map_rsp_msg.vaddrout.
QDA_MAP_REQUEST_ATTR (FASTRPC_RMID_INIT_MEM_MAP) Attribute-based four-argument mapping: sends a fastrpc_mem_map_req_msg which additionally carries the DMA-BUF fd, byte offset, and SMMU attribute flags. The DSP uses these to apply custom cache and permission attributes to the mapping.
In both cases qda_fastrpc_return_result() writes the DSP virtual address back into the drm_qda_mem_map.vaddrout field so the DRM framework copies it to user-space on IOCTL return.
The DMA-BUF fd is resolved to a fastrpc_phy_page descriptor via setup_mmap_pages(), which imports the fd as a GEM object to obtain the IOMMU-mapped dma_addr and then releases the extra reference.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/accel/qda/qda_drv.c | 1 + drivers/accel/qda/qda_fastrpc.c | 237 ++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_fastrpc.h | 56 ++++++++++ drivers/accel/qda/qda_ioctl.c | 36 ++++++ drivers/accel/qda/qda_ioctl.h | 1 + include/uapi/drm/qda_accel.h | 45 +++++++- 6 files changed, 375 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index 4eaba9b050c0..3640e4a41605 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -67,6 +67,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = { DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0), DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0), DRM_IOCTL_DEF_DRV(QDA_REMOTE_SESSION_CREATE, qda_ioctl_init_create, 0), + DRM_IOCTL_DEF_DRV(QDA_REMOTE_MAP, qda_ioctl_mmap, 0), DRM_IOCTL_DEF_DRV(QDA_REMOTE_INVOKE, qda_ioctl_invoke, 0), };
diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c index 305915022b91..cab3a560ceb5 100644 --- a/drivers/accel/qda/qda_fastrpc.c +++ b/drivers/accel/qda/qda_fastrpc.c @@ -524,6 +524,44 @@ int qda_fastrpc_invoke_unpack(struct fastrpc_invoke_context *ctx, return err; }
+static int fastrpc_return_result_mem_map(struct fastrpc_invoke_context *ctx, char __user *argp) +{ + struct drm_qda_mem_map margs; + struct fastrpc_map_rsp_msg *rsp_msg; + + rsp_msg = ctx->rsp; + + memcpy(&margs, argp, sizeof(margs)); + + margs.vaddrout = rsp_msg->vaddrout; + + memcpy(argp, &margs, sizeof(margs)); + return 0; +} + +/** + * qda_fastrpc_return_result() - Return invocation result to user-space + * @ctx: FastRPC invocation context + * @argp: User-space pointer to write result into + * + * Return: 0 on success, negative error code on failure + */ +int qda_fastrpc_return_result(struct fastrpc_invoke_context *ctx, char __user *argp) +{ + int err = 0; + + switch (ctx->type) { + case FASTRPC_RMID_INIT_MMAP: + case FASTRPC_RMID_INIT_MEM_MAP: + err = fastrpc_return_result_mem_map(ctx, argp); + break; + default: + break; + } + + return err; +} + static void setup_create_process_args(struct drm_qda_fastrpc_invoke_args *args, struct fastrpc_create_process_inbuf *inbuf, struct drm_qda_init_create *init, @@ -561,6 +599,37 @@ static void setup_single_arg(struct drm_qda_fastrpc_invoke_args *args, const voi args[0].fd = -1; }
+/* + * setup_mmap_pages() - Resolve a DMA-BUF fd to a physical page descriptor + * + * Imports the DMA-BUF fd as a GEM object to obtain the IOMMU-mapped + * dma_addr, fills in the fastrpc_phy_page entry, then releases the extra + * GEM object reference. The handle table keeps the object alive. + */ +static int setup_mmap_pages(struct fastrpc_invoke_context *ctx, int dmabuf_fd, + struct fastrpc_phy_page *pages) +{ + struct drm_gem_object *gem_obj; + struct qda_gem_obj *qda_gem_obj; + int err; + + if (dmabuf_fd <= 0) { + pages->addr = 0; + pages->size = 0; + return 0; + } + + err = get_gem_obj_from_dmabuf_fd(ctx, dmabuf_fd, &gem_obj); + if (err) + return err; + + qda_gem_obj = to_qda_gem_obj(gem_obj); + setup_pages_from_gem_obj(qda_gem_obj, pages); + + drm_gem_object_put(gem_obj); + return 0; +} + static int fastrpc_prepare_args_release_process(struct fastrpc_invoke_context *ctx) { struct drm_qda_fastrpc_invoke_args *args; @@ -656,6 +725,168 @@ static int fastrpc_prepare_args_init_create(struct fastrpc_invoke_context *ctx, return err; }
+static int fastrpc_prepare_args_map(struct fastrpc_invoke_context *ctx, char __user *argp) +{ + struct drm_qda_mem_map margs; + struct drm_qda_fastrpc_invoke_args *args; + void *req, *rsp; + struct fastrpc_map_req_msg *req_msg; + struct fastrpc_map_rsp_msg *rsp_msg; + int err; + + memcpy(&margs, argp, sizeof(margs)); + + args = kzalloc_objs(*args, 3); + if (!args) + return -ENOMEM; + + req = kzalloc_obj(*req_msg); + if (!req) { + err = -ENOMEM; + goto err_free_args; + } + req_msg = (struct fastrpc_map_req_msg *)req; + + rsp = kzalloc_obj(*rsp_msg); + if (!rsp) { + err = -ENOMEM; + goto err_free_req; + } + rsp_msg = (struct fastrpc_map_rsp_msg *)rsp; + + ctx->input_pages = kzalloc_objs(*ctx->input_pages, 1); + if (!ctx->input_pages) { + err = -ENOMEM; + goto err_free_rsp; + } + + req_msg->remote_session_id = ctx->remote_session_id; + req_msg->flags = margs.flags; + req_msg->vaddr = margs.vaddrin; + req_msg->num = sizeof(*ctx->input_pages); + + args[0].ptr = (u64)(uintptr_t)req; + args[0].length = sizeof(*req_msg); + args[0].fd = -1; + + /* Resolve DMA-BUF fd to physical page descriptor */ + err = setup_mmap_pages(ctx, margs.fd, ctx->input_pages); + if (err) + goto err_free_input_pages; + + args[1].ptr = (u64)(uintptr_t)ctx->input_pages; + args[1].length = sizeof(*ctx->input_pages); + args[1].fd = -1; + + args[2].ptr = (u64)(uintptr_t)rsp; + args[2].length = sizeof(*rsp_msg); + args[2].fd = -1; + + ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_MMAP, 2, 1); + ctx->args = args; + ctx->req = req; + ctx->rsp = rsp; + ctx->handle = FASTRPC_INIT_HANDLE; + + return 0; + +err_free_input_pages: + kfree(ctx->input_pages); + ctx->input_pages = NULL; +err_free_rsp: + kfree(rsp); +err_free_req: + kfree(req); +err_free_args: + kfree(args); + return err; +} + +static int fastrpc_prepare_args_mem_map_attr(struct fastrpc_invoke_context *ctx, char __user *argp) +{ + struct drm_qda_mem_map margs; + struct drm_qda_fastrpc_invoke_args *args; + void *req, *rsp; + struct fastrpc_mem_map_req_msg *req_msg; + struct fastrpc_map_rsp_msg *rsp_msg; + int err; + + memcpy(&margs, argp, sizeof(margs)); + + args = kzalloc_objs(*args, 4); + if (!args) + return -ENOMEM; + + req = kzalloc_obj(*req_msg); + if (!req) { + err = -ENOMEM; + goto err_free_args; + } + req_msg = (struct fastrpc_mem_map_req_msg *)req; + + rsp = kzalloc_obj(*rsp_msg); + if (!rsp) { + err = -ENOMEM; + goto err_free_req; + } + rsp_msg = (struct fastrpc_map_rsp_msg *)rsp; + + ctx->input_pages = kzalloc_objs(*ctx->input_pages, 1); + if (!ctx->input_pages) { + err = -ENOMEM; + goto err_free_rsp; + } + + req_msg->remote_session_id = ctx->remote_session_id; + req_msg->fd = margs.fd; /* DMA-BUF fd forwarded to DSP */ + req_msg->offset = margs.offset; + req_msg->flags = margs.flags; + req_msg->vaddrin = margs.vaddrin; + req_msg->num = sizeof(*ctx->input_pages); + req_msg->data_len = 0; + + args[0].ptr = (u64)(uintptr_t)req; + args[0].length = sizeof(*req_msg); + args[0].fd = -1; + + /* Resolve DMA-BUF fd to physical page descriptor */ + err = setup_mmap_pages(ctx, margs.fd, ctx->input_pages); + if (err) + goto err_free_input_pages; + + args[1].ptr = (u64)(uintptr_t)ctx->input_pages; + args[1].length = sizeof(*ctx->input_pages); + args[1].fd = -1; + + /* args[2] is a zero-length handle-only entry required by the DSP protocol */ + args[2].ptr = (u64)(uintptr_t)ctx->input_pages; + args[2].length = 0; + args[2].fd = -1; + + args[3].ptr = (u64)(uintptr_t)rsp; + args[3].length = sizeof(*rsp_msg); + args[3].fd = -1; + + ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_MEM_MAP, 3, 1); + ctx->args = args; + ctx->req = req; + ctx->rsp = rsp; + ctx->handle = FASTRPC_INIT_HANDLE; + + return 0; + +err_free_input_pages: + kfree(ctx->input_pages); + ctx->input_pages = NULL; +err_free_rsp: + kfree(rsp); +err_free_req: + kfree(req); +err_free_args: + kfree(args); + return err; +} + static int fastrpc_prepare_args_invoke(struct fastrpc_invoke_context *ctx, char __user *argp) { struct drm_qda_invoke_args invoke_args; @@ -708,6 +939,12 @@ int qda_fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *ar ctx->pd = QDA_USER_PD; err = fastrpc_prepare_args_init_create(ctx, argp); break; + case FASTRPC_RMID_INIT_MMAP: + err = fastrpc_prepare_args_map(ctx, argp); + break; + case FASTRPC_RMID_INIT_MEM_MAP: + err = fastrpc_prepare_args_mem_map_attr(ctx, argp); + break; case FASTRPC_RMID_INVOKE_DYNAMIC: err = fastrpc_prepare_args_invoke(ctx, argp); break; diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h index 1c1236f9525e..71812eaf9a54 100644 --- a/drivers/accel/qda/qda_fastrpc.h +++ b/drivers/accel/qda/qda_fastrpc.h @@ -274,8 +274,10 @@ struct fastrpc_invoke_context {
/* Remote Method ID table - identifies initialization and control operations */ #define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP process */ +#define FASTRPC_RMID_INIT_MMAP 4 /* Map memory region to DSP */ #define FASTRPC_RMID_INIT_CREATE 6 /* Create DSP process */ #define FASTRPC_RMID_INIT_CREATE_ATTR 7 /* Create DSP process with attributes */ +#define FASTRPC_RMID_INIT_MEM_MAP 10 /* Map DMA buffer with attributes to DSP */ #define FASTRPC_RMID_INVOKE_DYNAMIC 0xFFFFFFFF /* Dynamic method invocation */
/* Common handle for initialization operations */ @@ -290,11 +292,65 @@ struct fastrpc_invoke_context { /* Maximum initialization file size (4 MB) */ #define FASTRPC_INIT_FILELEN_MAX (4 * 1024 * 1024)
+/* Message structures for internal FastRPC calls */ + +/** + * struct fastrpc_mem_map_req_msg - Memory map request message with attributes + * + * This message structure is sent to the DSP to request mapping + * of a DMA buffer with custom attributes (ATTR request). + */ +struct fastrpc_mem_map_req_msg { + /** @remote_session_id: Client identifier for the session */ + s32 remote_session_id; + /** @fd: DMA-BUF file descriptor of the buffer to map */ + s32 fd; + /** @offset: Byte offset within the buffer */ + s32 offset; + /** @flags: Mapping flags (cache attributes, permissions) */ + u32 flags; + /** @vaddrin: Virtual address hint for the DSP mapping */ + u64 vaddrin; + /** @num: Size of the physical page descriptor array in bytes */ + s32 num; + /** @data_len: Length of additional inline data */ + s32 data_len; +}; + +/** + * struct fastrpc_map_req_msg - Legacy memory map request message + * + * This message structure is sent to the DSP to request mapping + * of a DMA buffer into the DSP's virtual address space. + */ +struct fastrpc_map_req_msg { + /** @remote_session_id: Client identifier for the session */ + s32 remote_session_id; + /** @flags: Mapping flags (cache attributes, permissions) */ + u32 flags; + /** @vaddr: Virtual address hint for the DSP mapping */ + u64 vaddr; + /** @num: Size of the physical page descriptor array in bytes */ + s32 num; +}; + +/** + * struct fastrpc_map_rsp_msg - Memory map response message + * + * This message structure is returned by the DSP after successfully + * mapping a buffer, providing the virtual address for future access. + */ +struct fastrpc_map_rsp_msg { + /** @vaddrout: DSP virtual address assigned to the mapped buffer */ + u64 vaddrout; +}; + void qda_fastrpc_context_free(struct kref *ref); struct fastrpc_invoke_context *qda_fastrpc_context_alloc(void); int qda_fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp); int qda_fastrpc_get_header_size(struct fastrpc_invoke_context *ctx, size_t *out_size); int qda_fastrpc_invoke_pack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg); int qda_fastrpc_invoke_unpack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg); +int qda_fastrpc_return_result(struct fastrpc_invoke_context *ctx, char __user *argp);
#endif /* __QDA_FASTRPC_H__ */ diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c index 33f0a798ad13..283eb7535c45 100644 --- a/drivers/accel/qda/qda_ioctl.c +++ b/drivers/accel/qda/qda_ioctl.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only // Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. #include <drm/drm_ioctl.h> +#include <drm/drm_print.h> #include <drm/qda_accel.h> #include "qda_drv.h" #include "qda_fastrpc.h" @@ -178,6 +179,10 @@ static int fastrpc_invoke(int type, struct drm_device *dev, void *data, if (err) goto err_context_free;
+ err = qda_fastrpc_return_result(ctx, (char __user *)data); + if (err) + goto err_context_free; + fastrpc_context_put_id(ctx, qdev); kref_put(&ctx->refcount, qda_fastrpc_context_free); return 0; @@ -218,6 +223,37 @@ int qda_release_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv) return fastrpc_invoke(FASTRPC_RMID_INIT_RELEASE, &qdev->drm_dev, NULL, file_priv); }
+/** + * qda_ioctl_mmap() - Map memory to DSP address space + * @dev: DRM device structure + * @data: User-space data (struct drm_qda_mem_map) + * @file_priv: DRM file private data + * + * Return: 0 on success, negative error code on failure + */ +int qda_ioctl_mmap(struct drm_device *dev, void *data, struct drm_file *file_priv) +{ + struct drm_qda_mem_map *map_req; + + if (!data) + return -EINVAL; + + map_req = (struct drm_qda_mem_map *)data; + + if (map_req->pad) + return -EINVAL; + + switch (map_req->request) { + case QDA_MAP_REQUEST_LEGACY: + return fastrpc_invoke(FASTRPC_RMID_INIT_MMAP, dev, data, file_priv); + case QDA_MAP_REQUEST_ATTR: + return fastrpc_invoke(FASTRPC_RMID_INIT_MEM_MAP, dev, data, file_priv); + default: + drm_err(dev, "Invalid map request type: %u\n", map_req->request); + return -EINVAL; + } +} + /** * qda_ioctl_invoke() - Perform a dynamic FastRPC method invocation * @dev: DRM device structure diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h index 192565434363..457ceccede08 100644 --- a/drivers/accel/qda/qda_ioctl.h +++ b/drivers/accel/qda/qda_ioctl.h @@ -13,5 +13,6 @@ int qda_ioctl_init_create(struct drm_device *dev, void *data, struct drm_file *f int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *file_priv); int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv); int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv); +int qda_ioctl_mmap(struct drm_device *dev, void *data, struct drm_file *file_priv);
#endif /* __QDA_IOCTL_H__ */ diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h index 711e2523a570..173f59abd361 100644 --- a/include/uapi/drm/qda_accel.h +++ b/include/uapi/drm/qda_accel.h @@ -21,8 +21,9 @@ extern "C" { #define DRM_QDA_QUERY 0x00 #define DRM_QDA_GEM_CREATE 0x01 #define DRM_QDA_GEM_MMAP_OFFSET 0x02 -/* Command number 0x03 reserved for INIT_ATTACH; 0x05-0x06 reserved for MAP, MUNMAP */ +/* Command number 0x03 reserved for INIT_ATTACH; 0x06 reserved for MUNMAP */ #define DRM_QDA_REMOTE_SESSION_CREATE 0x04 +#define DRM_QDA_REMOTE_MAP 0x05 #define DRM_QDA_REMOTE_INVOKE 0x07
/* @@ -41,9 +42,15 @@ extern "C" { #define DRM_IOCTL_QDA_REMOTE_SESSION_CREATE \ DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_REMOTE_SESSION_CREATE, \ struct drm_qda_init_create) +#define DRM_IOCTL_QDA_REMOTE_MAP DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_REMOTE_MAP, \ + struct drm_qda_mem_map) #define DRM_IOCTL_QDA_REMOTE_INVOKE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_REMOTE_INVOKE, \ struct drm_qda_invoke_args)
+/* Request type definitions for qda_mem_map */ +#define QDA_MAP_REQUEST_LEGACY 1 /* Legacy MMAP operation */ +#define QDA_MAP_REQUEST_ATTR 2 /* Handle-based MEM_MAP operation with attributes */ + /** * struct drm_qda_query - Device information query structure * @dsp_name: Name of DSP (e.g., "adsp", "cdsp", "cdsp1", "gdsp0", "gdsp1") @@ -145,6 +152,42 @@ struct drm_qda_invoke_args { __u64 args; };
+/** + * struct drm_qda_mem_map - Memory mapping request structure + * @request: Request type (QDA_MAP_REQUEST_LEGACY or QDA_MAP_REQUEST_ATTR) + * @flags: Mapping flags for DSP (cache attributes, permissions) + * @fd: DMA-BUF file descriptor of the buffer to map + * @attrs: Mapping attributes (used for ATTR request) + * @offset: Offset within buffer (used for ATTR request) + * @pad: Padding for 64-bit alignment (must be zero) + * @vaddrin: Optional virtual address hint for mapping + * @size: Size of the memory region to map in bytes + * @vaddrout: Output DSP virtual address after successful mapping + * + * This structure is used to request mapping of a DMA buffer into the + * DSP's virtual address space. The DSP will map the buffer according + * to the specified flags and return the virtual address in vaddrout. + * + * For QDA_MAP_REQUEST_LEGACY (value 1): + * - Uses fields: fd, flags, vaddrin, size, vaddrout + * - Legacy MMAP operation for backward compatibility + * + * For QDA_MAP_REQUEST_ATTR (value 2): + * - Uses all fields including attrs and offset + * - FD-based MEM_MAP operation with custom SMMU attributes + */ +struct drm_qda_mem_map { + __u32 request; + __u32 flags; + __s32 fd; + __u32 attrs; + __u32 offset; + __u32 pad; + __u64 vaddrin; + __u64 size; + __u64 vaddrout; +}; + #if defined(__cplusplus) } #endif
From: Ekansh Gupta ekansh.gupta@oss.qualcomm.com
Implement DRM_IOCTL_QDA_REMOTE_MUNMAP (command 0x06), which unmaps a previously mapped memory region from the DSP's virtual address space. Two unmap modes mirror the two map modes:
QDA_MUNMAP_REQUEST_LEGACY (FASTRPC_RMID_INIT_MUNMAP) Legacy single-argument unmap: sends a fastrpc_munmap_req_msg containing the session ID, the DSP virtual address (vaddrout from the original map response), and the region size.
QDA_MUNMAP_REQUEST_ATTR (FASTRPC_RMID_INIT_MEM_UNMAP) Attribute-based unmap: sends a fastrpc_mem_unmap_req_msg which additionally carries the original DMA-BUF fd and virtual address, matching the fd-based MEM_MAP path.
DRM_QDA_REMOTE_MUNMAP is assigned command number 0x06, filling the slot that was previously reserved for this purpose.
Assisted-by: Claude:claude-4-6-sonnet Signed-off-by: Ekansh Gupta ekansh.gupta@oss.qualcomm.com --- drivers/accel/qda/qda_drv.c | 1 + drivers/accel/qda/qda_fastrpc.c | 84 +++++++++++++++++++++++++++++++++++++++++ drivers/accel/qda/qda_fastrpc.h | 34 +++++++++++++++++ drivers/accel/qda/qda_ioctl.c | 28 ++++++++++++++ drivers/accel/qda/qda_ioctl.h | 1 + include/uapi/drm/qda_accel.h | 36 +++++++++++++++++- 6 files changed, 183 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c index 3640e4a41605..41cc207447b4 100644 --- a/drivers/accel/qda/qda_drv.c +++ b/drivers/accel/qda/qda_drv.c @@ -68,6 +68,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = { DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0), DRM_IOCTL_DEF_DRV(QDA_REMOTE_SESSION_CREATE, qda_ioctl_init_create, 0), DRM_IOCTL_DEF_DRV(QDA_REMOTE_MAP, qda_ioctl_mmap, 0), + DRM_IOCTL_DEF_DRV(QDA_REMOTE_MUNMAP, qda_ioctl_munmap, 0), DRM_IOCTL_DEF_DRV(QDA_REMOTE_INVOKE, qda_ioctl_invoke, 0), };
diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c index cab3a560ceb5..0513beede428 100644 --- a/drivers/accel/qda/qda_fastrpc.c +++ b/drivers/accel/qda/qda_fastrpc.c @@ -887,6 +887,84 @@ static int fastrpc_prepare_args_mem_map_attr(struct fastrpc_invoke_context *ctx, return err; }
+static int fastrpc_prepare_args_munmap(struct fastrpc_invoke_context *ctx, char __user *argp) +{ + struct drm_qda_fastrpc_invoke_args *args; + struct fastrpc_munmap_req_msg *req_msg; + struct drm_qda_mem_unmap uargs; + void *req; + int err; + + memcpy(&uargs, argp, sizeof(uargs)); + + args = kzalloc_obj(*args); + if (!args) + return -ENOMEM; + + req = kzalloc_obj(*req_msg); + if (!req) { + err = -ENOMEM; + goto err_free_args; + } + req_msg = (struct fastrpc_munmap_req_msg *)req; + + req_msg->remote_session_id = ctx->remote_session_id; + req_msg->size = uargs.size; + req_msg->vaddr = uargs.vaddrout; + + setup_single_arg(args, req_msg, sizeof(*req_msg)); + ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_MUNMAP, 1, 0); + ctx->args = args; + ctx->req = req; + ctx->handle = FASTRPC_INIT_HANDLE; + + return 0; + +err_free_args: + kfree(args); + return err; +} + +static int fastrpc_prepare_args_mem_unmap_attr(struct fastrpc_invoke_context *ctx, + char __user *argp) +{ + struct drm_qda_fastrpc_invoke_args *args; + struct fastrpc_mem_unmap_req_msg *req_msg; + struct drm_qda_mem_unmap uargs; + void *req; + int err; + + memcpy(&uargs, argp, sizeof(uargs)); + + args = kzalloc_obj(*args); + if (!args) + return -ENOMEM; + + req = kzalloc_obj(*req_msg); + if (!req) { + err = -ENOMEM; + goto err_free_args; + } + req_msg = (struct fastrpc_mem_unmap_req_msg *)req; + + req_msg->remote_session_id = ctx->remote_session_id; + req_msg->fd = uargs.fd; /* DMA-BUF fd forwarded to DSP */ + req_msg->vaddrin = uargs.vaddr; + req_msg->len = uargs.size; + + setup_single_arg(args, req_msg, sizeof(*req_msg)); + ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_MEM_UNMAP, 1, 0); + ctx->args = args; + ctx->req = req; + ctx->handle = FASTRPC_INIT_HANDLE; + + return 0; + +err_free_args: + kfree(args); + return err; +} + static int fastrpc_prepare_args_invoke(struct fastrpc_invoke_context *ctx, char __user *argp) { struct drm_qda_invoke_args invoke_args; @@ -945,6 +1023,12 @@ int qda_fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *ar case FASTRPC_RMID_INIT_MEM_MAP: err = fastrpc_prepare_args_mem_map_attr(ctx, argp); break; + case FASTRPC_RMID_INIT_MUNMAP: + err = fastrpc_prepare_args_munmap(ctx, argp); + break; + case FASTRPC_RMID_INIT_MEM_UNMAP: + err = fastrpc_prepare_args_mem_unmap_attr(ctx, argp); + break; case FASTRPC_RMID_INVOKE_DYNAMIC: err = fastrpc_prepare_args_invoke(ctx, argp); break; diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h index 71812eaf9a54..030e9b954f7a 100644 --- a/drivers/accel/qda/qda_fastrpc.h +++ b/drivers/accel/qda/qda_fastrpc.h @@ -275,9 +275,11 @@ struct fastrpc_invoke_context { /* Remote Method ID table - identifies initialization and control operations */ #define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP process */ #define FASTRPC_RMID_INIT_MMAP 4 /* Map memory region to DSP */ +#define FASTRPC_RMID_INIT_MUNMAP 5 /* Unmap DSP memory region */ #define FASTRPC_RMID_INIT_CREATE 6 /* Create DSP process */ #define FASTRPC_RMID_INIT_CREATE_ATTR 7 /* Create DSP process with attributes */ #define FASTRPC_RMID_INIT_MEM_MAP 10 /* Map DMA buffer with attributes to DSP */ +#define FASTRPC_RMID_INIT_MEM_UNMAP 11 /* Unmap DMA buffer from DSP */ #define FASTRPC_RMID_INVOKE_DYNAMIC 0xFFFFFFFF /* Dynamic method invocation */
/* Common handle for initialization operations */ @@ -345,6 +347,38 @@ struct fastrpc_map_rsp_msg { u64 vaddrout; };
+/** + * struct fastrpc_mem_unmap_req_msg - Memory unmap request message with attributes + * + * This message structure is sent to the DSP to request unmapping + * of a previously mapped memory region (ATTR request). + */ +struct fastrpc_mem_unmap_req_msg { + /** @remote_session_id: Client identifier for the session */ + s32 remote_session_id; + /** @fd: DMA-BUF file descriptor of the buffer to unmap */ + s32 fd; + /** @vaddrin: DSP virtual address of the mapped region to unmap */ + u64 vaddrin; + /** @len: Size of the region to unmap in bytes */ + u64 len; +}; + +/** + * struct fastrpc_munmap_req_msg - Legacy memory unmap request message + * + * This message structure is sent to the DSP to request unmapping + * of a previously mapped memory region. + */ +struct fastrpc_munmap_req_msg { + /** @remote_session_id: Client identifier for the session */ + s32 remote_session_id; + /** @vaddr: DSP virtual address of the mapped region to unmap */ + u64 vaddr; + /** @size: Size of the region to unmap in bytes */ + u64 size; +}; + void qda_fastrpc_context_free(struct kref *ref); struct fastrpc_invoke_context *qda_fastrpc_context_alloc(void); int qda_fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp); diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c index 283eb7535c45..aeba6190182e 100644 --- a/drivers/accel/qda/qda_ioctl.c +++ b/drivers/accel/qda/qda_ioctl.c @@ -254,6 +254,34 @@ int qda_ioctl_mmap(struct drm_device *dev, void *data, struct drm_file *file_pri } }
+/** + * qda_ioctl_munmap() - Unmap memory from DSP address space + * @dev: DRM device structure + * @data: User-space data (struct drm_qda_mem_unmap) + * @file_priv: DRM file private data + * + * Return: 0 on success, negative error code on failure + */ +int qda_ioctl_munmap(struct drm_device *dev, void *data, struct drm_file *file_priv) +{ + struct drm_qda_mem_unmap *unmap_req; + + if (!data) + return -EINVAL; + + unmap_req = (struct drm_qda_mem_unmap *)data; + + switch (unmap_req->request) { + case QDA_MUNMAP_REQUEST_LEGACY: + return fastrpc_invoke(FASTRPC_RMID_INIT_MUNMAP, dev, data, file_priv); + case QDA_MUNMAP_REQUEST_ATTR: + return fastrpc_invoke(FASTRPC_RMID_INIT_MEM_UNMAP, dev, data, file_priv); + default: + drm_err(dev, "Invalid munmap request type: %u\n", unmap_req->request); + return -EINVAL; + } +} + /** * qda_ioctl_invoke() - Perform a dynamic FastRPC method invocation * @dev: DRM device structure diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h index 457ceccede08..e14a39050d09 100644 --- a/drivers/accel/qda/qda_ioctl.h +++ b/drivers/accel/qda/qda_ioctl.h @@ -14,5 +14,6 @@ int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *fi int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv); int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv); int qda_ioctl_mmap(struct drm_device *dev, void *data, struct drm_file *file_priv); +int qda_ioctl_munmap(struct drm_device *dev, void *data, struct drm_file *file_priv);
#endif /* __QDA_IOCTL_H__ */ diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h index 173f59abd361..e3b5c9a963bf 100644 --- a/include/uapi/drm/qda_accel.h +++ b/include/uapi/drm/qda_accel.h @@ -21,9 +21,10 @@ extern "C" { #define DRM_QDA_QUERY 0x00 #define DRM_QDA_GEM_CREATE 0x01 #define DRM_QDA_GEM_MMAP_OFFSET 0x02 -/* Command number 0x03 reserved for INIT_ATTACH; 0x06 reserved for MUNMAP */ +/* Command number 0x03 reserved for INIT_ATTACH */ #define DRM_QDA_REMOTE_SESSION_CREATE 0x04 #define DRM_QDA_REMOTE_MAP 0x05 +#define DRM_QDA_REMOTE_MUNMAP 0x06 #define DRM_QDA_REMOTE_INVOKE 0x07
/* @@ -44,6 +45,8 @@ extern "C" { struct drm_qda_init_create) #define DRM_IOCTL_QDA_REMOTE_MAP DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_REMOTE_MAP, \ struct drm_qda_mem_map) +#define DRM_IOCTL_QDA_REMOTE_MUNMAP DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_REMOTE_MUNMAP, \ + struct drm_qda_mem_unmap) #define DRM_IOCTL_QDA_REMOTE_INVOKE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_REMOTE_INVOKE, \ struct drm_qda_invoke_args)
@@ -51,6 +54,10 @@ extern "C" { #define QDA_MAP_REQUEST_LEGACY 1 /* Legacy MMAP operation */ #define QDA_MAP_REQUEST_ATTR 2 /* Handle-based MEM_MAP operation with attributes */
+/* Request type definitions for qda_mem_unmap */ +#define QDA_MUNMAP_REQUEST_LEGACY 1 /* Legacy MUNMAP operation */ +#define QDA_MUNMAP_REQUEST_ATTR 2 /* Handle-based MEM_UNMAP operation */ + /** * struct drm_qda_query - Device information query structure * @dsp_name: Name of DSP (e.g., "adsp", "cdsp", "cdsp1", "gdsp0", "gdsp1") @@ -188,6 +195,33 @@ struct drm_qda_mem_map { __u64 vaddrout; };
+/** + * struct drm_qda_mem_unmap - Memory unmapping request structure + * @request: Request type (QDA_MUNMAP_REQUEST_LEGACY or QDA_MUNMAP_REQUEST_ATTR) + * @fd: DMA-BUF file descriptor (used for ATTR request) + * @vaddr: Virtual address (used for ATTR request) + * @vaddrout: DSP virtual address (used for LEGACY request) + * @size: Size of the memory region to unmap in bytes + * + * This structure is used to request unmapping of a previously mapped + * memory region from the DSP's virtual address space. + * + * For QDA_MUNMAP_REQUEST_LEGACY (value 1): + * - Uses fields: vaddrout, size + * - Legacy MUNMAP operation for backward compatibility + * + * For QDA_MUNMAP_REQUEST_ATTR (value 2): + * - Uses fields: fd, vaddr, size + * - Handle-based MEM_UNMAP operation + */ +struct drm_qda_mem_unmap { + __u32 request; + __s32 fd; + __u64 vaddr; + __u64 vaddrout; + __u64 size; +}; + #if defined(__cplusplus) } #endif
linaro-mm-sig@lists.linaro.org