Linaro-mm-sig April 2025

linaro-mm-sig@lists.linaro.org

24 participants
109 discussions

Re: [PATCH v7 04/11] optee: sync secure world ABI headers

by Jens Wiklander

Hi Rouven, On Fri, Apr 25, 2025 at 3:36 PM Rouven Czerwinski <rouven.czerwinski(a)linaro.org> wrote: > > Hi, > > On Fri, 4 Apr 2025 at 16:31, Jens Wiklander <jens.wiklander(a)linaro.org> wrote: > > > > Update the header files describing the secure world ABI, both with and > > without FF-A. The ABI is extended to deal with protected memory, but as > > usual backward compatible. > > > > Signed-off-by: Jens Wiklander <jens.wiklander(a)linaro.org> > > --- > > drivers/tee/optee/optee_ffa.h | 27 +++++++++--- > > drivers/tee/optee/optee_msg.h | 83 ++++++++++++++++++++++++++++++----- > > drivers/tee/optee/optee_smc.h | 71 +++++++++++++++++++++++++++++- > > 3 files changed, 163 insertions(+), 18 deletions(-) > > > > diff --git a/drivers/tee/optee/optee_ffa.h b/drivers/tee/optee/optee_ffa.h > > index 257735ae5b56..cc257e7956a3 100644 > > --- a/drivers/tee/optee/optee_ffa.h > > +++ b/drivers/tee/optee/optee_ffa.h > > @@ -81,7 +81,7 @@ > > * as the second MSG arg struct for > > * OPTEE_FFA_YIELDING_CALL_WITH_ARG. > > * Bit[31:8]: Reserved (MBZ) > > - * w5: Bitfield of secure world capabilities OPTEE_FFA_SEC_CAP_* below, > > + * w5: Bitfield of OP-TEE capabilities OPTEE_FFA_SEC_CAP_* > > * w6: The maximum secure world notification number > > * w7: Not used (MBZ) > > */ > > @@ -94,6 +94,8 @@ > > #define OPTEE_FFA_SEC_CAP_ASYNC_NOTIF BIT(1) > > /* OP-TEE supports probing for RPMB device if needed */ > > #define OPTEE_FFA_SEC_CAP_RPMB_PROBE BIT(2) > > +/* OP-TEE supports Protected Memory for secure data path */ > > +#define OPTEE_FFA_SEC_CAP_PROTMEM BIT(3) > > > > #define OPTEE_FFA_EXCHANGE_CAPABILITIES OPTEE_FFA_BLOCKING_CALL(2) > > > > @@ -108,7 +110,7 @@ > > * > > * Return register usage: > > * w3: Error code, 0 on success > > - * w4-w7: Note used (MBZ) > > + * w4-w7: Not used (MBZ) > > */ > > #define OPTEE_FFA_UNREGISTER_SHM OPTEE_FFA_BLOCKING_CALL(3) > > > > @@ -119,16 +121,31 @@ > > * Call register usage: > > * w3: Service ID, OPTEE_FFA_ENABLE_ASYNC_NOTIF > > * w4: Notification value to request bottom half processing, should be > > - * less than OPTEE_FFA_MAX_ASYNC_NOTIF_VALUE. > > + * less than OPTEE_FFA_MAX_ASYNC_NOTIF_VALUE > > * w5-w7: Not used (MBZ) > > * > > * Return register usage: > > * w3: Error code, 0 on success > > - * w4-w7: Note used (MBZ) > > + * w4-w7: Not used (MBZ) > > */ > > #define OPTEE_FFA_ENABLE_ASYNC_NOTIF OPTEE_FFA_BLOCKING_CALL(5) > > > > -#define OPTEE_FFA_MAX_ASYNC_NOTIF_VALUE 64 > > +#define OPTEE_FFA_MAX_ASYNC_NOTIF_VALUE 64 > > + > > +/* > > + * Release Protected memory > > + * > > + * Call register usage: > > + * w3: Service ID, OPTEE_FFA_RECLAIM_PROTMEM > > + * w4: Shared memory handle, lower bits > > + * w5: Shared memory handle, higher bits > > + * w6-w7: Not used (MBZ) > > + * > > + * Return register usage: > > + * w3: Error code, 0 on success > > + * w4-w7: Note used (MBZ) > > + */ > > +#define OPTEE_FFA_RELEASE_PROTMEM OPTEE_FFA_BLOCKING_CALL(8) > > > > /* > > * Call with struct optee_msg_arg as argument in the supplied shared memory > > diff --git a/drivers/tee/optee/optee_msg.h b/drivers/tee/optee/optee_msg.h > > index e8840a82b983..22d71d6f110d 100644 > > --- a/drivers/tee/optee/optee_msg.h > > +++ b/drivers/tee/optee/optee_msg.h > > @@ -133,13 +133,13 @@ struct optee_msg_param_rmem { > > }; > > > > /** > > - * struct optee_msg_param_fmem - ffa memory reference parameter > > + * struct optee_msg_param_fmem - FF-A memory reference parameter > > * @offs_lower: Lower bits of offset into shared memory reference > > * @offs_upper: Upper bits of offset into shared memory reference > > * @internal_offs: Internal offset into the first page of shared memory > > * reference > > * @size: Size of the buffer > > - * @global_id: Global identifier of Shared memory > > + * @global_id: Global identifier of the shared memory > > */ > > struct optee_msg_param_fmem { > > u32 offs_low; > > @@ -165,7 +165,7 @@ struct optee_msg_param_value { > > * @attr: attributes > > * @tmem: parameter by temporary memory reference > > * @rmem: parameter by registered memory reference > > - * @fmem: parameter by ffa registered memory reference > > + * @fmem: parameter by FF-A registered memory reference > > * @value: parameter by opaque value > > * @octets: parameter by octet string > > * > > @@ -296,6 +296,18 @@ struct optee_msg_arg { > > */ > > #define OPTEE_MSG_FUNCID_GET_OS_REVISION 0x0001 > > > > +/* > > + * Values used in OPTEE_MSG_CMD_LEND_PROTMEM below > > + * OPTEE_MSG_PROTMEM_RESERVED Reserved > > + * OPTEE_MSG_PROTMEM_SECURE_VIDEO_PLAY Secure Video Playback > > + * OPTEE_MSG_PROTMEM_TRUSTED_UI Trused UI > > + * OPTEE_MSG_PROTMEM_SECURE_VIDEO_RECORD Secure Video Recording > > + */ > > +#define OPTEE_MSG_PROTMEM_RESERVED 0 > > +#define OPTEE_MSG_PROTMEM_SECURE_VIDEO_PLAY 1 > > +#define OPTEE_MSG_PROTMEM_TRUSTED_UI 2 > > +#define OPTEE_MSG_PROTMEM_SECURE_VIDEO_RECORD 3 > > + > > /* > > * Do a secure call with struct optee_msg_arg as argument > > * The OPTEE_MSG_CMD_* below defines what goes in struct optee_msg_arg::cmd > > @@ -337,15 +349,62 @@ struct optee_msg_arg { > > * OPTEE_MSG_CMD_STOP_ASYNC_NOTIF informs secure world that from now is > > * normal world unable to process asynchronous notifications. Typically > > * used when the driver is shut down. > > + * > > + * OPTEE_MSG_CMD_LEND_PROTMEM lends protected memory. The passed normal > > + * physical memory is protected from normal world access. The memory > > + * should be unmapped prior to this call since it becomes inaccessible > > + * during the request. > > + * Parameters are passed as: > > + * [in] param[0].attr OPTEE_MSG_ATTR_TYPE_VALUE_INPUT > > + * [in] param[0].u.value.a OPTEE_MSG_PROTMEM_* defined above > > + * [in] param[1].attr OPTEE_MSG_ATTR_TYPE_TMEM_INPUT > > + * [in] param[1].u.tmem.buf_ptr physical address > > + * [in] param[1].u.tmem.size size > > + * [in] param[1].u.tmem.shm_ref holds protected memory reference > > + * > > + * OPTEE_MSG_CMD_RECLAIM_PROTMEM reclaims a previously lent protected > > + * memory reference. The physical memory is accessible by the normal world > > + * after this function has return and can be mapped again. The information > > + * is passed as: > > + * [in] param[0].attr OPTEE_MSG_ATTR_TYPE_VALUE_INPUT > > + * [in] param[0].u.value.a holds protected memory cookie > > + * > > + * OPTEE_MSG_CMD_GET_PROTMEM_CONFIG get configuration for a specific > > + * protected memory use case. Parameters are passed as: > > + * [in] param[0].attr OPTEE_MSG_ATTR_TYPE_VALUE_INOUT > > + * [in] param[0].value.a OPTEE_MSG_PROTMEM_* > > + * [in] param[1].attr OPTEE_MSG_ATTR_TYPE_{R,F}MEM_OUTPUT > > + * [in] param[1].u.{r,f}mem Buffer or NULL > > + * [in] param[1].u.{r,f}mem.size Provided size of buffer or 0 for query > > + * output for the protected use case: > > + * [out] param[0].value.a Minimal size of protected memory > > + * [out] param[0].value.b Required alignment of size and start of > > + * protected memory > > + * [out] param[1].{r,f}mem.size Size of output data > > + * [out] param[1].{r,f}mem If non-NULL, contains an array of > > + * uint16_t holding endpoints that > > + * must be included when lending > > + * memory for this use case > > + * > > + * OPTEE_MSG_CMD_ASSIGN_PROTMEM assigns use-case to protected memory > > + * previously lent using the FFA_LEND framework ABI. Parameters are passed > > + * as: > > + * [in] param[0].attr OPTEE_MSG_ATTR_TYPE_VALUE_INPUT > > + * [in] param[0].u.value.a holds protected memory cookie > > + * [in] param[0].u.value.b OPTEE_MSG_PROTMEM_* defined above > > */ > > -#define OPTEE_MSG_CMD_OPEN_SESSION 0 > > -#define OPTEE_MSG_CMD_INVOKE_COMMAND 1 > > -#define OPTEE_MSG_CMD_CLOSE_SESSION 2 > > -#define OPTEE_MSG_CMD_CANCEL 3 > > -#define OPTEE_MSG_CMD_REGISTER_SHM 4 > > -#define OPTEE_MSG_CMD_UNREGISTER_SHM 5 > > -#define OPTEE_MSG_CMD_DO_BOTTOM_HALF 6 > > -#define OPTEE_MSG_CMD_STOP_ASYNC_NOTIF 7 > > -#define OPTEE_MSG_FUNCID_CALL_WITH_ARG 0x0004 > > +#define OPTEE_MSG_CMD_OPEN_SESSION 0 > > +#define OPTEE_MSG_CMD_INVOKE_COMMAND 1 > > +#define OPTEE_MSG_CMD_CLOSE_SESSION 2 > > +#define OPTEE_MSG_CMD_CANCEL 3 > > +#define OPTEE_MSG_CMD_REGISTER_SHM 4 > > +#define OPTEE_MSG_CMD_UNREGISTER_SHM 5 > > +#define OPTEE_MSG_CMD_DO_BOTTOM_HALF 6 > > +#define OPTEE_MSG_CMD_STOP_ASYNC_NOTIF 7 > > +#define OPTEE_MSG_CMD_LEND_PROTMEM 8 > > +#define OPTEE_MSG_CMD_RECLAIM_PROTMEM 9 > > +#define OPTEE_MSG_CMD_GET_PROTMEM_CONFIG 10 > > +#define OPTEE_MSG_CMD_ASSIGN_PROTMEM 11 > > +#define OPTEE_MSG_FUNCID_CALL_WITH_ARG 0x0004 > > > > #endif /* _OPTEE_MSG_H */ > > diff --git a/drivers/tee/optee/optee_smc.h b/drivers/tee/optee/optee_smc.h > > index 879426300821..b17e81f464a3 100644 > > --- a/drivers/tee/optee/optee_smc.h > > +++ b/drivers/tee/optee/optee_smc.h > > @@ -264,7 +264,6 @@ struct optee_smc_get_shm_config_result { > > #define OPTEE_SMC_SEC_CAP_HAVE_RESERVED_SHM BIT(0) > > /* Secure world can communicate via previously unregistered shared memory */ > > #define OPTEE_SMC_SEC_CAP_UNREGISTERED_SHM BIT(1) > > - > > /* > > * Secure world supports commands "register/unregister shared memory", > > * secure world accepts command buffers located in any parts of non-secure RAM > > @@ -280,6 +279,10 @@ struct optee_smc_get_shm_config_result { > > #define OPTEE_SMC_SEC_CAP_RPC_ARG BIT(6) > > /* Secure world supports probing for RPMB device if needed */ > > #define OPTEE_SMC_SEC_CAP_RPMB_PROBE BIT(7) > > +/* Secure world supports protected memory */ > > +#define OPTEE_SMC_SEC_CAP_PROTMEM BIT(8) > > +/* Secure world supports dynamic protected memory */ > > +#define OPTEE_SMC_SEC_CAP_DYNAMIC_PROTMEM BIT(9) > > > > #define OPTEE_SMC_FUNCID_EXCHANGE_CAPABILITIES 9 > > #define OPTEE_SMC_EXCHANGE_CAPABILITIES \ > > @@ -451,6 +454,72 @@ struct optee_smc_disable_shm_cache_result { > > > > /* See OPTEE_SMC_CALL_WITH_REGD_ARG above */ > > #define OPTEE_SMC_FUNCID_CALL_WITH_REGD_ARG 19 > > +/* > > + * Get protected memory config > > + * > > + * Returns the protected memory config. > > + * > > + * Call register usage: > > + * a0 SMC Function ID, OPTEE_SMC_GET_PROTMEM_CONFIG > > + * a2-6 Not used, must be zero > > + * a7 Hypervisor Client ID register > > + * > > + * Have config return register usage: > > + * a0 OPTEE_SMC_RETURN_OK > > + * a1 Physical address of start of protected memory > > + * a2 Size of protected memory > > + * a3 Not used > > + * a4-7 Preserved > > + * > > + * Not available register usage: > > + * a0 OPTEE_SMC_RETURN_ENOTAVAIL > > + * a1-3 Not used > > + * a4-7 Preserved > > + */ > > +#define OPTEE_SMC_FUNCID_GET_PROTMEM_CONFIG 20 > > +#define OPTEE_SMC_GET_PROTMEM_CONFIG \ > > + OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_GET_PROTMEM_CONFIG) > > + > > +struct optee_smc_get_protmem_config_result { > > + unsigned long status; > > + unsigned long start; > > + unsigned long size; > > + unsigned long flags; > > The ABI comment does not document a flags return argument, either > this can be removed or the ABI comment needs to be fixed. Sure, I'll remove the field. > Same for > > +}; > > + > > +/* > > + * Get dynamic protected memory config > > + * > > + * Returns the dynamic protected memory config. > > + * > > + * Call register usage: > > + * a0 SMC Function ID, OPTEE_SMC_GET_DYN_SHM_CONFIG > > should be OPTEE_SMC_GET_DYN_PROTMEM_CONFIG Thanks, I'll update. > > > + * a2-6 Not used, must be zero > > + * a7 Hypervisor Client ID register > > + * > > + * Have config return register usage: > > + * a0 OPTEE_SMC_RETURN_OK > > + * a1 Minamal size of protected memory > > Nit: Typo, should be "Minimal" Yes, I'll update. Cheers, Jens > > > + * a2 Required alignment of size and start of registered protected memory > > + * a3 Not used > > + * a4-7 Preserved > > + * > > + * Not available register usage: > > + * a0 OPTEE_SMC_RETURN_ENOTAVAIL > > + * a1-3 Not used > > + * a4-7 Preserved > > + */ > > + > > +#define OPTEE_SMC_FUNCID_GET_DYN_PROTMEM_CONFIG 21 > > +#define OPTEE_SMC_GET_DYN_PROTMEM_CONFIG \ > > + OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_GET_DYN_PROTMEM_CONFIG) > > + > > +struct optee_smc_get_dyn_protmem_config_result { > > + unsigned long status; > > + unsigned long size; > > + unsigned long align; > > + unsigned long flags; > > +}; > > > > /* > > * Resume from RPC (for example after processing a foreign interrupt) > > -- > > 2.43.0 > > - Rouven

7 months

[PATCH AUTOSEL 6.13 15/34] drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P

by Sasha Levin

From: Christian König <christian.koenig(a)amd.com> [ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ] Try pinning into VRAM to allow P2P with RDMA NICs without ODP support if all attachments can do P2P. If any attachment can't do P2P just pin into GTT instead. Acked-by: Simona Vetter <simona.vetter(a)ffwll.ch> Signed-off-by: Christian König <christian.koenig(a)amd.com> Signed-off-by: Felix Kuehling <felix.kuehling(a)amd.com> Reviewed-by: Felix Kuehling <felix.kuehling(a)amd.com> Tested-by: Pak Nin Lui <pak.lui(a)amd.com> Cc: Simona Vetter <simona.vetter(a)ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 25 +++++++++++++++------ 1 file changed, 18 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index 8e81a83d37d84..83390143c2e9f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -72,11 +72,25 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf, */ static int amdgpu_dma_buf_pin(struct dma_buf_attachment *attach) { - struct drm_gem_object *obj = attach->dmabuf->priv; - struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); + struct dma_buf *dmabuf = attach->dmabuf; + struct amdgpu_bo *bo = gem_to_amdgpu_bo(dmabuf->priv); + u32 domains = bo->preferred_domains; - /* pin buffer into GTT */ - return amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT); + dma_resv_assert_held(dmabuf->resv); + + /* + * Try pinning into VRAM to allow P2P with RDMA NICs without ODP + * support if all attachments can do P2P. If any attachment can't do + * P2P just pin into GTT instead. + */ + list_for_each_entry(attach, &dmabuf->attachments, node) + if (!attach->peer2peer) + domains &= ~AMDGPU_GEM_DOMAIN_VRAM; + + if (domains & AMDGPU_GEM_DOMAIN_VRAM) + bo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; + + return amdgpu_bo_pin(bo, domains); } /** @@ -131,9 +145,6 @@ static struct sg_table *amdgpu_dma_buf_map(struct dma_buf_attachment *attach, r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx); if (r) return ERR_PTR(r); - - } else if (bo->tbo.resource->mem_type != TTM_PL_TT) { - return ERR_PTR(-EBUSY); } switch (bo->tbo.resource->mem_type) { -- 2.39.5

7 months, 1 week

[PATCH v3 0/2] dma-buf: heaps: Support carved-out heaps

by Maxime Ripard

Hi, This series is the follow-up of the discussion that John and I had some time ago here: https://lore.kernel.org/all/CANDhNCquJn6bH3KxKf65BWiTYLVqSd9892-xtFDHHqqyrr… The initial problem we were discussing was that I'm currently working on a platform which has a memory layout with ECC enabled. However, enabling the ECC has a number of drawbacks on that platform: lower performance, increased memory usage, etc. So for things like framebuffers, the trade-off isn't great and thus there's a memory region with ECC disabled to allocate from for such use cases. After a suggestion from John, I chose to first start using heap allocations flags to allow for userspace to ask for a particular ECC setup. This is then backed by a new heap type that runs from reserved memory chunks flagged as such, and the existing DT properties to specify the ECC properties. After further discussion, it was considered that flags were not the right solution, and relying on the names of the heaps would be enough to let userspace know the kind of buffer it deals with. Thus, even though the uAPI part of it has been dropped in this second version, we still need a driver to create heaps out of carved-out memory regions. In addition to the original usecase, a similar driver can be found in BSPs from most vendors, so I believe it would be a useful addition to the kernel. I submitted a draft PR to the DT schema for the bindings used in this PR: https://github.com/devicetree-org/dt-schema/pull/138 Let me know what you think, Maxime Signed-off-by: Maxime Ripard <mripard(a)kernel.org> --- Changes in v3: - Reworked global variable patch - Link to v2: https://lore.kernel.org/r/20250401-dma-buf-ecc-heap-v2-0-043fd006a1af@kerne… Changes in v2: - Add vmap/vunmap operations - Drop ECC flags uapi - Rebase on top of 6.14 - Link to v1: https://lore.kernel.org/r/20240515-dma-buf-ecc-heap-v1-0-54cbbd049511@kerne… --- Maxime Ripard (2): dma-buf: heaps: system: Remove global variable dma-buf: heaps: Introduce a new heap for reserved memory drivers/dma-buf/heaps/Kconfig | 8 + drivers/dma-buf/heaps/Makefile | 1 + drivers/dma-buf/heaps/carveout_heap.c | 360 ++++++++++++++++++++++++++++++++++ drivers/dma-buf/heaps/system_heap.c | 3 +- 4 files changed, 370 insertions(+), 2 deletions(-) --- base-commit: fcbf30774e82a441890b722bf0c26542fb82150f change-id: 20240515-dma-buf-ecc-heap-28a311d2c94e Best regards, -- Maxime Ripard <mripard(a)kernel.org>

7 months, 1 week

Re: [PATCH 2/4] bpf: Add dmabuf iterator

by T.J. Mercier

On Tue, Apr 22, 2025 at 4:01 PM Alexei Starovoitov <alexei.starovoitov(a)gmail.com> wrote: > > On Tue, Apr 22, 2025 at 12:57 PM T.J. Mercier <tjmercier(a)google.com> wrote: > > > > On Mon, Apr 21, 2025 at 4:39 PM Alexei Starovoitov > > <alexei.starovoitov(a)gmail.com> wrote: > > > > > > On Mon, Apr 21, 2025 at 1:40 PM T.J. Mercier <tjmercier(a)google.com> wrote: > > > > > > > > > > new file mode 100644 > > > > > > index 000000000000..b4b8be1d6aa4 > > > > > > --- /dev/null > > > > > > +++ b/kernel/bpf/dmabuf_iter.c > > > > > > > > > > Maybe we should add this file to drivers/dma-buf. I would like to > > > > > hear other folks thoughts on this. > > > > > > > > This is fine with me, and would save us the extra > > > > CONFIG_DMA_SHARED_BUFFER check that's currently needed in > > > > kernel/bpf/Makefile but would require checking CONFIG_BPF instead. > > > > Sumit / Christian any objections to moving the dmabuf bpf iterator > > > > implementation into drivers/dma-buf? > > > > > > The driver directory would need to 'depends on BPF_SYSCALL'. > > > Are you sure you want this? > > > imo kernel/bpf/ is fine for this. > > > > I don't have a strong preference so either way is fine with me. The > > main difference I see is maintainership. > > > > > You also probably want > > > .feature = BPF_ITER_RESCHED > > > in bpf_dmabuf_reg_info. > > > > Thank you, this looks like a good idea. > > > > > Also have you considered open coded iterator for dmabufs? > > > Would it help with the interface to user space? > > > > I read through the open coded iterator patches, and it looks like they > > would be slightly more efficient by avoiding seq_file overhead. As far > > as the interface to userspace, for the purpose of replacing what's > > currently exposed by CONFIG_DMABUF_SYSFS_STATS I don't think there is > > a difference. However it looks like if I were to try to replace all of > > our userspace analysis of dmabufs with a single bpf program then an > > open coded iterator would make that much easier. I had not considered > > attempting that. > > > > One problem I see with open coded iterators is that support is much > > more recent (2023 vs 2020). We support longterm stable kernels (back > > to 5.4 currently but probably 5.10 by the time this would be used), so > > it seems like it would be harder to backport the kernel support for an > > open-coded iterator that far since it only goes back as far as 6.6 > > now. Actually it doesn't look like it is possible while also > > maintaining the stable ABI we provide to device vendors. Which means > > we couldn't get rid of the dmabuf sysfs stats userspace dependency > > until 6.1 EOL in Dec. 2027. :\ So I'm in favor of a traditional bpf > > iterator here for now. > > Fair enough, but please implement both and backport only > the old style pinned iterator. Ok, will do. > The code will be mostly shared between them. > bpf_iter_dmabuf_new/_next will be more flexible with more > options to return data to user space. Like android can invent > their own binary format. Pack into it in a bpf prog, send to > bpf ringbuf and unmarshal efficiently in user space. > Instead of being limited to text output that pinned iterators > are supposed to do usually. Also a neat idea! > You can do binary with bpf_seq_write() too, but it's rare. > > Also please provide full bpf prog that you'll use in production > in a selftest instead of trivial: > +SEC("iter/dmabuf") > +int dmabuf_collector(struct bpf_iter__dmabuf *ctx) > > just to make sure it's tested end to end and future changes > won't break it. The final bpf program should be something pretty close to that, but I'll start working on the AOSP side as well so I can put up patches. > > pw-bot: cr

7 months, 1 week

Re: [PATCH v2 0/2] dma-buf: heaps: Use constant name for CMA heap

by Sumit Semwal

Hello Jared, On Wed, 23 Apr 2025 at 00:49, Jared Kangas <jkangas(a)redhat.com> wrote: > > Hi all, > > This patch series is based on a previous discussion around CMA heap > naming. [1] The heap's name depends on the device name, which is > generally "reserved", "linux,cma", or "default-pool", but could be any > arbitrary name given to the default CMA area in the devicetree. For a > consistent userspace interface, the series introduces a constant name > for the CMA heap, and for backwards compatibility, an additional Kconfig > that controls the creation of a legacy-named heap with the same CMA > backing. > > The ideas to handle backwards compatibility in [1] are to either use a > symlink or add a heap node with a duplicate minor. However, I assume > that we don't want to create symlinks in /dev from module initcalls, and > attempting to duplicate minors would cause device_create() to fail. > Because of these drawbacks, after brainstorming with Maxime Ripard, I > went with creating a new node in devtmpfs with its own minor. This > admittedly makes it a little unclear that the old and new nodes are > backed by the same heap when both are present. The only approach that I > think would provide total clarity on this in userspace is symlinking, > which seemed like a fairly involved solution for devtmpfs, but if I'm > wrong on this, please let me know. Thanks indeed for this patch; just one minor nit: the link referred to as [1] here seems to be missing. Could you please add it? This would make it easier to follow the chain of discussion in posterity. > > Changelog: > v2: Use tabs instead of spaces for large vertical alignment. > > Jared Kangas (2): > dma-buf: heaps: Parameterize heap name in __add_cma_heap() > dma-buf: heaps: Give default CMA heap a fixed name > > Documentation/userspace-api/dma-buf-heaps.rst | 11 ++++--- > drivers/dma-buf/heaps/Kconfig | 10 +++++++ > drivers/dma-buf/heaps/cma_heap.c | 30 ++++++++++++++----- > 3 files changed, 40 insertions(+), 11 deletions(-) > > -- > 2.49.0 > Best, Sumit

7 months, 1 week

Re: [PATCH v2 2/2] dma-buf: heaps: Give default CMA heap a fixed name

by John Stultz

On Tue, Apr 22, 2025 at 12:19 PM Jared Kangas <jkangas(a)redhat.com> wrote: > > The CMA heap's name in devtmpfs can vary depending on how the heap is > defined. Its name defaults to "reserved", but if a CMA area is defined > in the devicetree, the heap takes on the devicetree node's name, such as > "default-pool" or "linux,cma". To simplify naming, just name it > "default_cma", and keep a legacy node in place backed by the same > underlying structure for backwards compatibility. > > Signed-off-by: Jared Kangas <jkangas(a)redhat.com> Once again, thanks for working out how to improve the standard naming while keeping compatibility. I do still hope we can get to the point where other cma regions can be registered as heaps with unique/purpose-specific names, but I can see having a standard name for the default region is a nice improvement. Acked-by: John Stultz <jstultz(a)google.com> thanks -john

7 months, 1 week

Re: [PATCH v2 1/2] dma-buf: heaps: Parameterize heap name in __add_cma_heap()

by John Stultz

On Tue, Apr 22, 2025 at 12:19 PM Jared Kangas <jkangas(a)redhat.com> wrote: > > Prepare for the introduction of a fixed-name CMA heap by replacing the > unused void pointer parameter in __add_cma_heap() with the heap name. > > Signed-off-by: Jared Kangas <jkangas(a)redhat.com> Thanks so much for taking this effort on. Looks good to me! Acked-by: John Stultz <jstultz(a)google.com>

7 months, 1 week

Re: [PATCH 2/4] bpf: Add dmabuf iterator

by T.J. Mercier

On Mon, Apr 21, 2025 at 4:39 PM Alexei Starovoitov <alexei.starovoitov(a)gmail.com> wrote: > > On Mon, Apr 21, 2025 at 1:40 PM T.J. Mercier <tjmercier(a)google.com> wrote: > > > > > > new file mode 100644 > > > > index 000000000000..b4b8be1d6aa4 > > > > --- /dev/null > > > > +++ b/kernel/bpf/dmabuf_iter.c > > > > > > Maybe we should add this file to drivers/dma-buf. I would like to > > > hear other folks thoughts on this. > > > > This is fine with me, and would save us the extra > > CONFIG_DMA_SHARED_BUFFER check that's currently needed in > > kernel/bpf/Makefile but would require checking CONFIG_BPF instead. > > Sumit / Christian any objections to moving the dmabuf bpf iterator > > implementation into drivers/dma-buf? > > The driver directory would need to 'depends on BPF_SYSCALL'. > Are you sure you want this? > imo kernel/bpf/ is fine for this. I don't have a strong preference so either way is fine with me. The main difference I see is maintainership. > You also probably want > .feature = BPF_ITER_RESCHED > in bpf_dmabuf_reg_info. Thank you, this looks like a good idea. > Also have you considered open coded iterator for dmabufs? > Would it help with the interface to user space? I read through the open coded iterator patches, and it looks like they would be slightly more efficient by avoiding seq_file overhead. As far as the interface to userspace, for the purpose of replacing what's currently exposed by CONFIG_DMABUF_SYSFS_STATS I don't think there is a difference. However it looks like if I were to try to replace all of our userspace analysis of dmabufs with a single bpf program then an open coded iterator would make that much easier. I had not considered attempting that. One problem I see with open coded iterators is that support is much more recent (2023 vs 2020). We support longterm stable kernels (back to 5.4 currently but probably 5.10 by the time this would be used), so it seems like it would be harder to backport the kernel support for an open-coded iterator that far since it only goes back as far as 6.6 now. Actually it doesn't look like it is possible while also maintaining the stable ABI we provide to device vendors. Which means we couldn't get rid of the dmabuf sysfs stats userspace dependency until 6.1 EOL in Dec. 2027. :\ So I'm in favor of a traditional bpf iterator here for now.

7 months, 1 week

Re: [PATCH 0/3] uio/dma-buf: Give UIO users access to DMA addresses.

by Jason Gunthorpe

On Mon, Apr 14, 2025 at 09:21:25PM +0200, Thomas Petazzoni wrote: > > "UIO is a broken legacy mess, so let's add more broken things > > to it as broken + broken => still broken, so no harm done", am I > > getting that right? > > Who says UIO is a "broken legacy mess"? Only you says so. I don't see > any indication anywhere in the kernel tree suggesting that UIO is > considered a broken legacy mess. Explain what the difference is between UIO and VFIO, especially VFIO no-iommu mode? I've always understood that UIO is for very simple devices that cannot do DMA. So it's very simple operating model and simple security work fine. IMHO, if the can use DMA it should use VFIO. If you have no iommu then you should use the VFIO unsafe no-iommu path. It still provides a solid framework. As to this series, I have seen a number of requests to improve the VFIO no-iommu path to allow working with the existing IOAS scheme to register memory but to allow the kernel the return the no-iommu DMAable address of the IOAS pinned memory. This would replace the hacky use of mlock and /proc/XX/pagemap that people use today. If that were done, could you use VFIO no-iommu? > Keep in mind that when you're running code as root, you can load a > kernel module, which can do anything on the system security-wise. So > letting UIO expose MMIO registers of devices to userspace applications > running as root is not any worse than that. That isn't fully true.. UIO isn't fitting into the security model by allowing DMA capable devices to be exposed without checking for CAP_SYS_RAW_IO first. Jason

7 months, 1 week

Re: [PATCH 2/4] bpf: Add dmabuf iterator

by T.J. Mercier

On Mon, Apr 21, 2025 at 10:58 AM Song Liu <song(a)kernel.org> wrote: > > On Mon, Apr 14, 2025 at 3:53 PM T.J. Mercier <tjmercier(a)google.com> wrote: > > > > The dmabuf iterator traverses the list of all DMA buffers. The list is > > maintained only when CONFIG_DEBUG_FS is enabled. > > > > DMA buffers are refcounted through their associated struct file. A > > reference is taken on each buffer as the list is iterated to ensure each > > buffer persists for the duration of the bpf program execution without > > holding the list mutex. > > > > Signed-off-by: T.J. Mercier <tjmercier(a)google.com> > > --- > > include/linux/btf_ids.h | 1 + > > kernel/bpf/Makefile | 3 + > > kernel/bpf/dmabuf_iter.c | 130 +++++++++++++++++++++++++++++++++++++++ > > 3 files changed, 134 insertions(+) > > create mode 100644 kernel/bpf/dmabuf_iter.c > > > > diff --git a/include/linux/btf_ids.h b/include/linux/btf_ids.h > > index 139bdececdcf..899ead57d89d 100644 > > --- a/include/linux/btf_ids.h > > +++ b/include/linux/btf_ids.h > > @@ -284,5 +284,6 @@ extern u32 bpf_cgroup_btf_id[]; > > extern u32 bpf_local_storage_map_btf_id[]; > > extern u32 btf_bpf_map_id[]; > > extern u32 bpf_kmem_cache_btf_id[]; > > +extern u32 bpf_dmabuf_btf_id[]; > > This is not necessary. See below. > > > > > #endif > > diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile > > index 70502f038b92..5b30d37ef055 100644 > > --- a/kernel/bpf/Makefile > > +++ b/kernel/bpf/Makefile > > @@ -53,6 +53,9 @@ obj-$(CONFIG_BPF_SYSCALL) += relo_core.o > > obj-$(CONFIG_BPF_SYSCALL) += btf_iter.o > > obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o > > obj-$(CONFIG_BPF_SYSCALL) += kmem_cache_iter.o > > +ifeq ($(CONFIG_DEBUG_FS),y) > > +obj-$(CONFIG_BPF_SYSCALL) += dmabuf_iter.o > > +endif > > > > CFLAGS_REMOVE_percpu_freelist.o = $(CC_FLAGS_FTRACE) > > CFLAGS_REMOVE_bpf_lru_list.o = $(CC_FLAGS_FTRACE) > > diff --git a/kernel/bpf/dmabuf_iter.c b/kernel/bpf/dmabuf_iter.c > > new file mode 100644 > > index 000000000000..b4b8be1d6aa4 > > --- /dev/null > > +++ b/kernel/bpf/dmabuf_iter.c > > Maybe we should add this file to drivers/dma-buf. I would like to > hear other folks thoughts on this. This is fine with me, and would save us the extra CONFIG_DMA_SHARED_BUFFER check that's currently needed in kernel/bpf/Makefile but would require checking CONFIG_BPF instead. Sumit / Christian any objections to moving the dmabuf bpf iterator implementation into drivers/dma-buf? > > @@ -0,0 +1,130 @@ > > +// SPDX-License-Identifier: GPL-2.0-only > > +/* Copyright (c) 2025 Google LLC */ > > +#include <linux/bpf.h> > > +#include <linux/btf_ids.h> > > +#include <linux/dma-buf.h> > > +#include <linux/kernel.h> > > +#include <linux/seq_file.h> > > + > > +BTF_ID_LIST_GLOBAL_SINGLE(bpf_dmabuf_btf_id, struct, dma_buf) > > Use BTF_ID_LIST_SINGLE(), then we don't need this in btf_ids.h > > > +DEFINE_BPF_ITER_FUNC(dmabuf, struct bpf_iter_meta *meta, struct dma_buf *dmabuf) > > + > > +static void *dmabuf_iter_seq_start(struct seq_file *seq, loff_t *pos) > > +{ > > + struct dma_buf *dmabuf, *ret = NULL; > > + > > + if (*pos) { > > + *pos = 0; > > + return NULL; > > + } > > + /* Look for the first buffer we can obtain a reference to. > > + * The list mutex does not protect a dmabuf's refcount, so it can be > > + * zeroed while we are iterating. Therefore we cannot call get_dma_buf() > > + * since the caller of this program may not already own a reference to > > + * the buffer. > > + */ > > We should use kernel comment style for new code. IOW, > /* > * Look for ... > */ > > > Thanks, > Song Thanks, I have incorporated all of your comments and retested. I will give some time for more feedback before sending a v2. > > [...]

7 months, 1 week

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig April 2025