Hi,
On Mon, Oct 23, 2023 at 10:25:50AM -0700, Doug Anderson wrote:
> On Mon, Oct 23, 2023 at 9:31 AM Yuran Pereira <yuran.pereira(a)hotmail.com> wrote:
> >
> > Since "Clean up checks for already prepared/enabled in panels" has
> > already been done and merged [1], I think there is no longer a need
> > for this item to be in the gpu TODO.
> >
> > [1] https://patchwork.freedesktop.org/patch/551421/
> >
> > Signed-off-by: Yuran Pereira <yuran.pereira(a)hotmail.com>
> > ---
> > Documentation/gpu/todo.rst | 25 -------------------------
> > 1 file changed, 25 deletions(-)
>
> It's not actually all done. It's in a bit of a limbo state right now,
> unfortunately. I landed all of the "simple" cases where panels were
> needlessly tracking prepare/enable, but the less simple cases are
> still outstanding.
>
> Specifically the issue is that many panels have code to properly power
> cycle themselves off at shutdown time and in order to do that they
> need to keep track of the prepare/enable state. After a big, long
> discussion [1] it was decided that we could get rid of all the panel
> code handling shutdown if only all relevant DRM KMS drivers would
> properly call drm_atomic_helper_shutdown().
>
> I made an attempt to get DRM KMS drivers to call
> drm_atomic_helper_shutdown() [2] [3] [4]. I was able to land the
> patches that went through drm-misc, but currently many of the
> non-drm-misc ones are blocked waiting for attention.
>
> ...so things that could be done to help out:
>
> a) Could review patches that haven't landed in [4]. Maybe adding a
> Reviewed-by tag would help wake up maintainers?
>
> b) Could see if you can identify panels that are exclusively used w/
> DRM drivers that have already been converted and then we could post
> patches for just those panels. I have no idea how easy this task would
> be. Is it enough to look at upstream dts files by "compatible" string?
I think it is, yes.
Maxime
On Mon, Jan 20, 2025 at 08:45:51PM +1100, Alexey Kardashevskiy wrote:
> > For CC I'm expecting the KVM fd to be the handle for the cVM, so any
> > RPCs that want to call into the secure world need the KVM FD to get
> > the cVM's identifier. Ie a "bind to cVM" RPC will need the PCI
> > information and the cVM's handle.
>
> And keep KVM fd open until unbind? Or just for the short time to call the
> PSP?
iommufd will keep the KVM fd alive so long as the vIOMMU object
exists. Other uses for kvm require it to work like this.
> > But it also seems to me that VFIO should be able to support putting
> > the device into the RUN state without involving KVM or cVMs.
>
> AMD's TDI bind handler in the PSP wants a guest handle ("GCTX") and a guest
> device BDFn, and VFIO has no desire to dive into this KVM business beyond
> IOMMUFD.
As in my other email, VFIO is not restricted to running VMs, useful
things should be available to apps like DPDK.
There is a use case for using TDISP and getting devices up into an
ecrypted/attested state on pure bare metal without any KVM, VFIO
should work in that use case too.
Jason
I can't see patch #1 in my inbox for some reason, but I already know
what it does from your repository.
Feel free to add Reviewed-by: Christian König <christian.koenig(a)amd.com>
to the entire series.
Regards,
Christian.
Am 31.01.25 um 12:02 schrieb Pierre-Eric Pelloux-Prayer:
> Hi,
>
> The initial goal of this series was to improve the drm and amdgpu
> trace events to be able to expose more of the inner workings of
> the scheduler and drivers to developers via tools.
>
> Then, the series evolved to become focused only on gpu_scheduler.
> The changes around vblank events will be part of a different
> series, as well as the amdgpu ones.
>
> Moreover Sima suggested to make some trace events stable uAPI,
> so tools can rely on them long term.
>
> The first patches extend and cleanup the gpu scheduler events.
>
> The last one adds a documentation entry in drm-uapi.rst.
>
> Changes since v6:
> * Addressed comments from Philipp, Tvrtko, Steven
> * The commit storing drm_client_id in sched_fence adds an extra
> field which looks like a duplicate of the owner field. Currently
> amdgpu uses the owner field with some magic values, so we have to
> have 2 separate fields for now, but ultimately one could be removed.
> Similarly storing the drm_client_id in the sched_entity is not
> really possible as there's nothing preventing a driver to use a
> sched_entity with multiple drm_file instances.
>
>
> Useful links:
> - userspace tool using the updated events:
> https://gitlab.freedesktop.org/tomstdenis/umr/-/merge_requests/37
> - v6:
> https://lists.freedesktop.org/archives/dri-devel/2024-November/477644.html
>
> Pierre-Eric Pelloux-Prayer (7):
> drm/debugfs: output client_id in in drm_clients_info
> drm/sched: store the drm client_id in drm_sched_fence
> drm/sched: add device name to the drm_sched_process_job event
> drm/sched: cleanup gpu_scheduler trace events
> drm/sched: trace dependencies for gpu jobs
> drm/sched: add the drm_client_id to the drm_sched_run/exec_job events
> drm/doc: document some tracepoints as uAPI
>
> Documentation/gpu/drm-uapi.rst | 19 +++
> drivers/accel/amdxdna/aie2_ctx.c | 3 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 8 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +-
> drivers/gpu/drm/drm_debugfs.c | 10 +-
> drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +-
> drivers/gpu/drm/imagination/pvr_job.c | 2 +-
> drivers/gpu/drm/imagination/pvr_queue.c | 5 +-
> drivers/gpu/drm/imagination/pvr_queue.h | 2 +-
> drivers/gpu/drm/lima/lima_gem.c | 2 +-
> drivers/gpu/drm/lima/lima_sched.c | 6 +-
> drivers/gpu/drm/lima/lima_sched.h | 3 +-
> drivers/gpu/drm/msm/msm_gem_submit.c | 8 +-
> drivers/gpu/drm/nouveau/nouveau_sched.c | 3 +-
> drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +-
> drivers/gpu/drm/panthor/panthor_drv.c | 3 +-
> drivers/gpu/drm/panthor/panthor_mmu.c | 2 +-
> drivers/gpu/drm/panthor/panthor_sched.c | 5 +-
> drivers/gpu/drm/panthor/panthor_sched.h | 3 +-
> .../gpu/drm/scheduler/gpu_scheduler_trace.h | 123 ++++++++++++++----
> drivers/gpu/drm/scheduler/sched_entity.c | 8 +-
> drivers/gpu/drm/scheduler/sched_fence.c | 4 +-
> drivers/gpu/drm/scheduler/sched_main.c | 8 +-
> drivers/gpu/drm/v3d/v3d_submit.c | 2 +-
> drivers/gpu/drm/xe/xe_sched_job.c | 3 +-
> include/drm/gpu_scheduler.h | 12 +-
> 28 files changed, 192 insertions(+), 64 deletions(-)
>
On Thu, Jan 30, 2025 at 01:08:57PM +0000, Florent Tomasin wrote:
> Introduce a CMA Heap dt-binding allowing custom
> CMA heap registrations.
>
> * Note to the reviewers:
> The patch was used for the development of the protected mode
> feature in Panthor CSF kernel driver and is not initially thought
> to land in the Linux kernel. It is mostly relevant if someone
> wants to reproduce the environment of testing. Please, raise
> interest if you think the patch has value in the Linux kernel.
Why would panthor need CMA, it has an MMU.
In any case, I agree with Maxime that this is redundant.
Rob
Hi,
I started to review it, but it's probably best to discuss it here.
On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> Hi,
>
> This is a patch series covering the support for protected mode execution in
> Mali Panthor CSF kernel driver.
>
> The Mali CSF GPUs come with the support for protected mode execution at the
> HW level. This feature requires two main changes in the kernel driver:
>
> 1) Configure the GPU with a protected buffer. The system must provide a DMA
> heap from which the driver can allocate a protected buffer.
> It can be a carved-out memory or dynamically allocated protected memory region.
> Some system includes a trusted FW which is in charge of the protected memory.
> Since this problem is integration specific, the Mali Panthor CSF kernel
> driver must import the protected memory from a device specific exporter.
Why do you need a heap for it in the first place? My understanding of
your series is that you have a carved out memory region somewhere, and
you want to allocate from that carved out memory region your buffers.
How is that any different from using a reserved-memory region, adding
the reserved-memory property to the GPU device and doing all your
allocation through the usual dma_alloc_* API?
Or do you expect to have another dma-buf heap that would call into the
firmware to perform the allocations?
The semantics of the CMA heap allocations is a concern too.
Another question is how would you expect something like a secure
video-playback pipeline to operate with that kind of interface. Assuming
you have a secure codec, you would likely get that protected buffer from
the codec, right? So the most likely scenario would be to then import
that dma-buf into the GPU driver, but not allocate the buffer from it.
Overall, I think a better abstraction would be to have a heap indeed to
allocate your protected buffers from, and then import them in the
devices that need them. The responsibility would be on the userspace to
do so, but it already kind of does with your design anyway.
Maxime
Hi,
On Thu, Jan 30, 2025 at 01:08:58PM +0000, Florent Tomasin wrote:
> This patch introduces a cma-heap probe function, allowing
> users to register custom cma heaps in the device tree.
>
> A "memory-region" is bound to the cma heap at probe time
> allowing allocation of DMA buffers from that heap.
>
> Use cases:
> - registration of carved out secure heaps. Some devices
> are implementing secure memory by reserving a specific
> memory regions for that purpose. For example, this is the
> case of platforms making use of early version of
> ARM TrustZone.
In such a case, the CMA heap would de-facto become un-mappable for
userspace, right?
> - registration of multiple memory regions at different
> locations for efficiency or HW integration reasons.
> For example, a peripheral may expect to share data at a
> specific location in RAM. This information could have been
> programmed by a FW prior to the kernel boot.
How would you differentiate between them?
Maxime
On 30/01/2025 14:08, Florent Tomasin wrote:
> Allow mali-valhall-csf driver to retrieve a protected
> heap at probe time by passing the name of the heap
> as attribute to the device tree GPU node.
Please wrap commit message according to Linux coding style / submission
process (neither too early nor over the limit):
https://elixir.bootlin.com/linux/v6.4-rc1/source/Documentation/process/subm…
Why this cannot be passed by phandle, just like all reserved regions?
From where do you take these protected heaps? Firmware? This would
explain why no relation is here (no probe ordering, no device links,
nothing connecting separate devices).
Best regards,
Krzysztof
Even the kerneldoc says that with a zero timeout the function should not
wait for anything, but still return 1 to indicate that the fences are
signaled now.
Unfortunately that isn't what was implemented, instead of only returning
1 we also waited for at least one jiffies.
Fix that by adjusting the handling to what the function is actually
documented to do.
v2: improve code readability
Reported-by: Marek Olšák <marek.olsak(a)amd.com>
Reported-by: Lucas Stach <l.stach(a)pengutronix.de>
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Cc: <stable(a)vger.kernel.org>
---
drivers/dma-buf/dma-resv.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 5f8d010516f0..c78cdae3deaf 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -684,11 +684,13 @@ long dma_resv_wait_timeout(struct dma_resv *obj, enum dma_resv_usage usage,
dma_resv_iter_begin(&cursor, obj, usage);
dma_resv_for_each_fence_unlocked(&cursor, fence) {
- ret = dma_fence_wait_timeout(fence, intr, ret);
- if (ret <= 0) {
- dma_resv_iter_end(&cursor);
- return ret;
- }
+ ret = dma_fence_wait_timeout(fence, intr, timeout);
+ if (ret <= 0)
+ break;
+
+ /* Even for zero timeout the return value is 1 */
+ if (timeout)
+ timeout = ret;
}
dma_resv_iter_end(&cursor);
--
2.34.1
Even the kerneldoc says that with a zero timeout the function should not
wait for anything, but still return 1 to indicate that the fences are
signaled now.
Unfortunately that isn't what was implemented, instead of only returning
1 we also waited for at least one jiffies.
Fix that by adjusting the handling to what the function is actually
documented to do.
Reported-by: Marek Olšák <marek.olsak(a)amd.com>
Reported-by: Lucas Stach <l.stach(a)pengutronix.de>
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Cc: <stable(a)vger.kernel.org>
---
drivers/dma-buf/dma-resv.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 5f8d010516f0..ae92d9d2394d 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -684,11 +684,12 @@ long dma_resv_wait_timeout(struct dma_resv *obj, enum dma_resv_usage usage,
dma_resv_iter_begin(&cursor, obj, usage);
dma_resv_for_each_fence_unlocked(&cursor, fence) {
- ret = dma_fence_wait_timeout(fence, intr, ret);
- if (ret <= 0) {
- dma_resv_iter_end(&cursor);
- return ret;
- }
+ ret = dma_fence_wait_timeout(fence, intr, timeout);
+ if (ret <= 0)
+ break;
+
+ /* Even for zero timeout the return value is 1 */
+ timeout = min(timeout, ret);
}
dma_resv_iter_end(&cursor);
--
2.34.1