From: Rob Clark robdclark@chromium.org
Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse Memory[2] in the form of: 1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/ MAP_NULL/UNMAP commands 2. Extending the SUBMIT` ioctl to allow submitting batches of one or more MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue
The UABI takes a slightly different approach from what other drivers have done, and what would make sense if starting from a clean sheet, ie separate VM_BIND and EXEC ioctls. But since we have to maintain support for the existing SUBMIT ioctl, and because the fence, syncobj, and BO pinning is largely the same between legacy "BO-table" style SUBMIT ioctls, and new- style VM updates submitted to a VM_BIND submitqueue, I chose to go the route of extending the existing `SUBMIT` ioctl rather than adding a new ioctl.
I also did not implement support for synchronous VM_BIND commands. Since userspace could just immediately wait for the `SUBMIT` to complete, I don't think we need this extra complexity in the kernel.
The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533
### Notes/TODOs/Open Questions: 1. The first handful of patches are from Bibek Kumar Patro's series, "iommu/arm-smmu: introduction of ACTLR implementation for Qualcomm SoCs[3], which introduces PRR (Partially-Resident-Region) support, needed to implement MAP_NULL (for Vulkan Sparse Residency[4] 2. Why do VM_BIND commands need fence fd support, instead of just syncobjs? Mainly for the benefit of virtgpu drm native context guest<->host fence passing[5], where the host VMM is operating in terms of fence fd's (syncobs are just a convenience wrapper above a dma_fence, and don't exist below the guest kernel). 3. Currently shrinker support is disabled (hence this being in Draft/RFC state). To properly support the shrinker, we need to pre-allocate various objects and pages needed for the pagetables themselves, to move memory allocations out of the fence signaling path. This short- cut was taken to unblock userspace implementation of sparse buffer/ image support. 4. Could/should we do all the vm/vma updates synchronously and defer _only_ the io-pgtable updates to the VM_BIND scheduler queue? This would simplify the previous point, in that we'd only have to pre-allocate pages for the io-pgtable updates. 5. Currently we lose support for BO dumping for devcoredump. Ideally we'd plumb `MSM_SUBMIT_BO_DUMP` flag in a `MAP` commands thru to the resulting drm_gpuva's. To do this, I think we need to extend drm_gpuva with a flags field.. the flags can be driver defined, but drm_gpuvm needs to know not to merge drm_gpuva's with different flags.
This series can be found in MR form, if you prefer: https://gitlab.freedesktop.org/drm/msm/-/merge_requests/144
[1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm [2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html [3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700 [4] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html#sparsememory-par... [5] https://patchew.org/linux/20231007194747.788934-1-dmitry.osipenko@collabora....
Rob Clark (24): HACK: drm/msm: Disable shrinker drm/gpuvm: Don't require obj lock in destructor path drm/gpuvm: Remove bogus lock assert drm/msm: Rename msm_file_private -> msm_context drm/msm: Improve msm_context comments drm/msm: Rename msm_gem_address_space -> msm_gem_vm drm/msm: Remove vram carveout support drm/msm: Collapse vma allocation and initialization drm/msm: Collapse vma close and delete drm/msm: drm_gpuvm conversion drm/msm: Use drm_gpuvm types more drm/msm: Split submit_pin_objects() drm/msm: Lazily create context VM drm/msm: Add opt-in for VM_BIND drm/msm: Mark VM as unusable on faults drm/msm: Extend SUBMIT ioctl for VM_BIND drm/msm: Add VM_BIND submitqueue drm/msm: Add _NO_SHARE flag drm/msm: Split out helper to get iommu prot flags drm/msm: Add mmu support for non-zero offset drm/msm: Add PRR support drm/msm: Rename msm_gem_vma_purge() -> _unmap() drm/msm: Wire up gpuvm ops drm/msm: Bump UAPI version
drivers/gpu/drm/drm_gpuvm.c | 10 +- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 19 +- drivers/gpu/drm/msm/adreno/a2xx_gpummu.c | 5 +- drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 4 +- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 24 +- drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 10 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 32 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 51 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 6 +- drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 10 +- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 78 ++- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 22 +- .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 14 +- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 18 +- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h | 2 +- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 18 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 14 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 4 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c | 6 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 28 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c | 12 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c | 4 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 19 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c | 12 +- drivers/gpu/drm/msm/dsi/dsi_host.c | 14 +- drivers/gpu/drm/msm/msm_drv.c | 175 ++---- drivers/gpu/drm/msm/msm_drv.h | 31 +- drivers/gpu/drm/msm/msm_fb.c | 18 +- drivers/gpu/drm/msm/msm_fbdev.c | 2 +- drivers/gpu/drm/msm/msm_gem.c | 403 ++++++------- drivers/gpu/drm/msm/msm_gem.h | 193 +++++-- drivers/gpu/drm/msm/msm_gem_prime.c | 15 + drivers/gpu/drm/msm/msm_gem_submit.c | 223 +++++-- drivers/gpu/drm/msm/msm_gem_vma.c | 543 +++++++++++++++--- drivers/gpu/drm/msm/msm_gpu.c | 66 ++- drivers/gpu/drm/msm/msm_gpu.h | 132 +++-- drivers/gpu/drm/msm/msm_iommu.c | 84 ++- drivers/gpu/drm/msm/msm_kms.c | 14 +- drivers/gpu/drm/msm/msm_kms.h | 2 +- drivers/gpu/drm/msm/msm_mmu.h | 2 +- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +- drivers/gpu/drm/msm/msm_submitqueue.c | 86 ++- include/uapi/drm/msm_drm.h | 98 +++- 48 files changed, 1637 insertions(+), 903 deletions(-)
From: Rob Clark robdclark@chromium.org
This is a bit different than the path taken by other clean-slate drivers. But there is a lot in similar with BO pinning in the legacy "EXEC" path and "VM_BIND" MAP path. Also, we want the same fence and syncobj handling.
(Why bother with fence fd's? Because for virtgpu nctx for, guest syncobj's exist only as dma_fence's between the guest kernel and host.)
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_gem.h | 10 ++--- drivers/gpu/drm/msm/msm_gem_submit.c | 65 ++++++++++++++++++++++++---- include/uapi/drm/msm_drm.h | 49 ++++++++++++++++++--- 3 files changed, 103 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index 7cb720137548..8e29e36ca9c5 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -345,13 +345,13 @@ struct msm_gem_submit { uint32_t nr_relocs; struct drm_msm_gem_submit_reloc *relocs; } *cmd; /* array of size nr_cmds */ - struct { + struct msm_gem_submit_bo { uint32_t flags; - union { - struct drm_gem_object *obj; - uint32_t handle; - }; + uint32_t handle; + struct drm_gem_object *obj; uint64_t iova; + uint64_t bo_offset; + uint64_t range; } bos[]; };
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 79bbe552f23e..9ac74f9a139e 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -115,23 +115,37 @@ void __msm_gem_submit_destroy(struct kref *kref) kfree(submit); }
+static bool invalid_alignment(uint64_t addr) +{ + /* + * Technically this is about GPU alignment, not CPU alignment. But + * I've not seen any qcom SoC where the SMMU does not support the + * CPU's smallest page size. + */ + return !PAGE_ALIGNED(addr); +} + static int submit_lookup_objects(struct msm_gem_submit *submit, struct drm_msm_gem_submit *args, struct drm_file *file) { - unsigned i; + unsigned i, bo_stride = args->bos_stride; int ret = 0;
+ if (!bo_stride) + bo_stride = sizeof(struct drm_msm_gem_submit_bo); + for (i = 0; i < args->nr_bos; i++) { - struct drm_msm_gem_submit_bo submit_bo; + struct drm_msm_gem_submit_bo_v2 submit_bo = {0}; void __user *userptr = - u64_to_user_ptr(args->bos + (i * sizeof(submit_bo))); + u64_to_user_ptr(args->bos + (i * bo_stride)); + unsigned copy_sz = min(bo_stride, sizeof(submit_bo));
/* make sure we don't have garbage flags, in case we hit * error path before flags is initialized: */ submit->bos[i].flags = 0;
- if (copy_from_user(&submit_bo, userptr, sizeof(submit_bo))) { + if (copy_from_user(&submit_bo, userptr, copy_sz)) { ret = -EFAULT; i = 0; goto out; @@ -141,14 +155,27 @@ static int submit_lookup_objects(struct msm_gem_submit *submit, #define MANDATORY_FLAGS (MSM_SUBMIT_BO_READ | MSM_SUBMIT_BO_WRITE)
if ((submit_bo.flags & ~MSM_SUBMIT_BO_FLAGS) || - !(submit_bo.flags & MANDATORY_FLAGS)) { + !(submit_bo.flags & MANDATORY_FLAGS)) ret = SUBMIT_ERROR(EINVAL, submit, "invalid flags: %x\n", submit_bo.flags); + + if (invalid_alignment(submit_bo.address)) + ret = SUBMIT_ERROR(EINVAL, submit, "invalid address: %016llx\n", submit_bo.address); + + if (invalid_alignment(submit_bo.bo_offset)) + ret = SUBMIT_ERROR(EINVAL, submit, "invalid bo_offset: %016llx\n", submit_bo.bo_offset); + + if (invalid_alignment(submit_bo.range)) + ret = SUBMIT_ERROR(EINVAL, submit, "invalid range: %016llx\n", submit_bo.range); + + if (ret) { i = 0; goto out; }
submit->bos[i].handle = submit_bo.handle; submit->bos[i].flags = submit_bo.flags; + submit->bos[i].bo_offset = submit_bo.bo_offset; + submit->bos[i].range = submit_bo.range; }
spin_lock(&file->table_lock); @@ -167,6 +194,15 @@ static int submit_lookup_objects(struct msm_gem_submit *submit,
drm_gem_object_get(obj);
+ if (submit->bos[i].bo_offset > obj->size) + ret = SUBMIT_ERROR(EINVAL, submit, "bo_offset to large: %016llx\n", submit->bos[i].bo_offset); + + if ((submit->bos[i].bo_offset + submit->bos[i].range) > obj->size) + ret = SUBMIT_ERROR(EINVAL, submit, "range to large: %016llx\n", submit->bos[i].range); + + if (ret) + goto out_unlock; + submit->bos[i].obj = obj; }
@@ -182,6 +218,7 @@ static int submit_lookup_objects(struct msm_gem_submit *submit, static int submit_lookup_cmds(struct msm_gem_submit *submit, struct drm_msm_gem_submit *args, struct drm_file *file) { + struct msm_context *ctx = file->driver_priv; unsigned i; size_t sz; int ret = 0; @@ -213,6 +250,19 @@ static int submit_lookup_cmds(struct msm_gem_submit *submit, goto out; }
+ if (msm_context_is_vmbind(ctx)) { + if (submit_cmd.nr_relocs) { + ret = SUBMIT_ERROR(EINVAL, submit, "nr_relocs must be zero"); + goto out; + } + if (submit_cmd.submit_idx || submit_cmd.submit_offset) { + ret = SUBMIT_ERROR(EINVAL, submit, "submit_idx/offset must be zero"); + goto out; + } + + submit->cmd[i].iova = submit_cmd.iova; + } + submit->cmd[i].type = submit_cmd.type; submit->cmd[i].size = submit_cmd.size / 4; submit->cmd[i].offset = submit_cmd.submit_offset / 4; @@ -665,9 +715,6 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (!gpu) return -ENXIO;
- if (args->pad) - return -EINVAL; - if (to_msm_vm(ctx->vm)->unusable) return UERR(EPIPE, dev, "context is unusable");
@@ -677,7 +724,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (MSM_PIPE_ID(args->flags) != MSM_PIPE_3D0) return UERR(EINVAL, dev, "invalid pipe");
- if (MSM_PIPE_FLAGS(args->flags) & ~MSM_SUBMIT_FLAGS) + if (MSM_PIPE_FLAGS(args->flags) & ~MSM_SUBMIT_EXEC_FLAGS) return UERR(EINVAL, dev, "invalid flags");
if (args->flags & MSM_SUBMIT_SUDO) { diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h index 072e82a80607..1a948d49c610 100644 --- a/include/uapi/drm/msm_drm.h +++ b/include/uapi/drm/msm_drm.h @@ -245,7 +245,10 @@ struct drm_msm_gem_submit_cmd { __u32 size; /* in, cmdstream size */ __u32 pad; __u32 nr_relocs; /* in, number of submit_reloc's */ - __u64 relocs; /* in, ptr to array of submit_reloc's */ + union { + __u64 relocs; /* in, ptr to array of submit_reloc's */ + __u64 iova; /* cmdstream address (for VM_BIND contexts) */ + }; };
/* Each buffer referenced elsewhere in the cmdstream submit (ie. the @@ -264,6 +267,19 @@ struct drm_msm_gem_submit_cmd { #define MSM_SUBMIT_BO_DUMP 0x0004 #define MSM_SUBMIT_BO_NO_IMPLICIT 0x0008
+/* Map OP for for submits to a VM_BIND submitqueue: + * - MAP: map a specified range of the BO into the VM + * - MAP_NULL: map a NULL page into the specified range of the VM, handle + * and bo_offset MBZ. A NULL range will return zero on reads + * and discard writes + * see: VkPhysicalDeviceSparseProperties::residencyNonResidentStrict + * - UNMAP: unmap a specified VM range, handle and bo_offset MBZ + */ +#define MSM_SUBMIT_BO_OP_MASK 0xf000 +#define MSM_SUBMIT_BO_OP_MAP 0x0000 +#define MSM_SUBMIT_BO_OP_MAP_NULL 0x1000 +#define MSM_SUBMIT_BO_OP_UNMAP 0x2000 + #define MSM_SUBMIT_BO_FLAGS (MSM_SUBMIT_BO_READ | \ MSM_SUBMIT_BO_WRITE | \ MSM_SUBMIT_BO_DUMP | \ @@ -272,7 +288,16 @@ struct drm_msm_gem_submit_cmd { struct drm_msm_gem_submit_bo { __u32 flags; /* in, mask of MSM_SUBMIT_BO_x */ __u32 handle; /* in, GEM handle */ - __u64 presumed; /* in/out, presumed buffer address */ + __u64 address; /* in/out, presumed buffer address */ +}; + +struct drm_msm_gem_submit_bo_v2 { + __u32 flags; /* in, mask of MSM_SUBMIT_BO_x */ + __u32 handle; /* in, GEM handle */ + __u64 address; /* in/out, presumed buffer address */ + /* Remaining fields are only used with MSM_SUBMIT_OP_VM_BIND/_ASYNC: */ + __u64 bo_offset; + __u64 range; };
/* Valid submit ioctl flags: */ @@ -283,7 +308,8 @@ struct drm_msm_gem_submit_bo { #define MSM_SUBMIT_SYNCOBJ_IN 0x08000000 /* enable input syncobj */ #define MSM_SUBMIT_SYNCOBJ_OUT 0x04000000 /* enable output syncobj */ #define MSM_SUBMIT_FENCE_SN_IN 0x02000000 /* userspace passes in seqno fence */ -#define MSM_SUBMIT_FLAGS ( \ + +#define MSM_SUBMIT_EXEC_FLAGS ( \ MSM_SUBMIT_NO_IMPLICIT | \ MSM_SUBMIT_FENCE_FD_IN | \ MSM_SUBMIT_FENCE_FD_OUT | \ @@ -293,6 +319,13 @@ struct drm_msm_gem_submit_bo { MSM_SUBMIT_FENCE_SN_IN | \ 0)
+#define MSM_SUBMIT_VM_BIND_FLAGS ( \ + MSM_SUBMIT_FENCE_FD_IN | \ + MSM_SUBMIT_FENCE_FD_OUT | \ + MSM_SUBMIT_SYNCOBJ_IN | \ + MSM_SUBMIT_SYNCOBJ_OUT | \ + 0) + #define MSM_SUBMIT_SYNCOBJ_RESET 0x00000001 /* Reset syncobj after wait. */ #define MSM_SUBMIT_SYNCOBJ_FLAGS ( \ MSM_SUBMIT_SYNCOBJ_RESET | \ @@ -307,14 +340,17 @@ struct drm_msm_gem_submit_syncobj { /* Each cmdstream submit consists of a table of buffers involved, and * one or more cmdstream buffers. This allows for conditional execution * (context-restore), and IB buffers needed for per tile/bin draw cmds. + * + * For MSM_SUBMIT_VM_BIND/_ASYNC operations, the queue must have been + * created with the MSM_SUBMITQUEUE_VM_BIND flag. */ struct drm_msm_gem_submit { __u32 flags; /* MSM_PIPE_x | MSM_SUBMIT_x */ __u32 fence; /* out (or in with MSM_SUBMIT_FENCE_SN_IN flag) */ __u32 nr_bos; /* in, number of submit_bo's */ - __u32 nr_cmds; /* in, number of submit_cmd's */ + __u32 nr_cmds; /* in, number of submit_cmd's, MBZ for VM_BIND queue */ __u64 bos; /* in, ptr to array of submit_bo's */ - __u64 cmds; /* in, ptr to array of submit_cmd's */ + __u64 cmds; /* in, ptr to array of submit_cmd's, MBZ for VM_BIND queue */ __s32 fence_fd; /* in/out fence fd (see MSM_SUBMIT_FENCE_FD_IN/OUT) */ __u32 queueid; /* in, submitqueue id */ __u64 in_syncobjs; /* in, ptr to array of drm_msm_gem_submit_syncobj */ @@ -322,8 +358,7 @@ struct drm_msm_gem_submit { __u32 nr_in_syncobjs; /* in, number of entries in in_syncobj */ __u32 nr_out_syncobjs; /* in, number of entries in out_syncobj. */ __u32 syncobj_stride; /* in, stride of syncobj arrays. */ - __u32 pad; /*in, reserved for future use, always 0. */ - + __u32 bos_stride; /* in, stride of bos array, if zero 16bytes used. */ };
#define MSM_WAIT_FENCE_BOOST 0x00000001
From: Rob Clark robdclark@chromium.org
This submitqueue type isn't tied to a hw ringbuffer, but instead executes on the CPU for performing async VM_BIND ops.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_gem.c | 3 +- drivers/gpu/drm/msm/msm_gem.h | 10 +++ drivers/gpu/drm/msm/msm_gem_submit.c | 123 ++++++++++++++++++++++---- drivers/gpu/drm/msm/msm_gem_vma.c | 100 +++++++++++++++++++++ drivers/gpu/drm/msm/msm_gpu.h | 3 + drivers/gpu/drm/msm/msm_submitqueue.c | 57 +++++++++--- include/uapi/drm/msm_drm.h | 9 +- 7 files changed, 275 insertions(+), 30 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 0bfc993571fc..66332481c4c3 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -215,8 +215,7 @@ static void put_pages(struct drm_gem_object *obj) } }
-static struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj, - unsigned madv) +struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj, unsigned madv) { struct msm_gem_object *msm_obj = to_msm_bo(obj);
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index 8e29e36ca9c5..a2255fd269ca 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -53,6 +53,13 @@ struct msm_gem_vm { /** @base: Inherit from drm_gpuvm. */ struct drm_gpuvm base;
+ /** + * @sched: Scheduler used for asynchronous VM_BIND request. + * + * Unused for kernel managed VMs (where all operations are synchronous). + */ + struct drm_gpu_scheduler sched; + /** * @mm: Memory management for kernel managed VA allocations * @@ -106,6 +113,8 @@ struct drm_gpuvm * msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, u64 va_start, u64 va_size, bool managed);
+void msm_gem_vm_close(struct drm_gpuvm *vm); + struct msm_fence_context;
/** @@ -195,6 +204,7 @@ int msm_gem_get_and_pin_iova(struct drm_gem_object *obj, struct drm_gpuvm *vm, uint64_t *iova); void msm_gem_unpin_iova(struct drm_gem_object *obj, struct drm_gpuvm *vm); void msm_gem_pin_obj_locked(struct drm_gem_object *obj); +struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj, unsigned madv); struct page **msm_gem_pin_pages_locked(struct drm_gem_object *obj); void msm_gem_unpin_pages_locked(struct drm_gem_object *obj); int msm_gem_dumb_create(struct drm_file *file, struct drm_device *dev, diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 9ac74f9a139e..8295c21e4ca0 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -115,6 +115,17 @@ void __msm_gem_submit_destroy(struct kref *kref) kfree(submit); }
+static bool invalid_bo_flags(bool vm_bind, uint32_t flags) +{ + if (vm_bind) { + return flags & ~(MSM_SUBMIT_BO_FLAGS | MSM_SUBMIT_BO_OP_MASK); + } else { + /* at least one of READ and/or WRITE flags should be set: */ + return (flags & ~MSM_SUBMIT_BO_FLAGS) || + !(flags & (MSM_SUBMIT_BO_READ | MSM_SUBMIT_BO_WRITE)); + } +} + static bool invalid_alignment(uint64_t addr) { /* @@ -129,9 +140,10 @@ static int submit_lookup_objects(struct msm_gem_submit *submit, struct drm_msm_gem_submit *args, struct drm_file *file) { unsigned i, bo_stride = args->bos_stride; + bool vm_bind = !!(submit->queue->flags & MSM_SUBMITQUEUE_VM_BIND); int ret = 0;
- if (!bo_stride) + if (!bo_stride || !vm_bind) bo_stride = sizeof(struct drm_msm_gem_submit_bo);
for (i = 0; i < args->nr_bos; i++) { @@ -151,11 +163,7 @@ static int submit_lookup_objects(struct msm_gem_submit *submit, goto out; }
-/* at least one of READ and/or WRITE flags should be set: */ -#define MANDATORY_FLAGS (MSM_SUBMIT_BO_READ | MSM_SUBMIT_BO_WRITE) - - if ((submit_bo.flags & ~MSM_SUBMIT_BO_FLAGS) || - !(submit_bo.flags & MANDATORY_FLAGS)) + if (invalid_bo_flags(vm_bind, submit_bo.flags)) ret = SUBMIT_ERROR(EINVAL, submit, "invalid flags: %x\n", submit_bo.flags);
if (invalid_alignment(submit_bo.address)) @@ -174,6 +182,7 @@ static int submit_lookup_objects(struct msm_gem_submit *submit,
submit->bos[i].handle = submit_bo.handle; submit->bos[i].flags = submit_bo.flags; + submit->bos[i].iova = submit_bo.address; submit->bos[i].bo_offset = submit_bo.bo_offset; submit->bos[i].range = submit_bo.range; } @@ -183,6 +192,12 @@ static int submit_lookup_objects(struct msm_gem_submit *submit, for (i = 0; i < args->nr_bos; i++) { struct drm_gem_object *obj;
+ if (vm_bind) { + unsigned op = submit->bos[i].flags & MSM_SUBMIT_BO_OP_MASK; + if (op != MSM_SUBMIT_BO_OP_MAP) + continue; + } + /* normally use drm_gem_object_lookup(), but for bulk lookup * all under single table_lock just hit object_idr directly: */ @@ -297,13 +312,21 @@ static int submit_lookup_cmds(struct msm_gem_submit *submit, /* This is where we make sure all the bo's are reserved and pin'd: */ static int submit_lock_objects(struct msm_gem_submit *submit) { + unsigned flags = DRM_EXEC_INTERRUPTIBLE_WAIT; int ret;
- drm_exec_init(&submit->exec, DRM_EXEC_INTERRUPTIBLE_WAIT, submit->nr_bos); + if (submit->queue->flags & MSM_SUBMITQUEUE_VM_BIND) + flags |= DRM_EXEC_IGNORE_DUPLICATES; + + drm_exec_init(&submit->exec, flags, submit->nr_bos);
drm_exec_until_all_locked (&submit->exec) { for (unsigned i = 0; i < submit->nr_bos; i++) { struct drm_gem_object *obj = submit->bos[i].obj; + + if (!obj) + continue; + ret = drm_exec_prepare_obj(&submit->exec, obj, 1); drm_exec_retry_on_contention(&submit->exec); if (ret) @@ -372,6 +395,28 @@ static int submit_pin_vmas(struct msm_gem_submit *submit) return ret; }
+static int submit_get_pages(struct msm_gem_submit *submit) +{ + /* + * First loop, before holding the LRU lock, avoids holding the + * LRU lock while calling msm_gem_pin_vma_locked (which could + * trigger get_pages()) + */ + for (int i = 0; i < submit->nr_bos; i++) { + struct drm_gem_object *obj = submit->bos[i].obj; + struct page **pages; + + if (!obj) + continue; + + pages = msm_gem_get_pages_locked(obj, MSM_MADV_WILLNEED); + if (IS_ERR(pages)) + return PTR_ERR(pages); + } + + return 0; +} + static void submit_pin_objects(struct msm_gem_submit *submit) { struct msm_drm_private *priv = submit->dev->dev_private; @@ -385,7 +430,12 @@ static void submit_pin_objects(struct msm_gem_submit *submit) */ mutex_lock(&priv->lru.lock); for (int i = 0; i < submit->nr_bos; i++) { - msm_gem_pin_obj_locked(submit->bos[i].obj); + struct drm_gem_object *obj = submit->bos[i].obj; + + if (!obj) + continue; + + msm_gem_pin_obj_locked(obj); } mutex_unlock(&priv->lru.lock);
@@ -400,6 +450,9 @@ static void submit_unpin_objects(struct msm_gem_submit *submit) for (int i = 0; i < submit->nr_bos; i++) { struct drm_gem_object *obj = submit->bos[i].obj;
+ if (!obj) + continue; + msm_gem_unpin_locked(obj); }
@@ -413,6 +466,9 @@ static void submit_attach_object_fences(struct msm_gem_submit *submit) for (i = 0; i < submit->nr_bos; i++) { struct drm_gem_object *obj = submit->bos[i].obj;
+ if (!obj) + continue; + if (submit->bos[i].flags & MSM_SUBMIT_BO_WRITE) dma_resv_add_fence(obj->resv, submit->user_fence, DMA_RESV_USAGE_WRITE); @@ -708,6 +764,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct msm_ringbuffer *ring; struct msm_submit_post_dep *post_deps = NULL; struct drm_syncobj **syncobjs_to_reset = NULL; + unsigned cmds_to_parse; int out_fence_fd = -1; unsigned i; int ret; @@ -724,9 +781,6 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (MSM_PIPE_ID(args->flags) != MSM_PIPE_3D0) return UERR(EINVAL, dev, "invalid pipe");
- if (MSM_PIPE_FLAGS(args->flags) & ~MSM_SUBMIT_EXEC_FLAGS) - return UERR(EINVAL, dev, "invalid flags"); - if (args->flags & MSM_SUBMIT_SUDO) { if (!IS_ENABLED(CONFIG_DRM_MSM_GPU_SUDO) || !capable(CAP_SYS_RAWIO)) @@ -737,6 +791,26 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (!queue) return -ENOENT;
+ if (queue->flags & MSM_SUBMITQUEUE_VM_BIND) { + if (args->nr_cmds || args->cmds) { + ret = UERR(EINVAL, dev, "nr_cmds should be zero for VM_BIND queue"); + goto out_post_unlock; + } + if (MSM_PIPE_FLAGS(args->flags) & ~MSM_SUBMIT_VM_BIND_FLAGS) { + ret = UERR(EINVAL, dev, "invalid flags"); + goto out_post_unlock; + } + } else { + if (msm_context_is_vmbind(ctx) && (args->nr_bos || args->bos)) { + ret = UERR(EINVAL, dev, "nr_bos should be zero for VM_BIND contexts"); + goto out_post_unlock; + } + if (MSM_PIPE_FLAGS(args->flags) & ~MSM_SUBMIT_EXEC_FLAGS) { + ret = UERR(EINVAL, dev, "invalid flags"); + goto out_post_unlock; + } + } + ring = gpu->rb[queue->ring_nr];
if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) { @@ -813,19 +887,38 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out;
- if (!(args->flags & MSM_SUBMIT_NO_IMPLICIT)) { + if (msm_context_is_vmbind(ctx) && !(queue->flags & MSM_SUBMITQUEUE_VM_BIND)) { + /* + * If we are not using VM_BIND, submit_pin_vmas() will validate + * just the BOs attached to the submit. In that case we don't + * need to validate the _entire_ vm, because userspace tracked + * what BOs are associated with the submit. + */ + ret = drm_gpuvm_validate(submit->vm, &submit->exec); + if (ret) + goto out; + } + + if (!(args->flags & MSM_SUBMIT_NO_IMPLICIT) && + !(queue->flags & MSM_SUBMITQUEUE_VM_BIND)) { ret = submit_fence_sync(submit); if (ret) goto out; }
- ret = submit_pin_vmas(submit); + if (queue->flags & MSM_SUBMITQUEUE_VM_BIND) { + ret = submit_get_pages(submit); + } else { + ret = submit_pin_vmas(submit); + } if (ret) goto out;
submit_pin_objects(submit);
- for (i = 0; i < args->nr_cmds; i++) { + cmds_to_parse = msm_context_is_vmbind(ctx) ? 0 : args->nr_cmds; + + for (i = 0; i < cmds_to_parse; i++) { struct drm_gem_object *obj; uint64_t iova;
@@ -857,7 +950,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, goto out; }
- submit->nr_cmds = i; + submit->nr_cmds = args->nr_cmds;
idr_preload(GFP_KERNEL);
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c index b37bfd80bca9..2160d492a999 100644 --- a/drivers/gpu/drm/msm/msm_gem_vma.c +++ b/drivers/gpu/drm/msm/msm_gem_vma.c @@ -162,6 +162,70 @@ static const struct drm_gpuvm_ops msm_gpuvm_ops = { .vm_free = msm_gem_vm_free, };
+static int +run_bo_op(struct msm_gem_submit *submit, const struct msm_gem_submit_bo *bo) +{ + unsigned op = bo->flags & MSM_SUBMIT_BO_OP_MASK; + + switch (op) { + case MSM_SUBMIT_BO_OP_MAP: + case MSM_SUBMIT_BO_OP_MAP_NULL: + return drm_gpuvm_sm_map(submit->vm, submit->vm, bo->iova, + bo->range, bo->obj, bo->bo_offset); + break; + case MSM_SUBMIT_BO_OP_UNMAP: + return drm_gpuvm_sm_unmap(submit->vm, submit->vm, bo->iova, + bo->bo_offset); + } + + return -EINVAL; +} + +static struct dma_fence * +msm_vma_job_run(struct drm_sched_job *job) +{ + struct msm_gem_submit *submit = to_msm_submit(job); + + for (unsigned i = 0; i < submit->nr_bos; i++) { + int ret = run_bo_op(submit, &submit->bos[i]); + if (ret) { + to_msm_vm(submit->vm)->unusable = true; + return ERR_PTR(ret); + } + } + + /* VM_BIND ops run on CPU, so we are done now: */ + msm_submit_retire(submit); + + for (int i = 0; i < submit->nr_bos; i++) { + struct drm_gem_object *obj = submit->bos[i].obj; + + if (!obj) + continue; + + msm_gem_lock(obj); + msm_gem_unpin_locked(obj); + msm_gem_unlock(obj); + } + + /* VM_BIND ops are synchronous, so no fence to wait on: */ + return NULL; +} + +static void +msm_vma_job_free(struct drm_sched_job *job) +{ + struct msm_gem_submit *submit = to_msm_submit(job); + + drm_sched_job_cleanup(job); + msm_gem_submit_put(submit); +} + +static const struct drm_sched_backend_ops msm_vm_bind_ops = { + .run_job = msm_vma_job_run, + .free_job = msm_vma_job_free +}; + /** * msm_gem_vm_create() - Create and initialize a &msm_gem_vm * @drm: the drm device @@ -197,6 +261,14 @@ msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, goto err_free_vm; }
+ if (!managed) { + ret = drm_sched_init(&vm->sched, &msm_vm_bind_ops, NULL, 1, 1, 0, + MAX_SCHEDULE_TIMEOUT, NULL, NULL, + "msm-vm-bind", drm->dev); + if (ret) + goto err_free_dummy; + } + drm_gpuvm_init(&vm->base, name, 0, drm, dummy_gem, va_start, va_size, 0, 0, &msm_gpuvm_ops); drm_gem_object_put(dummy_gem); @@ -211,8 +283,36 @@ msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name,
return &vm->base;
+err_free_dummy: + drm_gem_object_put(dummy_gem); + err_free_vm: kfree(vm); return ERR_PTR(ret);
} + +/** + * msm_gem_vm_close() - Close a VM + * @_vm: The VM to close + * + * Called when the drm device file is closed, to tear down VM related resources + * (which will drop refcounts to GEM objects that were still mapped into the + * VM at the time). + */ +void +msm_gem_vm_close(struct drm_gpuvm *_vm) +{ + struct msm_gem_vm *vm = to_msm_vm(_vm); + + /* + * For kernel managed VMs, the VMAs are torn down when the handle is + * closed, so nothing more to do. + */ + if (vm->managed) + return; + + /* Kill the scheduler now, so we aren't racing with it for cleanup: */ + drm_sched_stop(&vm->sched, NULL); + drm_sched_fini(&vm->sched); +} diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 70abbd93e11b..fe716f0004f2 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -554,6 +554,9 @@ struct msm_gpu_submitqueue { struct mutex lock; struct kref ref; struct drm_sched_entity *entity; + + /** @_vm_bind_entity: used for @entity pointer for VM_BIND queues */ + struct drm_sched_entity _vm_bind_entity[0]; };
struct msm_gpu_state_bo { diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c index 8ced49c7557b..99ab780d5d7b 100644 --- a/drivers/gpu/drm/msm/msm_submitqueue.c +++ b/drivers/gpu/drm/msm/msm_submitqueue.c @@ -72,6 +72,9 @@ void msm_submitqueue_destroy(struct kref *kref)
idr_destroy(&queue->fence_idr);
+ if (queue->entity == &queue->_vm_bind_entity[0]) + drm_sched_entity_destroy(queue->entity); + msm_context_put(queue->ctx);
kfree(queue); @@ -115,6 +118,11 @@ void msm_submitqueue_close(struct msm_context *ctx) list_del(&entry->node); msm_submitqueue_put(entry); } + + if (!ctx->vm) + return; + + msm_gem_vm_close(ctx->vm); }
static struct drm_sched_entity * @@ -160,8 +168,6 @@ int msm_submitqueue_create(struct drm_device *drm, struct msm_context *ctx, struct msm_drm_private *priv = drm->dev_private; struct msm_gpu_submitqueue *queue; enum drm_sched_priority sched_prio; - extern int enable_preemption; - bool preemption_supported; unsigned ring_nr; int ret;
@@ -171,26 +177,53 @@ int msm_submitqueue_create(struct drm_device *drm, struct msm_context *ctx, if (!priv->gpu) return -ENODEV;
- preemption_supported = priv->gpu->nr_rings == 1 && enable_preemption != 0; + if (flags & MSM_SUBMITQUEUE_VM_BIND) { + unsigned sz;
- if (flags & MSM_SUBMITQUEUE_ALLOW_PREEMPT && preemption_supported) - return -EINVAL; + /* Not allowed for kernel managed VMs (ie. kernel allocs VA) */ + if (!msm_context_is_vmbind(ctx)) + return -EINVAL;
- ret = msm_gpu_convert_priority(priv->gpu, prio, &ring_nr, &sched_prio); - if (ret) - return ret; + if (prio) + return -EINVAL; + + sz = struct_size(queue, _vm_bind_entity, 1); + queue = kzalloc(sz, GFP_KERNEL); + } else { + extern int enable_preemption; + bool preemption_supported = + priv->gpu->nr_rings == 1 && enable_preemption != 0; + + if (flags & MSM_SUBMITQUEUE_ALLOW_PREEMPT && preemption_supported) + return -EINVAL;
- queue = kzalloc(sizeof(*queue), GFP_KERNEL); + ret = msm_gpu_convert_priority(priv->gpu, prio, &ring_nr, &sched_prio); + if (ret) + return ret; + + queue = kzalloc(sizeof(*queue), GFP_KERNEL); + }
if (!queue) return -ENOMEM;
kref_init(&queue->ref); queue->flags = flags; - queue->ring_nr = ring_nr;
- queue->entity = get_sched_entity(ctx, priv->gpu->rb[ring_nr], - ring_nr, sched_prio); + if (flags & MSM_SUBMITQUEUE_VM_BIND) { + struct drm_gpu_scheduler *sched = &to_msm_vm(msm_context_vm(drm, ctx))->sched; + + queue->entity = &queue->_vm_bind_entity[0]; + + drm_sched_entity_init(queue->entity, DRM_SCHED_PRIORITY_KERNEL, + &sched, 1, NULL); + } else { + queue->ring_nr = ring_nr; + + queue->entity = get_sched_entity(ctx, priv->gpu->rb[ring_nr], + ring_nr, sched_prio); + } + if (IS_ERR(queue->entity)) { ret = PTR_ERR(queue->entity); kfree(queue); diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h index 1a948d49c610..39b55c8d7413 100644 --- a/include/uapi/drm/msm_drm.h +++ b/include/uapi/drm/msm_drm.h @@ -404,12 +404,19 @@ struct drm_msm_gem_madvise { /* * Draw queues allow the user to set specific submission parameter. Command * submissions specify a specific submitqueue to use. ID 0 is reserved for - * backwards compatibility as a "default" submitqueue + * backwards compatibility as a "default" submitqueue. + * + * Because VM_BIND async updates happen on the CPU, they must run on a + * virtual queue created with the flag MSM_SUBMITQUEUE_VM_BIND. If we had + * a way to do pgtable updates on the GPU, we could drop this restriction. */
#define MSM_SUBMITQUEUE_ALLOW_PREEMPT 0x00000001 +#define MSM_SUBMITQUEUE_VM_BIND 0x00000002 /* virtual queue for VM_BIND ops */ + #define MSM_SUBMITQUEUE_FLAGS ( \ MSM_SUBMITQUEUE_ALLOW_PREEMPT | \ + MSM_SUBMITQUEUE_VM_BIND | \ 0)
/*
From: Rob Clark robdclark@chromium.org
Buffers that are not shared between contexts can share a single resv object. This way drm_gpuvm will not track them as external objects, and submit-time validating overhead will be O(1) for all N non-shared BOs, instead of O(n).
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_drv.h | 1 + drivers/gpu/drm/msm/msm_gem.c | 23 +++++++++++++++++++++++ drivers/gpu/drm/msm/msm_gem_prime.c | 15 +++++++++++++++ include/uapi/drm/msm_drm.h | 14 ++++++++++++++ 4 files changed, 53 insertions(+)
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index 80582c0c2bf7..4c7ff83a0a20 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -246,6 +246,7 @@ int msm_gem_prime_vmap(struct drm_gem_object *obj, struct iosys_map *map); void msm_gem_prime_vunmap(struct drm_gem_object *obj, struct iosys_map *map); struct drm_gem_object *msm_gem_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *attach, struct sg_table *sg); +struct dma_buf *msm_gem_prime_export(struct drm_gem_object *obj, int flags); int msm_gem_prime_pin(struct drm_gem_object *obj); void msm_gem_prime_unpin(struct drm_gem_object *obj);
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 66332481c4c3..c21e1284f289 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -511,6 +511,9 @@ static int get_and_pin_iova_range_locked(struct drm_gem_object *obj,
msm_gem_assert_locked(obj);
+ if (to_msm_bo(obj)->flags & MSM_BO_NO_SHARE) + return -EINVAL; + vma = get_vma_locked(obj, vm, range_start, range_end); if (IS_ERR(vma)) return PTR_ERR(vma); @@ -1026,6 +1029,16 @@ static void msm_gem_free_object(struct drm_gem_object *obj) put_iova_vmas(obj); }
+ if (msm_obj->flags & MSM_BO_NO_SHARE) { + struct drm_gem_object *r_obj = + container_of(obj->resv, struct drm_gem_object, _resv); + + BUG_ON(obj->resv == &obj->_resv); + + /* Drop reference we hold to shared resv obj: */ + drm_gem_object_put(r_obj); + } + drm_gem_object_release(obj);
kfree(msm_obj->metadata); @@ -1058,6 +1071,15 @@ int msm_gem_new_handle(struct drm_device *dev, struct drm_file *file, if (name) msm_gem_object_set_name(obj, "%s", name);
+ if (flags & MSM_BO_NO_SHARE) { + struct msm_context *ctx = file->driver_priv; + struct drm_gem_object *r_obj = drm_gpuvm_resv_obj(ctx->vm); + + drm_gem_object_get(r_obj); + + obj->resv = r_obj->resv; + } + ret = drm_gem_handle_create(file, obj, handle);
/* drop reference from allocate - handle holds it now */ @@ -1090,6 +1112,7 @@ static const struct drm_gem_object_funcs msm_gem_object_funcs = { .free = msm_gem_free_object, .open = msm_gem_open, .close = msm_gem_close, + .export = msm_gem_prime_export, .pin = msm_gem_prime_pin, .unpin = msm_gem_prime_unpin, .get_sg_table = msm_gem_prime_get_sg_table, diff --git a/drivers/gpu/drm/msm/msm_gem_prime.c b/drivers/gpu/drm/msm/msm_gem_prime.c index ee267490c935..1a6d8099196a 100644 --- a/drivers/gpu/drm/msm/msm_gem_prime.c +++ b/drivers/gpu/drm/msm/msm_gem_prime.c @@ -16,6 +16,9 @@ struct sg_table *msm_gem_prime_get_sg_table(struct drm_gem_object *obj) struct msm_gem_object *msm_obj = to_msm_bo(obj); int npages = obj->size >> PAGE_SHIFT;
+ if (msm_obj->flags & MSM_BO_NO_SHARE) + return ERR_PTR(-EINVAL); + if (WARN_ON(!msm_obj->pages)) /* should have already pinned! */ return ERR_PTR(-ENOMEM);
@@ -45,6 +48,15 @@ struct drm_gem_object *msm_gem_prime_import_sg_table(struct drm_device *dev, return msm_gem_import(dev, attach->dmabuf, sg); }
+ +struct dma_buf *msm_gem_prime_export(struct drm_gem_object *obj, int flags) +{ + if (to_msm_bo(obj)->flags & MSM_BO_NO_SHARE) + return ERR_PTR(-EPERM); + + return drm_gem_prime_export(obj, flags); +} + int msm_gem_prime_pin(struct drm_gem_object *obj) { struct page **pages; @@ -53,6 +65,9 @@ int msm_gem_prime_pin(struct drm_gem_object *obj) if (obj->import_attach) return 0;
+ if (to_msm_bo(obj)->flags & MSM_BO_NO_SHARE) + return -EINVAL; + pages = msm_gem_pin_pages_locked(obj); if (IS_ERR(pages)) ret = PTR_ERR(pages); diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h index 39b55c8d7413..a7e48ee1dd95 100644 --- a/include/uapi/drm/msm_drm.h +++ b/include/uapi/drm/msm_drm.h @@ -138,6 +138,19 @@ struct drm_msm_param {
#define MSM_BO_SCANOUT 0x00000001 /* scanout capable */ #define MSM_BO_GPU_READONLY 0x00000002 +/* Private buffers do not need to be explicitly listed in the SUBMIT + * ioctl, unless referenced by a drm_msm_gem_submit_cmd. Private + * buffers may NOT be imported/exported or used for scanout (or any + * other situation where buffers can be indefinitely pinned, but + * cases other than scanout are all kernel owned BOs which are not + * visible to userspace). + * + * In exchange for those constraints, all private BOs associated with + * a single context (drm_file) share a single dma_resv, and if there + * has been no eviction since the last submit, there are no per-BO + * bookeeping to do, significantly cutting the SUBMIT overhead. + */ +#define MSM_BO_NO_SHARE 0x00000004 #define MSM_BO_CACHE_MASK 0x000f0000 /* cache modes */ #define MSM_BO_CACHED 0x00010000 @@ -147,6 +160,7 @@ struct drm_msm_param {
#define MSM_BO_FLAGS (MSM_BO_SCANOUT | \ MSM_BO_GPU_READONLY | \ + MSM_BO_NO_SHARE | \ MSM_BO_CACHE_MASK)
struct drm_msm_gem_new {
linaro-mm-sig@lists.linaro.org