On Mon, May 06, 2024 at 04:01:42PM +0200, Hans de Goede wrote:
Hi Sima,
On 5/6/24 3:38 PM, Daniel Vetter wrote:
On Mon, May 06, 2024 at 02:05:12PM +0200, Maxime Ripard wrote:
Hi,
On Mon, May 06, 2024 at 01:49:17PM GMT, Hans de Goede wrote:
Hi dma-buf maintainers, et.al.,
Various people have been working on making complex/MIPI cameras work OOTB with mainline Linux kernels and an opensource userspace stack.
The generic solution adds a software ISP (for Debayering and 3A) to libcamera. Libcamera's API guarantees that buffers handed to applications using it are dma-bufs so that these can be passed to e.g. a video encoder.
In order to meet this API guarantee the libcamera software ISP allocates dma-bufs from userspace through one of the /dev/dma_heap/* heaps. For the Fedora COPR repo for the PoC of this: https://hansdegoede.dreamwidth.org/28153.html
For the record, we're also considering using them for ARM KMS devices, so it would be better if the solution wasn't only considering v4l2 devices.
I have added a simple udev rule to give physically present users access to the dma_heap-s:
KERNEL=="system", SUBSYSTEM=="dma_heap", TAG+="uaccess"
(and on Rasperry Pi devices any users in the video group get access)
This was just a quick fix for the PoC. Now that we are ready to move out of the PoC phase and start actually integrating this into distributions the question becomes if this is an acceptable solution; or if we need some other way to deal with this ?
Specifically the question is if this will have any negative security implications? I can certainly see this being used to do some sort of denial of service attack on the system (1). This is especially true for the cma heap which generally speaking is a limited resource.
There's plenty of other ways to exhaust CMA, like allocating too much KMS or v4l2 buffers. I'm not sure we should consider dma-heaps differently than those if it's part of our threat model.
So generally for an arm soc where your display needs cma, your render node doesn't. And user applications only have access to the later, while only the compositor gets a kms fd through logind. At least in drm aside from vc4 there's really no render driver that just gives you access to cma and allows you to exhaust that, you need to be a compositor with drm master access to the display.
Which means we're mostly protected against bad applications, and that's not a threat the "user physically sits in front of the machine accounts for", and which giving cma access to everyone would open up. And with flathub/snaps/... this is very much an issue.
I agree that bad applications are an issue, but not for the flathub / snaps case. Flatpacks / snaps run sandboxed and don't have access to a full /dev so those should not be able to open /dev/dma_heap/* independent of the ACLs on /dev/dma_heap/*. The plan is for cameras using the libcamera software ISP to always be accessed through pipewire and the camera portal, so in this case pipewere is taking the place of the compositor in your kms vs render node example.
Yeah essentially if you clarify to "set the permissions such that pipewire can do allocations", then I think that makes sense. And is at the same level as e.g. drm kms giving compsitors (but _only_ compositors) special access rights.
So this reduces the problem to bad apps packaged by regular distributions and if any of those misbehave the distros should fix that.
So I think that for the denial of service side allowing physical present users (but not sandboxed apps running as those users) to access /dev/dma_heap/* should be ok.
My bigger worry is if dma_heap (u)dma-bufs can be abused in other ways then causing a denial of service.
I guess that the answer there is that causing other security issues should not be possible ?
Well pinned memory exhaustion is a very useful tool to make all kinds of other kernel issues exploitable. Like if you have that you can weaponize all kinds of kmalloc error paths (and since it's untracked memory the oom killer will not get you of these issuees).
I think for the pipewire based desktop it'd be best if you only allow pipewire to get at an fd for allocating from dma-heaps, kinda like logind furnishes the kms master fd ... Still has the issue that you can't nuke these buffers, but that's for another day. But at least from a "limit attack surface" design pov I think this would be better than just handing out access to the current user outright. But that's also not the worst option I guess, as long as snaps/flatpacks only go through the pipewire service. -Sima
Hi,
On Tue, 7 May 2024 at 12:15, Daniel Vetter daniel@ffwll.ch wrote:
On Mon, May 06, 2024 at 04:01:42PM +0200, Hans de Goede wrote:
On 5/6/24 3:38 PM, Daniel Vetter wrote: I agree that bad applications are an issue, but not for the flathub / snaps case. Flatpacks / snaps run sandboxed and don't have access to a full /dev so those should not be able to open /dev/dma_heap/* independent of the ACLs on /dev/dma_heap/*. The plan is for cameras using the libcamera software ISP to always be accessed through pipewire and the camera portal, so in this case pipewere is taking the place of the compositor in your kms vs render node example.
Yeah essentially if you clarify to "set the permissions such that pipewire can do allocations", then I think that makes sense. And is at the same level as e.g. drm kms giving compsitors (but _only_ compositors) special access rights.
That would have the unfortunate side effect of making sandboxed apps less efficient on some platforms, since they wouldn't be able to do direct scanout anymore ...
Cheers, Daniel
On Wed, May 08, 2024 at 06:46:53AM +0100, Daniel Stone wrote:
Hi,
On Tue, 7 May 2024 at 12:15, Daniel Vetter daniel@ffwll.ch wrote:
On Mon, May 06, 2024 at 04:01:42PM +0200, Hans de Goede wrote:
On 5/6/24 3:38 PM, Daniel Vetter wrote: I agree that bad applications are an issue, but not for the flathub / snaps case. Flatpacks / snaps run sandboxed and don't have access to a full /dev so those should not be able to open /dev/dma_heap/* independent of the ACLs on /dev/dma_heap/*. The plan is for cameras using the libcamera software ISP to always be accessed through pipewire and the camera portal, so in this case pipewere is taking the place of the compositor in your kms vs render node example.
Yeah essentially if you clarify to "set the permissions such that pipewire can do allocations", then I think that makes sense. And is at the same level as e.g. drm kms giving compsitors (but _only_ compositors) special access rights.
That would have the unfortunate side effect of making sandboxed apps less efficient on some platforms, since they wouldn't be able to do direct scanout anymore ...
I was assuming that everyone goes through pipewire, and ideally that is the only one that can even get at these special chardev.
If pipewire is only for sandboxed apps then yeah this aint great :-/ -Sima
On Wed, 8 May 2024 at 09:33, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, May 08, 2024 at 06:46:53AM +0100, Daniel Stone wrote:
That would have the unfortunate side effect of making sandboxed apps less efficient on some platforms, since they wouldn't be able to do direct scanout anymore ...
I was assuming that everyone goes through pipewire, and ideally that is the only one that can even get at these special chardev.
If pipewire is only for sandboxed apps then yeah this aint great :-/
No, PipeWire is fine, I mean graphical apps.
Right now, if your platform requires CMA for display, then the app needs access to the GPU render node and the display node too, in order to allocate buffers which the compositor can scan out directly. If it only has access to the render nodes and not the display node, it won't be able to allocate correctly, so its content will need a composition pass, i.e. performance penalty for sandboxing. But if it can allocate correctly, then hey, it can exhaust CMA just like heaps can.
Personally I think we'd be better off just allowing access and figuring out cgroups later. It's not like the OOM story is great generally, and hey, you can get there with just render nodes ...
Cheers, Daniel
On Wed, May 08, 2024 at 09:38:33AM +0100, Daniel Stone wrote:
On Wed, 8 May 2024 at 09:33, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, May 08, 2024 at 06:46:53AM +0100, Daniel Stone wrote:
That would have the unfortunate side effect of making sandboxed apps less efficient on some platforms, since they wouldn't be able to do direct scanout anymore ...
I was assuming that everyone goes through pipewire, and ideally that is the only one that can even get at these special chardev.
If pipewire is only for sandboxed apps then yeah this aint great :-/
No, PipeWire is fine, I mean graphical apps.
Right now, if your platform requires CMA for display, then the app needs access to the GPU render node and the display node too, in order to allocate buffers which the compositor can scan out directly. If it only has access to the render nodes and not the display node, it won't be able to allocate correctly, so its content will need a composition pass, i.e. performance penalty for sandboxing. But if it can allocate correctly, then hey, it can exhaust CMA just like heaps can.
Personally I think we'd be better off just allowing access and figuring out cgroups later. It's not like the OOM story is great generally, and hey, you can get there with just render nodes ...
Imo the right fix is to ask the compositor to allocate the buffers in this case, and then maybe have some kind of revoke/purge behaviour on these buffers. Compositor has an actual idea of who's a candidate for direct scanout after all, not the app. Or well at least force migrate the memory from cma to shmem.
If you only whack cgroups on this issue you're still stuck in the world where either all apps together can ddos the display or no one can realistically direct scanout.
So yeah on the display side the problem isn't solved either, but we knew that already. -Sima
Hi,
On Wed, 8 May 2024 at 16:49, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, May 08, 2024 at 09:38:33AM +0100, Daniel Stone wrote:
Right now, if your platform requires CMA for display, then the app needs access to the GPU render node and the display node too, in order to allocate buffers which the compositor can scan out directly. If it only has access to the render nodes and not the display node, it won't be able to allocate correctly, so its content will need a composition pass, i.e. performance penalty for sandboxing. But if it can allocate correctly, then hey, it can exhaust CMA just like heaps can.
Personally I think we'd be better off just allowing access and figuring out cgroups later. It's not like the OOM story is great generally, and hey, you can get there with just render nodes ...
Imo the right fix is to ask the compositor to allocate the buffers in this case, and then maybe have some kind of revoke/purge behaviour on these buffers. Compositor has an actual idea of who's a candidate for direct scanout after all, not the app. Or well at least force migrate the memory from cma to shmem.
If you only whack cgroups on this issue you're still stuck in the world where either all apps together can ddos the display or no one can realistically direct scanout.
Mmm, back to DRI2. I can't say I'm wildly enthused about that, not least because a client using GPU/codec/etc for those buffers would have to communicate its requirements (alignment etc) forward to the compositor in order for the compositor to allocate for it. Obviously passing the constraints etc around isn't a solved problem yet, but it is at least contained down in clients rather than making it back and forth between client and compositor.
I'm extremely not-wild about the compositor migrating memory from CMA to shmem behind the client's back, and tbh I'm not sure how that would even work if the client has it pinned through whatever API it's imported into.
Anyway, like Laurent says, if we're deciding that heaps can't be used by generic apps (unlike DRM/V4L2/etc), then we need gralloc.
Cheers, Daniel
On Thu, May 09, 2024 at 10:23:16AM +0100, Daniel Stone wrote:
Hi,
On Wed, 8 May 2024 at 16:49, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, May 08, 2024 at 09:38:33AM +0100, Daniel Stone wrote:
Right now, if your platform requires CMA for display, then the app needs access to the GPU render node and the display node too, in order to allocate buffers which the compositor can scan out directly. If it only has access to the render nodes and not the display node, it won't be able to allocate correctly, so its content will need a composition pass, i.e. performance penalty for sandboxing. But if it can allocate correctly, then hey, it can exhaust CMA just like heaps can.
Personally I think we'd be better off just allowing access and figuring out cgroups later. It's not like the OOM story is great generally, and hey, you can get there with just render nodes ...
Imo the right fix is to ask the compositor to allocate the buffers in this case, and then maybe have some kind of revoke/purge behaviour on these buffers. Compositor has an actual idea of who's a candidate for direct scanout after all, not the app. Or well at least force migrate the memory from cma to shmem.
If you only whack cgroups on this issue you're still stuck in the world where either all apps together can ddos the display or no one can realistically direct scanout.
Mmm, back to DRI2. I can't say I'm wildly enthused about that, not least because a client using GPU/codec/etc for those buffers would have to communicate its requirements (alignment etc) forward to the compositor in order for the compositor to allocate for it. Obviously passing the constraints etc around isn't a solved problem yet, but it is at least contained down in clients rather than making it back and forth between client and compositor.
I don't think you need the compositor to allocate the buffer from the requirements, you only need a protocol that a) allocates a buffer of a given size from a given heap and b) has some kinda of revoke provisions so that the compositor can claw back the memory again when it needs it.
I'm extremely not-wild about the compositor migrating memory from CMA to shmem behind the client's back, and tbh I'm not sure how that would even work if the client has it pinned through whatever API it's imported into.
Other option is revoke on cma buffers that are allocated by clients, for the case the compositor needs it.
Anyway, like Laurent says, if we're deciding that heaps can't be used by generic apps (unlike DRM/V4L2/etc), then we need gralloc.
gralloc doesn't really fix this, it's just abstraction around how/where you allocate?
Anyway the current plan is that we all pretend this issue of CMA allocated buffers don't exist and we let clients allocate without limits. Given that we don't even have cgroups to sort out the mess for anything else I wouldn't worry too much ... -Sima
On Wednesday, May 8th, 2024 at 17:49, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, May 08, 2024 at 09:38:33AM +0100, Daniel Stone wrote:
On Wed, 8 May 2024 at 09:33, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, May 08, 2024 at 06:46:53AM +0100, Daniel Stone wrote:
That would have the unfortunate side effect of making sandboxed apps less efficient on some platforms, since they wouldn't be able to do direct scanout anymore ...
I was assuming that everyone goes through pipewire, and ideally that is the only one that can even get at these special chardev.
If pipewire is only for sandboxed apps then yeah this aint great :-/
No, PipeWire is fine, I mean graphical apps.
Right now, if your platform requires CMA for display, then the app needs access to the GPU render node and the display node too, in order to allocate buffers which the compositor can scan out directly. If it only has access to the render nodes and not the display node, it won't be able to allocate correctly, so its content will need a composition pass, i.e. performance penalty for sandboxing. But if it can allocate correctly, then hey, it can exhaust CMA just like heaps can.
Personally I think we'd be better off just allowing access and figuring out cgroups later. It's not like the OOM story is great generally, and hey, you can get there with just render nodes ...
Imo the right fix is to ask the compositor to allocate the buffers in this case, and then maybe have some kind of revoke/purge behaviour on these buffers. Compositor has an actual idea of who's a candidate for direct scanout after all, not the app. Or well at least force migrate the memory from cma to shmem.
If you only whack cgroups on this issue you're still stuck in the world where either all apps together can ddos the display or no one can realistically direct scanout.
So yeah on the display side the problem isn't solved either, but we knew that already.
What makes scanout memory so special?
The way I see it, any kind of memory will always be a limited resource: regular programs can exhaust system memory, as well as GPU VRAM, as well as scanout memory. I think we need to have ways to limit/control/arbiter the allocations regardless, and I don't think scanout memory should be a special case here.
On Mon, May 13, 2024 at 01:51:23PM +0000, Simon Ser wrote:
On Wednesday, May 8th, 2024 at 17:49, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, May 08, 2024 at 09:38:33AM +0100, Daniel Stone wrote:
On Wed, 8 May 2024 at 09:33, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, May 08, 2024 at 06:46:53AM +0100, Daniel Stone wrote:
That would have the unfortunate side effect of making sandboxed apps less efficient on some platforms, since they wouldn't be able to do direct scanout anymore ...
I was assuming that everyone goes through pipewire, and ideally that is the only one that can even get at these special chardev.
If pipewire is only for sandboxed apps then yeah this aint great :-/
No, PipeWire is fine, I mean graphical apps.
Right now, if your platform requires CMA for display, then the app needs access to the GPU render node and the display node too, in order to allocate buffers which the compositor can scan out directly. If it only has access to the render nodes and not the display node, it won't be able to allocate correctly, so its content will need a composition pass, i.e. performance penalty for sandboxing. But if it can allocate correctly, then hey, it can exhaust CMA just like heaps can.
Personally I think we'd be better off just allowing access and figuring out cgroups later. It's not like the OOM story is great generally, and hey, you can get there with just render nodes ...
Imo the right fix is to ask the compositor to allocate the buffers in this case, and then maybe have some kind of revoke/purge behaviour on these buffers. Compositor has an actual idea of who's a candidate for direct scanout after all, not the app. Or well at least force migrate the memory from cma to shmem.
If you only whack cgroups on this issue you're still stuck in the world where either all apps together can ddos the display or no one can realistically direct scanout.
So yeah on the display side the problem isn't solved either, but we knew that already.
What makes scanout memory so special?
The way I see it, any kind of memory will always be a limited resource: regular programs can exhaust system memory, as well as GPU VRAM, as well as scanout memory. I think we need to have ways to limit/control/arbiter the allocations regardless, and I don't think scanout memory should be a special case here.
(Long w/en and I caught a cold)
It's not scanout that's special, it's cma memory that's special. Because once you've allocated it, it's gone since it cannot be swapped out, and there's not a lot of it to go around. Which means even if we'd have cgroups for all the various gpu allocation heaps, you can't use cgroups to manage cma in a meaningful way:
- You set the cgroup limits so low for apps that it's guaranteed that the compositor will always be able to allocate enough scanout memory for it's need. That will be low enough that apps can never allocate scanout buffers themselves.
- Or you set the limit high enough so that apps can allocate enough, which means (as soon as you have more than just one app and not a totally bonkers amount of cma) that the compositor might not be able to allocate anymore.
It's kinda shit situation, which is also why you need the compositor to be able to revoke cma allocations it has handed to clients (like with drm leases).
Or we just keep the current yolo situation.
For any other memory type than CMA most of the popular drivers at least implement swapping, which gives you a ton more flexibility in setting up limits in a way that actually work. But even there we'd need cgroups first to make sure things don't go wrong too badly in the face of evil apps ... -Sima
Am 16.05.24 um 12:13 schrieb Daniel Vetter:
(Long w/en and I caught a cold)
Handing over a coup of tea.
I'm fighting with a cold since last week and I think it's one of the worst I've ever had.
(On the other hand every cold feels like the worst you ever had).
Christian.
linaro-mm-sig@lists.linaro.org