On Thu, Jan 28, 2021 at 11:08 PM Suren Baghdasaryan surenb@google.com wrote:
On Thu, Jan 28, 2021 at 11:51 AM Suren Baghdasaryan surenb@google.com wrote:
On Tue, Jan 26, 2021 at 5:52 AM 'Michal Hocko' via kernel-team kernel-team@android.com wrote:
On Wed 20-01-21 14:17:39, Jann Horn wrote:
On Wed, Jan 13, 2021 at 3:22 PM Michal Hocko mhocko@suse.com wrote:
On Tue 12-01-21 09:51:24, Suren Baghdasaryan wrote:
On Tue, Jan 12, 2021 at 9:45 AM Oleg Nesterov oleg@redhat.com wrote: > > On 01/12, Michal Hocko wrote: > > > > On Mon 11-01-21 09:06:22, Suren Baghdasaryan wrote: > > > > > What we want is the ability for one process to influence another process > > > in order to optimize performance across the entire system while leaving > > > the security boundary intact. > > > Replace PTRACE_MODE_ATTACH with a combination of PTRACE_MODE_READ > > > and CAP_SYS_NICE. PTRACE_MODE_READ to prevent leaking ASLR metadata > > > and CAP_SYS_NICE for influencing process performance. > > > > I have to say that ptrace modes are rather obscure to me. So I cannot > > really judge whether MODE_READ is sufficient. My understanding has > > always been that this is requred to RO access to the address space. But > > this operation clearly has a visible side effect. Do we have any actual > > documentation for the existing modes? > > > > I would be really curious to hear from Jann and Oleg (now Cced). > > Can't comment, sorry. I never understood these security checks and never tried. > IIUC only selinux/etc can treat ATTACH/READ differently and I have no idea what > is the difference.
Yama in particular only does its checks on ATTACH and ignores READ, that's the difference you're probably most likely to encounter on a normal desktop system, since some distros turn Yama on by default. Basically the idea there is that running "gdb -p $pid" or "strace -p $pid" as a normal user will usually fail, but reading /proc/$pid/maps still works; so you can see things like detailed memory usage information and such, but you're not supposed to be able to directly peek into a running SSH client and inject data into the existing SSH connection, or steal the cryptographic keys for the current connection, or something like that.
I haven't seen a written explanation on ptrace modes but when I consulted Jann his explanation was:
PTRACE_MODE_READ means you can inspect metadata about processes with the specified domain, across UID boundaries. PTRACE_MODE_ATTACH means you can fully impersonate processes with the specified domain, across UID boundaries.
Maybe this would be a good start to document expectations. Some more practical examples where the difference is visible would be great as well.
Before documenting the behavior, it would be a good idea to figure out what to do with perf_event_open(). That one's weird in that it only requires PTRACE_MODE_READ, but actually allows you to sample stuff like userspace stack and register contents (if perf_event_paranoid is 1 or 2). Maybe for SELinux things (and maybe also for Yama), there should be a level in between that allows fully inspecting the process (for purposes like profiling) but without the ability to corrupt its memory or registers or things like that. Or maybe perf_event_open() should just use the ATTACH mode.
Thanks for the clarification. I still cannot say I would have a good mental picture. Having something in Documentation/core-api/ sounds really needed. Wrt to perf_event_open it sounds really odd it can do more than other places restrict indeed. Something for the respective maintainer but I strongly suspect people simply copy the pattern from other places because the expected semantic is not really clear.
Sorry, back to the matters of this patch. Are there any actionable items for me to take care of before it can be accepted? The only request from Andrew to write a man page is being worked on at https://lore.kernel.org/linux-mm/20210120202337.1481402-1-surenb@google.com/ and I'll follow up with the next version. I also CC'ed stable@ for this to be included into 5.10 per Andrew's request. That CC was lost at some point, so CC'ing again.
I do not see anything else on this patch to fix. Please chime in if there are any more concerns, otherwise I would ask Andrew to take it into mm-tree and stable@ to apply it to 5.10. Thanks!
process_madvise man page V2 is posted at: https://lore.kernel.org/linux-mm/20210129070340.566340-1-surenb@google.com/
process_madvise man page V3 is posted at: https://lore.kernel.org/linux-mm/20210202053046.1653012-1-surenb@google.com/
-- Michal Hocko SUSE Labs
-- To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.