Android captures per-process system memory state when certain low memory events (e.g a foreground app kill) occur, to identify potential memory hoggers. In order to measure how much memory a process actually consumes, it is necessary to include the DMA buffer sizes for that process in the memory accounting. Since the handle to DMA buffers are raw FDs, it is important to be able to identify which processes have FD references to a DMA buffer.
Currently, DMA buffer FDs can be accounted using /proc/<pid>/fd/* and /proc/<pid>/fdinfo -- both are only readable by the process owner, as follows: 1. Do a readlink on each FD. 2. If the target path begins with "/dmabuf", then the FD is a dmabuf FD. 3. stat the file to get the dmabuf inode number. 4. Read/ proc/<pid>/fdinfo/<fd>, to get the DMA buffer size.
Accessing other processes’ fdinfo requires root privileges. This limits the use of the interface to debugging environments and is not suitable for production builds. Granting root privileges even to a system process increases the attack surface and is highly undesirable.
This series proposes making the requirement to read fdinfo less strict with PTRACE_MODE_READ.
Kalesh Singh (2): procfs: Allow reading fdinfo with PTRACE_MODE_READ dmabuf: Add dmabuf inode no to fdinfo
drivers/dma-buf/dma-buf.c | 1 + fs/proc/base.c | 4 ++-- fs/proc/fd.c | 15 ++++++++++++++- 3 files changed, 17 insertions(+), 3 deletions(-)
Since fdinfo doesn't permit reading process memory and manipulating process state, allow accessing fdinfo under PTRACE_MODE_READ_FSCRED.
Suggested-by: Jann Horn jannh@google.com Signed-off-by: Kalesh Singh kaleshsingh@google.com --- fs/proc/base.c | 4 ++-- fs/proc/fd.c | 15 ++++++++++++++- 2 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c index b3422cda2a91..a37f9de7103f 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -3160,7 +3160,7 @@ static const struct pid_entry tgid_base_stuff[] = { DIR("task", S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations), DIR("fd", S_IRUSR|S_IXUSR, proc_fd_inode_operations, proc_fd_operations), DIR("map_files", S_IRUSR|S_IXUSR, proc_map_files_inode_operations, proc_map_files_operations), - DIR("fdinfo", S_IRUSR|S_IXUSR, proc_fdinfo_inode_operations, proc_fdinfo_operations), + DIR("fdinfo", S_IRUGO|S_IXUGO, proc_fdinfo_inode_operations, proc_fdinfo_operations), DIR("ns", S_IRUSR|S_IXUGO, proc_ns_dir_inode_operations, proc_ns_dir_operations), #ifdef CONFIG_NET DIR("net", S_IRUGO|S_IXUGO, proc_net_inode_operations, proc_net_operations), @@ -3504,7 +3504,7 @@ static const struct inode_operations proc_tid_comm_inode_operations = { */ static const struct pid_entry tid_base_stuff[] = { DIR("fd", S_IRUSR|S_IXUSR, proc_fd_inode_operations, proc_fd_operations), - DIR("fdinfo", S_IRUSR|S_IXUSR, proc_fdinfo_inode_operations, proc_fdinfo_operations), + DIR("fdinfo", S_IRUGO|S_IXUGO, proc_fdinfo_inode_operations, proc_fdinfo_operations), DIR("ns", S_IRUSR|S_IXUGO, proc_ns_dir_inode_operations, proc_ns_dir_operations), #ifdef CONFIG_NET DIR("net", S_IRUGO|S_IXUGO, proc_net_inode_operations, proc_net_operations), diff --git a/fs/proc/fd.c b/fs/proc/fd.c index cb51763ed554..585e213301f9 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -6,6 +6,7 @@ #include <linux/fdtable.h> #include <linux/namei.h> #include <linux/pid.h> +#include <linux/ptrace.h> #include <linux/security.h> #include <linux/file.h> #include <linux/seq_file.h> @@ -72,6 +73,18 @@ static int seq_show(struct seq_file *m, void *v)
static int seq_fdinfo_open(struct inode *inode, struct file *file) { + bool allowed = false; + struct task_struct *task = get_proc_task(inode); + + if (!task) + return -ESRCH; + + allowed = ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS); + put_task_struct(task); + + if (!allowed) + return -EACCES; + return single_open(file, seq_show, inode); }
@@ -307,7 +320,7 @@ static struct dentry *proc_fdinfo_instantiate(struct dentry *dentry, struct proc_inode *ei; struct inode *inode;
- inode = proc_pid_make_inode(dentry->d_sb, task, S_IFREG | S_IRUSR); + inode = proc_pid_make_inode(dentry->d_sb, task, S_IFREG | S_IRUGO); if (!inode) return ERR_PTR(-ENOENT);
On Thu, Jan 28, 2021 at 10:24 AM Kalesh Singh kaleshsingh@google.com wrote:
Since fdinfo doesn't permit reading process memory and manipulating process state, allow accessing fdinfo under PTRACE_MODE_READ_FSCRED.
I would copy some of the reasoning from your cover letter so that this patch has more context on its own when merged.
Suggested-by: Jann Horn jannh@google.com Signed-off-by: Kalesh Singh kaleshsingh@google.com
fs/proc/base.c | 4 ++-- fs/proc/fd.c | 15 ++++++++++++++- 2 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c index b3422cda2a91..a37f9de7103f 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -3160,7 +3160,7 @@ static const struct pid_entry tgid_base_stuff[] = { DIR("task", S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations), DIR("fd", S_IRUSR|S_IXUSR, proc_fd_inode_operations, proc_fd_operations), DIR("map_files", S_IRUSR|S_IXUSR, proc_map_files_inode_operations, proc_map_files_operations),
DIR("fdinfo", S_IRUSR|S_IXUSR, proc_fdinfo_inode_operations, proc_fdinfo_operations),
DIR("fdinfo", S_IRUGO|S_IXUGO, proc_fdinfo_inode_operations, proc_fdinfo_operations), DIR("ns", S_IRUSR|S_IXUGO, proc_ns_dir_inode_operations, proc_ns_dir_operations),
#ifdef CONFIG_NET DIR("net", S_IRUGO|S_IXUGO, proc_net_inode_operations, proc_net_operations), @@ -3504,7 +3504,7 @@ static const struct inode_operations proc_tid_comm_inode_operations = { */ static const struct pid_entry tid_base_stuff[] = { DIR("fd", S_IRUSR|S_IXUSR, proc_fd_inode_operations, proc_fd_operations),
DIR("fdinfo", S_IRUSR|S_IXUSR, proc_fdinfo_inode_operations, proc_fdinfo_operations),
DIR("fdinfo", S_IRUGO|S_IXUGO, proc_fdinfo_inode_operations, proc_fdinfo_operations), DIR("ns", S_IRUSR|S_IXUGO, proc_ns_dir_inode_operations, proc_ns_dir_operations),
#ifdef CONFIG_NET DIR("net", S_IRUGO|S_IXUGO, proc_net_inode_operations, proc_net_operations), diff --git a/fs/proc/fd.c b/fs/proc/fd.c index cb51763ed554..585e213301f9 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -6,6 +6,7 @@ #include <linux/fdtable.h> #include <linux/namei.h> #include <linux/pid.h> +#include <linux/ptrace.h> #include <linux/security.h> #include <linux/file.h> #include <linux/seq_file.h> @@ -72,6 +73,18 @@ static int seq_show(struct seq_file *m, void *v)
static int seq_fdinfo_open(struct inode *inode, struct file *file) {
bool allowed = false;
struct task_struct *task = get_proc_task(inode);
if (!task)
return -ESRCH;
allowed = ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS);
put_task_struct(task);
if (!allowed)
return -EACCES;
return single_open(file, seq_show, inode);
}
@@ -307,7 +320,7 @@ static struct dentry *proc_fdinfo_instantiate(struct dentry *dentry, struct proc_inode *ei; struct inode *inode;
inode = proc_pid_make_inode(dentry->d_sb, task, S_IFREG | S_IRUSR);
inode = proc_pid_make_inode(dentry->d_sb, task, S_IFREG | S_IRUGO); if (!inode) return ERR_PTR(-ENOENT);
-- 2.30.0.365.g02bc693789-goog
On Thu, Jan 28, 2021 at 2:04 PM Suren Baghdasaryan surenb@google.com wrote:
On Thu, Jan 28, 2021 at 10:24 AM Kalesh Singh kaleshsingh@google.com wrote:
Since fdinfo doesn't permit reading process memory and manipulating process state, allow accessing fdinfo under PTRACE_MODE_READ_FSCRED.
I would copy some of the reasoning from your cover letter so that this patch has more context on its own when merged.
Sounds good. I'll update this description in the next version.
Suggested-by: Jann Horn jannh@google.com Signed-off-by: Kalesh Singh kaleshsingh@google.com
fs/proc/base.c | 4 ++-- fs/proc/fd.c | 15 ++++++++++++++- 2 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c index b3422cda2a91..a37f9de7103f 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -3160,7 +3160,7 @@ static const struct pid_entry tgid_base_stuff[] = { DIR("task", S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations), DIR("fd", S_IRUSR|S_IXUSR, proc_fd_inode_operations, proc_fd_operations), DIR("map_files", S_IRUSR|S_IXUSR, proc_map_files_inode_operations, proc_map_files_operations),
DIR("fdinfo", S_IRUSR|S_IXUSR, proc_fdinfo_inode_operations, proc_fdinfo_operations),
DIR("fdinfo", S_IRUGO|S_IXUGO, proc_fdinfo_inode_operations, proc_fdinfo_operations), DIR("ns", S_IRUSR|S_IXUGO, proc_ns_dir_inode_operations, proc_ns_dir_operations),
#ifdef CONFIG_NET DIR("net", S_IRUGO|S_IXUGO, proc_net_inode_operations, proc_net_operations), @@ -3504,7 +3504,7 @@ static const struct inode_operations proc_tid_comm_inode_operations = { */ static const struct pid_entry tid_base_stuff[] = { DIR("fd", S_IRUSR|S_IXUSR, proc_fd_inode_operations, proc_fd_operations),
DIR("fdinfo", S_IRUSR|S_IXUSR, proc_fdinfo_inode_operations, proc_fdinfo_operations),
DIR("fdinfo", S_IRUGO|S_IXUGO, proc_fdinfo_inode_operations, proc_fdinfo_operations), DIR("ns", S_IRUSR|S_IXUGO, proc_ns_dir_inode_operations, proc_ns_dir_operations),
#ifdef CONFIG_NET DIR("net", S_IRUGO|S_IXUGO, proc_net_inode_operations, proc_net_operations), diff --git a/fs/proc/fd.c b/fs/proc/fd.c index cb51763ed554..585e213301f9 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -6,6 +6,7 @@ #include <linux/fdtable.h> #include <linux/namei.h> #include <linux/pid.h> +#include <linux/ptrace.h> #include <linux/security.h> #include <linux/file.h> #include <linux/seq_file.h> @@ -72,6 +73,18 @@ static int seq_show(struct seq_file *m, void *v)
static int seq_fdinfo_open(struct inode *inode, struct file *file) {
bool allowed = false;
struct task_struct *task = get_proc_task(inode);
if (!task)
return -ESRCH;
allowed = ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS);
put_task_struct(task);
if (!allowed)
return -EACCES;
return single_open(file, seq_show, inode);
}
@@ -307,7 +320,7 @@ static struct dentry *proc_fdinfo_instantiate(struct dentry *dentry, struct proc_inode *ei; struct inode *inode;
inode = proc_pid_make_inode(dentry->d_sb, task, S_IFREG | S_IRUSR);
inode = proc_pid_make_inode(dentry->d_sb, task, S_IFREG | S_IRUGO); if (!inode) return ERR_PTR(-ENOENT);
-- 2.30.0.365.g02bc693789-goog
The dmabuf inode number allows userspace to uniquely identify the buffer and avoids a dependency on /proc/<pid>/fd/* when accounting per-process DMA buffer sizes.
Signed-off-by: Kalesh Singh kaleshsingh@google.com --- drivers/dma-buf/dma-buf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 9ad6397aaa97..d869099ede83 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -414,6 +414,7 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file) { struct dma_buf *dmabuf = file->private_data;
+ seq_printf(m, "dmabuf_inode_no:\t%lu\n", file_inode(file)->i_ino); seq_printf(m, "size:\t%zu\n", dmabuf->size); /* Don't count the temporary reference taken inside procfs seq_show */ seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
On 1/28/21 10:24 AM, Kalesh Singh wrote:
The dmabuf inode number allows userspace to uniquely identify the buffer and avoids a dependency on /proc/<pid>/fd/* when accounting per-process DMA buffer sizes.
Signed-off-by: Kalesh Singh kaleshsingh@google.com
drivers/dma-buf/dma-buf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 9ad6397aaa97..d869099ede83 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -414,6 +414,7 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file) { struct dma_buf *dmabuf = file->private_data;
- seq_printf(m, "dmabuf_inode_no:\t%lu\n", file_inode(file)->i_ino); seq_printf(m, "size:\t%zu\n", dmabuf->size); /* Don't count the temporary reference taken inside procfs seq_show */ seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
Hi,
The subject is a little confusing. It seems to say that the dmabuf inode no is being added to /proc/<pid>/fdino, but looking at the code, it is being added to /proc/dmabuf (or /proc/<pid>/dmabuf).
Please clarify.
and is the /proc file format documented anywhere?
thanks.
On 1/28/21 10:24 AM, Kalesh Singh wrote:
The dmabuf inode number allows userspace to uniquely identify the buffer and avoids a dependency on /proc/<pid>/fd/* when accounting per-process DMA buffer sizes.
Signed-off-by: Kalesh Singh kaleshsingh@google.com
drivers/dma-buf/dma-buf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 9ad6397aaa97..d869099ede83 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -414,6 +414,7 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file) { struct dma_buf *dmabuf = file->private_data;
- seq_printf(m, "dmabuf_inode_no:\t%lu\n", file_inode(file)->i_ino); seq_printf(m, "size:\t%zu\n", dmabuf->size); /* Don't count the temporary reference taken inside procfs seq_show */ seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
[resending; hopefully the first one didn't go anywhere]
Hi,
The subject is a little confusing. It seems to say that the dmabuf inode no is being added to /proc/<pid>/fdinfo, but looking at the code, it is being added to /proc/dmabuf (or /proc/<pid>/dmabuf).
Please clarify.
and is the /proc file format documented anywhere?
thanks.
On Thu, Jan 28, 2021 at 1:55 PM Randy Dunlap rdunlap@infradead.org wrote:
On 1/28/21 10:24 AM, Kalesh Singh wrote:
The dmabuf inode number allows userspace to uniquely identify the buffer and avoids a dependency on /proc/<pid>/fd/* when accounting per-process DMA buffer sizes.
Signed-off-by: Kalesh Singh kaleshsingh@google.com
drivers/dma-buf/dma-buf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 9ad6397aaa97..d869099ede83 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -414,6 +414,7 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file) { struct dma_buf *dmabuf = file->private_data;
seq_printf(m, "dmabuf_inode_no:\t%lu\n", file_inode(file)->i_ino); seq_printf(m, "size:\t%zu\n", dmabuf->size); /* Don't count the temporary reference taken inside procfs seq_show */ seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
[resending; hopefully the first one didn't go anywhere]
Hi,
The subject is a little confusing. It seems to say that the dmabuf inode no is being added to /proc/<pid>/fdinfo, but looking at the code, it is being added to /proc/dmabuf (or /proc/<pid>/dmabuf).
Please clarify.
and is the /proc file format documented anywhere?
Hi Randy,
The patch adds the inode number in /proc/<pid>/fdinfo/* and /proc/<pid>/task/<tid>/fdinfo/* for FDs corresponding to dmabufs. fdinfo is documented in proc man pages, but it doesn't include the dmabuf specific fields, so this will need to be updated. I hope this clarifies.
Thanks, Kalesh
thanks.
-- ~Randy
On Thu, Jan 28, 2021 at 1:24 PM Kalesh Singh kaleshsingh@google.com wrote:
Android captures per-process system memory state when certain low memory events (e.g a foreground app kill) occur, to identify potential memory hoggers. In order to measure how much memory a process actually consumes, it is necessary to include the DMA buffer sizes for that process in the memory accounting. Since the handle to DMA buffers are raw FDs, it is important to be able to identify which processes have FD references to a DMA buffer.
Currently, DMA buffer FDs can be accounted using /proc/<pid>/fd/* and /proc/<pid>/fdinfo -- both are only readable by the process owner, as follows:
- Do a readlink on each FD.
- If the target path begins with "/dmabuf", then the FD is a dmabuf FD.
- stat the file to get the dmabuf inode number.
- Read/ proc/<pid>/fdinfo/<fd>, to get the DMA buffer size.
Accessing other processes’ fdinfo requires root privileges. This limits the use of the interface to debugging environments and is not suitable for production builds. Granting root privileges even to a system process increases the attack surface and is highly undesirable.
This series proposes making the requirement to read fdinfo less strict with PTRACE_MODE_READ.
Hi everyone,
I will send v2 of this patch series. Please let me know if you have any other comments or feedback, that should be addressed in the new version.
Thanks, Kalesh
Kalesh Singh (2): procfs: Allow reading fdinfo with PTRACE_MODE_READ dmabuf: Add dmabuf inode no to fdinfo
drivers/dma-buf/dma-buf.c | 1 + fs/proc/base.c | 4 ++-- fs/proc/fd.c | 15 ++++++++++++++- 3 files changed, 17 insertions(+), 3 deletions(-)
-- 2.30.0.365.g02bc693789-goog
linaro-mm-sig@lists.linaro.org