[PATCH] tracing: do not leak kernel addresses

List overview All Threads
Download

newer

older

Request for 4.4-stable:...

[PATCH] block:...

Mark Salyzyn

25 Jul 2018 25 Jul '18

8:22 p.m.

From: Nick Desaulniers ndesaulniers@google.com

Switch from 0x%lx to 0x%pK to print the kernel addresses.

Fixes: CVE-2017-0630 Signed-off-by: Mark Salyzyn salyzyn@android.com Cc: Nick Desaulniers ndesaulniers@google.com Cc: Steven Rostedt rostedt@goodmis.org Cc: Ingo Molnar mingo@redhat.com Cc: kernel-team@android.com Cc: stable@vger.kernel.org # 3.18, 4.4, 4.9, 4.14 Cc: linux-kernel@vger.kernel.org

--- kernel/trace/trace_printk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_printk.c b/kernel/trace/trace_printk.c index ad1d6164e946..93698023baf1 100644 --- a/kernel/trace/trace_printk.c +++ b/kernel/trace/trace_printk.c @@ -304,7 +304,7 @@ static int t_show(struct seq_file *m, void *v) if (!*fmt) return 0;

- seq_printf(m, "0x%lx : "", *(unsigned long *)fmt); + seq_printf(m, "0x%pK : "", *(unsigned long *)fmt);

/* * Tabs and new lines need to be converted.

-- 2.18.0.233.g985f88cf7e-goog

Show replies by date

Nick Desaulniers

25 Jul 25 Jul

9:14 p.m.

On Wed, Jul 25, 2018 at 1:23 PM Mark Salyzyn salyzyn@android.com wrote:

...

From: Nick Desaulniers ndesaulniers@google.com Signed-off-by: Mark Salyzyn salyzyn@android.com

Thanks for sending. You should take credit/authorship, and add my Reviewed-by: Nick Desaulniers ndesaulniers@google.com

-- Thanks, ~Nick Desaulniers

Steven Rostedt

26 Jul 26 Jul

1:07 a.m.

On Wed, 25 Jul 2018 13:22:36 -0700 Mark Salyzyn salyzyn@android.com wrote:

...

From: Nick Desaulniers ndesaulniers@google.com

Switch from 0x%lx to 0x%pK to print the kernel addresses.

Fixes: CVE-2017-0630

Wait!!!! This breaks perf and trace-cmd! They require this to be able to print various strings in trace events. This file is root read only, as the CVE says.

NAK for this fix. Come up with something that doesn't break perf and trace-cmd. That will not be trivial, as the format is stored in the ring buffer with an address, then referenced directly. It also handles trace_printk() functions that simply point to the string format itself.

A fix would require having a pointer be the same that is referenced inside the kernel as well as in this file. Maybe make the format string placed in a location that doesn't leak where the rest of the kernel exists?

-- Steve

...

Signed-off-by: Mark Salyzyn salyzyn@android.com Cc: Nick Desaulniers ndesaulniers@google.com Cc: Steven Rostedt rostedt@goodmis.org Cc: Ingo Molnar mingo@redhat.com Cc: kernel-team@android.com Cc: stable@vger.kernel.org # 3.18, 4.4, 4.9, 4.14 Cc: linux-kernel@vger.kernel.org

kernel/trace/trace_printk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_printk.c b/kernel/trace/trace_printk.c index ad1d6164e946..93698023baf1 100644 --- a/kernel/trace/trace_printk.c +++ b/kernel/trace/trace_printk.c @@ -304,7 +304,7 @@ static int t_show(struct seq_file *m, void *v) if (!*fmt) return 0;

seq_printf(m, "0x%lx : "", *(unsigned long *)fmt);

seq_printf(m, "0x%pK : "", *(unsigned long *)fmt);

/* * Tabs and new lines need to be converted.

Mark Salyzyn

3:14 p.m.

On 07/25/2018 06:07 PM, Steven Rostedt wrote:

...

On Wed, 25 Jul 2018 13:22:36 -0700 Mark Salyzyn salyzyn@android.com wrote:

...
From: Nick Desaulniers ndesaulniers@google.com

Switch from 0x%lx to 0x%pK to print the kernel addresses.

Fixes: CVE-2017-0630

Wait!!!! This breaks perf and trace-cmd! They require this to be able to print various strings in trace events. This file is root read only, as the CVE says.

NAK for this fix. Come up with something that doesn't break perf and trace-cmd. That will not be trivial, as the format is stored in the ring buffer with an address, then referenced directly. It also handles trace_printk() functions that simply point to the string format itself.

A fix would require having a pointer be the same that is referenced inside the kernel as well as in this file. Maybe make the format string placed in a location that doesn't leak where the rest of the kernel exists?

-- Steve

Thank you Steve, much appreciated feedback, I have asked the security developers to keep this in mind and come up with a correct fix.

The correct fix that meets your guidelines would _not_ be suitable for stable due to the invasiveness it sounds, only for the latest will such a rework make sense. As such, the fix proposed in this patch is the only one that meets the bar for stable patch simplicity, and merely(!) needs to state that if the fix is taken, perf and trace are broken.

Posting this patch publicly on the lists, that may never be applied, may be the limit of our responsibility for a fix to stable kernel releases, to be optionally applied by vendors concerned with this CVE criteria?

-- Mark

Steven Rostedt

3:22 p.m.

On Thu, 26 Jul 2018 08:14:08 -0700 Mark Salyzyn salyzyn@android.com wrote:

...

Thank you Steve, much appreciated feedback, I have asked the security developers to keep this in mind and come up with a correct fix.

The correct fix that meets your guidelines would _not_ be suitable for stable due to the invasiveness it sounds, only for the latest will such a rework make sense. As such, the fix proposed in this patch is the only one that meets the bar for stable patch simplicity, and merely(!) needs to state that if the fix is taken, perf and trace are broken.

Posting this patch publicly on the lists, that may never be applied, may be the limit of our responsibility for a fix to stable kernel releases, to be optionally applied by vendors concerned with this CVE criteria?

The patch breaks the code it touches. It makes it useless. If you want something for stable, add a command line parameter that just disables the creation of that file. Otherwise you will break usespace and that will be a definitely NAK from Linus, and for stable itself. This is a very minor security issue, and does not justify breaking userspace applications. I would be very upset if a new stable release broke both perf and trace-cmd's ability to read certain trace events.

-- Steve

Nick Desaulniers

4:32 p.m.

On Thu, Jul 26, 2018 at 8:22 AM Steven Rostedt rostedt@goodmis.org wrote:

...

On Thu, 26 Jul 2018 08:14:08 -0700 Mark Salyzyn salyzyn@android.com wrote:

...
Thank you Steve, much appreciated feedback, I have asked the security developers to keep this in mind and come up with a correct fix.

The correct fix that meets your guidelines would _not_ be suitable for stable due to the invasiveness it sounds, only for the latest will such a rework make sense. As such, the fix proposed in this patch is the only one that meets the bar for stable patch simplicity, and merely(!) needs to state that if the fix is taken, perf and trace are broken.

Posting this patch publicly on the lists, that may never be applied, may be the limit of our responsibility for a fix to stable kernel releases, to be optionally applied by vendors concerned with this CVE criteria?

The patch breaks the code it touches. It makes it useless.

Doesn't that depend on kptr_restrict, or would it be broken if kptr_restrict was set to 0?

...

If you want something for stable, add a command line parameter that just disables the creation of that file. Otherwise you will break usespace and that will be a definitely NAK from Linus, and for stable itself. This is a very minor security issue, and does not justify breaking userspace applications. I would be very upset if a new stable release broke both perf and trace-cmd's ability to read certain trace events.

I don't disagree.

-- Thanks, ~Nick Desaulniers

Steven Rostedt

4:37 p.m.

On Thu, 26 Jul 2018 09:32:07 -0700 Nick Desaulniers ndesaulniers@google.com wrote:

...

On Thu, Jul 26, 2018 at 8:22 AM Steven Rostedt rostedt@goodmis.org wrote:

...
On Thu, 26 Jul 2018 08:14:08 -0700 Mark Salyzyn salyzyn@android.com wrote:

...
Thank you Steve, much appreciated feedback, I have asked the security developers to keep this in mind and come up with a correct fix.

The correct fix that meets your guidelines would _not_ be suitable for stable due to the invasiveness it sounds, only for the latest will such a rework make sense. As such, the fix proposed in this patch is the only one that meets the bar for stable patch simplicity, and merely(!) needs to state that if the fix is taken, perf and trace are broken.

Posting this patch publicly on the lists, that may never be applied, may be the limit of our responsibility for a fix to stable kernel releases, to be optionally applied by vendors concerned with this CVE criteria?

The patch breaks the code it touches. It makes it useless.

Doesn't that depend on kptr_restrict, or would it be broken if kptr_restrict was set to 0?

Is that what governs the output of kallsyms?

-- Steve

Nick Desaulniers

4:59 p.m.

On Thu, Jul 26, 2018 at 9:37 AM Steven Rostedt rostedt@goodmis.org wrote:

...

On Thu, 26 Jul 2018 09:32:07 -0700 Nick Desaulniers ndesaulniers@google.com wrote:

...
On Thu, Jul 26, 2018 at 8:22 AM Steven Rostedt rostedt@goodmis.org wrote:

...
On Thu, 26 Jul 2018 08:14:08 -0700 Mark Salyzyn salyzyn@android.com wrote:

...
Thank you Steve, much appreciated feedback, I have asked the security developers to keep this in mind and come up with a correct fix.

The correct fix that meets your guidelines would _not_ be suitable for stable due to the invasiveness it sounds, only for the latest will such a rework make sense. As such, the fix proposed in this patch is the only one that meets the bar for stable patch simplicity, and merely(!) needs to state that if the fix is taken, perf and trace are broken.

Posting this patch publicly on the lists, that may never be applied, may be the limit of our responsibility for a fix to stable kernel releases, to be optionally applied by vendors concerned with this CVE criteria?

The patch breaks the code it touches. It makes it useless.

Doesn't that depend on kptr_restrict, or would it be broken if kptr_restrict was set to 0?

Is that what governs the output of kallsyms?

...

From my workstation:

$ cat /proc/kallsyms

prints a bunch of zero'd out addresses, while

$ sudo !!

prints out actual addresses. Looking at kernel/kallsyms.c, it seems that there's no use of %pK, but kallsyms_show_value() switches on kptr_restrict (and additional values):

/* * We show kallsyms information even to normal users if we've enabled * kernel profiling and are explicitly not paranoid (so kptr_restrict * is clear, and sysctl_perf_event_paranoid isn't set). * * Otherwise, require CAP_SYSLOG (assuming kptr_restrict isn't set to * block even that). */ int kallsyms_show_value(void) { switch (kptr_restrict) { case 0: if (kallsyms_for_perf()) return 1; /* fallthrough */ case 1: if (has_capability_noaudit(current, CAP_SYSLOG)) return 1; /* fallthrough */ default: return 0; } }

-- Thanks, ~Nick Desaulniers

Nick Desaulniers

9:56 p.m.

On Thu, Jul 26, 2018 at 9:59 AM Nick Desaulniers ndesaulniers@google.com wrote:

...

On Thu, Jul 26, 2018 at 9:37 AM Steven Rostedt rostedt@goodmis.org wrote:

...
On Thu, 26 Jul 2018 09:32:07 -0700 Nick Desaulniers ndesaulniers@google.com wrote:

...
On Thu, Jul 26, 2018 at 8:22 AM Steven Rostedt rostedt@goodmis.org wrote:

...
On Thu, 26 Jul 2018 08:14:08 -0700 Mark Salyzyn salyzyn@android.com wrote:

...
Thank you Steve, much appreciated feedback, I have asked the security developers to keep this in mind and come up with a correct fix.

The correct fix that meets your guidelines would _not_ be suitable for stable due to the invasiveness it sounds, only for the latest will such a rework make sense. As such, the fix proposed in this patch is the only one that meets the bar for stable patch simplicity, and merely(!) needs to state that if the fix is taken, perf and trace are broken.

Posting this patch publicly on the lists, that may never be applied, may be the limit of our responsibility for a fix to stable kernel releases, to be optionally applied by vendors concerned with this CVE criteria?

The patch breaks the code it touches. It makes it useless.

Doesn't that depend on kptr_restrict, or would it be broken if kptr_restrict was set to 0?

Is that what governs the output of kallsyms?

From my workstation:

$ cat /proc/kallsyms

prints a bunch of zero'd out addresses, while

$ sudo !!

prints out actual addresses. Looking at kernel/kallsyms.c, it seems that there's no use of %pK, but kallsyms_show_value() switches on kptr_restrict (and additional values):

/*

We show kallsyms information even to normal users if we've enabled

kernel profiling and are explicitly not paranoid (so kptr_restrict

is clear, and sysctl_perf_event_paranoid isn't set).

Otherwise, require CAP_SYSLOG (assuming kptr_restrict isn't set to

block even that).

*/ int kallsyms_show_value(void) { switch (kptr_restrict) { case 0: if (kallsyms_for_perf()) return 1; /* fallthrough */ case 1: if (has_capability_noaudit(current, CAP_SYSLOG)) return 1; /* fallthrough */ default: return 0; } }

What are folks thoughts on this: 1. create function show_symbols_for_perf() that replaces kallsyms_show_value(), maybe in linux/ftrace.c (since linux/ftrace.h is included in kernel/trace/trace_printk.c and kernel/kallsyms.c). 2. use new show_symbols_for_perf() in kernel/kallsyms.c and kernel/trace/trace_printk.c

Where the implementation of show_symbols_for_perf() is kallsyms_show_value() implementation-wise (just renamed since it's no longer kallsyms specific). Does that make sense, or should I just send a patch? Does it make sense to check whether kernel/trace/trace_printk.c#t_show() should print an address based on the same checks done in kallsyms_show_value()?

-- Thanks, ~Nick Desaulniers

Greg KH

3:31 p.m.

On Thu, Jul 26, 2018 at 08:14:08AM -0700, Mark Salyzyn wrote:

...

On 07/25/2018 06:07 PM, Steven Rostedt wrote:

...
On Wed, 25 Jul 2018 13:22:36 -0700 Mark Salyzyn salyzyn@android.com wrote:

...
From: Nick Desaulniers ndesaulniers@google.com

Switch from 0x%lx to 0x%pK to print the kernel addresses.

Fixes: CVE-2017-0630

Wait!!!! This breaks perf and trace-cmd! They require this to be able to print various strings in trace events. This file is root read only, as the CVE says.

NAK for this fix. Come up with something that doesn't break perf and trace-cmd. That will not be trivial, as the format is stored in the ring buffer with an address, then referenced directly. It also handles trace_printk() functions that simply point to the string format itself.

A fix would require having a pointer be the same that is referenced inside the kernel as well as in this file. Maybe make the format string placed in a location that doesn't leak where the rest of the kernel exists?

-- Steve

Thank you Steve, much appreciated feedback, I have asked the security developers to keep this in mind and come up with a correct fix.

The correct fix that meets your guidelines would _not_ be suitable for stable due to the invasiveness it sounds, only for the latest will such a rework make sense. As such, the fix proposed in this patch is the only one that meets the bar for stable patch simplicity, and merely(!) needs to state that if the fix is taken, perf and trace are broken.

Why would I take something for the stable trees that does not match what is upstream? It feels to me that this CVE is just invalid. Yes, root can read the kernel address, does that mean it is a problem? Only if you allow unprotected users to run with root privileges :)

What exactly is the problem here in the current kernel that you are trying to solve?

thanks,

greg k-h

Nick Desaulniers

4:52 p.m.

On Thu, Jul 26, 2018 at 8:32 AM Greg KH greg@kroah.com wrote:

...

On Thu, Jul 26, 2018 at 08:14:08AM -0700, Mark Salyzyn wrote:

...
On 07/25/2018 06:07 PM, Steven Rostedt wrote:

...
On Wed, 25 Jul 2018 13:22:36 -0700 Mark Salyzyn salyzyn@android.com wrote:

...
From: Nick Desaulniers ndesaulniers@google.com

Switch from 0x%lx to 0x%pK to print the kernel addresses.

Fixes: CVE-2017-0630

Wait!!!! This breaks perf and trace-cmd! They require this to be able to print various strings in trace events. This file is root read only, as the CVE says.

NAK for this fix. Come up with something that doesn't break perf and trace-cmd. That will not be trivial, as the format is stored in the ring buffer with an address, then referenced directly. It also handles trace_printk() functions that simply point to the string format itself.

A fix would require having a pointer be the same that is referenced inside the kernel as well as in this file. Maybe make the format string placed in a location that doesn't leak where the rest of the kernel exists?

-- Steve

Thank you Steve, much appreciated feedback, I have asked the security developers to keep this in mind and come up with a correct fix.

The correct fix that meets your guidelines would _not_ be suitable for stable due to the invasiveness it sounds, only for the latest will such a rework make sense. As such, the fix proposed in this patch is the only one that meets the bar for stable patch simplicity, and merely(!) needs to state that if the fix is taken, perf and trace are broken.

Why would I take something for the stable trees that does not match what is upstream? It feels to me that this CVE is just invalid. Yes, root can read the kernel address, does that mean it is a problem? Only if you allow unprotected users to run with root privileges :)

What exactly is the problem here in the current kernel that you are trying to solve?

See the section "Kernel addresses" in Documentation/security/self-protection. IIRC, the issue is that a process may have CAP_SYSLOG but not necessarily CAP_SYS_ADMIN (so it can read dmesg, but not necessarily issue a sysctl to change kptr_restrict), get compromised and used to leak kernel addresses, which can then be used to defeat KASLR.

-- Thanks, ~Nick Desaulniers

Steven Rostedt

10:15 p.m.

On Thu, 26 Jul 2018 09:52:11 -0700 Nick Desaulniers ndesaulniers@google.com wrote:

...

See the section "Kernel addresses" in Documentation/security/self-protection. IIRC, the issue is that a process may have CAP_SYSLOG but not necessarily CAP_SYS_ADMIN (so it can read dmesg, but not necessarily issue a sysctl to change kptr_restrict), get compromised and used to leak kernel addresses, which can then be used to defeat KASLR.

But the code doesn't go to dmesg. It's only available via /sys/kernel/debug/tracing/printk_formats which is only available via root. Nobody else has access to that directory.

-- Steve

Jordan Glover

27 Jul 27 Jul

12:07 p.m.

On July 27, 2018 12:15 AM, Steven Rostedt rostedt@goodmis.org wrote:

...

On Thu, 26 Jul 2018 09:52:11 -0700 Nick Desaulniers ndesaulniers@google.com wrote:

...
See the section "Kernel addresses" in Documentation/security/self-protection. IIRC, the issue is that a process may have CAP_SYSLOG but not necessarily CAP_SYS_ADMIN (so it can read dmesg, but not necessarily issue a sysctl to change kptr_restrict), get compromised and used to leak kernel addresses, which can then be used to defeat KASLR.

But the code doesn't go to dmesg. It's only available via /sys/kernel/debug/tracing/printk_formats which is only available via root. Nobody else has access to that directory.

-- Steve

I think the point was that when we take capabilities into account the root privileges aren't unequivocal anymore. The 'root' owned process with only 'CAP_SYSLOG' shouldn't have access to /sys/kernel/debug/tracing/printk_formats

Jordan

Jann Horn

1:40 p.m.

On Fri, Jul 27, 2018 at 2:07 PM Jordan Glover Golden_Miller83@protonmail.ch wrote:

...

On July 27, 2018 12:15 AM, Steven Rostedt rostedt@goodmis.org wrote:

...
On Thu, 26 Jul 2018 09:52:11 -0700 Nick Desaulniers ndesaulniers@google.com wrote:

...
See the section "Kernel addresses" in Documentation/security/self-protection. IIRC, the issue is that a process may have CAP_SYSLOG but not necessarily CAP_SYS_ADMIN (so it can read dmesg, but not necessarily issue a sysctl to change kptr_restrict), get compromised and used to leak kernel addresses, which can then be used to defeat KASLR.

But the code doesn't go to dmesg. It's only available via /sys/kernel/debug/tracing/printk_formats which is only available via root. Nobody else has access to that directory.

-- Steve

I think the point was that when we take capabilities into account the root privileges aren't unequivocal anymore. The 'root' owned process with only 'CAP_SYSLOG' shouldn't have access to /sys/kernel/debug/tracing/printk_formats

Then they shouldn't have access to debugfs at all, right?

Steven Rostedt

1:47 p.m.

On Fri, 27 Jul 2018 15:40:32 +0200 Jann Horn jannh@google.com wrote:

...

...
...
But the code doesn't go to dmesg. It's only available via /sys/kernel/debug/tracing/printk_formats which is only available via root. Nobody else has access to that directory.

-- Steve

I think the point was that when we take capabilities into account the root privileges aren't unequivocal anymore. The 'root' owned process with only 'CAP_SYSLOG' shouldn't have access to /sys/kernel/debug/tracing/printk_formats

Then they shouldn't have access to debugfs at all, right?

That's what I'm thinking.

-- Steve

Nick Desaulniers

6:13 p.m.

On Fri, Jul 27, 2018 at 6:47 AM Steven Rostedt rostedt@goodmis.org wrote:

...

On Fri, 27 Jul 2018 15:40:32 +0200 Jann Horn jannh@google.com wrote:

...
...
...
But the code doesn't go to dmesg. It's only available via /sys/kernel/debug/tracing/printk_formats which is only available via root. Nobody else has access to that directory.

Oh, sorry, you're right. We're not printing an address to dmesg, but to a sysfs node. If you must have CAP_SYS_ADMIN to read this dir, then printk's %pK wont save you, because then you can modify kptr_restrict with sysctl.

...

...
...
I think the point was that when we take capabilities into account the root privileges aren't unequivocal anymore. The 'root' owned process with only 'CAP_SYSLOG' shouldn't have access to /sys/kernel/debug/tracing/printk_formats

Then they shouldn't have access to debugfs at all, right?

That's what I'm thinking.

I found the internal bug report (reported Jan '17, you'll have to forgive me if my memory of the issue is hazy, or if the fix used at the time wasn't perfect), which was reported against the Nexus 6.

...

From the report, it was possible to `cat

/sys/kernel/debug/tracing/printk_formats` without being root, which I can't do on my workstations much more modern kernel (Nexus 6 was 3.10). So I guess the question is what governs access to files below /sys/kernel/debug, and why was it missing from those kernels? I assume some check was added, but either not backported to 3.10 stable (or more likely not pulled in to Nexus 6's kernel through stable; Android is now in a much better place for that kind of issue).

-- Thanks, ~Nick Desaulniers

Steven Rostedt

6:31 p.m.

On Fri, 27 Jul 2018 11:13:51 -0700 Nick Desaulniers ndesaulniers@google.com wrote:

...

I found the internal bug report (reported Jan '17, you'll have to forgive me if my memory of the issue is hazy, or if the fix used at the time wasn't perfect), which was reported against the Nexus 6.

...
From the report, it was possible to `cat

/sys/kernel/debug/tracing/printk_formats` without being root, which I can't do on my workstations much more modern kernel (Nexus 6 was 3.10). So I guess the question is what governs access to files below /sys/kernel/debug, and why was it missing from those kernels? I assume some check was added, but either not backported to 3.10 stable (or more likely not pulled in to Nexus 6's kernel through stable; Android is now in a much better place for that kind of issue).

As of commit 82aceae4f0d4 ("debugfs: more tightly restrict default mount mode") /sys/kernel/debug has been default mounted as 0700 (root only). But that was introduced in 3.7. Not sure why your 3.10 kernel didn't have that. Perhaps there's another commit that fixed permissions not being inherited?

-- Steve

Mark Salyzyn

6:41 p.m.

Any system can chose to change the permissions of a sysfs node, default, DAC (and MAC) is just layers of multi-level security (or lack thereof). As well intentioned as a default DAC is in the kernel, leaking kernel addresses is still an attack surface that we want to close tightly.

For instance on Android:

chmod 0755 /sys/kernel/debug/tracing

is in the common init.rc file ...

If DAC has been adjusted at runtime to permit access to the node, I would think that if the caller does not have all the credentials/capabilities, we do want the addresses to be abstracted by the kernel.

-- Mark

On Fri, Jul 27, 2018 at 11:31 AM, Steven Rostedt rostedt@goodmis.org wrote:

...

On Fri, 27 Jul 2018 11:13:51 -0700 Nick Desaulniers ndesaulniers@google.com wrote:

...
I found the internal bug report (reported Jan '17, you'll have to forgive me if my memory of the issue is hazy, or if the fix used at the time wasn't perfect), which was reported against the Nexus 6.

...
From the report, it was possible to `cat

/sys/kernel/debug/tracing/printk_formats` without being root, which I can't do on my workstations much more modern kernel (Nexus 6 was 3.10). So I guess the question is what governs access to files below /sys/kernel/debug, and why was it missing from those kernels? I assume some check was added, but either not backported to 3.10 stable (or more likely not pulled in to Nexus 6's kernel through stable; Android is now in a much better place for that kind of issue).

As of commit 82aceae4f0d4 ("debugfs: more tightly restrict default mount mode") /sys/kernel/debug has been default mounted as 0700 (root only). But that was introduced in 3.7. Not sure why your 3.10 kernel didn't have that. Perhaps there's another commit that fixed permissions not being inherited?

-- Steve

-- You received this message because you are subscribed to the Google Groups "kernel-team" group. To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.

Jann Horn

6:47 p.m.

On Fri, Jul 27, 2018 at 8:41 PM Mark Salyzyn salyzyn@google.com wrote:

...

Any system can chose to change the permissions of a sysfs node, default, DAC (and MAC) is just layers of multi-level security (or lack thereof). As well intentioned as a default DAC is in the kernel, leaking kernel addresses is still an attack surface that we want to close tightly.

For instance on Android:
 chmod 0755 /sys/kernel/debug/tracing
is in the common init.rc file ...

If DAC has been adjusted at runtime to permit access to the node, I would think that if the caller does not have all the credentials/capabilities, we do want the addresses to be abstracted by the kernel.

If you adjust the access controls on debugfs to permit things that aren't possible upstream, you may have to add new access controls to ensure that that doesn't lead to security issues. And, in fact, you did:

walleye:/ # ls -laZ /sys/kernel/debug total 0 drwxr-xr-x 100 root root u:object_r:debugfs:s0 0 2018-07-27 18:08 . drwxr-xr-x 19 root root u:object_r:sysfs:s0 0 1970-06-04 10:30 .. [...] drwxr-xr-x 6 root root u:object_r:debugfs_tracing:s0 0 1970-01-01 01:00 tracing drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 tsens drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 tzdbg drwxr-xr-x 4 root root u:object_r:debugfs_ufs:s0 0 1970-01-01 01:00 ufshcd0 drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 usb drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 usb-pdphy drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 usb_diag drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 vmem -r--r--r-- 1 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 wakeup_sources drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 2018-07-27 18:07 wcd_spi drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 2018-07-27 18:07 wdsp0 drwxr-xr-x 2 root root u:object_r:debugfs_wlan:s0 0 2018-07-27 18:07 wlan0 drwxr-xr-x 3 root root u:object_r:debugfs:s0 0 2018-07-27 18:07 wlan_qdf

Stuff in the debugfs mount is labeled as "debugfs", with a few exceptions. And the SELinux policy locks down access to debugfs:

public/domain.te:neverallow { domain -init -vendor_init -system_server -dumpstate } debugfs:file no_rw_file_perms;

...

On Fri, Jul 27, 2018 at 11:31 AM, Steven Rostedt rostedt@goodmis.org wrote:

...
On Fri, 27 Jul 2018 11:13:51 -0700 Nick Desaulniers ndesaulniers@google.com wrote:

...
I found the internal bug report (reported Jan '17, you'll have to forgive me if my memory of the issue is hazy, or if the fix used at the time wasn't perfect), which was reported against the Nexus 6.

...
From the report, it was possible to `cat

/sys/kernel/debug/tracing/printk_formats` without being root, which I can't do on my workstations much more modern kernel (Nexus 6 was 3.10). So I guess the question is what governs access to files below /sys/kernel/debug, and why was it missing from those kernels? I assume some check was added, but either not backported to 3.10 stable (or more likely not pulled in to Nexus 6's kernel through stable; Android is now in a much better place for that kind of issue).

As of commit 82aceae4f0d4 ("debugfs: more tightly restrict default mount mode") /sys/kernel/debug has been default mounted as 0700 (root only). But that was introduced in 3.7. Not sure why your 3.10 kernel didn't have that. Perhaps there's another commit that fixed permissions not being inherited?

-- Steve

-- You received this message because you are subscribed to the Google Groups "kernel-team" group. To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.

Jann Horn

6:58 p.m.

+cc jeffv

On Fri, Jul 27, 2018 at 8:47 PM Jann Horn jannh@google.com wrote:

...

On Fri, Jul 27, 2018 at 8:41 PM Mark Salyzyn salyzyn@google.com wrote:

...
Any system can chose to change the permissions of a sysfs node, default, DAC (and MAC) is just layers of multi-level security (or lack thereof). As well intentioned as a default DAC is in the kernel, leaking kernel addresses is still an attack surface that we want to close tightly.

For instance on Android:
 chmod 0755 /sys/kernel/debug/tracing
is in the common init.rc file ...

If DAC has been adjusted at runtime to permit access to the node, I would think that if the caller does not have all the credentials/capabilities, we do want the addresses to be abstracted by the kernel.
If you adjust the access controls on debugfs to permit things that aren't possible upstream, you may have to add new access controls to ensure that that doesn't lead to security issues. And, in fact, you did:

walleye:/ # ls -laZ /sys/kernel/debug total 0 drwxr-xr-x 100 root root u:object_r:debugfs:s0 0 2018-07-27 18:08 . drwxr-xr-x 19 root root u:object_r:sysfs:s0 0 1970-06-04 10:30 .. [...] drwxr-xr-x 6 root root u:object_r:debugfs_tracing:s0 0 1970-01-01 01:00 tracing drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 tsens drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 tzdbg drwxr-xr-x 4 root root u:object_r:debugfs_ufs:s0 0 1970-01-01 01:00 ufshcd0 drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 usb drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 usb-pdphy drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 usb_diag drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 vmem -r--r--r-- 1 root root u:object_r:debugfs:s0 0 1970-01-01 01:00 wakeup_sources drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 2018-07-27 18:07 wcd_spi drwxr-xr-x 2 root root u:object_r:debugfs:s0 0 2018-07-27 18:07 wdsp0 drwxr-xr-x 2 root root u:object_r:debugfs_wlan:s0 0 2018-07-27 18:07 wlan0 drwxr-xr-x 3 root root u:object_r:debugfs:s0 0 2018-07-27 18:07 wlan_qdf

Stuff in the debugfs mount is labeled as "debugfs", with a few exceptions. And the SELinux policy locks down access to debugfs:

public/domain.te:neverallow { domain -init -vendor_init -system_server -dumpstate } debugfs:file no_rw_file_perms;

And yes, if you check from an ADB shell, you can still access the mentioned file even on walleye:

walleye:/ $ ls -lZ /sys/kernel/debug/tracing/printk_formats -r--r--r-- 1 root root u:object_r:debugfs_tracing:s0 0 1970-01-01 01:00 /sys/kernel/debug/tracing/printk_formats walleye:/ $ cat /sys/kernel/debug/tracing/printk_formats 0xffffff9c60ba04de : "Rescheduling interrupts" 0xffffff9c60ba04f6 : "Function call interrupts" 0xffffff9c60ba050f : "CPU stop interrupts" 0xffffff9c60ba0523 : "Timer broadcast interrupts" 0xffffff9c60ba053e : "IRQ work interrupts" 0xffffff9c60ba0552 : "CPU wakeup interrupts" 0xffffff9c60ba0568 : "CPU backtrace" 0xffffff9c619a6600 : "rcu_sched" 0xffffff9c619a6a40 : "rcu_bh" 0xffffff9c619a6ef8 : "rcu_preempt"

But that's only because you're coming from "shell" context, and "shell" context has explicitly been granted access to files labeled as debugfs_tracing:

# systrace support - allow atrace to run allow shell debugfs_tracing_debug:dir r_dir_perms; allow shell debugfs_tracing:dir r_dir_perms; allow shell debugfs_tracing:file rw_file_perms; allow shell debugfs_trace_marker:file getattr; allow shell atrace_exec:file rx_file_perms;

Normal apps can't access it, AFAICS.

If you think that the contents of this particular file should not be exposed, because you think that even someone with ADB access or in traceur_app context shouldn't be able to see those pointers, then you may wish to make a change to how you're labeling tracefs files.

Theodore Y. Ts'o

7:54 p.m.

More generally, stupid question, but does Android *really* need to have debugfs mounted? And if so, can we figure out what facilities that are needed and can we find some other way of meeting those requirements? - Ted

Steven Rostedt

8:11 p.m.

On Fri, 27 Jul 2018 15:54:16 -0400 "Theodore Y. Ts'o" tytso@mit.edu wrote:

...

More generally, stupid question, but does Android *really* need to have debugfs mounted? And if so, can we figure out what facilities that are needed and can we find some other way of meeting those requirements?

I do know that they have applications that use ftrace. But then again, the ftrace files are under its own tracefs file system (that just happens to be mounted when debugfs is). That said, I would assume that other Android utilities are using other debugfs files for system status and such.

-- Steve

Theodore Y. Ts'o

8:21 p.m.

On Fri, Jul 27, 2018 at 04:11:03PM -0400, Steven Rostedt wrote:

...

That said, I would assume that other Android utilities are using other debugfs files for system status and such.

Yeah, I know we probably have lost the "debugfs is only for debugging and has no place in a production system" battle, and we should just move on and assume we need to completely harden all of debugfs. But it's worth at least *asking* whether or not the use of debugfs for Android can be avoided....

- Ted

Steven Rostedt

8:53 p.m.

On Fri, 27 Jul 2018 16:21:14 -0400 "Theodore Y. Ts'o" tytso@mit.edu wrote:

...

On Fri, Jul 27, 2018 at 04:11:03PM -0400, Steven Rostedt wrote:

...
That said, I would assume that other Android utilities are using other debugfs files for system status and such.

Yeah, I know we probably have lost the "debugfs is only for debugging and has no place in a production system" battle, and we should just move on and assume we need to completely harden all of debugfs. But it's worth at least *asking* whether or not the use of debugfs for Android can be avoided....

Perhaps we should have a way to disable directories in debugfs at boot? That way, people can only have what is needed. The reason I created tracefs, is because I was asked to so that tracing utilities could be enabled without bringing in all of debugfs itself. But now it appears there's more there that makes it have to be mounted.

-- Steve

Sandeep Patil

10:05 p.m.

On Fri, Jul 27, 2018 at 04:21:14PM -0400, Theodore Y. Ts'o wrote:

...

On Fri, Jul 27, 2018 at 04:11:03PM -0400, Steven Rostedt wrote:

...
That said, I would assume that other Android utilities are using other debugfs files for system status and such.

As of today, I think a lot of information in 'bugreports' is read out of debugfs (including things like binder stats). We do have a plan to change that.

...

Yeah, I know we probably have lost the "debugfs is only for debugging and has no place in a production system" battle, and we should just move on and assume we need to completely harden all of debugfs. But it's worth at least *asking* whether or not the use of debugfs for Android can be avoided....

Indeed, I think it can. However, the problem is the last time I tried to remove this a whole bunch of things just broke. So, it wasn't about losing a functionality here and there. Agree, we need to clean up platform to not use debugfs first. Then we can expect Apps or other native processes to not rely on debugfs at all.

The work is in progress..[1]

- ssp

1] https://source.android.com/devices/architecture/kernel/modular-kernels#debug...

...

			- Ted
-- You received this message because you are subscribed to the Google Groups "kernel-team" group. To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.

Theodore Y. Ts'o

28 Jul 28 Jul

12:04 a.m.

On Fri, Jul 27, 2018 at 03:05:43PM -0700, Sandeep Patil wrote:

...

On Fri, Jul 27, 2018 at 04:21:14PM -0400, Theodore Y. Ts'o wrote:

...
On Fri, Jul 27, 2018 at 04:11:03PM -0400, Steven Rostedt wrote:

...
That said, I would assume that other Android utilities are using other debugfs files for system status and such.

As of today, I think a lot of information in 'bugreports' is read out of debugfs (including things like binder stats). We do have a plan to change that.

Hmm, if it's only for bugreports, maybe it can be only mounted when about root processes getting tricked into reading from debugfs.

...

Indeed, I think it can. However, the problem is the last time I tried to remove this a whole bunch of things just broke. So, it wasn't about losing a functionality here and there. Agree, we need to clean up platform to not use debugfs first. Then we can expect Apps or other native processes to not rely on debugfs at all.

Is Android controlling access to debugfs files via SELinux? If so, then access to debugfs can be gradually cranked down as use cases are removed.

- Ted

Sandeep Patil

30 Jul 30 Jul

2:35 p.m.

On Fri, Jul 27, 2018 at 08:04:28PM -0400, Theodore Y. Ts'o wrote:

...

On Fri, Jul 27, 2018 at 03:05:43PM -0700, Sandeep Patil wrote:

...
On Fri, Jul 27, 2018 at 04:21:14PM -0400, Theodore Y. Ts'o wrote:

...
On Fri, Jul 27, 2018 at 04:11:03PM -0400, Steven Rostedt wrote:

...
That said, I would assume that other Android utilities are using other debugfs files for system status and such.

As of today, I think a lot of information in 'bugreports' is read out of debugfs (including things like binder stats). We do have a plan to change that.

Hmm, if it's only for bugreports, maybe it can be only mounted when about root processes getting tricked into reading from debugfs.

Yes, that's an interesting idea. May be a quicker way to get ourselves rid of relying on debugfs. We need some platform cleanup to remove all debugfs accessing code that's not "debug only" first. That work has been ongoing ..

...

...
Indeed, I think it can. However, the problem is the last time I tried to remove this a whole bunch of things just broke. So, it wasn't about losing a functionality here and there. Agree, we need to clean up platform to not use debugfs first. Then we can expect Apps or other native processes to not rely on debugfs at all.

Is Android controlling access to debugfs files via SELinux? If so, then access to debugfs can be gradually cranked down as use cases are removed.

Yes, that's what we've done now, so we know where the code is that depends on it and working on moving it out. New domains aren't allowed to rely on debugfs now.

- ssp

...

				- Ted

2725

days inactive

2730

days old

linux-stable-mirror@lists.linaro.org

26 comments

participants

tags (0)

participants (9)

Greg KH
Jann Horn
Jordan Glover
Mark Salyzyn
Mark Salyzyn
Nick Desaulniers
Sandeep Patil
Steven Rostedt
Theodore Y. Ts'o