Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access kernel memory that can fault

23 Feb 2019


      On Fri, 22 Feb 2019 15:56:20 -0800
Alexei Starovoitov alexei.starovoitov@gmail.com wrote:
...
On Fri, Feb 22, 2019 at 03:16:35PM -0800, Linus Torvalds wrote:
...
So a kernel pointer value of 0x12345678 could be a value kernel
pointer pointing to some random kmalloc'ed kernel memory, and a user
pointer value of 0x12345678 could be a valid _user_ pointer pointing
to some user mapping.
See?
If you access a user pointer, you need to use a user accessor function
(eg "get_user()"), while if you access a kernel pointer you need to
just dereference it directly (unless you can't trust it, in which case
you need to use a _different_ accessor function).
that was clear already.
Reading 0x12345678 via probe_kernel_read can return valid value
and via get_user() can return another valid value on _some_ architectures.
...
The fact that user and kernel pointers happen to be distinct on x86-64
(right now) is just a random implementation detail.
yes and my point that people already rely on this implementation detail.
Say we implement 
int bpf_probe_read(void *val, void *unsafe_ptr)
{
  if (probe_kernel_read(val, unsafe_ptr) == OK) {
     return 0;
  } else (get_user(val, unsafe_ptr) == OK) {
     return 0;
  } else {
     *val = 0;
     return -EFAULT;
  }
}
Note that we can not use get_user() form kprobe handler. If you use it,
you have to prepare fault_handler() and make bpf itself can be aborted.
So, maybe you can use probe_user_read().
Hmm, however, it still doesn't work correctly on "some" architecture,
since whether a pointer (address) points user-space or kernel-space
depends on the context. In kprobe/bpf, the context means where you
put the probe and which pointer you record.
I think only "__user" tag tells us which one is user-space. But 
unfortunately, that "__user" tag is only for compiler or checker, not
for runtime binary. Such useful attribute goes away when we execute it.
So, even if we introduce "ustring", ftrace/perf users has to decide to use
it by themselves. As far as I know, DWARF(debuginfo) also doesn't have
that attribute. So perf-probe can not help it from debuginfo.
(Maybe if we introduce C parser, it might be detected...)
...
It will preserve existing bpf_probe_read() behavior on x86.
If x86 implementation changes tomorrow then progs that read user
addresses may start failing randomly because first probe_kernel_read()
will be returning random values from kernel memory and that's no good,
but at least we won't be breaking them today, so we have time to
introduce bpf_user_read and bpf_kernel_read and folks have time to adopt them.
I see. I think bpf also has to introduce new bpf_probe_read_user() and
keep bpf_probe_read() for kernel dataa only.
...
Imo that's much better than making current bpf_probe_read() fail
on user addresses today and not providing a non disruptive path forward.
Agreed.
Thank you,
-- 
Masami Hiramatsu mhiramat@kernel.org

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access kernel memory that can fault