Hi there,
I found a nullptr dereference in perf subsystem and it affects at least v5.10 and v6.1 stable trees. (the same poc cannot trigger the crash in the mainline).
I fail to find the root cause the bug. All I know is that it is a race condition in the logic of moving_groups from pure software-based perf events to hardware ones. More specifically, when we add a hardware perf event to a software event group, it will trigger a "move_group" logic in perf_event_open. When the "move_group" logic happens, it will remove all existing events from the context first using `perf_remove_from_context`. And it will invoke `__perf_remove_from_context` through `event_function_call`.
Notice that `event_function_call` is defined as follow: ~~~ static void event_function_call(struct perf_event *event, event_f func, void *data) { ... func(event, NULL, ctx, data); ... } ~~~ This means `__perf_remove_from_context` will be invoked with cpuctx==NULL, which leads to invoking `event_sched_out` with cpuctx == NULL. At this moment, as long as the event is active, we are going to invoke the `if (event->attr.exclusive || !cpuctx->active_oncpu)` logic, which is a null pointer deference.
I don't know the proper way to patch this bug. So I'm asking for help.
A reproducer is attached to this email.
Best, Kyle Zeng
On Fri, Oct 06, 2023 at 10:47:57PM -0700, Kyle Zeng wrote:
Hi there,
I found a nullptr dereference in perf subsystem and it affects at least v5.10 and v6.1 stable trees. (the same poc cannot trigger the crash in the mainline).
Can you use 'git bisect' to find the patch that fixes this?
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org