[PATCH v1] trace: Fix race in trace_open and buffer resize call

List overview All Threads
Download

newer

older

FAILED: patch "[PATCH] btrfs: do...

[PATCH v2] PHY: Ingenic: Fixes:...

Gaurav Kohli

6 Oct 2020 6 Oct '20

9:33 a.m.

Below race can come, if trace_open and resize of cpu buffer is running parallely on different cpus CPUX CPUY ring_buffer_resize atomic_read(&buffer->resize_disabled) tracing_open tracing_reset_online_cpus ring_buffer_reset_cpu rb_reset_cpu rb_update_pages remove/insert pages resetting pointer

This race can cause data abort or some times infinte loop in rb_remove_pages and rb_insert_pages while checking pages for sanity.

Take buffer lock to fix this.

Signed-off-by: Gaurav Kohli gkohli@codeaurora.org Cc: stable@vger.kernel.org --- Changes since v0: -Addressed Steven's review comments.

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 93ef0ab..15bf28b 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -4866,6 +4866,9 @@ void ring_buffer_reset_cpu(struct trace_buffer *buffer, int cpu) if (!cpumask_test_cpu(cpu, buffer->cpumask)) return;

+ /* prevent another thread from changing buffer sizes */ + mutex_lock(&buffer->mutex); + atomic_inc(&cpu_buffer->resize_disabled); atomic_inc(&cpu_buffer->record_disabled);

@@ -4876,6 +4879,8 @@ void ring_buffer_reset_cpu(struct trace_buffer *buffer, int cpu)

atomic_dec(&cpu_buffer->record_disabled); atomic_dec(&cpu_buffer->resize_disabled); + + mutex_unlock(&buffer->mutex); } EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu);

@@ -4889,6 +4894,9 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer) struct ring_buffer_per_cpu *cpu_buffer; int cpu;

+ /* prevent another thread from changing buffer sizes */ + mutex_lock(&buffer->mutex); + for_each_online_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu];

@@ -4907,6 +4915,8 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer) atomic_dec(&cpu_buffer->record_disabled); atomic_dec(&cpu_buffer->resize_disabled); } + + mutex_unlock(&buffer->mutex); }

/**

-- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project

Show replies by date

Denis Efremov

21 Jan 21 Jan

2:30 p.m.

Hi,

This patch (CVE-2020-27825) was tagged with Fixes: b23d7a5f4a07a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")

I'm not an expert here but it seems like b23d7a5f4a07a only refactored ring_buffer_reset_cpu() by introducing reset_disabled_cpu_buffer() without significant changes. Hence, mutex_lock(&buffer->mutex)/mutex_unlock(&buffer->mutex) can be backported further than b23d7a5f4a07a~ and to all LTS kernels. Is b23d7a5f4a07a the actual cause of the bug?

Thanks, Denis

On 10/6/20 12:33 PM, Gaurav Kohli wrote:

...

Below race can come, if trace_open and resize of cpu buffer is running parallely on different cpus CPUX CPUY ring_buffer_resize atomic_read(&buffer->resize_disabled) tracing_open tracing_reset_online_cpus ring_buffer_reset_cpu rb_reset_cpu rb_update_pages remove/insert pages resetting pointer

This race can cause data abort or some times infinte loop in rb_remove_pages and rb_insert_pages while checking pages for sanity.

Take buffer lock to fix this.

Signed-off-by: Gaurav Kohli gkohli@codeaurora.org Cc: stable@vger.kernel.org

Changes since v0: -Addressed Steven's review comments.

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 93ef0ab..15bf28b 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -4866,6 +4866,9 @@ void ring_buffer_reset_cpu(struct trace_buffer *buffer, int cpu) if (!cpumask_test_cpu(cpu, buffer->cpumask)) return;

/* prevent another thread from changing buffer sizes */

mutex_lock(&buffer->mutex);

atomic_inc(&cpu_buffer->resize_disabled); atomic_inc(&cpu_buffer->record_disabled);

@@ -4876,6 +4879,8 @@ void ring_buffer_reset_cpu(struct trace_buffer *buffer, int cpu) atomic_dec(&cpu_buffer->record_disabled); atomic_dec(&cpu_buffer->resize_disabled);

mutex_unlock(&buffer->mutex);

} EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu); @@ -4889,6 +4894,9 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer) struct ring_buffer_per_cpu *cpu_buffer; int cpu;

/* prevent another thread from changing buffer sizes */

mutex_lock(&buffer->mutex);

for_each_online_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu];

@@ -4907,6 +4915,8 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer) atomic_dec(&cpu_buffer->record_disabled); atomic_dec(&cpu_buffer->resize_disabled); }

mutex_unlock(&buffer->mutex);

} /**

Steven Rostedt

7:09 p.m.

On Thu, 21 Jan 2021 17:30:40 +0300 Denis Efremov efremov@linux.com wrote:

...

Hi,

This patch (CVE-2020-27825) was tagged with Fixes: b23d7a5f4a07a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")

I'm not an expert here but it seems like b23d7a5f4a07a only refactored ring_buffer_reset_cpu() by introducing reset_disabled_cpu_buffer() without significant changes. Hence, mutex_lock(&buffer->mutex)/mutex_unlock(&buffer->mutex) can be backported further than b23d7a5f4a07a~ and to all LTS kernels. Is b23d7a5f4a07a the actual cause of the bug?

Ug, that looks to be a mistake. Looking back at the thread about this:

https://lore.kernel.org/linux-arm-msm/20200915141304.41fa7c30@gandalf.local....

That should have been:

Depends-on: b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")

-- Steve

Denis Efremov

8:15 p.m.

On 1/21/21 10:09 PM, Steven Rostedt wrote:

...

On Thu, 21 Jan 2021 17:30:40 +0300 Denis Efremov efremov@linux.com wrote:

...
Hi,

This patch (CVE-2020-27825) was tagged with Fixes: b23d7a5f4a07a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")

I'm not an expert here but it seems like b23d7a5f4a07a only refactored ring_buffer_reset_cpu() by introducing reset_disabled_cpu_buffer() without significant changes. Hence, mutex_lock(&buffer->mutex)/mutex_unlock(&buffer->mutex) can be backported further than b23d7a5f4a07a~ and to all LTS kernels. Is b23d7a5f4a07a the actual cause of the bug?

Ug, that looks to be a mistake. Looking back at the thread about this:

https://lore.kernel.org/linux-arm-msm/20200915141304.41fa7c30@gandalf.local....

I see from the link that it was planned to backport the patch to LTS kernels:

...

Actually we are seeing issue in older kernel like 4.19/4.14/5.4 and there below patch was not present in stable branches: Commit b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")

The point is that it's not backported yet. Maybe because of Fixes tag. I've discovered this while trying to formalize CVE-2020-27825 bug in cvehound https://github.com/evdenis/cvehound/blob/master/cvehound/cve/CVE-2020-27825....

I think that the backport to the 4.4+ should be something like:

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 547a3a5ac57b..2171b377bbc1 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -4295,6 +4295,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) if (!cpumask_test_cpu(cpu, buffer->cpumask)) return;

+ mutex_lock(&buffer->mutex); + atomic_inc(&buffer->resize_disabled); atomic_inc(&cpu_buffer->record_disabled);

@@ -4317,6 +4319,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu)

atomic_dec(&cpu_buffer->record_disabled); atomic_dec(&buffer->resize_disabled); + + mutex_unlock(&buffer->mutex); } EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu);

Thanks, Denis

Steven Rostedt

8:37 p.m.

On Thu, 21 Jan 2021 23:15:22 +0300 Denis Efremov efremov@linux.com wrote:

...

On 1/21/21 10:09 PM, Steven Rostedt wrote:

...
On Thu, 21 Jan 2021 17:30:40 +0300 Denis Efremov efremov@linux.com wrote:

...
Hi,

This patch (CVE-2020-27825) was tagged with Fixes: b23d7a5f4a07a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")

I'm not an expert here but it seems like b23d7a5f4a07a only refactored ring_buffer_reset_cpu() by introducing reset_disabled_cpu_buffer() without significant changes. Hence, mutex_lock(&buffer->mutex)/mutex_unlock(&buffer->mutex) can be backported further than b23d7a5f4a07a~ and to all LTS kernels. Is b23d7a5f4a07a the actual cause of the bug?

Ug, that looks to be a mistake. Looking back at the thread about this:

https://lore.kernel.org/linux-arm-msm/20200915141304.41fa7c30@gandalf.local....

I see from the link that it was planned to backport the patch to LTS kernels:

...
Actually we are seeing issue in older kernel like 4.19/4.14/5.4 and there below patch was not present in stable branches: Commit b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")

The point is that it's not backported yet. Maybe because of Fixes tag. I've discovered this while trying to formalize CVE-2020-27825 bug in cvehound https://github.com/evdenis/cvehound/blob/master/cvehound/cve/CVE-2020-27825....

I think that the backport to the 4.4+ should be something like:

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 547a3a5ac57b..2171b377bbc1 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -4295,6 +4295,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) if (!cpumask_test_cpu(cpu, buffer->cpumask)) return;

mutex_lock(&buffer->mutex);

atomic_inc(&buffer->resize_disabled); atomic_inc(&cpu_buffer->record_disabled);

@@ -4317,6 +4319,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) atomic_dec(&cpu_buffer->record_disabled); atomic_dec(&buffer->resize_disabled);

mutex_unlock(&buffer->mutex);

} EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu);

That could possibly work.

-- Steve

Greg KH

22 Jan 22 Jan

10:59 a.m.

On Thu, Jan 21, 2021 at 03:37:32PM -0500, Steven Rostedt wrote:

...

On Thu, 21 Jan 2021 23:15:22 +0300 Denis Efremov efremov@linux.com wrote:

...
On 1/21/21 10:09 PM, Steven Rostedt wrote:

...
On Thu, 21 Jan 2021 17:30:40 +0300 Denis Efremov efremov@linux.com wrote:

...
Hi,

This patch (CVE-2020-27825) was tagged with Fixes: b23d7a5f4a07a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")

I'm not an expert here but it seems like b23d7a5f4a07a only refactored ring_buffer_reset_cpu() by introducing reset_disabled_cpu_buffer() without significant changes. Hence, mutex_lock(&buffer->mutex)/mutex_unlock(&buffer->mutex) can be backported further than b23d7a5f4a07a~ and to all LTS kernels. Is b23d7a5f4a07a the actual cause of the bug?

Ug, that looks to be a mistake. Looking back at the thread about this:

https://lore.kernel.org/linux-arm-msm/20200915141304.41fa7c30@gandalf.local....

I see from the link that it was planned to backport the patch to LTS kernels:

...
Actually we are seeing issue in older kernel like 4.19/4.14/5.4 and there below patch was not present in stable branches: Commit b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")

The point is that it's not backported yet. Maybe because of Fixes tag. I've discovered this while trying to formalize CVE-2020-27825 bug in cvehound https://github.com/evdenis/cvehound/blob/master/cvehound/cve/CVE-2020-27825....

I think that the backport to the 4.4+ should be something like:

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 547a3a5ac57b..2171b377bbc1 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -4295,6 +4295,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) if (!cpumask_test_cpu(cpu, buffer->cpumask)) return;

mutex_lock(&buffer->mutex);

atomic_inc(&buffer->resize_disabled); atomic_inc(&cpu_buffer->record_disabled);

@@ -4317,6 +4319,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) atomic_dec(&cpu_buffer->record_disabled); atomic_dec(&buffer->resize_disabled);

mutex_unlock(&buffer->mutex);

} EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu);

That could possibly work.

Ok, so what can I do here? Can someone resend this as a backport to the other stable kernels in this way so that I can queue it up?

thanks,

greg k-h

Gaurav Kohli

11:25 a.m.

On 1/22/2021 4:29 PM, Greg KH wrote:

...

On Thu, Jan 21, 2021 at 03:37:32PM -0500, Steven Rostedt wrote:

...
On Thu, 21 Jan 2021 23:15:22 +0300 Denis Efremov efremov@linux.com wrote:

...
On 1/21/21 10:09 PM, Steven Rostedt wrote:

...
On Thu, 21 Jan 2021 17:30:40 +0300 Denis Efremov efremov@linux.com wrote:

...
Hi,

This patch (CVE-2020-27825) was tagged with Fixes: b23d7a5f4a07a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")

I'm not an expert here but it seems like b23d7a5f4a07a only refactored ring_buffer_reset_cpu() by introducing reset_disabled_cpu_buffer() without significant changes. Hence, mutex_lock(&buffer->mutex)/mutex_unlock(&buffer->mutex) can be backported further than b23d7a5f4a07a~ and to all LTS kernels. Is b23d7a5f4a07a the actual cause of the bug?

Ug, that looks to be a mistake. Looking back at the thread about this:

https://lore.kernel.org/linux-arm-msm/20200915141304.41fa7c30@gandalf.local....

I see from the link that it was planned to backport the patch to LTS kernels:

...
Actually we are seeing issue in older kernel like 4.19/4.14/5.4 and there below patch was not present in stable branches: Commit b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")

The point is that it's not backported yet. Maybe because of Fixes tag. I've discovered this while trying to formalize CVE-2020-27825 bug in cvehound https://github.com/evdenis/cvehound/blob/master/cvehound/cve/CVE-2020-27825....

I think that the backport to the 4.4+ should be something like:

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 547a3a5ac57b..2171b377bbc1 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -4295,6 +4295,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) if (!cpumask_test_cpu(cpu, buffer->cpumask)) return;

mutex_lock(&buffer->mutex);

atomic_inc(&buffer->resize_disabled); atomic_inc(&cpu_buffer->record_disabled);

@@ -4317,6 +4319,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) atomic_dec(&cpu_buffer->record_disabled); atomic_dec(&buffer->resize_disabled);

mutex_unlock(&buffer->mutex); } EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu);

That could possibly work.

Yes, this will work, As i have tested similar patch for internal testing for kernel branches like 5.4/4.19.

...

Ok, so what can I do here? Can someone resend this as a backport to the other stable kernels in this way so that I can queue it up?

thanks,

greg k-h

-- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

Steven Rostedt

2:37 p.m.

On Fri, 22 Jan 2021 16:55:29 +0530 Gaurav Kohli gkohli@codeaurora.org wrote:

...

...
...
That could possibly work.

Yes, this will work, As i have tested similar patch for internal testing for kernel branches like 5.4/4.19.

Can you or Denis send a proper patch for Greg to backport? I'll review it, test it and give my ack to it, so Greg can take it without issue.

Thanks!

-- Steve

...

...
Ok, so what can I do here? Can someone resend this as a backport to the other stable kernels in this way so that I can queue it up?

Denis Efremov

23 Jan 23 Jan

10:49 a.m.

On 1/22/21 5:37 PM, Steven Rostedt wrote:

...

On Fri, 22 Jan 2021 16:55:29 +0530 Gaurav Kohli gkohli@codeaurora.org wrote:

...
...
...
That could possibly work.

Yes, this will work, As i have tested similar patch for internal testing for kernel branches like 5.4/4.19.

Can you or Denis send a proper patch for Greg to backport? I'll review it, test it and give my ack to it, so Greg can take it without issue.

I can prepare the patch, but it will be compile-tested only from my side. Honestly, I think it's better when the patch and its backports have the same author and commit message. And I can't test the fix by myself as I don't know how to reproduce conditions for the bug. I think it's better if Gaurav will prepare this backport, unless he have reasons for me to do it or maybe just don't have enough time nowadays. Gaurav, if you want to somehow mention me you add my Reported-by:

Thanks, Denis

Gaurav Kohli

4:33 p.m.

On 1/23/2021 4:19 PM, Denis Efremov wrote:

...

On 1/22/21 5:37 PM, Steven Rostedt wrote:

...
On Fri, 22 Jan 2021 16:55:29 +0530 Gaurav Kohli gkohli@codeaurora.org wrote:

...
...
...
That could possibly work.

Yes, this will work, As i have tested similar patch for internal testing for kernel branches like 5.4/4.19.

Can you or Denis send a proper patch for Greg to backport? I'll review it, test it and give my ack to it, so Greg can take it without issue.

I can prepare the patch, but it will be compile-tested only from my side. Honestly, I think it's better when the patch and its backports have the same author and commit message. And I can't test the fix by myself as I don't know how to reproduce conditions for the bug. I think it's better if Gaurav will prepare this backport, unless he have reasons for me to do it or maybe just don't have enough time nowadays. Gaurav, if you want to somehow mention me you add my Reported-by:

Thanks, Denis

Sure I will do, I have never posted on backport branches. Let me check and post it.

-- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

Steven Rostedt

24 Jan 24 Jan

3:21 a.m.

On Sat, 23 Jan 2021 22:03:27 +0530 Gaurav Kohli gkohli@codeaurora.org wrote:

...

Sure I will do, I have never posted on backport branches. Let me check and post it.

Basically you take your original patch that was in mainline (as the subject and commit message), and make it work as if you were doing the same exact fix for the stable release.

Send it to me (and Cc everyone else), and I'll give it a test too.

Thanks!

-- Steve

Gaurav Kohli

9:57 a.m.

On 1/24/2021 8:51 AM, Steven Rostedt wrote:

...

On Sat, 23 Jan 2021 22:03:27 +0530 Gaurav Kohli gkohli@codeaurora.org wrote:

...
Sure I will do, I have never posted on backport branches. Let me check and post it.

Basically you take your original patch that was in mainline (as the subject and commit message), and make it work as if you were doing the same exact fix for the stable release.

Send it to me (and Cc everyone else), and I'll give it a test too.

Thanks for the guidance. Just sent and tested it for 5.4 kernel, please review it once.

...

Thanks!

-- Steve

-- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

Greg KH

10:05 a.m.

On Sun, Jan 24, 2021 at 03:27:25PM +0530, Gaurav Kohli wrote:

...

On 1/24/2021 8:51 AM, Steven Rostedt wrote:

...
On Sat, 23 Jan 2021 22:03:27 +0530 Gaurav Kohli gkohli@codeaurora.org wrote:

...
Sure I will do, I have never posted on backport branches. Let me check and post it.

Basically you take your original patch that was in mainline (as the subject and commit message), and make it work as if you were doing the same exact fix for the stable release.

Send it to me (and Cc everyone else), and I'll give it a test too.

Thanks for the guidance. Just sent and tested it for 5.4 kernel, please review it once.

This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly.

</formletter>

1669

days inactive

1779

days old

linux-stable-mirror@lists.linaro.org

12 comments

participants

tags (0)

participants (4)

Denis Efremov
Gaurav Kohli
Greg KH
Steven Rostedt