Hi, steve,
after some operations on file 'tracing_cpumask' and 'snapshot', trace ring buffer can no longer record anything. This series contain a patch to fix it and a constrived testcase to reproduce it.
I think the reproduction testcase is useful to help others to check if their version has this problem and verify the bugfix. However, currently in "tools/testing/selftests/ftrace/test.d", there seems no appropriate subdirectory to put this kind reproductions.
So I now put the testcase in "00basic" because it is basicly simple. Or would there be a new directory to collect simple reproduction testcases?
Zheng Yejian (2): tracing: Fix cpu buffers unavailable due to 'record_disabled' messed selftests/ftrace: Add a basic testcase for snapshot
kernel/trace/trace.c | 2 ++ .../ftrace/test.d/00basic/snapshot1.tc | 17 +++++++++++++++++ 2 files changed, 19 insertions(+) create mode 100644 tools/testing/selftests/ftrace/test.d/00basic/snapshot1.tc
Trace ring buffer can no longer record anything after executing like: # cd /sys/kernel/tracing # cat tracing_cpumask fff # echo 0 > tracing_cpumask # echo 1 > snapshot # echo fff > tracing_cpumask # echo "hello world" > trace_marker -bash: echo: write error: Bad file descriptor
The root cause is that: 1. After `echo 0 > tracing_cpumask`, 'record_disabled' of cpu buffers in 'tr->array_buffer.buffer' became 1 (see tracing_set_cpumask()); 2. After `echo 1 > snapshot`, 'tr->array_buffer.buffer' is swapped with 'tr->max_buffer.buffer', then the 'record_disabled' became 0 (see update_max_tr()); 3. After `echo fff > tracing_cpumask`, the 'record_disabled' become -1; Then array_buffer and max_buffer are both unavailable due to value of 'record_disabled' is not 0.
To fix it, enable or disable both array_buffer and max_buffer at the same time in tracing_set_cpumask().
Fixes: 71babb2705e2 ("tracing: change CPU ring buffer state from tracing_cpumask") Signed-off-by: Zheng Yejian zhengyejian1@huawei.com --- kernel/trace/trace.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index b8870078ef58..691ec4d4cd19 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5277,11 +5277,13 @@ int tracing_set_cpumask(struct trace_array *tr, !cpumask_test_cpu(cpu, tracing_cpumask_new)) { atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu); + ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu); } if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) && cpumask_test_cpu(cpu, tracing_cpumask_new)) { atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu); + ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu); } } arch_spin_unlock(&tr->max_lock);
Hi Zheng,
kernel test robot noticed the following build errors:
[auto build test ERROR on linus/master] [cannot apply to rostedt-trace/for-next-urgent] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Zheng-Yejian/tracing-Fix-cpu-... base: linus/master patch link: https://lore.kernel.org/r/20230804124549.2562977-2-zhengyejian1%40huawei.com patch subject: [PATCH 1/2] tracing: Fix cpu buffers unavailable due to 'record_disabled' messed config: i386-randconfig-r025-20230731 (https://download.01.org/0day-ci/archive/20230805/202308050048.bUnVeBjV-lkp@i...) compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07) reproduce: (https://download.01.org/0day-ci/archive/20230805/202308050048.bUnVeBjV-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202308050048.bUnVeBjV-lkp@intel.com/
All errors (new ones prefixed by >>):
kernel/trace/trace.c:5280:39: error: no member named 'max_buffer' in 'struct trace_array'
ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu); ~~ ^ kernel/trace/trace.c:5286:38: error: no member named 'max_buffer' in 'struct trace_array' ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu); ~~ ^ 2 errors generated.
vim +5280 kernel/trace/trace.c
5260 5261 int tracing_set_cpumask(struct trace_array *tr, 5262 cpumask_var_t tracing_cpumask_new) 5263 { 5264 int cpu; 5265 5266 if (!tr) 5267 return -EINVAL; 5268 5269 local_irq_disable(); 5270 arch_spin_lock(&tr->max_lock); 5271 for_each_tracing_cpu(cpu) { 5272 /* 5273 * Increase/decrease the disabled counter if we are 5274 * about to flip a bit in the cpumask: 5275 */ 5276 if (cpumask_test_cpu(cpu, tr->tracing_cpumask) && 5277 !cpumask_test_cpu(cpu, tracing_cpumask_new)) { 5278 atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); 5279 ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
5280 ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
5281 } 5282 if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) && 5283 cpumask_test_cpu(cpu, tracing_cpumask_new)) { 5284 atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); 5285 ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu); 5286 ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu); 5287 } 5288 } 5289 arch_spin_unlock(&tr->max_lock); 5290 local_irq_enable(); 5291 5292 cpumask_copy(tr->tracing_cpumask, tracing_cpumask_new); 5293 5294 return 0; 5295 } 5296
On Sat, 5 Aug 2023 00:41:13 +0800 kernel test robot lkp@intel.com wrote:
5276 if (cpumask_test_cpu(cpu, tr->tracing_cpumask) && 5277 !cpumask_test_cpu(cpu, tracing_cpumask_new)) { 5278 atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); 5279 ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
5280 ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
The access to max_buffer requires a:
#ifdef CONFIG_TRACER_MAX_TRACE
Around them.
-- Steve
5281 } 5282 if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) && 5283 cpumask_test_cpu(cpu, tracing_cpumask_new)) { 5284 atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); 5285 ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu); 5286 ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu); 5287 } 5288 } 5289 arch_spin_unlock(&tr->max_lock);
On 2023/8/5 00:51, Steven Rostedt wrote:
On Sat, 5 Aug 2023 00:41:13 +0800 kernel test robot lkp@intel.com wrote:
5276 if (cpumask_test_cpu(cpu, tr->tracing_cpumask) && 5277 !cpumask_test_cpu(cpu, tracing_cpumask_new)) { 5278 atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); 5279 ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
5280 ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
The access to max_buffer requires a:
#ifdef CONFIG_TRACER_MAX_TRACE
Around them.
-- Steve
Thanks, Steve, I'll add it in v2.
-- Zheng Yejian
5281 } 5282 if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) && 5283 cpumask_test_cpu(cpu, tracing_cpumask_new)) { 5284 atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); 5285 ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu); 5286 ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu); 5287 } 5288 } 5289 arch_spin_unlock(&tr->max_lock);
Hi Zheng,
kernel test robot noticed the following build errors:
[auto build test ERROR on linus/master] [also build test ERROR on rostedt-trace/for-next v6.5-rc4 next-20230804] [cannot apply to rostedt-trace/for-next-urgent] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Zheng-Yejian/tracing-Fix-cpu-... base: linus/master patch link: https://lore.kernel.org/r/20230804124549.2562977-2-zhengyejian1%40huawei.com patch subject: [PATCH 1/2] tracing: Fix cpu buffers unavailable due to 'record_disabled' messed config: i386-defconfig (https://download.01.org/0day-ci/archive/20230805/202308050601.nsx7Z4CY-lkp@i...) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce: (https://download.01.org/0day-ci/archive/20230805/202308050601.nsx7Z4CY-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202308050601.nsx7Z4CY-lkp@intel.com/
All errors (new ones prefixed by >>):
kernel/trace/trace.c: In function 'tracing_set_cpumask':
kernel/trace/trace.c:5280:60: error: 'struct trace_array' has no member named 'max_buffer'; did you mean 'array_buffer'?
5280 | ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu); | ^~~~~~~~~~ | array_buffer kernel/trace/trace.c:5286:59: error: 'struct trace_array' has no member named 'max_buffer'; did you mean 'array_buffer'? 5286 | ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu); | ^~~~~~~~~~ | array_buffer
vim +5280 kernel/trace/trace.c
5260 5261 int tracing_set_cpumask(struct trace_array *tr, 5262 cpumask_var_t tracing_cpumask_new) 5263 { 5264 int cpu; 5265 5266 if (!tr) 5267 return -EINVAL; 5268 5269 local_irq_disable(); 5270 arch_spin_lock(&tr->max_lock); 5271 for_each_tracing_cpu(cpu) { 5272 /* 5273 * Increase/decrease the disabled counter if we are 5274 * about to flip a bit in the cpumask: 5275 */ 5276 if (cpumask_test_cpu(cpu, tr->tracing_cpumask) && 5277 !cpumask_test_cpu(cpu, tracing_cpumask_new)) { 5278 atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); 5279 ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
5280 ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
5281 } 5282 if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) && 5283 cpumask_test_cpu(cpu, tracing_cpumask_new)) { 5284 atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); 5285 ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu); 5286 ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu); 5287 } 5288 } 5289 arch_spin_unlock(&tr->max_lock); 5290 local_irq_enable(); 5291 5292 cpumask_copy(tr->tracing_cpumask, tracing_cpumask_new); 5293 5294 return 0; 5295 } 5296
Hi Zheng,
kernel test robot noticed the following build errors:
[auto build test ERROR on linus/master] [also build test ERROR on rostedt-trace/for-next v6.5-rc4 next-20230804] [cannot apply to rostedt-trace/for-next-urgent] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Zheng-Yejian/tracing-Fix-cpu-... base: linus/master patch link: https://lore.kernel.org/r/20230804124549.2562977-2-zhengyejian1%40huawei.com patch subject: [PATCH 1/2] tracing: Fix cpu buffers unavailable due to 'record_disabled' messed config: x86_64-defconfig (https://download.01.org/0day-ci/archive/20230805/202308050731.PQutr3r0-lkp@i...) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce: (https://download.01.org/0day-ci/archive/20230805/202308050731.PQutr3r0-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202308050731.PQutr3r0-lkp@intel.com/
All errors (new ones prefixed by >>):
kernel/trace/trace.c: In function 'tracing_set_cpumask':
kernel/trace/trace.c:5280:60: error: 'struct trace_array' has no member named 'max_buffer'; did you mean 'array_buffer'?
5280 | ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu); | ^~~~~~~~~~ | array_buffer kernel/trace/trace.c:5286:59: error: 'struct trace_array' has no member named 'max_buffer'; did you mean 'array_buffer'? 5286 | ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu); | ^~~~~~~~~~ | array_buffer
vim +5280 kernel/trace/trace.c
5260 5261 int tracing_set_cpumask(struct trace_array *tr, 5262 cpumask_var_t tracing_cpumask_new) 5263 { 5264 int cpu; 5265 5266 if (!tr) 5267 return -EINVAL; 5268 5269 local_irq_disable(); 5270 arch_spin_lock(&tr->max_lock); 5271 for_each_tracing_cpu(cpu) { 5272 /* 5273 * Increase/decrease the disabled counter if we are 5274 * about to flip a bit in the cpumask: 5275 */ 5276 if (cpumask_test_cpu(cpu, tr->tracing_cpumask) && 5277 !cpumask_test_cpu(cpu, tracing_cpumask_new)) { 5278 atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); 5279 ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
5280 ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
5281 } 5282 if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) && 5283 cpumask_test_cpu(cpu, tracing_cpumask_new)) { 5284 atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); 5285 ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu); 5286 ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu); 5287 } 5288 } 5289 arch_spin_unlock(&tr->max_lock); 5290 local_irq_enable(); 5291 5292 cpumask_copy(tr->tracing_cpumask, tracing_cpumask_new); 5293 5294 return 0; 5295 } 5296
On 2023/8/5 09:15, kernel test robot wrote:
Hi Zheng,
kernel test robot noticed the following build errors:
[auto build test ERROR on linus/master] [also build test ERROR on rostedt-trace/for-next v6.5-rc4 next-20230804] [cannot apply to rostedt-trace/for-next-urgent] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Zheng-Yejian/tracing-Fix-cpu-... base: linus/master patch link: https://lore.kernel.org/r/20230804124549.2562977-2-zhengyejian1%40huawei.com patch subject: [PATCH 1/2] tracing: Fix cpu buffers unavailable due to 'record_disabled' messed config: x86_64-defconfig (https://download.01.org/0day-ci/archive/20230805/202308050731.PQutr3r0-lkp@i...) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce: (https://download.01.org/0day-ci/archive/20230805/202308050731.PQutr3r0-lkp@i...)
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot lkp@intel.com | Closes: https://lore.kernel.org/oe-kbuild-all/202308050731.PQutr3r0-lkp@intel.com/
All errors (new ones prefixed by >>):
kernel/trace/trace.c: In function 'tracing_set_cpumask':
kernel/trace/trace.c:5280:60: error: 'struct trace_array' has no member named 'max_buffer'; did you mean 'array_buffer'?
5280 | ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu); | ^~~~~~~~~~ | array_buffer kernel/trace/trace.c:5286:59: error: 'struct trace_array' has no member named 'max_buffer'; did you mean 'array_buffer'? 5286 | ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu); | ^~~~~~~~~~ | array_buffer
Thank you, robot! I'll fix it in v2 soon.
vim +5280 kernel/trace/trace.c
5260 5261 int tracing_set_cpumask(struct trace_array *tr, 5262 cpumask_var_t tracing_cpumask_new) 5263 { 5264 int cpu; 5265 5266 if (!tr) 5267 return -EINVAL; 5268 5269 local_irq_disable(); 5270 arch_spin_lock(&tr->max_lock); 5271 for_each_tracing_cpu(cpu) { 5272 /* 5273 * Increase/decrease the disabled counter if we are 5274 * about to flip a bit in the cpumask: 5275 */ 5276 if (cpumask_test_cpu(cpu, tr->tracing_cpumask) && 5277 !cpumask_test_cpu(cpu, tracing_cpumask_new)) { 5278 atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); 5279 ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
5280 ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
5281 } 5282 if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) && 5283 cpumask_test_cpu(cpu, tracing_cpumask_new)) { 5284 atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled); 5285 ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu); 5286 ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu); 5287 } 5288 } 5289 arch_spin_unlock(&tr->max_lock); 5290 local_irq_enable(); 5291 5292 cpumask_copy(tr->tracing_cpumask, tracing_cpumask_new); 5293 5294 return 0; 5295 } 5296
This testcase is constrived to reproduce a problem that the cpu buffers become unavailable which is due to 'record_disabled' of array_buffer and max_buffer being messed up.
Local test result after bugfix: # ./ftracetest test.d/00basic/snapshot1.tc === Ftrace unit tests === [1] Snapshot and tracing_cpumask [PASS] [2] (instance) Snapshot and tracing_cpumask [PASS]
# of passed: 2 # of failed: 0 # of unresolved: 0 # of untested: 0 # of unsupported: 0 # of xfailed: 0 # of undefined(test bug): 0
Signed-off-by: Zheng Yejian zhengyejian1@huawei.com --- .../ftrace/test.d/00basic/snapshot1.tc | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) create mode 100644 tools/testing/selftests/ftrace/test.d/00basic/snapshot1.tc
diff --git a/tools/testing/selftests/ftrace/test.d/00basic/snapshot1.tc b/tools/testing/selftests/ftrace/test.d/00basic/snapshot1.tc new file mode 100644 index 000000000000..0d9b84b3adc3 --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/00basic/snapshot1.tc @@ -0,0 +1,17 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Snapshot and tracing_cpumask +# requires: trace_marker tracing_cpumask snapshot +# flags: instance + +# This testcase is constrived to reproduce a problem that the cpu buffers +# become unavailable which is due to 'record_disabled' of array_buffer and +# max_buffer being messed up. + +ORIG_CPUMASK=`cat tracing_cpumask` +echo 0 > tracing_cpumask +echo 1 > snapshot +echo ${ORIG_CPUMASK} > tracing_cpumask +echo "test input 1" > trace_marker +grep -q "test input 1" trace +exit 0
linux-kselftest-mirror@lists.linaro.org