On 4/28/2021 8:52 AM, George Kennedy wrote:
On 4/28/2021 12:57 AM, Greg Kroah-Hartman wrote:
On Tue, Apr 27, 2021 at 06:18:05PM -0400, George Kennedy wrote:
CC+ stable@vger.kernel.org
On 4/27/2021 6:17 PM, George Kennedy wrote:
Hello Greg,
We need the following 2 upstream commits applied to 5.4.y to fix an iBFT boot failure:
2021-03-29 rafael.j.wysocki@intel.com - 1a1c130a 2021-03-23 Rafael J. Wysocki ACPI: tables: x86: Reserve memory occupied by ACPI tables 2021-04-13 rafael.j.wysocki@intel.com - 6998a88 2021-04-13 Rafael J. Wysocki ACPI: x86: Call acpi_boot_table_init() after acpi_table_upgrade()
Currently, only the first commit (1a1c130a) is destined for 5.10 & 5.11.
The 2nd commit (6998a88) is needed as well and both commits are needed in 5.4.y.
Is this a regression (i.e. did this hardware work on older kernels?), and if so, what commit caused the problem?
These commits are already in 5.10.y, what changed in older kernels to require this to be backported?
Hello Greg,
Can the same 2 patches also be applied to 4.14.y, which one of distros is based on?
4.14.y crashes during ibft boot with KASAN enabled without the 2 patches.
Thank you, George
Not sure. With KASAN enabled the bug is exposed, but only during boot as the ACPI tables are freed and their memory re-alloc'd. Silent data corruption occurs if KASAN not enabled.
This is a latent bug that in upstream was more readily exposed with the following commit:
commit 7fef431be9c9ac255838a9578331567b9dba4477 Author: David Hildenbrand david@redhat.com Date: Thu Oct 15 20:09:35 2020 -0700 mm/page_alloc: place pages to tail in __free_pages_core()
This is the failure with latest upstream stable and KASAN enabled:
[ 22.986842] OPA Virtual Network Driver - v1.0 [ 22.988565] iBFT detected. [ 22.989244] ================================================================== [ 22.990233] BUG: KASAN: use-after-free in ibft_init+0x134/0xb8b [ 22.990233] Read of size 4 at addr ffff8880be451004 by task swapper/0/1 [ 22.990233] [ 22.990233] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.4.115-rc1.syzk #1 [ 22.990233] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 22.990233] Call Trace: [ 22.990233] dump_stack+0xd4/0x119 [ 22.990233] ? ibft_init+0x134/0xb8b [ 22.990233] print_address_description.constprop.6+0x20/0x220 [ 22.990233] ? ibft_init+0x134/0xb8b [ 22.990233] ? ibft_init+0x134/0xb8b [ 22.990233] __kasan_report.cold.9+0x37/0x77 [ 22.990233] ? ibft_init+0x134/0xb8b [ 22.990233] kasan_report+0x14/0x20 [ 22.990233] __asan_report_load_n_noabort+0xf/0x20 [ 22.990233] ibft_init+0x134/0xb8b [ 22.990233] ? dmi_sysfs_init+0x1a5/0x1a5 [ 22.990233] ? dmi_walk+0x72/0x90 [ 22.990233] ? ibft_check_initiator_for+0x159/0x159 [ 22.990233] ? rvt_init_port+0x110/0x110 [ 22.990233] ? ibft_check_initiator_for+0x159/0x159 [ 22.990233] do_one_initcall+0xc3/0x480 [ 22.990233] ? perf_trace_initcall_level+0x410/0x410 [ 22.990233] kernel_init_freeable+0x54c/0x66e [ 22.990233] ? start_kernel+0x94b/0x94b [ 22.990233] ? __switch_to_asm+0x34/0x70 [ 22.990233] ? __sanitizer_cov_trace_const_cmp1+0x1a/0x20 [ 22.990233] ? __kasan_check_write+0x14/0x20 [ 22.990233] ? rest_init+0xe6/0xe6 [ 22.990233] kernel_init+0x16/0x1ca [ 22.990233] ? rest_init+0xe6/0xe6 [ 22.990233] ret_from_fork+0x35/0x40 [ 22.990233] [ 22.990233] The buggy address belongs to the page: [ 22.990233] page:ffffea0002f91440 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 [ 22.990233] flags: 0xfffffc0000000() [ 22.990233] raw: 000fffffc0000000 ffffea0002f914c8 ffffea0002fa4708 0000000000000000 [ 22.990233] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 [ 22.990233] page dumped because: kasan: bad access detected [ 22.990233] [ 22.990233] Memory state around the buggy address: [ 22.990233] ffff8880be450f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 22.990233] ffff8880be450f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 22.990233] >ffff8880be451000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 22.990233] ^ [ 22.990233] ffff8880be451080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 22.990233] ffff8880be451100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 22.990233] ================================================================== [ 22.990233] Disabling lock debugging due to kernel taint [ 23.047129] Kernel panic - not syncing: panic_on_warn set ... [ 23.048110] CPU: 3 PID: 1 Comm: swapper/0 Tainted: G B 5.4.115-rc1v5.4.114-21-gf9824ac.syzk #1 [ 23.048110] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 23.048110] Call Trace: [ 23.048110] dump_stack+0xd4/0x119 [ 23.048110] ? ibft_init+0xc3/0xb8b [ 23.048110] panic+0x28f/0x6ad [ 23.048110] ? add_taint.cold.9+0x16/0x16 [ 23.048110] ? ibft_init+0x134/0xb8b [ 23.048110] ? add_taint+0x47/0x90 [ 23.048110] ? add_taint+0x47/0x90 [ 23.048110] ? ibft_init+0x134/0xb8b [ 23.048110] ? ibft_init+0x134/0xb8b [ 23.048110] end_report+0x4c/0x54 [ 23.048110] __kasan_report.cold.9+0x55/0x77 [ 23.048110] ? ibft_init+0x134/0xb8b [ 23.048110] kasan_report+0x14/0x20 [ 23.048110] __asan_report_load_n_noabort+0xf/0x20 [ 23.048110] ibft_init+0x134/0xb8b [ 23.048110] ? dmi_sysfs_init+0x1a5/0x1a5 [ 23.048110] ? dmi_walk+0x72/0x90 [ 23.048110] ? ibft_check_initiator_for+0x159/0x159 [ 23.048110] ? rvt_init_port+0x110/0x110 [ 23.048110] ? ibft_check_initiator_for+0x159/0x159 [ 23.048110] do_one_initcall+0xc3/0x480 [ 23.048110] ? perf_trace_initcall_level+0x410/0x410 [ 23.048110] kernel_init_freeable+0x54c/0x66e [ 23.048110] ? start_kernel+0x94b/0x94b [ 23.048110] ? __switch_to_asm+0x34/0x70 [ 23.048110] ? __sanitizer_cov_trace_const_cmp1+0x1a/0x20 [ 23.048110] ? __kasan_check_write+0x14/0x20 [ 23.048110] ? rest_init+0xe6/0xe6 [ 23.048110] kernel_init+0x16/0x1ca [ 23.048110] ? rest_init+0xe6/0xe6 [ 23.048110] ret_from_fork+0x35/0x40 [ 23.048110] Dumping ftrace buffer: [ 23.048110] --------------------------------- [ 23.048110] rb_produ-210 3.... 7555323us : ring_buffer_producer_thread: Starting ring buffer hammer [ 23.048110] rb_produ-210 3.... 17555348us : ring_buffer_producer_thread: End ring buffer hammer [ 23.048110] rb_produ-210 3.... 17640105us : ring_buffer_producer_thread: Running Consumer at nice: 19 [ 23.048110] rb_produ-210 3.... 17640111us : ring_buffer_producer_thread: Running Producer at nice: 19 [ 23.048110] rb_produ-210 3.... 17640113us : ring_buffer_producer_thread: WARNING!!! This test is running at lowest priority. [ 23.048110] rb_produ-210 3.... 17640118us : ring_buffer_producer_thread: Time: 10000017 (usecs) [ 23.048110] rb_produ-210 3.... 17640122us : ring_buffer_producer_thread: Overruns: 4460970 [ 23.048110] rb_produ-210 3.... 17640129us : ring_buffer_producer_thread: Read: 3807780 (by events) [ 23.048110] rb_produ-210 3.... 17640134us : ring_buffer_producer_thread: Entries: 0 [ 23.048110] rb_produ-210 3.... 17640137us : ring_buffer_producer_thread: Total: 8268750 [ 23.048110] rb_produ-210 3.... 17640142us : ring_buffer_producer_thread: Missed: 0 [ 23.048110] rb_produ-210 3.... 17640146us : ring_buffer_producer_thread: Hit: 8268750 [ 23.048110] rb_produ-210 3.... 17640150us : ring_buffer_producer_thread: Entries per millisec: 826 [ 23.048110] rb_produ-210 3.... 17640154us : ring_buffer_producer_thread: 1210 ns per entry [ 23.048110] rb_produ-210 3.... 17640157us : ring_buffer_producer_thread: Sleeping for 10 secs [ 23.048110] ---------------------------------
2021-04-26 gregkh@linuxfoundation.org - f9824ac 2021-04-26 Greg Kroah-Hartman Linux 5.4.115-rc1
Because the failure occurs during boot, syzkaller did not expose this bug.
George
thanks,
greg k-h