On 10/1/24 09:20, Vlastimil Babka wrote:
Guenter Roeck reports that the new slub kunit tests added by commit 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and test_leak_destroy()") cause a lockup on boot on several architectures when the kunit tests are configured to be built-in and not modules.
The test_kfree_rcu test invokes kfree_rcu() and boot sequence inspection showed the runner for built-in kunit tests kunit_run_all_tests() is called before setting system_state to SYSTEM_RUNNING and calling rcu_end_inkernel_boot(), so this seems like a likely cause. So while I was unable to reproduce the problem myself, skipping the test when the slub_kunit module is built-in should avoid the issue.
An alternative fix that was moving the call to kunit_run_all_tests() a bit later in the boot was tried, but has broken tests with functions marked as __init due to free_initmem() already being done.
Fixes: 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and test_leak_destroy()") Reported-by: Guenter Roeck linux@roeck-us.net Closes: https://lore.kernel.org/all/6fcb1252-7990-4f0d-8027-5e83f0fb9409@roeck-us.ne... Cc: "Paul E. McKenney" paulmck@kernel.org Cc: Boqun Feng boqun.feng@gmail.com Cc: Uladzislau Rezki urezki@gmail.com Cc: rcu@vger.kernel.org Cc: Brendan Higgins brendanhiggins@google.com Cc: David Gow davidgow@google.com Cc: Rae Moar rmoar@google.com Cc: linux-kselftest@vger.kernel.org Cc: kunit-dev@googlegroups.com Signed-off-by: Vlastimil Babka vbabka@suse.cz
This results in:
KTAP version 1 # Subtest: slub_test # module: slub_kunit 1..8 # test_clobber_zone: pass:1 fail:0 skip:0 total:1 ok 1 test_clobber_zone # test_next_pointer: pass:1 fail:0 skip:0 total:1 ok 2 test_next_pointer # test_first_word: pass:1 fail:0 skip:0 total:1 ok 3 test_first_word # test_clobber_50th_byte: pass:1 fail:0 skip:0 total:1 ok 4 test_clobber_50th_byte # test_clobber_redzone_free: pass:1 fail:0 skip:0 total:1 ok 5 test_clobber_redzone_free # test_kmalloc_redzone_access: pass:1 fail:0 skip:0 total:1 ok 6 test_kmalloc_redzone_access # test_kfree_rcu: pass:0 fail:0 skip:1 total:1 ok 7 test_kfree_rcu # SKIP can't do kfree_rcu() when test is built-in # test_leak_destroy: pass:1 fail:0 skip:0 total:1 ok 8 test_leak_destroy # slub_test: pass:7 fail:0 skip:1 total:8
Tested-by: Guenter Roeck linux@roeck-us.net
Thanks, Guenter