June 2023 - Linux-kselftest-mirror

[PATCH v3 00/11] A minor flurry of selftest/mm fixes

by John Hubbard

Hi, Changes since v2 [1]: * Added a new patch (sent separately earlier) at the end, to error out if "make headers" has not yet been run. * Reworked and simplified the uffd movement patch. Now it only moves some uffd*() routines, not all, and doesn't have to touch the Makefile at all. This lighter touch also allowed me to drop the "move psize(), pshift() into vm_utils.c" entirely. I expect Peter Xu will be a little happier with this new approach. * Fixed the commit description for the MADV_COLLAPSE patch. * Added more Reviewed-by tags from David Hildenbrand and Peter Xu. [1] https://lore.kernel.org/all/20230603021558.95299-1-jhubbard@nvidia.com/ John Hubbard (11): selftests/mm: fix uffd-stress unused function warning selftests/mm: fix unused variable warnings in hugetlb-madvise.c, migration.c selftests/mm: fix "warning: expression which evaluates to zero..." in mlock2-tests.c selftests/mm: fix invocation of tests that are run via shell scripts selftests/mm: .gitignore: add mkdirty, va_high_addr_switch selftests/mm: fix two -Wformat-security warnings in uffd builds selftests/mm: fix a "possibly uninitialized" warning in pkey-x86.h selftests/mm: fix build failures due to missing MADV_COLLAPSE selftests/mm: move certain uffd*() routines from vm_util.c to uffd-common.c Documentation: kselftest: "make headers" is a prerequisite selftests: error out if kernel header files are not yet built Documentation/dev-tools/kselftest.rst | 1 + tools/testing/selftests/lib.mk | 36 +++++++++++- tools/testing/selftests/mm/.gitignore | 2 + tools/testing/selftests/mm/cow.c | 7 --- tools/testing/selftests/mm/hugetlb-madvise.c | 8 ++- tools/testing/selftests/mm/khugepaged.c | 10 ---- tools/testing/selftests/mm/migration.c | 5 +- tools/testing/selftests/mm/mlock2-tests.c | 1 - tools/testing/selftests/mm/pkey-x86.h | 2 +- tools/testing/selftests/mm/run_vmtests.sh | 6 +- tools/testing/selftests/mm/uffd-common.c | 59 ++++++++++++++++++++ tools/testing/selftests/mm/uffd-common.h | 5 ++ tools/testing/selftests/mm/uffd-stress.c | 10 ---- tools/testing/selftests/mm/uffd-unit-tests.c | 16 ++---- tools/testing/selftests/mm/vm_util.c | 59 -------------------- tools/testing/selftests/mm/vm_util.h | 14 +++-- 16 files changed, 130 insertions(+), 111 deletions(-) base-commit: f8dba31b0a826e691949cd4fdfa5c30defaac8c5 -- 2.40.1

1 year, 12 months

6
38
0 0

[PATCH v1 0/9] x86/resctrl: Use soft RMIDs for reliable MBM on AMD

by Peter Newman

Hi Reinette, Fenghua, This series introduces a new mount option enabling an alternate mode for MBM to work around an issue on present AMD implementations and any other resctrl implementation where there are more RMIDs (or equivalent) than hardware counters. The L3 External Bandwidth Monitoring feature of the AMD PQoS extension[1] only guarantees that RMIDs currently assigned to a processor will be tracked by hardware. The counters of any other RMIDs which are no longer being tracked will be reset to zero. The MBM event counters return "Unavailable" to indicate when this has happened. An interval for effectively measuring memory bandwidth typically needs to be multiple seconds long. In Google's workloads, it is not feasible to bound the number of jobs with different RMIDs which will run in a cache domain over any period of time. Consequently, on a fully-committed system where all RMIDs are allocated, few groups' counters return non-zero values. To demonstrate the underlying issue, the first patch provides a test case in tools/testing/selftests/resctrl/test_rmids.sh. On an AMD EPYC 7B12 64-Core Processor with the default behavior: # ./test_rmids.sh Created 255 monitoring groups. g1: mbm_total_bytes: Unavailable -> Unavailable (FAIL) g2: mbm_total_bytes: Unavailable -> Unavailable (FAIL) g3: mbm_total_bytes: Unavailable -> Unavailable (FAIL) [..] g238: mbm_total_bytes: Unavailable -> Unavailable (FAIL) g239: mbm_total_bytes: Unavailable -> Unavailable (FAIL) g240: mbm_total_bytes: Unavailable -> Unavailable (FAIL) g241: mbm_total_bytes: Unavailable -> 660497472 g242: mbm_total_bytes: Unavailable -> 660793344 g243: mbm_total_bytes: Unavailable -> 660477312 g244: mbm_total_bytes: Unavailable -> 660495360 g245: mbm_total_bytes: Unavailable -> 660775360 g246: mbm_total_bytes: Unavailable -> 660645504 g247: mbm_total_bytes: Unavailable -> 660696128 g248: mbm_total_bytes: Unavailable -> 660605248 g249: mbm_total_bytes: Unavailable -> 660681280 g250: mbm_total_bytes: Unavailable -> 660834240 g251: mbm_total_bytes: Unavailable -> 660440064 g252: mbm_total_bytes: Unavailable -> 660501504 g253: mbm_total_bytes: Unavailable -> 660590720 g254: mbm_total_bytes: Unavailable -> 660548352 g255: mbm_total_bytes: Unavailable -> 660607296 255 groups, 0 returned counts in first pass, 15 in second successfully measured bandwidth from 15/255 groups To compare, here is the output from an Intel(R) Xeon(R) Platinum 8173M CPU: # ./test_rmids.sh Created 223 monitoring groups. g1: mbm_total_bytes: 0 -> 606126080 g2: mbm_total_bytes: 0 -> 613236736 g3: mbm_total_bytes: 0 -> 610254848 [..] g221: mbm_total_bytes: 0 -> 584679424 g222: mbm_total_bytes: 0 -> 588808192 g223: mbm_total_bytes: 0 -> 587317248 223 groups, 223 returned counts in first pass, 223 in second successfully measured bandwidth from 223/223 groups To make better use of the hardware in such a use case, this patchset introduces a "soft" RMID implementation, where each CPU is permanently assigned a "hard" RMID. On context switches which change the current soft RMID, the difference between each CPU's current event counts and most recent counts is added to the totals for the current or outgoing soft RMID. This technique does not work for cache occupancy counters, so this patch series disables cache occupancy events when soft RMIDs are enabled. This series adds the "mbm_soft_rmid" mount option to allow users to opt-in to the functionaltiy when they deem it helpful. When the same system from the earlier AMD example enables the mbm_soft_rmid mount option: # ./test_rmids.sh Created 255 monitoring groups. g1: mbm_total_bytes: 0 -> 686560576 g2: mbm_total_bytes: 0 -> 668204416 [..] g252: mbm_total_bytes: 0 -> 672651200 g253: mbm_total_bytes: 0 -> 666956800 g254: mbm_total_bytes: 0 -> 665917056 g255: mbm_total_bytes: 0 -> 671049600 255 groups, 255 returned counts in first pass, 255 in second successfully measured bandwidth from 255/255 groups (patches are based on tip/master) [1] https://www.amd.com/system/files/TechDocs/56375_1.03_PUB.pdf Peter Newman (8): selftests/resctrl: Verify all RMIDs count together x86/resctrl: Add resctrl_mbm_flush_cpu() to collect CPUs' MBM events x86/resctrl: Flush MBM event counts on soft RMID change x86/resctrl: Call mon_event_count() directly for soft RMIDs x86/resctrl: Create soft RMID version of __mon_event_count() x86/resctrl: Assign HW RMIDs to CPUs for soft RMID x86/resctrl: Use mbm_update() to push soft RMID counts x86/resctrl: Add mount option to enable soft RMID Stephane Eranian (1): x86/resctrl: Hold a spinlock in __rmid_read() on AMD arch/x86/include/asm/resctrl.h | 29 +++- arch/x86/kernel/cpu/resctrl/core.c | 80 ++++++++- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 9 +- arch/x86/kernel/cpu/resctrl/internal.h | 19 ++- arch/x86/kernel/cpu/resctrl/monitor.c | 158 +++++++++++++++++- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 52 ++++++ tools/testing/selftests/resctrl/test_rmids.sh | 93 +++++++++++ 7 files changed, 425 insertions(+), 15 deletions(-) create mode 100755 tools/testing/selftests/resctrl/test_rmids.sh base-commit: dd806e2f030e57dd5bac973372aa252b6c175b73 -- 2.40.0.634.g4ca3ef3211-goog

2 years

2
39
0 0

selftests: gpio: crash on arm64

by Naresh Kamboju

Following kernel warnings and crash notices on arm64 Rpi4 device while running selftests: gpio on Linux mainline 6.3.0-rc1 kernel and Linux next. Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org> Please refer to test log links for detailed test plan and kernel crash logs. It is reproducible on arm64 juno-r2, Rpi4 and Qualcomm dragonboard 410c and qemu-arm64. Test log: ----------- kselftest: Running tests in gpio TAP version 13 1..2 # selftests: gpio: gpio-mockup.sh # 1. Module load tests [ 61.176149] ============================================================================= [ 61.176802] [ 61.176807] ====================================================== [ 61.176809] WARNING: possible circular locking dependency detected [ 61.176811] 6.3.0-rc1-next-20230307 #1 Not tainted [ 61.176814] ------------------------------------------------------ [ 61.176816] modprobe/510 is trying to acquire lock: [ 61.176818] ffff80000b2284e8 (console_owner){..-.}-{0:0}, at: console_flush_all (kernel/printk/printk.c:2879 kernel/printk/printk.c:2942) [ 61.176846] [ 61.176846] but task is already holding lock: [ 61.176848] ffff000040000698 (&n->list_lock){-.-.}-{2:2}, at: get_partial_node.part.0 (mm/slub.c:2271) [ 61.176861] [ 61.176861] which lock already depends on the new lock. [ 61.176861] [ 61.176863] [ 61.176863] the existing dependency chain (in reverse order) is: [ 61.176864] [ 61.176864] -> #2 (&n->list_lock){-.-.}-{2:2}: [ 61.176871] lock_acquire (kernel/locking/lockdep.c:5673) [ 61.176879] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) [ 61.176885] get_partial_node.part.0 (mm/slub.c:2271) [ 61.176890] ___slab_alloc (mm/slub.c:2268 mm/slub.c:2386 mm/slub.c:3188) [ 61.176894] __slab_alloc.constprop.0 (mm/slub.c:3292) [ 61.176899] __kmem_cache_alloc_node (mm/slub.c:3345 mm/slub.c:3442 mm/slub.c:3491) [ 61.176903] __kmalloc (mm/slab_common.c:968 mm/slab_common.c:980) [ 61.176908] tty_buffer_alloc (drivers/tty/tty_buffer.c:182) [ 61.176914] __tty_buffer_request_room (drivers/tty/tty_buffer.c:279) [ 61.176919] __tty_insert_flip_char (drivers/tty/tty_buffer.c:398) [ 61.176924] uart_insert_char (drivers/tty/serial/serial_core.c:3341) [ 61.176929] pl011_fifo_to_tty.isra.0 (drivers/tty/serial/amba-pl011.c:314) [ 61.176934] pl011_int (include/linux/spinlock.h:390 drivers/tty/serial/amba-pl011.c:1396 drivers/tty/serial/amba-pl011.c:1571) [ 61.176937] __handle_irq_event_percpu (kernel/irq/handle.c:158) [ 61.176941] handle_irq_event (kernel/irq/handle.c:193 kernel/irq/handle.c:210) [ 61.176944] handle_fasteoi_irq (kernel/irq/chip.c:716) [ 61.176950] generic_handle_domain_irq (kernel/irq/irqdesc.c:652 kernel/irq/irqdesc.c:707) [ 61.176953] gic_handle_irq (arch/arm64/include/asm/io.h:75 include/asm-generic/io.h:335 drivers/irqchip/irq-gic.c:344) [ 61.176958] call_on_irq_stack (arch/arm64/kernel/entry.S:905) [ 61.176962] do_interrupt_handler (arch/arm64/kernel/entry-common.c:274) [ 61.176968] el1_interrupt (arch/arm64/kernel/entry-common.c:472 arch/arm64/kernel/entry-common.c:486) [ 61.176971] el1h_64_irq_handler (arch/arm64/kernel/entry-common.c:492) [ 61.176975] el1h_64_irq (arch/arm64/kernel/entry.S:587) [ 61.176978] __kmem_cache_alloc_node (mm/slub.c:3490) [ 61.176983] kmalloc_trace (mm/slab_common.c:1064 (discriminator 4)) [ 61.176986] inet6_dump_fib (net/ipv6/ip6_fib.c:657) [ 61.176991] rtnl_dump_all (net/core/rtnetlink.c:3964) [ 61.176997] netlink_dump (net/netlink/af_netlink.c:2296) [ 61.177004] netlink_recvmsg (net/netlink/af_netlink.c:2024) [ 61.177009] ____sys_recvmsg (net/socket.c:1015 net/socket.c:1036 net/socket.c:2723) [ 61.177014] ___sys_recvmsg (net/socket.c:2765) [ 61.177019] __sys_recvmsg (include/linux/file.h:31 net/socket.c:2797) [ 61.177025] __arm64_sys_recvmsg (net/socket.c:2802) [ 61.177030] invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:57) [ 61.177037] el0_svc_common.constprop.0 (arch/arm64/kernel/syscall.c:149) [ 61.177043] do_el0_svc (arch/arm64/kernel/syscall.c:194) [ 61.177049] el0_svc (arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:638) [ 61.177052] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:656) [ 61.177055] el0t_64_sync (arch/arm64/kernel/entry.S:591) [ 61.177058] [ 61.177058] -> #1 (&port_lock_key){-.-.}-{2:2}: [ 61.177065] lock_acquire (kernel/locking/lockdep.c:5673) [ 61.177071] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) [ 61.177074] serial8250_console_write (drivers/tty/serial/8250/8250_port.c:3394) [ 61.177082] univ8250_console_write (drivers/tty/serial/8250/8250_core.c:585) [ 61.177087] console_flush_all (kernel/printk/printk.c:2888 kernel/printk/printk.c:2942) [ 61.177093] console_unlock.part.0 (kernel/printk/printk.c:3017) [ 61.177098] vprintk_emit (kernel/printk/printk.c:2317) [ 61.177104] vprintk_default (kernel/printk/printk.c:2328) [ 61.177110] vprintk (kernel/printk/printk_safe.c:50) [ 61.177116] _printk (kernel/printk/printk.c:2341) [ 61.177121] register_console (kernel/printk/printk.c:3468) [ 61.177126] uart_add_one_port (drivers/tty/serial/serial_core.c:2579 drivers/tty/serial/serial_core.c:3100) [ 61.177130] serial8250_register_8250_port (drivers/tty/serial/8250/8250_core.c:1093) [ 61.177135] bcm2835aux_serial_probe (drivers/tty/serial/8250/8250_bcm2835aux.c:184) [ 61.177141] platform_probe (drivers/base/platform.c:1405) [ 61.177148] really_probe (drivers/base/dd.c:552 drivers/base/dd.c:631) [ 61.177152] __driver_probe_device (drivers/base/dd.c:768) [ 61.177157] driver_probe_device (drivers/base/dd.c:798) [ 61.177161] __driver_attach (drivers/base/dd.c:1185) [ 61.177166] bus_for_each_dev (drivers/base/bus.c:368) [ 61.177170] driver_attach (drivers/base/dd.c:1202) [ 61.177173] bus_add_driver (drivers/base/bus.c:673) [ 61.177177] driver_register (drivers/base/driver.c:246) [ 61.177182] __platform_driver_register (drivers/base/platform.c:868) [ 61.177188] bcm2835aux_serial_driver_init (drivers/tty/serial/8250/8250_bcm2835aux.c:233) [ 61.177195] do_one_initcall (init/main.c:1306) [ 61.177199] kernel_init_freeable (init/main.c:1378 init/main.c:1395 init/main.c:1414 init/main.c:1634) [ 61.177207] kernel_init (init/main.c:1524) [ 61.177212] ret_from_fork (arch/arm64/kernel/entry.S:871) [ 61.177216] [ 61.177216] -> #0 (console_owner){..-.}-{0:0}: [ 61.177222] __lock_acquire (kernel/locking/lockdep.c:3099 kernel/locking/lockdep.c:3217 kernel/locking/lockdep.c:3832 kernel/locking/lockdep.c:5056) [ 61.177228] lock_acquire.part.0 (arch/arm64/include/asm/percpu.h:40 kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5671) [ 61.177233] lock_acquire (kernel/locking/lockdep.c:5673) [ 61.177238] console_flush_all (kernel/printk/printk.c:2883 kernel/printk/printk.c:2942) [ 61.177244] console_unlock.part.0 (kernel/printk/printk.c:3017) [ 61.177250] vprintk_emit (kernel/printk/printk.c:2317) [ 61.177255] vprintk_default (kernel/printk/printk.c:2328) [ 61.177261] vprintk (kernel/printk/printk_safe.c:50) [ 61.177267] _printk (kernel/printk/printk.c:2341) [ 61.177271] slab_bug (mm/slub.c:892) [ 61.177274] check_bytes_and_report (mm/slub.c:1054) [ 61.177279] check_object (mm/slub.c:1196 (discriminator 2)) [ 61.177283] alloc_debug_processing (mm/slub.c:1415 mm/slub.c:1425) [ 61.177287] get_partial_node.part.0 (mm/slub.c:2146 mm/slub.c:2279) [ 61.177291] ___slab_alloc (mm/slub.c:2268 mm/slub.c:2386 mm/slub.c:3188) [ 61.177295] __slab_alloc.constprop.0 (mm/slub.c:3292) [ 61.177300] __kmem_cache_alloc_node (mm/slub.c:3345 mm/slub.c:3442 mm/slub.c:3491) [ 61.177304] kmalloc_trace (mm/slab_common.c:1064 (discriminator 4)) [ 61.177308] device_add (drivers/base/core.c:3436 drivers/base/core.c:3486) [ 61.177311] platform_device_add (drivers/base/platform.c:717) [ 61.177317] platform_device_register_full (drivers/base/platform.c:844) [ 61.177323] gpio_mockup_register_chip+0x1ec/0x2b8 gpio_mockup [ 61.177342] gpio_mockup_init+0xf0/0xd40 gpio_mockup [ 61.177352] do_one_initcall (init/main.c:1306) [ 61.177356] do_init_module (kernel/module/main.c:2457) [ 61.177363] load_module (kernel/module/main.c:2859) [ 61.177369] __do_sys_finit_module (kernel/module/main.c:2961) [ 61.177375] __arm64_sys_finit_module (kernel/module/main.c:2928) [ 61.177381] invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:57) [ 61.177387] el0_svc_common.constprop.0 (arch/arm64/kernel/syscall.c:149) [ 61.177393] do_el0_svc (arch/arm64/kernel/syscall.c:194) [ 61.177398] el0_svc (arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:638) [ 61.177402] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:656) [ 61.177405] el0t_64_sync (arch/arm64/kernel/entry.S:591) [ 61.177408] [ 61.177408] other info that might help us debug this: [ 61.177408] [ 61.177410] Chain exists of: [ 61.177410] console_owner --> &port_lock_key --> &n->list_lock [ 61.177410] [ 61.177417] Possible unsafe locking scenario: [ 61.177417] [ 61.177418] CPU0 CPU1 [ 61.177419] ---- ---- [ 61.177420] lock(&n->list_lock); [ 61.177423] lock(&port_lock_key); [ 61.177426] lock(&n->list_lock); [ 61.177429] lock(console_owner); [ 61.177432] [ 61.177432] *** DEADLOCK *** [ 61.177432] [ 61.177434] 3 locks held by modprobe/510: [ 61.177436] #0: ffff000040000698 (&n->list_lock){-.-.}-{2:2}, at: get_partial_node.part.0 (mm/slub.c:2271) [ 61.177448] #1: ffff80000b227f18 (console_lock){+.+.}-{0:0}, at: vprintk_emit (kernel/printk/printk.c:1936 kernel/printk/printk.c:2315) [ 61.177460] #2: ffff80000b228388 (console_srcu){....}-{0:0}, at: console_flush_all (include/linux/srcu.h:200 kernel/printk/printk.c:290 kernel/printk/printk.c:2934) [ 61.177471] [ 61.177471] stack backtrace: [ 61.177474] CPU: 3 PID: 510 Comm: modprobe Not tainted 6.3.0-rc1-next-20230307 #1 [ 61.177479] Hardware name: Raspberry Pi 4 Model B (DT) [ 61.177482] Call trace: [ 61.177483] dump_backtrace (arch/arm64/kernel/stacktrace.c:160) [ 61.177487] show_stack (arch/arm64/kernel/stacktrace.c:167) [ 61.177490] dump_stack_lvl (lib/dump_stack.c:107) [ 61.177498] dump_stack (lib/dump_stack.c:114) [ 61.177504] print_circular_bug (kernel/locking/lockdep.c:2057) [ 61.177509] check_noncircular (kernel/locking/lockdep.c:2181) [ 61.177514] __lock_acquire (kernel/locking/lockdep.c:3099 kernel/locking/lockdep.c:3217 kernel/locking/lockdep.c:3832 kernel/locking/lockdep.c:5056) [ 61.177520] lock_acquire.part.0 (arch/arm64/include/asm/percpu.h:40 kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5671) [ 61.177525] lock_acquire (kernel/locking/lockdep.c:5673) [ 61.177530] console_flush_all (kernel/printk/printk.c:2883 kernel/printk/printk.c:2942) [ 61.177536] console_unlock.part.0 (kernel/printk/printk.c:3017) [ 61.177542] vprintk_emit (kernel/printk/printk.c:2317) [ 61.177547] vprintk_default (kernel/printk/printk.c:2328) [ 61.177553] vprintk (kernel/printk/printk_safe.c:50) [ 61.177559] _printk (kernel/printk/printk.c:2341) [ 61.177564] slab_bug (mm/slub.c:892) [ 61.177567] check_bytes_and_report (mm/slub.c:1054) [ 61.177571] check_object (mm/slub.c:1196 (discriminator 2)) [ 61.177575] alloc_debug_processing (mm/slub.c:1415 mm/slub.c:1425) [ 61.177579] get_partial_node.part.0 (mm/slub.c:2146 mm/slub.c:2279) [ 61.177583] ___slab_alloc (mm/slub.c:2268 mm/slub.c:2386 mm/slub.c:3188) [ 61.177587] __slab_alloc.constprop.0 (mm/slub.c:3292) [ 61.177592] __kmem_cache_alloc_node (mm/slub.c:3345 mm/slub.c:3442 mm/slub.c:3491) [ 61.177596] kmalloc_trace (mm/slab_common.c:1064 (discriminator 4)) [ 61.177600] device_add (drivers/base/core.c:3436 drivers/base/core.c:3486) [ 61.177603] platform_device_add (drivers/base/platform.c:717) [ 61.177609] platform_device_register_full (drivers/base/platform.c:844) [ 61.177615] gpio_mockup_register_chip+0x1ec/0x2b8 gpio_mockup [ 61.177625] gpio_mockup_init+0xf0/0xd40 gpio_mockup [ 61.177634] do_one_initcall (init/main.c:1306) [ 61.177638] do_init_module (kernel/module/main.c:2457) [ 61.177644] load_module (kernel/module/main.c:2859) [ 61.177650] __do_sys_finit_module (kernel/module/main.c:2961) [ 61.177656] __arm64_sys_finit_module (kernel/module/main.c:2928) [ 61.177662] invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:57) [ 61.177668] el0_svc_common.constprop.0 (arch/arm64/kernel/syscall.c:149) [ 61.177674] do_el0_svc (arch/arm64/kernel/syscall.c:194) [ 61.177680] el0_svc (arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:638) [ 61.177683] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:656) [ 61.177686] el0t_64_sync (arch/arm64/kernel/entry.S:591) [ 62.011685] BUG kmalloc-512 (Not tainted): Poison overwritten [ 62.017513] ----------------------------------------------------------------------------- [ 62.017513] [ 62.027300] 0xffff00004ecb7a38-0xffff00004ecb7a47 @offset=31288. First byte 0x6a instead of 0x6b [ 62.036210] Allocated in swnode_register+0x40/0x218 age=808 cpu=3 pid=386 [ 62.043101] __kmem_cache_alloc_node (mm/slub.c:3345 mm/slub.c:3442 mm/slub.c:3491) [ 62.047784] kmalloc_trace (mm/slab_common.c:1064 (discriminator 4)) [ 62.051406] swnode_register (drivers/base/swnode.c:776) [ 62.055293] fwnode_create_software_node (drivers/base/swnode.c:934 (discriminator 4)) [ 62.060238] gpio_mockup_register_chip+0x1c4/0x2b8 gpio_mockup [ 62.066337] gpio_mockup_init+0xf0/0xd40 gpio_mockup [ 62.071551] do_one_initcall (init/main.c:1306) [ 62.075437] do_init_module (kernel/module/main.c:2457) [ 62.079238] load_module (kernel/module/main.c:2859) [ 62.083037] __do_sys_finit_module (kernel/module/main.c:2961) [ 62.087455] __arm64_sys_finit_module (kernel/module/main.c:2928) [ 62.092048] invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:57) [ 62.095848] el0_svc_common.constprop.0 (arch/arm64/kernel/syscall.c:149) [ 62.100793] do_el0_svc (arch/arm64/kernel/syscall.c:194) [ 62.104151] el0_svc (arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:638) [ 62.107244] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:656) [ 62.111570] Freed in software_node_release+0xdc/0x108 age=632 cpu=0 pid=428 [ 62.118633] __kmem_cache_free (mm/slub.c:3732 mm/slub.c:3788 mm/slub.c:3800) [ 62.122784] kfree (mm/slab_common.c:1020) [ 62.125788] software_node_release (drivers/base/swnode.c:761) [ 62.130204] kobject_put (lib/kobject.c:685 lib/kobject.c:712 include/linux/kref.h:65 lib/kobject.c:729) [ 62.133739] software_node_notify_remove (drivers/base/swnode.c:1093) [ 62.138597] device_del (drivers/base/core.c:2265 drivers/base/core.c:3778) [ 62.142134] platform_device_del.part.0 (drivers/base/platform.c:753) [ 62.146903] platform_device_unregister (drivers/base/platform.c:551 drivers/base/platform.c:794) [ 62.151672] gpio_mockup_exit+0x54/0x280 gpio_mockup [ 62.156888] __arm64_sys_delete_module (kernel/module/main.c:756 kernel/module/main.c:698 kernel/module/main.c:698) [ 62.161745] invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:57) [ 62.165545] el0_svc_common.constprop.0 (arch/arm64/kernel/syscall.c:149) [ 62.170490] do_el0_svc (arch/arm64/kernel/syscall.c:194) [ 62.173850] el0_svc (arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:638) [ 62.176941] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:656) [ 62.181267] el0t_64_sync (arch/arm64/kernel/entry.S:591) [ 62.184975] Slab 0xfffffc00013b2c00 objects=21 used=7 fp=0xffff00004ecb7400 flags=0x7fffc0000010200(slab|head|node=0|zone=1|lastcpupid=0xffff) [ 62.197943] Object 0xffff00004ecb7a00 @offset=31232 fp=0xffff00004ecb7400 [ 62.197943] [ 62.206325] Redzone ffff00004ecb7800: ... [ 63.089597] CPU: 3 PID: 510 Comm: modprobe Not tainted 6.3.0-rc1-next-20230307 #1 [ 63.097186] Hardware name: Raspberry Pi 4 Model B (DT) [ 63.102392] Call trace: [ 63.104865] dump_backtrace (arch/arm64/kernel/stacktrace.c:160) [ 63.108665] show_stack (arch/arm64/kernel/stacktrace.c:167) [ 63.112021] dump_stack_lvl (lib/dump_stack.c:107) [ 63.115734] dump_stack (lib/dump_stack.c:114) [ 63.119093] print_trailer (mm/slub.c:953) [ 63.122892] check_bytes_and_report (mm/slub.c:1058) [ 63.127395] check_object (mm/slub.c:1196 (discriminator 2)) [ 63.131104] alloc_debug_processing (mm/slub.c:1415 mm/slub.c:1425) [ 63.135606] get_partial_node.part.0 (mm/slub.c:2146 mm/slub.c:2279) [ 63.140286] ___slab_alloc (mm/slub.c:2268 mm/slub.c:2386 mm/slub.c:3188) [ 63.144084] __slab_alloc.constprop.0 (mm/slub.c:3292) [ 63.148674] __kmem_cache_alloc_node (mm/slub.c:3345 mm/slub.c:3442 mm/slub.c:3491) [ 63.153354] kmalloc_trace (mm/slab_common.c:1064 (discriminator 4)) [ 63.156974] device_add (drivers/base/core.c:3436 drivers/base/core.c:3486) [ 63.160508] platform_device_add (drivers/base/platform.c:717) [ 63.164837] platform_device_register_full (drivers/base/platform.c:844) [ 63.169959] gpio_mockup_register_chip+0x1ec/0x2b8 gpio_mockup [ 63.176057] gpio_mockup_init+0xf0/0xd40 gpio_mockup [ 63.181269] do_one_initcall (init/main.c:1306) [ 63.185155] do_init_module (kernel/module/main.c:2457) [ 63.188956] load_module (kernel/module/main.c:2859) [ 63.192755] __do_sys_finit_module (kernel/module/main.c:2961) [ 63.197171] __arm64_sys_finit_module (kernel/module/main.c:2928) [ 63.201765] invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:57) [ 63.205565] el0_svc_common.constprop.0 (arch/arm64/kernel/syscall.c:149) [ 63.210510] do_el0_svc (arch/arm64/kernel/syscall.c:194) [ 63.213869] el0_svc (arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:638) [ 63.216961] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:656) [ 63.221287] el0t_64_sync (arch/arm64/kernel/entry.S:591) [ 63.224998] FIX kmalloc-512: Restoring Poison 0xffff00004ecb7a38-0xffff00004ecb7a47=0x6b [ 63.233202] FIX kmalloc-512: Marking all objects used [ 63.399213] ============================================================================= links to the crash: - https://lkft.validation.linaro.org/scheduler/job/6224830#L1291 - https://lkft.validation.linaro.org/scheduler/job/6224742#L1202 - https://lkft.validation.linaro.org/scheduler/job/6224784#L3415 - https://lkft.validation.linaro.org/scheduler/job/6224810#L2029 metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 709c6adf19dc558e44ab5c01659b09a16a2d3c82 git_describe: next-20230307 kernel_version: 6.3.0-rc1 kernel-config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2MfXESbRAbSUj9oic6d8… build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/798095907 artifact-location: https://storage.tuxsuite.com/public/linaro/lkft/builds/2MfXESbRAbSUj9oic6d8… toolchain: gcc-11 -- Linaro LKFT https://lkft.linaro.org

2 years

5
8
0 0

[PATCH] selftests/ftrace: Test toplevel-enable for instance

by Zheng Yejian

'available_events' is actually not required by 'test.d/event/toplevel-enable.tc' and its Existence has been tested in 'test.d/00basic/basic4.tc'. So the require of 'available_events' can be dropped and then we can add 'instance' flag to test 'test.d/event/toplevel-enable.tc' for instance. Test result show as below: # ./ftracetest test.d/event/toplevel-enable.tc === Ftrace unit tests === [1] event tracing - enable/disable with top level files [PASS] [2] (instance) event tracing - enable/disable with top level files [PASS] # of passed: 2 # of failed: 0 # of unresolved: 0 # of untested: 0 # of unsupported: 0 # of xfailed: 0 # of undefined(test bug): 0 Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com> --- tools/testing/selftests/ftrace/test.d/event/toplevel-enable.tc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/ftrace/test.d/event/toplevel-enable.tc b/tools/testing/selftests/ftrace/test.d/event/toplevel-enable.tc index 93c10ea42a68..8b8e1aea985b 100644 --- a/tools/testing/selftests/ftrace/test.d/event/toplevel-enable.tc +++ b/tools/testing/selftests/ftrace/test.d/event/toplevel-enable.tc @@ -1,7 +1,8 @@ #!/bin/sh # SPDX-License-Identifier: GPL-2.0 # description: event tracing - enable/disable with top level files -# requires: available_events set_event events/enable +# requires: set_event events/enable +# flags: instance do_reset() { echo > set_event -- 2.25.1

2 years, 3 months

2
3
0 0

[PATCH] selftests/ftrace: Correctly enable event in instance-event.tc

by Zheng Yejian

Function instance_set() expects to enable event 'sched_switch', so we should set 1 to its 'enable' file. Testcase passed after this patch: # ./ftracetest test.d/instances/instance-event.tc === Ftrace unit tests === [1] Test creation and deletion of trace instances while setting an event [PASS] # of passed: 1 # of failed: 0 # of unresolved: 0 # of untested: 0 # of unsupported: 0 # of xfailed: 0 # of undefined(test bug): 0 Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com> --- .../testing/selftests/ftrace/test.d/instances/instance-event.tc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/ftrace/test.d/instances/instance-event.tc b/tools/testing/selftests/ftrace/test.d/instances/instance-event.tc index 0eb47fbb3f44..42422e425107 100644 --- a/tools/testing/selftests/ftrace/test.d/instances/instance-event.tc +++ b/tools/testing/selftests/ftrace/test.d/instances/instance-event.tc @@ -39,7 +39,7 @@ instance_read() { instance_set() { while :; do - echo 1 > foo/events/sched/sched_switch + echo 1 > foo/events/sched/sched_switch/enable done 2> /dev/null } -- 2.25.1

2 years, 3 months

5
29
0 0

[PATCH v3 0/3] TDX Guest Quote generation support

by Kuppuswamy Sathyanarayanan

Hi All, In TDX guest, the attestation process is used to verify the TDX guest trustworthiness to other entities before provisioning secrets to the guest. The TDX guest attestation process consists of two steps: 1. TDREPORT generation 2. Quote generation. The First step (TDREPORT generation) involves getting the TDX guest measurement data in the format of TDREPORT which is further used to validate the authenticity of the TDX guest. The second step involves sending the TDREPORT to a Quoting Enclave (QE) server to generate a remotely verifiable Quote. TDREPORT by design can only be verified on the local platform. To support remote verification of the TDREPORT, TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT locally and convert it to a remotely verifiable Quote. Although attestation software can use communication methods like TCP/IP or vsock to send the TDREPORT to QE, not all platforms support these communication models. So TDX GHCI specification [1] defines a method for Quote generation via hypercalls. Please check the discussion from Google [2] and Alibaba [3] which clarifies the need for hypercall based Quote generation support. This patch set adds this support. Support for TDREPORT generation already exists in the TDX guest driver. This patchset extends the same driver to add the Quote generation support. Following are the details of the patch set: Patch 1/3 -> Adds event notification IRQ support. Patch 2/3 -> Adds Quote generation support. Patch 3/3 -> Adds selftest support for Quote generation feature. [1] https://cdrdv2.intel.com/v1/dl/getContent/726790, section titled "TDG.VP.VMCALL<GetQuote>". [2] https://lore.kernel.org/lkml/CAAYXXYxxs2zy_978GJDwKfX5Hud503gPc8=1kQ-+JwG_k… [3] https://lore.kernel.org/lkml/a69faebb-11e8-b386-d591-dbd08330b008@linux.ali… Kuppuswamy Sathyanarayanan (3): x86/tdx: Add TDX Guest event notify interrupt support virt: tdx-guest: Add Quote generation support selftests/tdx: Test GetQuote TDX attestation feature Documentation/virt/coco/tdx-guest.rst | 11 ++ arch/x86/coco/tdx/tdx.c | 194 +++++++++++++++++++ arch/x86/include/asm/tdx.h | 8 + drivers/virt/coco/tdx-guest/tdx-guest.c | 175 ++++++++++++++++- include/uapi/linux/tdx-guest.h | 44 +++++ tools/testing/selftests/tdx/tdx_guest_test.c | 65 ++++++- 6 files changed, 490 insertions(+), 7 deletions(-) -- 2.34.1

2 years, 3 months

10
38
0 0

[PATCH 0/2] v2: F_OFD_GETLK extension to read lock info

by Stas Sergeev

This extension allows to use F_UNLCK on query, which currently returns EINVAL. Instead it can be used to query the locks on a particular fd - something that is not currently possible. The basic idea is that on F_OFD_GETLK, F_UNLCK would "conflict" with (or query) any types of the lock on the same fd, and ignore any locks on other fds. Use-cases: 1. CRIU-alike scenario when you want to read the locking info from an fd for the later reconstruction. This can now be done by setting l_start and l_len to 0 to cover entire file range, and do F_OFD_GETLK. In the loop you need to advance l_start past the returned lock ranges, to eventually collect all locked ranges. 2. Implementing the lock checking/enforcing policy. Say you want to implement an "auditor" module in your program, that checks that the I/O is done only after the proper locking is applied on a file region. In this case you need to know if the particular region is locked on that fd, and if so - with what type of the lock. If you would do that currently (without this extension) then you can only check for the write locks, and for that you need to probe the lock on your fd and then open the same file via another fd and probe there. That way you can identify the write lock on a particular fd, but such trick is non-atomic and complex. As for finding out the read lock on a particular fd - impossible. This extension allows to do such queries without any extra efforts. 3. Implementing the mandatory locking policy. Suppose you want to make a policy where the write lock inhibits any unlocked readers and writers. Currently you need to check if the write lock is present on some other fd, and if it is not there - allow the I/O operation. But because the write lock can appear at any moment, you need to do that under some global lock, which can be released only when the I/O operation is finished. With the proposed extension you can instead just check the write lock on your own fd first, and if it is there - allow the I/O operation on that fd without using any global lock. Only if there is no write lock on this fd, then you need to take global lock and check for a write lock on other fds. The second patch adds a test-case for OFD locks. It tests both the generic things and the proposed extension. The third patch is a proposed man page update for fcntl(2) (not for the linux source tree) Changes in v2: - Dropped the l_pid extension patch and updated test-case accordingly. Stas Sergeev (2): fs/locks: F_UNLCK extension for F_OFD_GETLK selftests: add OFD lock tests fs/locks.c | 23 +++- tools/testing/selftests/locking/Makefile | 2 + tools/testing/selftests/locking/ofdlocks.c | 132 +++++++++++++++++++++ 3 files changed, 154 insertions(+), 3 deletions(-) create mode 100644 tools/testing/selftests/locking/ofdlocks.c CC: Jeff Layton <jlayton(a)kernel.org> CC: Chuck Lever <chuck.lever(a)oracle.com> CC: Alexander Viro <viro(a)zeniv.linux.org.uk> CC: Christian Brauner <brauner(a)kernel.org> CC: linux-fsdevel(a)vger.kernel.org CC: linux-kernel(a)vger.kernel.org CC: Shuah Khan <shuah(a)kernel.org> CC: linux-kselftest(a)vger.kernel.org CC: linux-api(a)vger.kernel.org -- 2.39.2

2 years, 4 months

6
15
0 0

[PATCH 0/4] RSEQ selftests updates

by Mathieu Desnoyers

Hi, You will find in this series updates to the rseq selftests, mainly bringing fixes from librseq project back into the RSEQ selftests. Thanks, Mathieu Mathieu Desnoyers (4): selftests/rseq: Fix CID_ID typo in Makefile selftests/rseq: Implement rseq_unqual_scalar_typeof selftests/rseq: Fix arm64 buggy load-acquire/store-release macros selftests/rseq: Use rseq_unqual_scalar_typeof in macros tools/testing/selftests/rseq/Makefile | 2 +- tools/testing/selftests/rseq/compiler.h | 26 ++++++++++ tools/testing/selftests/rseq/rseq-arm.h | 4 +- tools/testing/selftests/rseq/rseq-arm64.h | 58 ++++++++++++----------- tools/testing/selftests/rseq/rseq-mips.h | 4 +- tools/testing/selftests/rseq/rseq-ppc.h | 4 +- tools/testing/selftests/rseq/rseq-riscv.h | 6 +-- tools/testing/selftests/rseq/rseq-s390.h | 4 +- tools/testing/selftests/rseq/rseq-x86.h | 4 +- 9 files changed, 70 insertions(+), 42 deletions(-) -- 2.25.1

2 years, 4 months

2
8
0 0

[PATCH v2 0/7] Split a folio to any lower order folios

by Zi Yan

From: Zi Yan <ziy(a)nvidia.com> Hi all, File folio supports any order and people would like to support flexible orders for anonymous folio[1] too. Currently, split_huge_page() only splits a huge page to order-0 pages, but splitting to orders higher than 0 is also useful. This patchset adds support for splitting a huge page to any lower order pages and uses it during folio truncate operations. The patchset is on top of mm-everything-2023-03-27-21-20. Changelog from v1 === 1. Changed split_page_memcg() and split_page_owner() parameter to use order 2. Used folio_test_pmd_mappable() in place of the equivalent code Details === * Patch 1 changes split_page_memcg() to use order instead of nr_pages * Patch 2 changes split_page_owner() to use order instead of nr_pages * Patch 3 and 4 add new_order parameter split_page_memcg() and split_page_owner() and prepare for upcoming changes. * Patch 5 adds split_huge_page_to_list_to_order() to split a huge page to any lower order. The original split_huge_page_to_list() calls split_huge_page_to_list_to_order() with new_order = 0. * Patch 6 uses split_huge_page_to_list_to_order() in large pagecache folio truncation instead of split the large folio all the way down to order-0. * Patch 7 adds a test API to debugfs and test cases in split_huge_page_test selftests. Comments and/or suggestions are welcome. [1] https://lore.kernel.org/linux-mm/Y%2FblF0GIunm+pRIC@casper.infradead.org/ Zi Yan (7): mm/memcg: use order instead of nr in split_page_memcg() mm/page_owner: use order instead of nr in split_page_owner() mm: memcg: make memcg huge page split support any order split. mm: page_owner: add support for splitting to any order in split page_owner. mm: thp: split huge page to any lower order pages. mm: truncate: split huge page cache page to a non-zero order if possible. mm: huge_memory: enable debugfs to split huge pages to any order. include/linux/huge_mm.h | 10 +- include/linux/memcontrol.h | 4 +- include/linux/page_owner.h | 10 +- mm/huge_memory.c | 137 ++++++++--- mm/memcontrol.c | 10 +- mm/page_alloc.c | 8 +- mm/page_owner.c | 10 +- mm/truncate.c | 21 +- .../selftests/mm/split_huge_page_test.c | 225 +++++++++++++++++- 9 files changed, 366 insertions(+), 69 deletions(-) -- 2.39.2

2 years, 4 months

4
13
0 0

[PATCH 0/1] Possible bug in zram on ppc64le on vfat

by Petr Vorel

Hi all, following bug is trying to workaround an error on ppc64le, where zram01.sh LTP test (there is also kernel selftest tools/testing/selftests/zram/zram01.sh, but LTP test got further updates) has often mem_used_total 0 although zram is already filled. Patch tries to repeatedly read /sys/block/zram*/mm_stat for 1 sec, waiting for mem_used_total > 0. The question if this is expected and should be workarounded or a bug which should be fixed. REPRODUCE THE ISSUE Quickest way to install only zram tests and their dependencies: make autotools && ./configure && for i in testcases/lib/ testcases/kernel/device-drivers/zram/; do cd $i && make -j$(getconf _NPROCESSORS_ONLN) && make install && cd -; done Run the test (only on vfat) PATH="/opt/ltp/testcases/bin:$PATH" LTP_SINGLE_FS_TYPE=vfat zram01.sh Petr Vorel (1): zram01.sh: Workaround division by 0 on vfat on ppc64le .../kernel/device-drivers/zram/zram01.sh | 27 +++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) -- 2.38.0

2 years, 4 months

5
19
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror June 2023