On Thu, Oct 14, 2021 at 10:11:46AM +0800, Ming Lei wrote:
On Thu, Oct 14, 2021 at 09:55:48AM +0800, Ming Lei wrote:
On Mon, Sep 27, 2021 at 09:38:04AM -0700, Luis Chamberlain wrote:
...
Hello Luis,
Can you test the following patch and see if the issue can be addressed?
Please see the idea from the inline comment.
Also zram_index_mutex isn't needed in zram disk's store() compared with your patch, then the deadlock issue you are addressing in this series can be avoided.
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index fcaf2750f68f..3c17927d23a7 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -1985,11 +1985,17 @@ static int zram_remove(struct zram *zram) /* Make sure all the pending I/O are finished */ fsync_bdev(bdev);
- zram_reset_device(zram);
pr_info("Removed device: %s\n", zram->disk->disk_name); del_gendisk(zram->disk);
- /*
* reset device after gendisk is removed, so any change from sysfs
* store won't come in, then we can really reset device here
*/
- zram_reset_device(zram);
- blk_cleanup_disk(zram->disk); kfree(zram); return 0;
@@ -2073,7 +2079,12 @@ static int zram_remove_cb(int id, void *ptr, void *data) static void destroy_devices(void) { class_unregister(&zram_control_class);
- /* hold the global lock so new device can't be added */
- mutex_lock(&zram_index_mutex); idr_for_each(&zram_index_idr, &zram_remove_cb, NULL);
- mutex_unlock(&zram_index_mutex);
Actually zram_index_mutex isn't needed when calling zram_remove_cb() since the zram-control sysfs interface has been removed, so userspace can't add new device any more, then the issue is supposed to be fixed by the following one line change, please test it:
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index fcaf2750f68f..96dd641de233 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -1985,11 +1985,17 @@ static int zram_remove(struct zram *zram) /* Make sure all the pending I/O are finished */ fsync_bdev(bdev);
- zram_reset_device(zram);
pr_info("Removed device: %s\n", zram->disk->disk_name); del_gendisk(zram->disk);
- /*
* reset device after gendisk is removed, so any change from sysfs
* store won't come in, then we can really reset device here
*/
- zram_reset_device(zram);
- blk_cleanup_disk(zram->disk); kfree(zram); return 0;
Sorry but nope, the cpu multistate issue is still present and we end up eventually with page faults. I tried with both patches.
Oct 14 20:21:34 kdevops kernel: ------------[ cut here ]------------ Oct 14 20:21:34 kdevops kernel: Error: Removing state 65 which has instances left. Oct 14 20:21:34 kdevops kernel: WARNING: CPU: 4 PID: 3358 at kernel/cpu.c:2151 __cpuhp_remove_state_cpuslocked+0xf9/0x100 Oct 14 20:21:34 kdevops kernel: Modules linked in: zram(E-) zstd(E) zsmalloc(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) > Oct 14 20:21:34 kdevops kernel: CPU: 4 PID: 3358 Comm: rmmod Tainted: G E 5.15.0-rc3-next-20210927+ #89 Oct 14 20:21:34 kdevops kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Oct 14 20:21:34 kdevops kernel: RIP: 0010:__cpuhp_remove_state_cpuslocked+0xf9/0x100 Oct 14 20:21:34 kdevops kernel: Code: 21 00 48 c7 43 18 00 00 00 00 5b 5d 41 5c 41 5d 41 5e 41 5f e9 d8 17 84 00 0f 0b 44 89 e6 48 c7 c7 78 0c 8b ad e8 56 92 7f 00 <0f> 0b > Oct 14 20:21:34 kdevops kernel: RSP: 0018:ffffaac980a1fe90 EFLAGS: 00010286 Oct 14 20:21:34 kdevops kernel: RAX: 0000000000000000 RBX: ffffffffada3e208 RCX: 0000000000000000 Oct 14 20:21:34 kdevops kernel: RDX: 0000000000000001 RSI: ffffffffad8efdb6 RDI: 00000000ffffffff Oct 14 20:21:34 kdevops kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffffaac980a1fcc0 Oct 14 20:21:34 kdevops kernel: R10: ffffaac980a1fcb8 R11: ffffffffadac3c68 R12: 0000000000000041 Oct 14 20:21:34 kdevops kernel: R13: 0000000000000a28 R14: 0000000000000000 R15: 0000000000000000 Oct 14 20:21:34 kdevops kernel: FS: 00007fc0c2882580(0000) GS:ffff9ed6f7d00000(0000) knlGS:0000000000000000 Oct 14 20:21:34 kdevops kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 14 20:21:34 kdevops kernel: CR2: 00005621b0490b78 CR3: 000000011a538005 CR4: 0000000000370ee0 Oct 14 20:21:34 kdevops kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Oct 14 20:21:34 kdevops kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Oct 14 20:21:34 kdevops kernel: Call Trace: Oct 14 20:21:34 kdevops kernel: <TASK> Oct 14 20:21:34 kdevops kernel: __cpuhp_remove_state+0x4d/0xc0 Oct 14 20:21:34 kdevops kernel: __do_sys_delete_module+0x18d/0x2a0 Oct 14 20:21:34 kdevops kernel: ? fpregs_assert_state_consistent+0x1e/0x40 Oct 14 20:21:34 kdevops kernel: ? exit_to_user_mode_prepare+0x3a/0x180 Oct 14 20:21:34 kdevops kernel: do_syscall_64+0x38/0xc0 Oct 14 20:21:34 kdevops kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae Oct 14 20:21:34 kdevops kernel: RIP: 0033:0x7fc0c29a84a7 <etc> Oct 14 20:21:35 kdevops kernel: sysfs: cannot create duplicate filename '/devices/virtual/block/zram0' Oct 14 20:21:35 kdevops kernel: CPU: 5 PID: 3388 Comm: modprobe Tainted: G W E 5.15.0-rc3-next-20210927+ #89 Oct 14 20:21:35 kdevops kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Oct 14 20:21:35 kdevops kernel: Call Trace: Oct 14 20:21:35 kdevops kernel: <TASK> Oct 14 20:21:35 kdevops kernel: dump_stack_lvl+0x48/0x5e Oct 14 20:21:35 kdevops kernel: sysfs_warn_dup.cold+0x17/0x24 Oct 14 20:21:35 kdevops kernel: sysfs_create_dir_ns+0xbc/0xd0 Oct 14 20:21:35 kdevops kernel: kobject_add_internal+0xbd/0x2b0 Oct 14 20:21:35 kdevops kernel: kobject_add+0x7e/0xb0 Oct 14 20:21:35 kdevops kernel: ? _raw_spin_unlock_irqrestore+0x25/0x40 Oct 14 20:21:35 kdevops kernel: ? preempt_count_add+0x68/0xa0 Oct 14 20:21:35 kdevops kernel: device_add+0x11a/0x980 Oct 14 20:21:35 kdevops kernel: ? dev_set_name+0x53/0x70 Oct 14 20:21:35 kdevops kernel: device_add_disk+0x9d/0x3a0 Oct 14 20:21:35 kdevops kernel: zram_add+0x1ad/0x200 [zram] Oct 14 20:21:35 kdevops kernel: ? 0xffffffffc0c10000 Oct 14 20:21:35 kdevops kernel: zram_init+0xd7/0x1000 [zram] Oct 14 20:21:35 kdevops kernel: do_one_initcall+0x41/0x200 Oct 14 20:21:35 kdevops kernel: ? _raw_spin_unlock_irqrestore+0x25/0x40 Oct 14 20:21:35 kdevops kernel: ? kmem_cache_alloc_trace+0x2ab/0x420 Oct 14 20:21:35 kdevops kernel: do_init_module+0x5c/0x270 Oct 14 20:21:35 kdevops kernel: __do_sys_finit_module+0xae/0x110 Oct 14 20:21:35 kdevops kernel: do_syscall_64+0x38/0xc0 Oct 14 20:21:35 kdevops kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae Oct 14 20:21:35 kdevops kernel: RIP: 0033:0x7fca3aa555e9 Oct 14 20:21:35 kdevops kernel: Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > Oct 14 20:21:35 kdevops kernel: RSP: 002b:00007fff142417b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 Oct 14 20:21:35 kdevops kernel: RAX: ffffffffffffffda RBX: 0000558ba9491bd0 RCX: 00007fca3aa555e9 Oct 14 20:21:35 kdevops kernel: RDX: 0000000000000000 RSI: 0000558ba9491f60 RDI: 0000000000000003 Oct 14 20:21:35 kdevops kernel: RBP: 0000000000040000 R08: 0000000000000000 R09: 0000558ba9491db0 Oct 14 20:21:35 kdevops kernel: R10: 0000000000000003 R11: 0000000000000246 R12: 0000558ba9491f60 Oct 14 20:21:35 kdevops kernel: R13: 0000000000000000 R14: 0000558ba9491d00 R15: 0000558ba9491bd0 Oct 14 20:21:35 kdevops kernel: </TASK> <etc> Oct 14 20:21:35 kdevops kernel: kobject_add_internal failed for zram0 with -EEXIST, don't try to register things with the same name in the same directory. Oct 14 20:21:35 kdevops kernel: ------------[ cut here ]------------ Oct 14 20:21:35 kdevops kernel: WARNING: CPU: 5 PID: 3388 at block/genhd.c:537 device_add_disk+0x1b9/0x3a0 Oct 14 20:21:35 kdevops kernel: Modules linked in: zram(E+) zstd(E) zsmalloc(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) > Oct 14 20:21:35 kdevops kernel: CPU: 5 PID: 3388 Comm: modprobe Tainted: G W E 5.15.0-rc3-next-20210927+ #89 Oct 14 20:21:35 kdevops kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Oct 14 20:21:35 kdevops kernel: RIP: 0010:device_add_disk+0x1b9/0x3a0 Oct 14 20:21:35 kdevops kernel: Code: 00 03 01 00 00 0f 85 32 ff ff ff e9 1e ff ff ff 0f 0b 41 bc ea ff ff ff e9 29 ff ff ff 4c 89 ff e8 5c 45 1c 00 e9 ef fe ff ff <0f> 0b > Oct 14 20:21:35 kdevops kernel: RSP: 0018:ffffaac980607d90 EFLAGS: 00010287 Oct 14 20:21:35 kdevops kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000023005 Oct 14 20:21:35 kdevops kernel: RDX: 0000000000022e05 RSI: ffffffffacc4b710 RDI: 0000000000000000 Oct 14 20:21:35 kdevops kernel: RBP: ffff9ed5d788a600 R08: 0000000000000000 R09: ffffaac980607a98 Oct 14 20:21:35 kdevops kernel: R10: ffff9ed5c795ef00 R11: ffffffffadac3c68 R12: 00000000ffffffef Oct 14 20:21:35 kdevops kernel: R13: ffff9ed5d5600000 R14: ffffffffc0a52100 R15: ffff9ed5d5600040 Oct 14 20:21:35 kdevops kernel: FS: 00007fca3a935580(0000) GS:ffff9ed6f7d40000(0000) knlGS:0000000000000000 Oct 14 20:21:35 kdevops kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 14 20:21:35 kdevops kernel: CR2: 00007fff1423e6d8 CR3: 0000000136752002 CR4: 0000000000370ee0 Oct 14 20:21:35 kdevops kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Oct 14 20:21:35 kdevops kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Oct 14 20:21:35 kdevops kernel: Call Trace: Oct 14 20:21:35 kdevops kernel: <TASK> Oct 14 20:21:35 kdevops kernel: zram_add+0x1ad/0x200 [zram] Oct 14 20:21:35 kdevops kernel: ? 0xffffffffc0c10000 Oct 14 20:21:35 kdevops kernel: zram_init+0xd7/0x1000 [zram] Oct 14 20:21:35 kdevops kernel: do_one_initcall+0x41/0x200 Oct 14 20:21:35 kdevops kernel: ? _raw_spin_unlock_irqrestore+0x25/0x40 Oct 14 20:21:35 kdevops kernel: ? kmem_cache_alloc_trace+0x2ab/0x420 Oct 14 20:21:35 kdevops kernel: do_init_module+0x5c/0x270 Oct 14 20:21:35 kdevops kernel: __do_sys_finit_module+0xae/0x110 Oct 14 20:21:35 kdevops kernel: do_syscall_64+0x38/0xc0 Oct 14 20:21:35 kdevops kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae Oct 14 20:21:35 kdevops kernel: RIP: 0033:0x7fca3aa555e9 <etc> Oct 14 20:21:35 kdevops kernel: ------------[ cut here ]------------ Oct 14 20:21:35 kdevops kernel: WARNING: CPU: 2 PID: 3457 at block/genhd.c:564 del_gendisk+0x1a2/0x1d0 Oct 14 20:21:35 kdevops kernel: Modules linked in: 842(E) 842_decompress(E) 842_compress(E) zram(E-) zstd(E) zsmalloc(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E> Oct 14 20:21:35 kdevops kernel: CPU: 2 PID: 3457 Comm: rmmod Tainted: G W E 5.15.0-rc3-next-20210927+ #89 Oct 14 20:21:35 kdevops kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Oct 14 20:21:35 kdevops kernel: RIP: 0010:del_gendisk+0x1a2/0x1d0 Oct 14 20:21:35 kdevops kernel: Code: 48 8d 78 40 e8 8f 87 1d 00 48 8b 7b 40 5b 5d 41 5c 48 83 c7 40 e9 4e 47 1c 00 48 8b 70 40 eb ce f6 43 61 04 0f 85 85 fe ff ff <0f> 0b > Oct 14 20:21:35 kdevops kernel: RSP: 0018:ffffaac9807cfe30 EFLAGS: 00010246 Oct 14 20:21:35 kdevops kernel: RAX: ffff9ed5d5600380 RBX: ffff9ed5d788a600 RCX: 0000000000000000 Oct 14 20:21:35 kdevops kernel: RDX: 0000000000000000 RSI: ffffffffad8efdb6 RDI: ffff9ed5d788a600 Oct 14 20:21:35 kdevops kernel: RBP: ffff9ed5d788b600 R08: 0000000000000000 R09: ffffaac9807cfc88 Oct 14 20:21:35 kdevops kernel: R10: ffffaac9807cfc80 R11: ffffffffadac3c68 R12: ffff9ed5d5600000 Oct 14 20:21:35 kdevops kernel: R13: 0000000000000000 R14: ffffffffc0a52360 R15: ffff9ed5c4a87b78 Oct 14 20:21:35 kdevops kernel: FS: 00007f292a2bb580(0000) GS:ffff9ed6f7c80000(0000) knlGS:0000000000000000 Oct 14 20:21:35 kdevops kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 14 20:21:35 kdevops kernel: CR2: 000056161b453b78 CR3: 000000013213e002 CR4: 0000000000370ee0 Oct 14 20:21:35 kdevops kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Oct 14 20:21:35 kdevops kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Oct 14 20:21:35 kdevops kernel: Call Trace: Oct 14 20:21:35 kdevops kernel: <TASK> Oct 14 20:21:35 kdevops kernel: zram_remove+0x96/0xc0 [zram] Oct 14 20:21:35 kdevops kernel: ? hot_remove_store+0xe0/0xe0 [zram] Oct 14 20:21:35 kdevops kernel: zram_remove_cb+0xd/0x10 [zram] Oct 14 20:21:35 kdevops kernel: idr_for_each+0x5b/0xd0 Oct 14 20:21:35 kdevops kernel: destroy_devices+0x32/0x68 [zram] Oct 14 20:21:35 kdevops kernel: __do_sys_delete_module+0x18d/0x2a0 Oct 14 20:21:35 kdevops kernel: ? fpregs_assert_state_consistent+0x1e/0x40 Oct 14 20:21:35 kdevops kernel: ? exit_to_user_mode_prepare+0x3a/0x180 Oct 14 20:21:35 kdevops kernel: do_syscall_64+0x38/0xc0 Oct 14 20:21:35 kdevops kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae Oct 14 20:21:35 kdevops kernel: RIP: 0033:0x7f292a3e14a7 <etc> Oct 14 20:21:35 kdevops kernel: BUG: unable to handle page fault for address: ffffffffc0a4e0ae Oct 14 20:21:35 kdevops kernel: #PF: supervisor instruction fetch in kernel mode Oct 14 20:21:35 kdevops kernel: #PF: error_code(0x0010) - not-present page Oct 14 20:21:35 kdevops kernel: PGD 3ba0e067 P4D 3ba0e067 PUD 3ba10067 PMD 10526c067 PTE 0 Oct 14 20:21:35 kdevops kernel: Oops: 0010 [#1] PREEMPT SMP NOPTI Oct 14 20:21:35 kdevops kernel: CPU: 6 PID: 3655 Comm: zram02.sh Tainted: G W E 5.15.0-rc3-next-20210927+ #89 Oct 14 20:21:35 kdevops kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Oct 14 20:21:35 kdevops kernel: RIP: 0010:0xffffffffc0a4e0ae Oct 14 20:21:35 kdevops kernel: Code: Unable to access opcode bytes at RIP 0xffffffffc0a4e084. Oct 14 20:21:35 kdevops kernel: RSP: 0018:ffffaac980687da8 EFLAGS: 00010286 Oct 14 20:21:35 kdevops kernel: RAX: 0000000000000000 RBX: ffff9ed5c40be400 RCX: 0000000080400035 Oct 14 20:21:35 kdevops kernel: RDX: 0000000080400036 RSI: fffffa3544561080 RDI: 0000000040000000 Oct 14 20:21:35 kdevops kernel: RBP: 0000000001900000 R08: ffff9ed5d5842cc0 R09: 0000000080400035 Oct 14 20:21:35 kdevops kernel: R10: ffff9ed5d5842c00 R11: ffff9ed5f1341350 R12: 0000000001900000 Oct 14 20:21:35 kdevops kernel: R13: ffff9ed5d5666c00 R14: ffff9ed5c40be420 R15: ffff9ed5dfa8c8c0 Oct 14 20:21:35 kdevops kernel: FS: 00007f978fe2d5c0(0000) GS:ffff9ed6f7d80000(0000) knlGS:0000000000000000 Oct 14 20:21:35 kdevops kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 14 20:21:35 kdevops kernel: CR2: ffffffffc0a4e084 CR3: 0000000133fd4006 CR4: 0000000000370ee0 Oct 14 20:21:35 kdevops kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Oct 14 20:21:35 kdevops kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Oct 14 20:21:35 kdevops kernel: Call Trace: Oct 14 20:21:35 kdevops kernel: <TASK> Oct 14 20:21:35 kdevops kernel: ? kernfs_fop_write_iter+0x177/0x220 Oct 14 20:21:35 kdevops kernel: ? new_sync_write+0x11c/0x1b0 Oct 14 20:21:35 kdevops kernel: ? vfs_write+0x20d/0x2a0 Oct 14 20:21:35 kdevops kernel: ? ksys_write+0x5f/0xe0 Oct 14 20:21:35 kdevops kernel: ? do_syscall_64+0x38/0xc0 Oct 14 20:21:35 kdevops kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xae Oct 14 20:21:35 kdevops kernel: </TASK> <etc, etc, etc, this goes on and on>
Luis