October 2018 - Linux-stable-mirror

Please include d391f120706726826 in v4.9.y [x86/kvm/vmx]: do not use vm-exit instruction length for fast MMIO when running nested

by Mike Haboustak

The following commit fixes freezes in virtio device drivers when KVM is nested under VMWare Workstation/ESXi or Hyper-V. I've encountered problems running KVM inside VMWare since upgrading to Debian 9 (currently testing 4.9.88-1+deb9u1). d391f1207067268261add0485f0f34503539c5b0 The same issue affects 4.4.y as well. A git-bisect within my environment stopped at e9ea5069d9e569c32ab913c39467df32e056b3a7, where the KVM capability was added that QEMU checks before enabling fast mmio. Thanks, Mike

6 years, 11 months

2
4
0 0

[PATCH stable 3.9 to 3.16] staging: comedi: quatech_daqp_cs: fix no-op loop daqp_ao_insn_write()

by Ian Abbott

From: Dan Carpenter <dan.carpenter(a)oracle.com> [ Upstream commit 1376b0a2160319125c3a2822e8c09bd283cd8141 ] There is a '>' vs '<' typo so this loop is a no-op. Fixes: d35dcc89fc93 ("staging: comedi: quatech_daqp_cs: fix daqp_ao_insn_write()") Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com> Reviewed-by: Ian Abbott <abbotti(a)mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/staging/comedi/drivers/quatech_daqp_cs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/comedi/drivers/quatech_daqp_cs.c b/drivers/staging/comedi/drivers/quatech_daqp_cs.c index b3bbec0a0d23..f89a863ea04c 100644 --- a/drivers/staging/comedi/drivers/quatech_daqp_cs.c +++ b/drivers/staging/comedi/drivers/quatech_daqp_cs.c @@ -649,7 +649,7 @@ static int daqp_ao_insn_write(struct comedi_device *dev, /* Make sure D/A update mode is direct update */ outb(0, dev->iobase + DAQP_AUX); - for (i = 0; i > insn->n; i++) { + for (i = 0; i < insn->n; i++) { val = data[0]; val &= 0x0fff; val ^= 0x0800; /* Flip the sign */ -- 2.18.0

6 years, 11 months

2
6
0 0

4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

by Daniel Wang

Prior to this change, the combination of `softlockup_panic=1` and `softlockup_all_cpu_stacktrace=1` may result in a deadlock when the reboot path is trying to grab the console lock that is held by the stack trace printing path. What seems to be happening is that while there are multiple CPUs, only one of them is tasked to print the back trace of all CPUs. On a machine with many CPUs and a slow serial console (on Google Compute Engine for example), the stack trace printing routine hits a timeout and the reboot path kicks in. The latter then tries to print something else, but can't get the lock because it's still held by earlier printing path. This is easily reproducible on a VM with 16+ vCPUs on Google Compute Engine - which is a very common scenario. A quick repro is available at https://github.com/wonderfly/printk-deadlock-repro. The system hangs 3 seconds into executing repro.sh. Both deadlock analysis and repro are credits to Peter Feiner. Note that I have read previous discussions on backporting this to stable [1]. The argument for objecting the backport was that this is a non-trivial fix and is supported to prevent hypothetical soft lockups. What we are hitting is a real deadlock, in production, however. Hence this request. [1] https://lore.kernel.org/lkml/20180409081535.dq7p5bfnpvd3xk3t@pathway.suse.c… Serial console logs leading up to the deadlock. As can be seen the stack trace was incomplete because the printing path hit a timeout. ``` lockup-test-16-2 login: [ 206.648060] LoadPin: kernel-module pinning-ignored obj="/tmp/release/hog.ko" pid=3003 cmdline="insmod hog.ko" [ 206.650851] hog: loading out-of-tree module taints kernel. [ 206.654761] Hogging a CPU now [ 209.577900] watchdog: BUG: soft lockup - CPU#13 stuck for 3s! [hog:3010] [ 209.584883] Modules linked in: hog(O) ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 xt_addrtype nf_nat br_netfilter ip6table_filter ip6_tables aesni_intel aes_x86_64 crypto_simd cryptd glue_helper [ 209.603952] CPU: 13 PID: 3010 Comm: hog Tainted: G O 4.14.0+ #11 [ 209.611390] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 209.620733] task: ffff9501b8ca9d00 task.stack: ffffb99c0732c000 [ 209.626766] RIP: 0010:hog_thread+0x13/0x1000 [hog] [ 209.631763] RSP: 0018:ffffb99c0732ff10 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff11 [ 209.639466] RAX: 0000000000000011 RBX: ffff9501bc1af580 RCX: 0000000000000000 [ 209.646818] RDX: ffff9501c3554d80 RSI: ffff9501c354cc38 RDI: ffff9501c354cc38 [ 209.654087] RBP: ffffb99c0732ff48 R08: 0000000000000030 R09: 0000000000000000 [ 209.661510] R10: ffffb99c08df3ce0 R11: 0000000000000000 R12: ffff9501aeab8e80 [ 209.668773] R13: ffffb99c0803bc28 R14: 0000000000000000 R15: ffff9501bc1af5c8 [ 209.676089] FS: 0000000000000000(0000) GS:ffff9501c3540000(0000) knlGS:0000000000000000 [ 209.684292] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 209.690150] CR2: 00007f146fd8aba0 CR3: 0000000b0ba11006 CR4: 00000000003606a0 [ 209.697571] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 209.704936] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 209.712184] Call Trace: [ 209.714853] kthread+0x120/0x160 [ 209.718198] ? 0xffffffffc0307000 [ 209.721641] ? kthread_stop+0x120/0x120 [ 209.725591] ? ret_from_fork+0x1f/0x30 [ 209.729462] Code: <eb> fe 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 209.737518] Sending NMI from CPU 13 to CPUs 0-12,14-15: [ 209.742864] NMI backtrace for cpu 0 [ 209.742868] CPU: 0 PID: 2866 Comm: dd Tainted: G O 4.14.0+ #11 [ 209.742868] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 209.742870] task: ffff95019d150000 task.stack: ffffb99c08d98000 [ 209.742875] RIP: 0010:native_queued_spin_lock_slowpath+0x28/0x1b0 [ 209.742876] RSP: 0018:ffffb99c08d9bda8 EFLAGS: 00000002 [ 209.742877] RAX: 0000000000000001 RBX: ffff9501c2cdda68 RCX: 0000000000000000 [ 209.742877] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff9501c2cdda68 [ 209.742878] RBP: ffffb99c08d9bdd8 R08: 000000007f569b7d R09: 000000006930609f [ 209.742880] R10: 000000000a41d205 R11: 000000008b5d54b4 R12: ffff9501c2cdda68 [ 209.742881] R13: ffffb99c08d9be30 R14: ffffb99c08d9be30 R15: 0000000000000040 [ 209.742882] FS: 00007f2605cd3700(0000) GS:ffff9501c3200000(0000) knlGS:0000000000000000 [ 209.742883] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 209.742884] CR2: 00007fe4acdfe9c0 CR3: 0000000ef55fa001 CR4: 00000000003606b0 [ 209.742888] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 209.742889] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 209.742890] Call Trace: [ 209.742895] do_raw_spin_lock+0xa0/0xb0 [ 209.742900] _raw_spin_lock_irqsave+0x20/0x30 [ 209.742905] _extract_crng+0x45/0x120 [ 209.742907] ? urandom_read+0xfa/0x2a0 [ 209.742910] ? vfs_read+0xad/0x170 [ 209.742912] ? SyS_read+0x4b/0xa0 [ 209.742916] ? __audit_syscall_exit+0x21e/0x2c0 [ 209.742918] ? do_syscall_64+0x63/0x1f0 [ 209.742920] ? entry_SYSCALL64_slow_path+0x25/0x25 [ 209.742921] Code: 0f 1f 00 0f 1f 44 00 00 8b 05 f5 b4 98 00 55 85 c0 7e 1a ba 01 00 00 00 90 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 <eb> ec 81 fe 00 01 00 00 0f 84 9a 00 00 00 41 b8 01 01 00 00 b9 [ 209.742940] NMI backtrace for cpu 8 [ 209.742942] CPU: 8 PID: 2883 Comm: dd Tainted: G O 4.14.0+ #11 [ 209.742943] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 209.742943] task: ffff95019d260000 task.stack: ffffb99c07bec000 [ 209.742945] RIP: 0010:native_queued_spin_lock_slowpath+0x20/0x1b0 [ 209.742946] RSP: 0018:ffffb99c07befda8 EFLAGS: 00000097 [ 209.742947] RAX: 0000000000000001 RBX: ffff9501c2cdda68 RCX: 0000000000000000 [ 209.742948] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff9501c2cdda68 [ 209.742948] RBP: ffffb99c07befdd8 R08: 000000007f6dbc9d R09: 0000000011140320 [ 209.742949] R10: 00000000cde5e021 R11: 000000008b7475d4 R12: ffff9501c2cdda68 [ 209.742950] R13: ffffb99c07befe30 R14: ffffb99c07befe30 R15: 0000000000000040 [ 209.742951] FS: 00007f9247798700(0000) GS:ffff9501c3400000(0000) knlGS:0000000000000000 [ 209.742952] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 209.742953] CR2: 0000561d05e288b0 CR3: 0000000edd2c2001 CR4: 00000000003606a0 [ 209.742956] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 209.742956] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 209.742957] Call Trace: [ 209.742959] do_raw_spin_lock+0xa0/0xb0 [ 209.742961] _raw_spin_lock_irqsave+0x20/0x30 [ 209.742962] _extract_crng+0x45/0x120 [ 209.742964] ? urandom_read+0xfa/0x2a0 [ 209.742966] ? vfs_read+0xad/0x170 [ 209.742967] ? SyS_read+0x4b/0xa0 [ 209.742969] ? __audit_syscall_exit+0x21e/0x2c0 [ 209.742970] ? do_syscall_64+0x63/0x1f0 [ 209.742971] ? entry_SYSCALL64_slow_path+0x25/0x25 [ 209.742972] Code: 00 00 00 e9 1d fe ff ff 0f 1f 00 0f 1f 44 00 00 8b 05 f5 b4 98 00 55 85 c0 7e 1a ba 01 00 00 00 90 8b 07 85 c0 75 0a f0 0f b1 17 <85> c0 75 f2 5d c3 f3 90 eb ec 81 fe 00 01 00 00 0f 84 9a 00 00 [ 209.742991] NMI backtrace for cpu 5 [ 209.742994] CPU: 5 PID: 2872 Comm: dd Tainted: G O 4.14.0+ #11 [ 209.742994] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 209.742995] task: ffff95019f229d00 task.stack: ffffb99c07d00000 [ 209.742998] RIP: 0010:native_queued_spin_lock_slowpath+0x28/0x1b0 [ 209.742999] RSP: 0018:ffffb99c07d03da8 EFLAGS: 00000002 [ 209.743000] RAX: 0000000000000001 RBX: ffff9501c2cdda68 RCX: 0000000000000000 [ 209.743001] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff9501c2cdda68 [ 209.743005] RBP: ffffb99c07d03dd8 R08: 00000000d10b17ce R09: 000000004db462a0 [ 209.743006] R10: 00000000fe50950b R11: 00000000dd11d105 R12: ffff9501c2cdda68 [ 209.743006] R13: ffffb99c07d03e30 R14: ffffb99c07d03e30 R15: 0000000000000040 [ 209.743008] FS: 00007fd82c5e2700(0000) GS:ffff9501c3340000(0000) knlGS:0000000000000000 [ 209.743009] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 209.743009] CR2: 00007f43a5388b20 CR3: 0000000ef5f9d004 CR4: 00000000003606a0 [ 209.743013] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 209.743013] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 209.743014] Call Trace: [ 209.743017] do_raw_spin_lock+0xa0/0xb0 [ 209.743020] _raw_spin_lock_irqsave+0x20/0x30 [ 209.743024] _extract_crng+0x45/0x120 [ 209.743026] ? urandom_read+0xfa/0x2a0 [ 209.743028] ? vfs_read+0xad/0x170 [ 209.743030] ? SyS_read+0x4b/0xa0 [ 209.743033] ? __audit_syscall_exit+0x21e/0x2c0 [ 209.743034] ? do_syscall_64+0x63/0x1f0 [ 209.743036] ? entry_SYSCALL64_slow_path+0x25/0x25 [ 209.743037] Code: 0f 1f 00 0f 1f 44 00 00 8b 05 f5 b4 98 00 55 85 c0 7e 1a ba 01 00 00 00 90 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 <eb> ec 81 fe 00 01 00 00 0f 84 9a 00 00 00 41 b8 01 01 00 00 b9 [ 209.743059] NMI backtrace for cpu 6 [ 209.743061] CPU: 6 PID: 2893 Comm: dd Tainted: G O 4.14.0+ #11 [ 209.743062] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 209.743063] task: ffff95019d2e2b80 task.stack: ffffb99c07e30000 [ 209.743066] RIP: 0010:native_queued_spin_lock_slowpath+0x18/0x1b0 [ 209.743067] RSP: 0018:ffffb99c07e33da8 EFLAGS: 00000002 [ 209.743068] RAX: 0000000000000001 RBX: ffff9501c2cdda68 RCX: 0000000000000000 [ 209.743069] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff9501c2cdda68 [ 209.743069] RBP: ffffb99c07e33dd8 R08: 00000000ae91c56e R09: 00000000aa3ad454 [ 209.743070] R10: 00000000646b9d65 R11: 00000000ba987ea5 R12: ffff9501c2cdda68 [ 209.743071] R13: ffffb99c07e33e30 R14: ffffb99c07e33e30 R15: 0000000000000040 [ 209.743072] FS: 00007f525c77c700(0000) GS:ffff9501c3380000(0000) knlGS:0000000000000000 [ 209.743073] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 209.743074] CR2: 00007f11dcd8c8c0 CR3: 0000000edd2b0004 CR4: 00000000003606a0 [ 209.743077] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 209.743078] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 209.743078] Call Trace: [ 209.743082] do_raw_spin_lock+0xa0/0xb0 [ 209.743084] _raw_spin_lock_irqsave+0x20/0x30 [ 209.743087] _extract_crng+0x45/0x120 [ 209.743089] ? urandom_read+0xfa/0x2a0 [ 209.743091] ? vfs_read+0xad/0x170 [ 209.743092] ? SyS_read+0x4b/0xa0 [ 209.743094] ? __audit_syscall_exit+0x21e/0x2c0 [ 209.743096] ? do_syscall_64+0x63/0x1f0 [ 209.743097] ? entry_SYSCALL64_slow_path+0x25/0x25 [ 209.743098] Code: 48 8b 2c 24 48 c7 00 00 00 00 00 e9 1d fe ff ff 0f 1f 00 0f 1f 44 00 00 8b 05 f5 b4 98 00 55 85 c0 7e 1a ba 01 00 00 00 90 8b 07 <85> c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 eb ec 81 fe 00 [ 209.743119] NMI backtrace for cpu 14 [ 209.743120] CPU: 14 PID: 2885 Comm: dd Tainted: G O 4.14.0+ #11 [ 209.743121] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 209.743122] task: ffff95019d1c8e80 task.stack: ffffb99c07e08000 [ 209.743124] RIP: 0010:native_queued_spin_lock_slowpath+0x28/0x1b0 [ 209.743124] RSP: 0018:ffffb99c07e0bda8 EFLAGS: 00000002 [ 209.743126] RAX: 0000000000000001 RBX: ffff9501c2cdda68 RCX: 0000000000000000 [ 209.743126] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff9501c2cdda68 [ 209.743127] RBP: ffffb99c07e0bdd8 R08: 00000000482066d2 R09: 000000005b007fea [ 209.743128] R10: 0000000012b1557e R11: 0000000054272009 R12: ffff9501c2cdda68 [ 209.743129] R13: ffffb99c07e0be30 R14: ffffb99c07e0be30 R15: 0000000000000040 [ 209.743130] FS: 00007f4282ed5700(0000) GS:ffff9501c3580000(0000) knlGS:0000000000000000 [ 209.743131] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 209.743131] CR2: 00007f32728e9ba0 CR3: 0000000edd0d7006 CR4: 00000000003606a0 [ 209.743135] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 209.743136] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 209.743136] Call Trace: [ 209.743139] do_raw_spin_lock+0xa0/0xb0 [ 209.743141] _raw_spin_lock_irqsave+0x20/0x30 [ 209.743143] _extract_crng+0x45/0x120 [ 209.743145] ? urandom_read+0xfa/0x2a0 [ 209.743147] ? vfs_read+0xad/0x170 [ 209.743148] ? SyS_read+0x4b/0xa0 [ 209.743150] ? __audit_syscall_exit+0x21e/0x2c0 [ 209.743151] ? do_syscall_64+0x63/0x1f0 [ 209.743152] ? entry_SYSCALL64_slow_path+0x25/0x25 [ 209.743153] Code: 0f 1f 00 0f 1f 44 00 00 8b 05 f5 b4 98 00 55 85 c0 7e 1a ba 01 00 00 00 90 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 <eb> ec 81 fe 00 01 00 00 0f 84 9a 00 00 00 41 b8 01 01 00 00 b9 [ 209.743174] NMI backtrace for cpu 1 [ 209.743176] CPU: 1 PID: 2865 Comm: dd Tainted: G O 4.14.0+ #11 [ 209.743177] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 209.743178] task: ffff95019d110000 task.stack: ffffb99c07d18000 [ 209.743181] RIP: 0010:native_queued_spin_lock_slowpath+0x18/0x1b0 [ 209.743182] RSP: 0018:ffffb99c07d1bda8 EFLAGS: 00000002 [ 209.743183] RAX: 0000000000000001 RBX: ffff9501c2cdda68 RCX: 0000000000000000 [ 209.743184] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff9501c2cdda68 [ 209.743184] RBP: ffffb99c07d1bdd8 R08: 0000000052b91797 R09: 000000002f2f8e5c [ 209.743185] R10: 00000000c5b37258 R11: 000000005ebfd0ce R12: ffff9501c2cdda68 [ 209.743186] R13: ffffb99c07d1be30 R14: ffffb99c07d1be30 R15: 0000000000000040 [ 209.743187] FS: 00007fd12de78700(0000) GS:ffff9501c3240000(0000) knlGS:0000000000000000 [ 209.743188] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 209.743189] CR2: 00007f3f9d1beba0 CR3: 0000000eddffa003 CR4: 00000000003606a0 [ 209.743192] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 209.743193] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 209.743193] Call Trace: [ 209.743197] do_raw_spin_lock+0xa0/0xb0 [ 209.743200] _raw_spin_lock_irqsave+0x20/0x30 [ 209.743202] _extract_crng+0x45/0x120 [ 209.743204] ? urandom_read+0xfa/0x2a0 [ 209.743206] ? vfs_read+0xad/0x170 [ 209.743207] ? SyS_read+0x4b/0xa0 [ 209.743209] ? __audit_syscall_exit+0x21e/0x2c0 [ 209.743211] ? do_syscall_64+0x63/0x1f0 [ 209.743212] ? entry_SYSCALL64_slow_path+0x25/0x25 [ 209.743213] Code: 48 8b 2c 24 48 c7 00 00 00 00 00 e9 1d fe ff ff 0f 1f 00 0f 1f 44 00 00 8b 05 f5 b4 98 00 55 85 c0 7e 1a ba 01 00 00 00 90 8b 07 <85> c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 eb ec 81 fe 00 [ 209.743235] NMI backtrace for cpu 7 [ 209.743238] CPU: 7 PID: 2884 Comm: dd Tainted: G O 4.14.0+ #11 [ 209.743238] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 209.743239] task: ffff95019f259d00 task.stack: ffffb99c078f4000 [ 209.743242] RIP: 0010:native_queued_spin_lock_slowpath+0x18/0x1b0 [ 209.743243] RSP: 0018:ffffb99c078f7da8 EFLAGS: 00000002 [ 209.743244] RAX: 0000000000000001 RBX: ffff9501c2cdda68 RCX: 0000000000000000 [ 209.743245] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff9501c2cdda68 [ 209.74 ```

6 years, 11 months

10
51
0 0

FAILED: patch "[PATCH] cifs: integer overflow in in SMB2_ioctl()" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 2d204ee9d671327915260071c19350d84344e096 Mon Sep 17 00:00:00 2001 From: Dan Carpenter <dan.carpenter(a)oracle.com> Date: Mon, 10 Sep 2018 14:12:07 +0300 Subject: [PATCH] cifs: integer overflow in in SMB2_ioctl() The "le32_to_cpu(rsp->OutputOffset) + *plen" addition can overflow and wrap around to a smaller value which looks like it would lead to an information leak. Fixes: 4a72dafa19ba ("SMB2 FSCTL and IOCTL worker function") Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com> Signed-off-by: Steve French <stfrench(a)microsoft.com> Reviewed-by: Aurelien Aptel <aaptel(a)suse.com> CC: Stable <stable(a)vger.kernel.org> diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index 6f0e6b42599c..f54d07bda067 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -2459,14 +2459,14 @@ SMB2_ioctl(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid, /* We check for obvious errors in the output buffer length and offset */ if (*plen == 0) goto ioctl_exit; /* server returned no data */ - else if (*plen > 0xFF00) { + else if (*plen > rsp_iov.iov_len || *plen > 0xFF00) { cifs_dbg(VFS, "srv returned invalid ioctl length: %d\n", *plen); *plen = 0; rc = -EIO; goto ioctl_exit; } - if (rsp_iov.iov_len < le32_to_cpu(rsp->OutputOffset) + *plen) { + if (rsp_iov.iov_len - *plen < le32_to_cpu(rsp->OutputOffset)) { cifs_dbg(VFS, "Malformed ioctl resp: len %d offset %d\n", *plen, le32_to_cpu(rsp->OutputOffset)); *plen = 0;

6 years, 12 months

4
3
0 0

FAILED: patch "[PATCH] ib_srpt: Fix a use-after-free in __srpt_close_all_ch()" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 14d15c2b278011056482eb015dff89f9cbf2b841 Mon Sep 17 00:00:00 2001 From: Bart Van Assche <bart.vanassche(a)wdc.com> Date: Mon, 2 Jul 2018 14:08:45 -0700 Subject: [PATCH] ib_srpt: Fix a use-after-free in __srpt_close_all_ch() BUG: KASAN: use-after-free in srpt_set_enabled+0x1a9/0x1e0 [ib_srpt] Read of size 4 at addr ffff8801269d23f8 by task check/29726 CPU: 4 PID: 29726 Comm: check Not tainted 4.18.0-rc2-dbg+ #4 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xa4/0xf5 print_address_description+0x6f/0x270 kasan_report+0x241/0x360 __asan_load4+0x78/0x80 srpt_set_enabled+0x1a9/0x1e0 [ib_srpt] srpt_tpg_enable_store+0xb8/0x120 [ib_srpt] configfs_write_file+0x14e/0x1d0 [configfs] __vfs_write+0xd2/0x3b0 vfs_write+0x101/0x270 ksys_write+0xab/0x120 __x64_sys_write+0x43/0x50 do_syscall_64+0x77/0x230 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f235cfe6154 Fixes: aaf45bd83eba ("IB/srpt: Detect session shutdown reliably") Signed-off-by: Bart Van Assche <bart.vanassche(a)wdc.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com> diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index 754da8d30952..e42eec20c631 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -1940,8 +1940,8 @@ static void __srpt_close_all_ch(struct srpt_port *sport) list_for_each_entry(nexus, &sport->nexus_list, entry) { list_for_each_entry(ch, &nexus->ch_list, list) { if (srpt_disconnect_ch(ch) >= 0) - pr_info("Closing channel %s-%d because target %s_%d has been disabled\n", - ch->sess_name, ch->qp->qp_num, + pr_info("Closing channel %s because target %s_%d has been disabled\n", + ch->sess_name, sport->sdev->device->name, sport->port); srpt_close_ch(ch); }

7 years

3
2
0 0

FAILED: patch "[PATCH] ubifs: Fix directory size calculation for symlinks" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 00ee8b60102862f4daf0814d12a2ea2744fc0b9b Mon Sep 17 00:00:00 2001 From: Richard Weinberger <richard(a)nod.at> Date: Mon, 11 Jun 2018 23:41:09 +0200 Subject: [PATCH] ubifs: Fix directory size calculation for symlinks We have to account the name of the symlink and not the target length. Fixes: ca7f85be8d6c ("ubifs: Add support for encrypted symlinks") Cc: <stable(a)vger.kernel.org> Signed-off-by: Richard Weinberger <richard(a)nod.at> diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index 9da224d4f2da..e8616040bffc 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -1123,8 +1123,7 @@ static int ubifs_symlink(struct inode *dir, struct dentry *dentry, struct ubifs_inode *ui; struct ubifs_inode *dir_ui = ubifs_inode(dir); struct ubifs_info *c = dir->i_sb->s_fs_info; - int err, len = strlen(symname); - int sz_change = CALC_DENT_SIZE(len); + int err, sz_change, len = strlen(symname); struct fscrypt_str disk_link; struct ubifs_budget_req req = { .new_ino = 1, .new_dent = 1, .new_ino_d = ALIGN(len, 8), @@ -1151,6 +1150,8 @@ static int ubifs_symlink(struct inode *dir, struct dentry *dentry, if (err) goto out_budg; + sz_change = CALC_DENT_SIZE(fname_len(&nm)); + inode = ubifs_new_inode(c, dir, S_IFLNK | S_IRWXUGO); if (IS_ERR(inode)) { err = PTR_ERR(inode);

7 years

4
3
0 0

[PATCH 4.4 00/18] V4.4 backport of 32-bit arm spectre patches

by David Long

From: "David A. Long" <dave.long(a)linaro.org> V4.4 backport of spectre patches from Russell M. King's spectre branch. Most KVM patches are excluded. Patches not yet in upstream are excluded. Russell King (18): ARM: add more CPU part numbers for Cortex and Brahma B15 CPUs ARM: bugs: prepare processor bug infrastructure ARM: bugs: hook processor bug checking into SMP and suspend paths ARM: bugs: add support for per-processor bug checking ARM: spectre: add Kconfig symbol for CPUs vulnerable to Spectre ARM: spectre-v2: harden branch predictor on context switches ARM: spectre-v2: add Cortex A8 and A15 validation of the IBE bit ARM: spectre-v2: harden user aborts in kernel space ARM: spectre-v2: warn about incorrect context switching functions ARM: spectre-v1: add speculation barrier (csdb) macros ARM: spectre-v1: add array_index_mask_nospec() implementation ARM: spectre-v1: fix syscall entry ARM: signal: copy registers using __copy_from_user() ARM: vfp: use __copy_from_user() when restoring VFP state ARM: oabi-compat: copy semops using __copy_from_user() ARM: use __inttype() in get_user() ARM: spectre-v1: use get_user() for __get_user() ARM: spectre-v1: mitigate user accesses arch/arm/include/asm/assembler.h | 12 +++ arch/arm/include/asm/barrier.h | 32 +++++++ arch/arm/include/asm/bugs.h | 6 +- arch/arm/include/asm/cp15.h | 18 ++++ arch/arm/include/asm/cputype.h | 8 ++ arch/arm/include/asm/proc-fns.h | 4 + arch/arm/include/asm/system_misc.h | 15 ++++ arch/arm/include/asm/thread_info.h | 4 +- arch/arm/include/asm/uaccess.h | 25 ++++-- arch/arm/kernel/Makefile | 1 + arch/arm/kernel/bugs.c | 18 ++++ arch/arm/kernel/entry-common.S | 18 ++-- arch/arm/kernel/entry-header.S | 25 ++++++ arch/arm/kernel/signal.c | 56 ++++++------ arch/arm/kernel/smp.c | 4 + arch/arm/kernel/suspend.c | 2 + arch/arm/kernel/sys_oabi-compat.c | 8 +- arch/arm/lib/copy_from_user.S | 9 ++ arch/arm/mm/Kconfig | 23 +++++ arch/arm/mm/Makefile | 2 +- arch/arm/mm/fault.c | 3 + arch/arm/mm/proc-macros.S | 3 +- arch/arm/mm/proc-v7-2level.S | 6 -- arch/arm/mm/proc-v7-bugs.c | 112 ++++++++++++++++++++++++ arch/arm/mm/proc-v7.S | 133 ++++++++++++++++++++++------- arch/arm/vfp/vfpmodule.c | 17 ++-- 26 files changed, 462 insertions(+), 102 deletions(-) create mode 100644 arch/arm/kernel/bugs.c create mode 100644 arch/arm/mm/proc-v7-bugs.c -- 2.17.1

7 years

4
28
0 0

[PATCH] jffs2: ensure wbuf_verify is valid before using it.

by Hou Tao

Now MTD emulated by UBI volumn doesn't allocate wbuf_verify in jffs2_ubivol_setup(), because UBI can do the verifcation itself, so when CONFIG_JFFS2_FS_WBUF_VERIFY is enabled and a MTD device emulated by UBI volumn is used, a Oops will occur as show in the following trace: general protection fault: 0000 [#1] SMP KASAN PTI CPU: 6 PID: 404 Comm: kworker/6:1 Not tainted 4.19.0-rc8 Workqueue: events_long delayed_wbuf_sync RIP: 0010:ubi_io_read+0x156/0x650 Call Trace: ubi_eba_read_leb+0x57d/0xba0 ubi_leb_read+0xe5/0x1b0 gluebi_read+0x10c/0x1a0 mtd_read+0x112/0x340 jffs2_verify_write+0xef/0x440 __jffs2_flush_wbuf+0x3fa/0x3540 jffs2_flush_wbuf_gc+0x1b1/0x2e0 process_one_work+0x58b/0x11e0 worker_thread+0x8f/0xfe0 kthread+0x2ae/0x3a0 ret_from_fork+0x35/0x40 Fix the problem by checking the validity of wbuf_verify before using it in jffs2_verify_write(). Cc: stable(a)vger.kernel.org Fixes: 0029da3bf430 ("JFFS2: add UBI support") Signed-off-by: Hou Tao <houtao1(a)huawei.com> --- fs/jffs2/wbuf.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/jffs2/wbuf.c b/fs/jffs2/wbuf.c index c6821a509481..3de45f4559d1 100644 --- a/fs/jffs2/wbuf.c +++ b/fs/jffs2/wbuf.c @@ -234,6 +234,13 @@ static int jffs2_verify_write(struct jffs2_sb_info *c, unsigned char *buf, size_t retlen; char *eccstr; + /* + * MTD emulated by UBI volume doesn't allocate wbuf_verify, + * because it can do the verification itself. + */ + if (!c->wbuf_verify) + return 0; + ret = mtd_read(c->mtd, ofs, c->wbuf_pagesize, &retlen, c->wbuf_verify); if (ret && ret != -EUCLEAN && ret != -EBADMSG) { pr_warn("%s(): Read back of page at %08x failed: %d\n", -- 2.16.2.dirty

7 years

2
3
0 0

[PATCH 1/1] bcache: set max writeback rate when I/O request is idle

by Kai Krakow

From: Coly Li <colyli(a)suse.de> Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle") allows the writeback rate to be faster if there is no I/O request on a bcache device. It works well if there is only one bcache device attached to the cache set. If there are many bcache devices attached to a cache set, it may introduce performance regression because multiple faster writeback threads of the idle bcache devices will compete the btree level locks with the bcache device who have I/O requests coming. This patch fixes the above issue by only permitting fast writebac when all bcache devices attached on the cache set are idle. And if one of the bcache devices has new I/O request coming, minimized all writeback throughput immediately and let PI controller __update_writeback_rate() to decide the upcoming writeback rate for each bcache device. Also when all bcache devices are idle, limited wrieback rate to a small number is wast of thoughput, especially when backing devices are slower non-rotation devices (e.g. SATA SSD). This patch sets a max writeback rate for each backing device if the whole cache set is idle. A faster writeback rate in idle time means new I/Os may have more available space for dirty data, and people may observe a better write performance then. Please note bcache may change its cache mode in run time, and this patch still works if the cache mode is switched from writeback mode and there is still dirty data on cache. Fixes: Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle") Cc: stable(a)vger.kernel.org #4.16+ Signed-off-by: Coly Li <colyli(a)suse.de> Tested-by: Kai Krakow <kai(a)kaishome.de> Tested-by: Stefan Priebe <s.priebe(a)profihost.ag> Cc: Michael Lyle <mlyle(a)lyle.org> Signed-off-by: Jens Axboe <axboe(a)kernel.dk> (cherry picked from commit ea8c5356d39048bc94bae068228f51ddbecc6b89) Signed-off-by: Kai Krakow <kai(a)kaishome.de> --- drivers/md/bcache/bcache.h | 10 ++--- drivers/md/bcache/request.c | 54 ++++++++++++++++++++++++- drivers/md/bcache/super.c | 4 ++ drivers/md/bcache/sysfs.c | 14 +++++-- drivers/md/bcache/util.c | 2 +- drivers/md/bcache/util.h | 2 +- drivers/md/bcache/writeback.c | 91 +++++++++++++++++++++++++++++-------------- 7 files changed, 133 insertions(+), 44 deletions(-) diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index d6bf294f3907..6ba41887664a 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -328,13 +328,6 @@ struct cached_dev { */ atomic_t has_dirty; - /* - * Set to zero by things that touch the backing volume-- except - * writeback. Incremented by writeback. Used to determine when to - * accelerate idle writeback. - */ - atomic_t backing_idle; - struct bch_ratelimit writeback_rate; struct delayed_work writeback_rate_update; @@ -514,6 +507,8 @@ struct cache_set { struct cache_accounting accounting; unsigned long flags; + atomic_t idle_counter; + atomic_t at_max_writeback_rate; struct cache_sb sb; @@ -523,6 +518,7 @@ struct cache_set { struct bcache_device **devices; unsigned devices_max_used; + atomic_t attached_dev_nr; struct list_head cached_devs; uint64_t cached_dev_sectors; struct closure caching; diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c index ae67f5fa8047..6e08eb89abee 100644 --- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -1102,6 +1102,44 @@ static void detached_dev_do_request(struct bcache_device *d, struct bio *bio) generic_make_request(bio); } +static void quit_max_writeback_rate(struct cache_set *c, + struct cached_dev *this_dc) +{ + int i; + struct bcache_device *d; + struct cached_dev *dc; + + /* + * mutex bch_register_lock may compete with other parallel requesters, + * or attach/detach operations on other backing device. Waiting to + * the mutex lock may increase I/O request latency for seconds or more. + * To avoid such situation, if mutext_trylock() failed, only writeback + * rate of current cached device is set to 1, and __update_write_back() + * will decide writeback rate of other cached devices (remember now + * c->idle_counter is 0 already). + */ + if (mutex_trylock(&bch_register_lock)) { + for (i = 0; i < c->devices_max_used; i++) { + if (!c->devices[i]) + continue; + + if (UUID_FLASH_ONLY(&c->uuids[i])) + continue; + + d = c->devices[i]; + dc = container_of(d, struct cached_dev, disk); + /* + * set writeback rate to default minimum value, + * then let update_writeback_rate() to decide the + * upcoming rate. + */ + atomic_long_set(&dc->writeback_rate.rate, 1); + } + mutex_unlock(&bch_register_lock); + } else + atomic_long_set(&this_dc->writeback_rate.rate, 1); +} + /* Cached devices - read & write stuff */ static blk_qc_t cached_dev_make_request(struct request_queue *q, @@ -1119,7 +1157,21 @@ static blk_qc_t cached_dev_make_request(struct request_queue *q, return BLK_QC_T_NONE; } - atomic_set(&dc->backing_idle, 0); + if (likely(d->c)) { + if (atomic_read(&d->c->idle_counter)) + atomic_set(&d->c->idle_counter, 0); + /* + * If at_max_writeback_rate of cache set is true and new I/O + * comes, quit max writeback rate of all cached devices + * attached to this cache set, and set at_max_writeback_rate + * to false. + */ + if (unlikely(atomic_read(&d->c->at_max_writeback_rate) == 1)) { + atomic_set(&d->c->at_max_writeback_rate, 0); + quit_max_writeback_rate(d->c, dc); + } + } + generic_start_io_acct(q, rw, bio_sectors(bio), &d->disk->part0); bio_set_dev(bio, dc->bdev); diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index fa4058e43202..dc7b6131ddbb 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -696,6 +696,8 @@ static void bcache_device_detach(struct bcache_device *d) { lockdep_assert_held(&bch_register_lock); + atomic_dec(&d->c->attached_dev_nr); + if (test_bit(BCACHE_DEV_DETACHING, &d->flags)) { struct uuid_entry *u = d->c->uuids + d->id; @@ -1138,6 +1140,7 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c, bch_cached_dev_run(dc); bcache_device_link(&dc->disk, c, "bdev"); + atomic_inc(&c->attached_dev_nr); /* Allow the writeback thread to proceed */ up_write(&dc->writeback_lock); @@ -1687,6 +1690,7 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb) c->block_bits = ilog2(sb->block_size); c->nr_uuids = bucket_bytes(c) / sizeof(struct uuid_entry); c->devices_max_used = 0; + atomic_set(&c->attached_dev_nr, 0); c->btree_pages = bucket_pages(c); if (c->btree_pages > BTREE_MAX_PAGES) c->btree_pages = max_t(int, c->btree_pages / 4, diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index 225b15aa0340..a56067e80b10 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -170,7 +170,8 @@ SHOW(__bch_cached_dev) var_printf(writeback_running, "%i"); var_print(writeback_delay); var_print(writeback_percent); - sysfs_hprint(writeback_rate, dc->writeback_rate.rate << 9); + sysfs_hprint(writeback_rate, + atomic_long_read(&dc->writeback_rate.rate) << 9); sysfs_hprint(io_errors, atomic_read(&dc->io_errors)); sysfs_printf(io_error_limit, "%i", dc->error_limit); sysfs_printf(io_disable, "%i", dc->io_disable); @@ -188,7 +189,8 @@ SHOW(__bch_cached_dev) char change[20]; s64 next_io; - bch_hprint(rate, dc->writeback_rate.rate << 9); + bch_hprint(rate, + atomic_long_read(&dc->writeback_rate.rate) << 9); bch_hprint(dirty, bcache_dev_sectors_dirty(&dc->disk) << 9); bch_hprint(target, dc->writeback_rate_target << 9); bch_hprint(proportional,dc->writeback_rate_proportional << 9); @@ -255,8 +257,12 @@ STORE(__cached_dev) sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent, 0, 40); - sysfs_strtoul_clamp(writeback_rate, - dc->writeback_rate.rate, 1, INT_MAX); + if (attr == &sysfs_writeback_rate) { + int v; + + sysfs_strtoul_clamp(writeback_rate, v, 1, INT_MAX); + atomic_long_set(&dc->writeback_rate.rate, v); + } sysfs_strtoul_clamp(writeback_rate_update_seconds, dc->writeback_rate_update_seconds, diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c index fc479b026d6d..b15256bcf0e7 100644 --- a/drivers/md/bcache/util.c +++ b/drivers/md/bcache/util.c @@ -200,7 +200,7 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done) { uint64_t now = local_clock(); - d->next += div_u64(done * NSEC_PER_SEC, d->rate); + d->next += div_u64(done * NSEC_PER_SEC, atomic_long_read(&d->rate)); /* Bound the time. Don't let us fall further than 2 seconds behind * (this prevents unnecessary backlog that would make it impossible diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h index cced87f8eb27..f7b0133c9d2f 100644 --- a/drivers/md/bcache/util.h +++ b/drivers/md/bcache/util.h @@ -442,7 +442,7 @@ struct bch_ratelimit { * Rate at which we want to do work, in units per second * The units here correspond to the units passed to bch_next_delay() */ - uint32_t rate; + atomic_long_t rate; }; static inline void bch_ratelimit_reset(struct bch_ratelimit *d) diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c index ad45ebe1a74b..9f5e33324d1d 100644 --- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -104,11 +104,56 @@ static void __update_writeback_rate(struct cached_dev *dc) dc->writeback_rate_proportional = proportional_scaled; dc->writeback_rate_integral_scaled = integral_scaled; - dc->writeback_rate_change = new_rate - dc->writeback_rate.rate; - dc->writeback_rate.rate = new_rate; + dc->writeback_rate_change = new_rate - + atomic_long_read(&dc->writeback_rate.rate); + atomic_long_set(&dc->writeback_rate.rate, new_rate); dc->writeback_rate_target = target; } +static bool set_at_max_writeback_rate(struct cache_set *c, + struct cached_dev *dc) +{ + /* + * Idle_counter is increased everytime when update_writeback_rate() is + * called. If all backing devices attached to the same cache set have + * identical dc->writeback_rate_update_seconds values, it is about 6 + * rounds of update_writeback_rate() on each backing device before + * c->at_max_writeback_rate is set to 1, and then max wrteback rate set + * to each dc->writeback_rate.rate. + * In order to avoid extra locking cost for counting exact dirty cached + * devices number, c->attached_dev_nr is used to calculate the idle + * throushold. It might be bigger if not all cached device are in write- + * back mode, but it still works well with limited extra rounds of + * update_writeback_rate(). + */ + if (atomic_inc_return(&c->idle_counter) < + atomic_read(&c->attached_dev_nr) * 6) + return false; + + if (atomic_read(&c->at_max_writeback_rate) != 1) + atomic_set(&c->at_max_writeback_rate, 1); + + atomic_long_set(&dc->writeback_rate.rate, INT_MAX); + + /* keep writeback_rate_target as existing value */ + dc->writeback_rate_proportional = 0; + dc->writeback_rate_integral_scaled = 0; + dc->writeback_rate_change = 0; + + /* + * Check c->idle_counter and c->at_max_writeback_rate agagain in case + * new I/O arrives during before set_at_max_writeback_rate() returns. + * Then the writeback rate is set to 1, and its new value should be + * decided via __update_writeback_rate(). + */ + if ((atomic_read(&c->idle_counter) < + atomic_read(&c->attached_dev_nr) * 6) || + !atomic_read(&c->at_max_writeback_rate)) + return false; + + return true; +} + static void update_writeback_rate(struct work_struct *work) { struct cached_dev *dc = container_of(to_delayed_work(work), @@ -136,13 +181,20 @@ static void update_writeback_rate(struct work_struct *work) return; } - down_read(&dc->writeback_lock); + if (atomic_read(&dc->has_dirty) && dc->writeback_percent) { + /* + * If the whole cache set is idle, set_at_max_writeback_rate() + * will set writeback rate to a max number. Then it is + * unncessary to update writeback rate for an idle cache set + * in maximum writeback rate number(s). + */ + if (!set_at_max_writeback_rate(c, dc)) { + down_read(&dc->writeback_lock); + __update_writeback_rate(dc); + up_read(&dc->writeback_lock); + } + } - if (atomic_read(&dc->has_dirty) && - dc->writeback_percent) - __update_writeback_rate(dc); - - up_read(&dc->writeback_lock); /* * CACHE_SET_IO_DISABLE might be set via sysfs interface, @@ -422,27 +474,6 @@ static void read_dirty(struct cached_dev *dc) delay = writeback_delay(dc, size); - /* If the control system would wait for at least half a - * second, and there's been no reqs hitting the backing disk - * for awhile: use an alternate mode where we have at most - * one contiguous set of writebacks in flight at a time. If - * someone wants to do IO it will be quick, as it will only - * have to contend with one operation in flight, and we'll - * be round-tripping data to the backing disk as quickly as - * it can accept it. - */ - if (delay >= HZ / 2) { - /* 3 means at least 1.5 seconds, up to 7.5 if we - * have slowed way down. - */ - if (atomic_inc_return(&dc->backing_idle) >= 3) { - /* Wait for current I/Os to finish */ - closure_sync(&cl); - /* And immediately launch a new set. */ - delay = 0; - } - } - while (!kthread_should_stop() && !test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) && delay) { @@ -715,7 +746,7 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc) dc->writeback_running = true; dc->writeback_percent = 10; dc->writeback_delay = 30; - dc->writeback_rate.rate = 1024; + atomic_long_set(&dc->writeback_rate.rate, 1024); dc->writeback_rate_minimum = 8; dc->writeback_rate_update_seconds = WRITEBACK_RATE_UPDATE_SECS_DEFAULT; -- 2.16.4

7 years

2
1
0 0

[PATCH v2] PCI / PM: Allow runtime PM without callback functions

by Jarkko Nikula

Commit a9c8088c7988 ("i2c: i801: Don't restore config registers on runtime PM") nullified the runtime PM suspend/resume callback pointers while keeping the runtime PM enabled. This causes that SMBus PCI device stays in D0 power state and sysfs /sys/bus/pci/devices/[SMBus PCI ID]/power/runtime_status shows "error" when the runtime PM framework attempts to autosuspend the device. This is due PCI bus runtime PM which checks for driver runtime PM callbacks and returns with -ENOSYS if they are not set. Since i2c-i801.c don't need to do anything device specific beyond PCI device power state management Jean Delvare proposed if this can be fixed in the PCI subsystem core level rather than adding dummy runtime PM callback functions in the PCI drivers. Change the pci_pm_runtime_suspend()/pci_pm_runtime_resume() semantics so that they allow change the PCI device power state during runtime PM transitions even if no runtime PM callback functions are defined. This change fixes the runtime PM regression on i2c-i801.c. It is not obvious why the code had hard requirements for the runtime PM callbacks. Test has been here since the code was introduced by the commit 6cbf82148ff2 ("PCI PM: Run-time callbacks for PCI bus type"). On the other hand similar change than this was done to generic runtime PM callbacks way back in the commit 05aa55dddb9e ("PM / Runtime: Lenient generic runtime pm callbacks"). Fixes: a9c8088c7988 ("i2c: i801: Don't restore config registers on runtime PM") Reported-by: Mika Westerberg <mika.westerberg(a)linux.intel.com> Cc: <stable(a)vger.kernel.org> # 4.18+ Signed-off-by: Jarkko Nikula <jarkko.nikula(a)linux.intel.com> --- I Cc'ed stable since this fixes the regression on i2c-i801.c. But we probably want to get some test coverage first before applying into stable. Queueing for v4.20 sounds reasonable to me. v2: Previous version had a potential NULL dereference in WARN_ONCE() statement noted by Jean Delvare. Now covered by pm && pm->runtime_suspend test. Also handling of error code from pm->runtime_suspend() moved under the same code block where callback is called. v1: This is related to my i2c-i801.c fix thread back in June which I completely forgot till now: https://lkml.org/lkml/2018/6/27/642 Discussion back then was that it should be handled in the PCI PM instead of having dummy functions in the drivers. I wanted to respin with a patch. --- drivers/pci/pci-driver.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index bef17c3fca67..33f3f475e5c6 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -1251,30 +1251,29 @@ static int pci_pm_runtime_suspend(struct device *dev) return 0; } - if (!pm || !pm->runtime_suspend) - return -ENOSYS; - pci_dev->state_saved = false; - error = pm->runtime_suspend(dev); - if (error) { + if (pm && pm->runtime_suspend) { + error = pm->runtime_suspend(dev); /* * -EBUSY and -EAGAIN is used to request the runtime PM core * to schedule a new suspend, so log the event only with debug * log level. */ - if (error == -EBUSY || error == -EAGAIN) + if (error == -EBUSY || error == -EAGAIN) { dev_dbg(dev, "can't suspend now (%pf returned %d)\n", pm->runtime_suspend, error); - else + return error; + } else if (error) { dev_err(dev, "can't suspend (%pf returned %d)\n", pm->runtime_suspend, error); - - return error; + return error; + } } pci_fixup_device(pci_fixup_suspend, pci_dev); - if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0 + if (pm && pm->runtime_suspend + && !pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: State of device not saved by %pF\n", @@ -1292,7 +1291,7 @@ static int pci_pm_runtime_suspend(struct device *dev) static int pci_pm_runtime_resume(struct device *dev) { - int rc; + int rc = 0; struct pci_dev *pci_dev = to_pci_dev(dev); const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; @@ -1306,14 +1305,12 @@ static int pci_pm_runtime_resume(struct device *dev) if (!pci_dev->driver) return 0; - if (!pm || !pm->runtime_resume) - return -ENOSYS; - pci_fixup_device(pci_fixup_resume_early, pci_dev); pci_enable_wake(pci_dev, PCI_D0, false); pci_fixup_device(pci_fixup_resume, pci_dev); - rc = pm->runtime_resume(dev); + if (pm && pm->runtime_resume) + rc = pm->runtime_resume(dev); pci_dev->runtime_d3cold = false; -- 2.19.1

7 years

5
5
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror October 2018