Commit 4aae4388165a2611fa4206363ccb243c1622446c ("nvme: fix hang in remove path"), which was introduced in Linux 4.9.94, changed nvme_kill_queues() to also forcibly start admin queues in order to avoid getting stuck during device removal.
If a device is being removed because it did not respond during device initialization (e.g., if it is not ready yet at boot time), we will end up trying to start an admin queue that has not yet been set up at all. This attempt will lead to a NULL pointer dereference.
To avoid hitting this bug, we add a sanity check around the invocation of blk_mq_start_hw_queues() to ensure that the admin queue has actually been set up already.
Upstream already has this check in place since commit 7dd1ab163c17e11473a65b11f7e748db30618ebb ("nvme: validate admin queue before unquiesce"), and thus 4.14 contains it as well. Linux 4.4 is not affected by this particular issue since it does not have the force-start behavior yet.
Fixes: 4aae4388165a2611fa42 ("nvme: fix hang in remove path")
Signed-off-by: Simon Veith sveith@amazon.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk --- drivers/nvme/host/core.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index c823e93..8a30478 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2041,8 +2041,10 @@ void nvme_kill_queues(struct nvme_ctrl *ctrl)
mutex_lock(&ctrl->namespaces_mutex);
- /* Forcibly start all queues to avoid having stuck requests */ - blk_mq_start_hw_queues(ctrl->admin_q); + if (ctrl->admin_q) { + /* Forcibly start all queues to avoid having stuck requests */ + blk_mq_start_hw_queues(ctrl->admin_q); + }
list_for_each_entry(ns, &ctrl->namespaces, list) { /*
On Wed, Jul 11, 2018 at 03:21:52PM +0200, Simon Veith wrote:
Commit 4aae4388165a2611fa4206363ccb243c1622446c ("nvme: fix hang in remove path"), which was introduced in Linux 4.9.94, changed nvme_kill_queues() to also forcibly start admin queues in order to avoid getting stuck during device removal.
If a device is being removed because it did not respond during device initialization (e.g., if it is not ready yet at boot time), we will end up trying to start an admin queue that has not yet been set up at all. This attempt will lead to a NULL pointer dereference.
To avoid hitting this bug, we add a sanity check around the invocation of blk_mq_start_hw_queues() to ensure that the admin queue has actually been set up already.
Upstream already has this check in place since commit 7dd1ab163c17e11473a65b11f7e748db30618ebb ("nvme: validate admin queue before unquiesce"), and thus 4.14 contains it as well. Linux 4.4 is not affected by this particular issue since it does not have the force-start behavior yet.
Fixes: 4aae4388165a2611fa42 ("nvme: fix hang in remove path")
Signed-off-by: Simon Veith sveith@amazon.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk
drivers/nvme/host/core.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index c823e93..8a30478 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2041,8 +2041,10 @@ void nvme_kill_queues(struct nvme_ctrl *ctrl) mutex_lock(&ctrl->namespaces_mutex);
- /* Forcibly start all queues to avoid having stuck requests */
- blk_mq_start_hw_queues(ctrl->admin_q);
- if (ctrl->admin_q) {
/* Forcibly start all queues to avoid having stuck requests */
blk_mq_start_hw_queues(ctrl->admin_q);
- }
Why have you rewritten commit 7dd1ab163c17 ("nvme: validate admin queue before unquiesce") here? Why not just backport it directly?
confused,
greg k-h
On Wed, 2018-07-11 at 15:41 +0200, Greg KH wrote:
Why have you rewritten commit 7dd1ab163c17 ("nvme: validate admin queue before unquiesce") here? Why not just backport it directly?
Initially, it seemed clearer to me to rewrite it as a standalone patch because it didn't apply cleanly due to the different functions being called, and because 4.9 does not have the 'Fixes 443bd90f2cca: "nvme: host: unquiesce queue in nvme_kill_queues()"' commit that the upstream commit refers to, but is affected nonetheless.
7dd1ab163c17e114 ("nvme: validate admin queue before unquiesce") patches the blk_mq_unquiesce_queue() call, which upstream has but 4.9 doesn't -- that call is added in 443bd90f2cca9dec ("nvme: host: unquiesce queue in nvme_kill_queues()").
My proposed patch adds the check before the blk_mq_start_hw_queues() call, which 4.9 has, but upstream no longer does since 8d7b8fafad87c340 ("nvme: kick requeue list when requeueing a request instead of when starting the queues").
I agree that both patches essentially do the same thing and fix the same error symptom. I will prepare a backported version of 7dd1ab163c17e114 instead to preserve the history from the upstream commit.
Simon Amazon Development Center Germany GmbH Berlin - Dresden - Aachen main office: Krausenstr. 38, 10117 Berlin Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger Ust-ID: DE289237879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
From: Scott Bauer scott.bauer@intel.com
commit 7dd1ab163c17e11473a65b11f7e748db30618ebb upstream.
With a misbehaving controller it's possible we'll never enter the live state and create an admin queue. When we fail out of reset work it's possible we failed out early enough without setting up the admin queue. We tear down queues after a failed reset, but needed to do some more sanitization.
Fixes 443bd90f2cca: "nvme: host: unquiesce queue in nvme_kill_queues()"
[ 189.650995] nvme nvme1: pci function 0000:0b:00.0 [ 317.680055] nvme nvme0: Device not ready; aborting reset [ 317.680183] nvme nvme0: Removing after probe failure status: -19 [ 317.681258] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 317.681397] general protection fault: 0000 [#1] SMP KASAN [ 317.682984] CPU: 3 PID: 477 Comm: kworker/3:2 Not tainted 4.13.0-rc1+ #5 [ 317.683112] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 [ 317.683284] Workqueue: events nvme_remove_dead_ctrl_work [nvme] [ 317.683398] task: ffff8803b0990000 task.stack: ffff8803c2ef0000 [ 317.683516] RIP: 0010:blk_mq_unquiesce_queue+0x2b/0xa0 [ 317.683614] RSP: 0018:ffff8803c2ef7d40 EFLAGS: 00010282 [ 317.683716] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1006fbdcde3 [ 317.683847] RDX: 0000000000000038 RSI: 1ffff1006f5a9245 RDI: 0000000000000000 [ 317.683978] RBP: ffff8803c2ef7d58 R08: 1ffff1007bcdc974 R09: 0000000000000000 [ 317.684108] R10: 1ffff1007bcdc975 R11: 0000000000000000 R12: 00000000000001c0 [ 317.684239] R13: ffff88037ad49228 R14: ffff88037ad492d0 R15: ffff88037ad492e0 [ 317.684371] FS: 0000000000000000(0000) GS:ffff8803de6c0000(0000) knlGS:0000000000000000 [ 317.684519] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 317.684627] CR2: 0000002d1860c000 CR3: 000000045b40d000 CR4: 00000000003406e0 [ 317.684758] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 317.684888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 317.685018] Call Trace: [ 317.685084] nvme_kill_queues+0x4d/0x170 [nvme_core] [ 317.685185] nvme_remove_dead_ctrl_work+0x3a/0x90 [nvme] [ 317.685289] process_one_work+0x771/0x1170 [ 317.685372] worker_thread+0xde/0x11e0 [ 317.685452] ? pci_mmcfg_check_reserved+0x110/0x110 [ 317.685550] kthread+0x2d3/0x3d0 [ 317.685617] ? process_one_work+0x1170/0x1170 [ 317.685704] ? kthread_create_on_node+0xc0/0xc0 [ 317.685785] ret_from_fork+0x25/0x30 [ 317.685798] Code: 0f 1f 44 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 4c 8d a7 c0 01 00 00 53 48 89 fb 4c 89 e2 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 75 50 48 8b bb c0 01 00 00 e8 33 8a f9 00 0f ba b3 [ 317.685872] RIP: blk_mq_unquiesce_queue+0x2b/0xa0 RSP: ffff8803c2ef7d40 [ 317.685908] ---[ end trace a3f8704150b1e8b4 ]---
Signed-off-by: Scott Bauer scott.bauer@intel.com Signed-off-by: Christoph Hellwig hch@lst.de [ adapted for 4.9: added check around blk_mq_start_hw_queues() call instead of upstream blk_mq_unquiesce_queue() ] Fixes: 4aae4388165a2611fa42 ("nvme: fix hang in remove path") Signed-off-by: Simon Veith sveith@amazon.de Signed-off-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Amit Shah aams@amazon.com --- drivers/nvme/host/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index c823e93..979c6ec 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2042,7 +2042,8 @@ void nvme_kill_queues(struct nvme_ctrl *ctrl) mutex_lock(&ctrl->namespaces_mutex);
/* Forcibly start all queues to avoid having stuck requests */ - blk_mq_start_hw_queues(ctrl->admin_q); + if (ctrl->admin_q) + blk_mq_start_hw_queues(ctrl->admin_q);
list_for_each_entry(ns, &ctrl->namespaces, list) { /*
linux-stable-mirror@lists.linaro.org