From: Yu Kuai yukuai3@huawei.com
One of our product reported a io hung problem, turns out the problem can be fixed by the patch.
I'm not sure why this patch is not backported yet, however, please consider it in 4.19 lts.
Ming Lei (1): scsi: core: Fix race between handling STS_RESOURCE and completion
drivers/scsi/scsi_lib.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
From: Ming Lei ming.lei@redhat.com
commit 673235f915318ced5d7ec4b2bfd8cb909e6a4a55 upstream.
When queuing I/O request to LLD, STS_RESOURCE may be returned because:
- Host is in recovery or blocked
- Target queue throttling or target is blocked
- LLD rejection
In these scenarios BLK_STS_DEV_RESOURCE is returned to the block layer to avoid an unnecessary re-run of the queue. However, all of the requests queued to this SCSI device may complete immediately after reading 'sdev->device_busy' and BLK_STS_DEV_RESOURCE is returned to block layer. In that case the current I/O won't get a chance to get queued since it is invisible at that time for both scsi_run_queue_async() and blk-mq's RESTART.
Fix the issue by not returning BLK_STS_DEV_RESOURCE in this situation.
Link: https://lore.kernel.org/r/20201202100419.525144-1-ming.lei@redhat.com Fixes: 86ff7c2a80cd ("blk-mq: introduce BLK_STS_DEV_RESOURCE") Cc: Hannes Reinecke hare@suse.com Cc: Sumit Saxena sumit.saxena@broadcom.com Cc: Kashyap Desai kashyap.desai@broadcom.com Cc: Bart Van Assche bvanassche@acm.org Cc: Ewan Milne emilne@redhat.com Cc: Long Li longli@microsoft.com Reported-by: John Garry john.garry@huawei.com Tested-by: "chenxiang (M)" chenxiang66@hisilicon.com Signed-off-by: Ming Lei ming.lei@redhat.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Yu Kuai yukuai3@huawei.com --- drivers/scsi/scsi_lib.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 0191708c9dd4..ace4a7230bcf 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2157,8 +2157,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, case BLK_STS_OK: break; case BLK_STS_RESOURCE: - if (atomic_read(&sdev->device_busy) || - scsi_device_blocked(sdev)) + if (scsi_device_blocked(sdev)) ret = BLK_STS_DEV_RESOURCE; break; default:
On Sat, Jul 30, 2022 at 04:46:50PM +0800, Yu Kuai wrote:
From: Yu Kuai yukuai3@huawei.com
One of our product reported a io hung problem, turns out the problem can be fixed by the patch.
I'm not sure why this patch is not backported yet, however, please consider it in 4.19 lts.
It was not backported as it did not apply as-is. Can you also provide a version for 5.4.y so that if someone were to upgrade to a newer kernel version, they would not have a regression? Once we have that, then we can accept this 4.19.y version.
thanks,
greg k-h
Hi, Greg
在 2022/07/31 18:46, Greg KH 写道:
On Sat, Jul 30, 2022 at 04:46:50PM +0800, Yu Kuai wrote:
From: Yu Kuai yukuai3@huawei.com
One of our product reported a io hung problem, turns out the problem can be fixed by the patch.
I'm not sure why this patch is not backported yet, however, please consider it in 4.19 lts.
It was not backported as it did not apply as-is. Can you also provide a version for 5.4.y so that if someone were to upgrade to a newer kernel version, they would not have a regression? Once we have that, then we can accept this 4.19.y version.
Of courest, thanks for your notice.
Kuai
thanks,
greg k-h .
linux-stable-mirror@lists.linaro.org