On Mon, May 18, 2020 at 07:36:34PM +0200, Greg Kroah-Hartman wrote:
From: Christoph Hellwig hch@lst.de
commit 287922eb0b186e2a5bf54fdd04b734c25c90035c upstream.
I notice 287922eb0b18 has been referenced in Fixes-tag in mainline commit 5480e299b5ae ("scsi: iscsi: Fix a potential deadlock in the timeout handler"). Consider if backporting 5480e299b5ae together with this 4.4 version of 287922eb0b18 is also relevant.
Thanks, -- Henri
Timer context is not very useful for drivers to perform any meaningful abort action from. So instead of calling the driver from this useless context defer it to a workqueue as soon as possible.
Note that while a delayed_work item would seem the right thing here I didn't dare to use it due to the magic in blk_add_timer that pokes deep into timer internals. But maybe this encourages Tejun to add a sensible API for that to the workqueue API and we'll all be fine in the end :)
Contains a major update from Keith Bush:
"This patch removes synchronizing the timeout work so that the timer can start a freeze on its own queue. The timer enters the queue, so timer context can only start a freeze, but not wait for frozen."
NOTE: Back-ported to 4.4.y.
The only parts of the upstream commit that have been kept are various locking changes, none of which were mentioned in the original commit message which therefore describes this change not at all.
Timeout callbacks continue to be run via a timer. Both blk_mq_rq_timer and blk_rq_timed_out_timer will return without without doing any work if they cannot acquire the queue (without waiting).