On Fri, Dec 15, 2023 at 11:30 AM Alexander Atanasov alexander.atanasov@virtuozzo.com wrote:
In commit 8930a6c20791 ("scsi: core: add support for request batching") blk-mq last flags was mapped to SCMD_LAST and used as an indicator to send the batch for the drivers that implement it but the error handling code was not updated.
scsi_send_eh_cmnd(...) is used to send error handling commands and request sense. The problem is that request sense comes as a single command that gets into the batch queue and times out. As result device goes offline after several failed resets. This was observed on virtio_scsi device resize operation.
[ 496.316946] sd 0:0:4:0: [sdd] tag#117 scsi_eh_0: requesting sense [ 506.786356] sd 0:0:4:0: [sdd] tag#117 scsi_send_eh_cmnd timeleft: 0 [ 506.787981] sd 0:0:4:0: [sdd] tag#117 abort
To fix this always set SCMD_LAST flag in scsi_send_eh_cmnd and scsi_reset_ioctl(...).
Fixes: 8930a6c20791 ("scsi: core: add support for request batching") Signed-off-by: Alexander Atanasov alexander.atanasov@virtuozzo.com
drivers/scsi/scsi_error.c | 2 ++ 1 file changed, 2 insertions(+)
v1->v2: fix it globally not only for virtio_scsi, as suggested by Paolo Bonzini, to avoid reintroducing the same bug.
Alexander,
The patch looks good to me but please resend including linux-scsi@vger.kernel.org.
A similar patch was also sent yesterday: https://lore.kernel.org/linux-scsi/ZXvdX6lWbdG+uqz8@infradead.org/T/#t but yours is more complete.
Paolo