commit da9de5f8527f4b9efc82f967d29a583318c034c7 upstream.
The call to sdma_progress() is called outside the wait lock.
In this case, there is a race condition where sdma_progress() can return false and the sdma_engine can idle. If that happens, there will be no more sdma interrupts to cause the wakeup and the user_sdma xmit will hang.
Fix by moving the lock to enclose the sdma_progress() call.
Also, delete busycount. The need for this was removed by: commit bcad29137a97 ("IB/hfi1: Serve the most starved iowait entry first")
Ported to linux-4.9.y.
Cc: stable@vger.kernel.org Fixes: 7724105686e7 ("IB/hfi1: add driver files") Reviewed-by: Gary Leshner Gary.S.Leshner@intel.com Signed-off-by: Mike Marciniszyn mike.marciniszyn@intel.com Signed-off-by: Dennis Dalessandro dennis.dalessandro@intel.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com --- drivers/infiniband/hw/hfi1/user_sdma.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c index 4c11116..098296a 100644 --- a/drivers/infiniband/hw/hfi1/user_sdma.c +++ b/drivers/infiniband/hw/hfi1/user_sdma.c @@ -260,7 +260,6 @@ struct user_sdma_txreq { struct list_head list; struct user_sdma_request *req; u16 flags; - unsigned busycount; u64 seqnum; };
@@ -323,25 +322,22 @@ static int defer_packet_queue( struct hfi1_user_sdma_pkt_q *pq = container_of(wait, struct hfi1_user_sdma_pkt_q, busy); struct hfi1_ibdev *dev = &pq->dd->verbs_dev; - struct user_sdma_txreq *tx = - container_of(txreq, struct user_sdma_txreq, txreq);
- if (sdma_progress(sde, seq, txreq)) { - if (tx->busycount++ < MAX_DEFER_RETRY_COUNT) - goto eagain; - } + write_seqlock(&dev->iowait_lock); + if (sdma_progress(sde, seq, txreq)) + goto eagain; /* * We are assuming that if the list is enqueued somewhere, it * is to the dmawait list since that is the only place where * it is supposed to be enqueued. */ xchg(&pq->state, SDMA_PKT_Q_DEFERRED); - write_seqlock(&dev->iowait_lock); if (list_empty(&pq->busy.list)) list_add_tail(&pq->busy.list, &sde->dmawait); write_sequnlock(&dev->iowait_lock); return -EBUSY; eagain: + write_sequnlock(&dev->iowait_lock); return -EAGAIN; }
@@ -925,7 +921,6 @@ static int user_sdma_send_pkts(struct user_sdma_request *req, unsigned maxpkts)
tx->flags = 0; tx->req = req; - tx->busycount = 0; INIT_LIST_HEAD(&tx->list);
if (req->seqnum == req->info.npkts - 1)
On Mon, Jun 24, 2019 at 04:15:37PM -0400, Mike Marciniszyn wrote:
commit da9de5f8527f4b9efc82f967d29a583318c034c7 upstream.
The call to sdma_progress() is called outside the wait lock.
In this case, there is a race condition where sdma_progress() can return false and the sdma_engine can idle. If that happens, there will be no more sdma interrupts to cause the wakeup and the user_sdma xmit will hang.
Fix by moving the lock to enclose the sdma_progress() call.
Also, delete busycount. The need for this was removed by: commit bcad29137a97 ("IB/hfi1: Serve the most starved iowait entry first")
Ported to linux-4.9.y.
Now applied, thanks.
Note, this already is in 4.14.132 and 4.19.57 so I didn't need the backports for those kernels.
greg k-h
linux-stable-mirror@lists.linaro.org