 
            From: Maurizio Lombardi mlombard@redhat.com
[ Upstream commit 5a58ac9bfc412a58c3cf26c6a7e54d4308e9d109 ]
Update the nvme_tcp_start_tls() function to use dev_err() instead of dev_dbg() when a TLS error is detected. This ensures that handshake failures are visible by default, aiding in debugging.
Signed-off-by: Maurizio Lombardi mlombard@redhat.com Reviewed-by: Laurence Oberman loberman@redhat.com Reviewed-by: Hannes Reinecke hare@suse.de Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a Real Debugging Issue**: The commit addresses a genuine problem where TLS handshake failures were being logged at debug level (`dev_dbg()`), making them invisible in production environments unless debug logging is explicitly enabled. This change promotes error visibility by using `dev_err()` for actual error conditions.
2. **Small and Contained Change**: The patch is minimal - it only changes logging behavior by: - Adding a conditional check `if (queue->tls_err)` - Moving the error case from `dev_dbg()` to `dev_err()` - Keeping successful handshakes at debug level
The diff shows only about 10 lines changed with no functional modifications.
3. **No Architectural Changes**: This is purely a logging improvement that doesn't alter any control flow, data structures, or protocol behavior. It simply makes existing errors more visible.
4. **Important for Production Debugging**: TLS handshake failures in NVMe-TCP can be caused by various issues (certificate problems, key mismatches, network issues) that are critical to diagnose in production. Having these errors hidden at debug level severely hampers troubleshooting.
5. **Mature Feature Area**: NVMe-TCP TLS support was introduced in kernel 6.5 (around August 2023 based on commit be8e82caa685), making it a relatively mature feature that's likely deployed in production systems requiring proper error visibility.
6. **No Risk of Regression**: The change only affects logging output and cannot introduce functional regressions. The worst case is slightly more verbose kernel logs when TLS errors occur, which is the intended behavior.
7. **Follows Stable Rules**: This meets the stable kernel criteria as it's a small fix that improves debuggability of an existing feature without introducing new functionality or risks.
drivers/nvme/host/tcp.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 25e486e6e805..83a6b18b01ad 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1777,9 +1777,14 @@ static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl, qid, ret); tls_handshake_cancel(queue->sock->sk); } else { - dev_dbg(nctrl->device, - "queue %d: TLS handshake complete, error %d\n", - qid, queue->tls_err); + if (queue->tls_err) { + dev_err(nctrl->device, + "queue %d: TLS handshake complete, error %d\n", + qid, queue->tls_err); + } else { + dev_dbg(nctrl->device, + "queue %d: TLS handshake complete\n", qid); + } ret = queue->tls_err; } return ret;