Despite NVM Express specification 1.3 requires a controller claiming to be 1.3 or higher implement Identify CNS 03h (Namespace Identification Descriptor list), the driver doesn't really need this identification in order to use a namespace. The code had already documented in comments that we're not to consider an error to this command.
Return success if the controller provided any response to an namespace identification descriptors command.
Fixes: 538af88ea7d9de24 ("nvme: make nvme_report_ns_ids propagate error back") Reported-by: Ingo Brunberg ingo_brunberg@web.de Cc: Sagi Grimberg sagi@grimberg.me Cc: stable@vger.kernel.org # 5.4+ Signed-off-by: Keith Busch kbusch@kernel.org --- drivers/nvme/host/core.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index e6ee34376c5e..2a84e1402244 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1735,6 +1735,8 @@ static int nvme_report_ns_ids(struct nvme_ctrl *ctrl, unsigned int nsid, if (ret) dev_warn(ctrl->device, "Identify Descriptors failed (%d)\n", ret); + if (ret > 0) + ret = 0; } return ret; }
On Tue, Dec 03, 2019 at 12:56:11AM +0900, Keith Busch wrote:
Despite NVM Express specification 1.3 requires a controller claiming to be 1.3 or higher implement Identify CNS 03h (Namespace Identification Descriptor list), the driver doesn't really need this identification in order to use a namespace. The code had already documented in comments that we're not to consider an error to this command.
Return success if the controller provided any response to an namespace identification descriptors command.
Fixes: 538af88ea7d9de24 ("nvme: make nvme_report_ns_ids propagate error back") Reported-by: Ingo Brunberg ingo_brunberg@web.de
Why would we ignore the error? Do you have a buggy controller messing this up?
On Mon, Dec 02, 2019 at 05:15:45PM +0100, Christoph Hellwig wrote:
On Tue, Dec 03, 2019 at 12:56:11AM +0900, Keith Busch wrote:
Despite NVM Express specification 1.3 requires a controller claiming to be 1.3 or higher implement Identify CNS 03h (Namespace Identification Descriptor list), the driver doesn't really need this identification in order to use a namespace. The code had already documented in comments that we're not to consider an error to this command.
Return success if the controller provided any response to an namespace identification descriptors command.
Fixes: 538af88ea7d9de24 ("nvme: make nvme_report_ns_ids propagate error back") Reported-by: Ingo Brunberg ingo_brunberg@web.de
Why would we ignore the error? Do you have a buggy controller messing this up?
I don't have such a controller, but many apparently do. The regression was reported here:
http://lists.infradead.org/pipermail/linux-nvme/2019-December/028223.html
And of course it's the SMI controller ...
On 12/2/2019 9:22 AM, Keith Busch wrote:
On Mon, Dec 02, 2019 at 05:15:45PM +0100, Christoph Hellwig wrote:
On Tue, Dec 03, 2019 at 12:56:11AM +0900, Keith Busch wrote:
Despite NVM Express specification 1.3 requires a controller claiming to be 1.3 or higher implement Identify CNS 03h (Namespace Identification Descriptor list), the driver doesn't really need this identification in order to use a namespace. The code had already documented in comments that we're not to consider an error to this command.
Return success if the controller provided any response to an namespace identification descriptors command.
Fixes: 538af88ea7d9de24 ("nvme: make nvme_report_ns_ids propagate error back") Reported-by: Ingo Brunberg ingo_brunberg@web.de
Why would we ignore the error? Do you have a buggy controller messing this up?
I don't have such a controller, but many apparently do. The regression was reported here:
http://lists.infradead.org/pipermail/linux-nvme/2019-December/028223.html
And of course it's the SMI controller ...
Does 5.4 show the exact error code? Perhaps we should selectively allow just for that case?
Ed
On Mon, Dec 02, 2019 at 09:27:14AM -0700, Nadolski, Edmund wrote:
I don't have such a controller, but many apparently do. The regression was reported here:
http://lists.infradead.org/pipermail/linux-nvme/2019-December/028223.html
And of course it's the SMI controller ...
Does 5.4 show the exact error code? Perhaps we should selectively allow just for that case?
They'll find other ways to f***ck up. Looks like at least the controller in the bug report also doesn't have an subnqn and the nguid/eui64 are bogus. I wonder if we actually do users a favour by allowing that..
On 12/2/2019 9:29 AM, Christoph Hellwig wrote:
On Mon, Dec 02, 2019 at 09:27:14AM -0700, Nadolski, Edmund wrote:
I don't have such a controller, but many apparently do. The regression was reported here:
http://lists.infradead.org/pipermail/linux-nvme/2019-December/028223.html
And of course it's the SMI controller ...
Does 5.4 show the exact error code? Perhaps we should selectively allow just for that case?
They'll find other ways to f***ck up. Looks like at least the controller in the bug report also doesn't have an subnqn and the nguid/eui64 are bogus. I wonder if we actually do users a favour by allowing that..
Agreed, tho since it looks like a regression, does it make sense to treat as a quirk?
Ed
On Mon, Dec 02, 2019 at 05:29:05PM +0100, Christoph Hellwig wrote:
On Mon, Dec 02, 2019 at 09:27:14AM -0700, Nadolski, Edmund wrote:
I don't have such a controller, but many apparently do. The regression was reported here:
http://lists.infradead.org/pipermail/linux-nvme/2019-December/028223.html
And of course it's the SMI controller ...
Does 5.4 show the exact error code? Perhaps we should selectively allow just for that case?
They'll find other ways to f***ck up. Looks like at least the controller in the bug report also doesn't have an subnqn and the nguid/eui64 are bogus.
Indeed, they will find creative ways to break it.
Customer or OEM requirments are poorly written, like "Must report NVMe version 1.3". Nobody bothers to mention that it must also be compliant to that version, or even realize they never cared for those features in the first place.
Compliance testing like from UNH should have caught this before shipping with such a device, but it's a cheap device, so maybe they skip that step.
I wonder if we actually do users a favour by allowing that..
I think it's too late now. We did successfully use such namespaces before 5.4, even if they're fundamentally broken.
Johannes also commented *not* to consider these errors when this identification was originally implemented, so either he knew vendors screwed this up, or had the forethought to know they would.
On Tue, Dec 03, 2019 at 01:49:03AM +0900, Keith Busch wrote:
Customer or OEM requirments are poorly written, like "Must report NVMe version 1.3". Nobody bothers to mention that it must also be compliant to that version, or even realize they never cared for those features in the first place.
Compliance testing like from UNH should have caught this before shipping with such a device, but it's a cheap device, so maybe they skip that step.
I wonder if we actually do users a favour by allowing that..
I think it's too late now. We did successfully use such namespaces before 5.4, even if they're fundamentally broken.
Johannes also commented *not* to consider these errors when this identification was originally implemented, so either he knew vendors screwed this up, or had the forethought to know they would.
Yes. I guess your patch is the best thing for now:
Reviewed-by: Christoph Hellwig hch@lst.de
But I think we might need a new kernel tain flag or something like it for devices that are so obviously broken in their identifiers.
On Mon, Dec 02, 2019 at 06:34:14PM +0100, Christoph Hellwig wrote:
Yes. I guess your patch is the best thing for now:
Reviewed-by: Christoph Hellwig hch@lst.de
Thanks, applied for-5.5
But I think we might need a new kernel tain flag or something like it for devices that are so obviously broken in their identifiers.
That's fine with me. We currently just log a warning when an error is returned here, we can add_taint() too. Which flag do you think? TAINT_FIRMWARE_WORKAROUND, TAINT_CRAP, or something else?
linux-stable-mirror@lists.linaro.org