On Thu, Feb 24, 2022 at 9:04 PM Eric Biggers ebiggers@kernel.org wrote:
On Thu, Feb 24, 2022 at 09:07:47AM +0200, Gilad Ben-Yossef wrote:
Hi Eric,
On Thu, Feb 24, 2022 at 3:53 AM Eric Biggers ebiggers@kernel.org wrote:
On Wed, Feb 23, 2022 at 05:47:25PM -0800, Eric Biggers wrote:
On Wed, Feb 23, 2022 at 10:04:00AM +0200, Gilad Ben-Yossef wrote:
the drbg code was binding the same buffer to two different scatter gather lists and submitting those as source and destination to a crypto api operation, thus potentially causing HW crypto drivers to perform overlapping DMA mappings which are not aware it is the same buffer.
This can have serious consequences of data corruption of internal DRBG buffers and wrong RNG output.
Fix this by reusing the same scatter gatther list for both src and dst.
Signed-off-by: Gilad Ben-Yossef gilad@benyossef.com Reported-by: Corentin Labbe clabbe.montjoie@gmail.com Tested-by: Corentin Labbe clabbe.montjoie@gmail.com Tested-on: r8a7795-salvator-x Tested-on: xilinx-zc706 Fixes: 43490e8046b5d ("crypto: drbg - in-place cipher operation for CTR") Cc: stable@vger.kernel.org
Where is it documented and tested that the API doesn't allow this? I wasn't aware of this case; it sounds perfectly allowed to me. There might be a lot of other users who do this, not just drbg.c.
Just quickly looking through the code I maintain, there is another place that uses scatterlists like this: in fscrypt_crypt_block() in fs/crypto/crypto.c, the source and destination can be the same. That's just the code I maintain; I'm sure if you looked through the whole kernel you'd find a lot more.
This sounds more like a driver bug, and a case we need to add self-tests for.
Thank you for the feedback. That is a very good question. Indeed, I agree with you that in an ideal world the internal implementation details of DMA mapping would not pop up and interfere with higher level layer logic.
Let me describe my point of view and I would be very happy to hear where I am wrong:
The root cause underlying this is that, of course, hardware crypto drivers map the sglists passed to them for DMA . Indeed, we require input to crypto API as sglists of DMAable buffers (and not, say stack allocated buffers) because of this. So far I am just stating the obvious...
Now, it looks like the DMA api, accessed via dma_map_sg(), does not like overlapping mappings. The bug report that triggered this patch (see: https://lkml.org/lkml/2022/2/20/240) was an oops message including this warning: "DMA-API: ccree e6601000.crypto: cacheline tracking EEXIST, overlapping mappings aren't supported".
The messages comes from add_dma_entry() in kernel.dma/debug.c, because, as stated in the commit message that added this check in May 2021:
"Since, overlapping mappings are not supported by the DMA API we should report an error if active_cacheline_insert returns -EEXIST." (https://lkml.org/lkml/2021/5/18/572)
For now, I will take it at a given that this is proper and you do not consider this an issue in the DMA API.
Now, driver writers are of course aware of this DMA API limitation and thus we check if the src sglist is the same as the dst sglist and if so only map once. However, the underlying assumption is that the buffers pointed by different sglists do not overlap. We do not iterate over all the sglist trying to find overlaps.
When I see "we", it is because this behavior is not unique to the ccree driver:
Here is the same logic from a marvell cesa driver: https://elixir.bootlin.com/linux/latest/source/drivers/crypto/marvell/cesa/c...
Here it is again in the camm driver: https://elixir.bootlin.com/linux/latest/source/drivers/crypto/caam/caamalg.c...
I do believe that at least all crypto HW drivers apply the same logic.
Of course, we can ask that every HW crypto driver (and possibly any other sglist using HW driver) will add logic that scans each sglist for overlapping buffers and if found use a more sophisticated mapping (easy for a simple sglist that has one buffer identical to some other sglist, maybe more complicated if the overlap is not identity). The storage drivers sort of already do on some level, although I think on a higher abstraction layer than the drivers themselves if I'm not mistaken, though for performance reasons. This is certainly DOABLE in the sense that it can be achieved.
However, I don't think this is desirable. This will add non trivial code with non trivial runtime costs just to spot these cases. And we will need to fix ALL the hw drivers, because, to the best of my knowledge, none of them do this right now.
The remaining option is to enforce the rule of no overlap between different sglists passed to the crypto API. This seems much easier to me. Indeed, the fix I sent is a one liner. I suspect all other fixes are similar and I assume (but did not check) that there are not many of those. Indeed, I think it is much easier to impose the required limitation at the API caller level. It is not pretty, nor "just", but easier, I think.
I hope I've managed to explain my logic here.
I will note that even if we decide to follow the other route, we do need to document and fix probably every hw crypto (and possibly others) driver out there, because AFAIK, no one is taking into account this possibility right now.
Decryption in dm-crypt is another example where different scatterlists are used for in-place data. (This is because like the fscrypt case, it has a helper function which handles both in-place and out-of-place data.)
I don't think it is reasonable to "fix" all these users who are using the crypto API in a perfectly reasonable way.
I understand what you are saying, but assuming the issue is real, the alternative seems to be to fix all the HW crypto drivers by adding code that will impact their performance, so it's a matter of choosing the lesser of two evils.
Are you saying that dm-crypt, fscrypt, drbg, etc. never worked with any hardware crypto driver? How could that possibly be the case? Perhaps something changed in the DMA API recently that is causing this. Or maybe it is specific to the implementation of the DMA API on the platform you are testing.
That is a very good question. I became aware of this via a bug report of Corentin Labbe which saw the problem on a Salvator-X board. I have never seen the specific issue anywhere else.
Having said that, the following is also true: - The code that checks for this condition was only added in 2021. - I am not sure why the DMA api prohibits aliased mappings, but I can guess, and if my guess is correct this situation will only happen on a platform that both: 1. uses a crypto HW accelerator driven by DMA and 2. uses some form of IO MMU. I guess the combination might have been very rare in the past.
I think the right thing to do right now is to verify that we indeed have a general issue and not something specific to one singular platform So the question becomes - do indeed the DMA api forbits aliased mappings and if so, under what conditions?
Any ideas on how to check this?
Gilad
- Eric