On Mon, Mar 10, 2025 at 06:13:58PM +0000, Mark Rutland wrote:
On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote:
On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote:
On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote:
The optimized strscpy() and dentry_string_cmp() routines will read 8 unaligned bytes at a time via the function read_word_at_a_time(), but this is incompatible with MTE which will fault on a partially invalid read. The attributes on read_word_at_a_time() that disable KASAN are invisible to the CPU so they have no effect on MTE. Let's fix the bug for now by disabling the optimizations if the kernel is built with HW tag-based KASAN and consider improvements for followup changes.
Why is faulting on a partially invalid read a problem? It's still invalid, so ... it should fault, yes? What am I missing?
read_word_at_a_time() is used to read 8 bytes, potentially unaligned and beyond the end of string. The has_zero() function is then used to check where the string ends. For this uses, I think we can go with load_unaligned_zeropad() which handles a potential fault and pads the rest with zeroes.
If we only care about synchronous and asymmetric modes, that should be possible, but that won't work in asynchronous mode. In asynchronous mode the fault will accumulate into TFSR and will be detected later asynchronously where it cannot be related to its source and fixed up.
That means that both read_word_at_a_time() and load_unaligned_zeropad() are dodgy in async mode.
load_unaligned_zeropad() has a __mte_enable_tco_async() call to set PSTATE.TCO if in async mode, so that's covered. read_word_at_a_time() is indeed busted and I've had Vincezo's patches for a couple of years already, they just never made it to the list.
Can we somehow hang this off ARCH_HAS_SUBPAGE_FAULTS?
We could, though that was mostly for user-space faults while in-kernel we'd only need something similar if KASAN_HW_TAGS.
... and is there anything else that deliberately makes accesses that could straddle objects?
So far we only came across load_unaligned_zeropad() and read_word_at_a_time(). I'm not aware of anything else.