On Fri, Apr 21, 2023 at 5:24 AM Catalin Marinas catalin.marinas@arm.com wrote:
On Thu, Apr 20, 2023 at 02:09:45PM -0700, Peter Collingbourne wrote:
Consider the following sequence of events:
- A page in a PROT_READ|PROT_WRITE VMA is faulted.
- Page migration allocates a page with the KASAN allocator, causing it to receive a non-match-all tag, and uses it to replace the page faulted in 1.
- The program uses mprotect() to enable PROT_MTE on the page faulted in 1.
Ah, so there is no race here, it's simply because the page allocation for migration has a non-match-all kasan tag in page->flags.
How do we handle the non-migration case with mprotect()? IIRC post_alloc_hook() always resets the page->flags since GFP_HIGHUSER_MOVABLE has the __GFP_SKIP_KASAN_UNPOISON flag.
Yes, that's how it normally works.
As a result of step 3, we are left with a non-match-all tag for a page with tags accessible to userspace, which can lead to the same kind of tag check faults that commit e74a68468062 ("arm64: Reset KASAN tag in copy_highpage with HW tags only") intended to fix.
The general invariant that we have for pages in a VMA with VM_MTE_ALLOWED is that they cannot have a non-match-all tag. As a result of step 2, the invariant is broken. This means that the fix in the referenced commit was incomplete and we also need to reset the tag for pages without PG_mte_tagged.
Fixes: e5b8d9218951 ("arm64: mte: reset the page tag in page->flags")
This commit was reverted in 20794545c146 (arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"). It looks a bit strange to fix it up.
It does seem strange but I think it is correct because that is when the bug (resetting tag only if PG_mte_tagged) was introduced. The revert preserved the bug because it did not account for the migration case, which means that it didn't account for migration+mprotect either.
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c index 4aadcfb01754..a7bb20055ce0 100644 --- a/arch/arm64/mm/copypage.c +++ b/arch/arm64/mm/copypage.c @@ -21,9 +21,10 @@ void copy_highpage(struct page *to, struct page *from)
copy_page(kto, kfrom);
if (kasan_hw_tags_enabled())
page_kasan_tag_reset(to);
if (system_supports_mte() && page_mte_tagged(from)) {
if (kasan_hw_tags_enabled())
page_kasan_tag_reset(to);
This should work but can we not do this at allocation time like we do for the source page and remove any page_kasan_tag_reset() here altogether?
That would be difficult because of the number of different ways that the page can be allocated. That's why we also decided to reset it here in commit e74a68468062.
Peter