Running this test program on ARMv4 a few times (sometimes just once) reproduces the bug.
int main() { unsigned i; char paragon[SIZE]; void* ptr;
memset(paragon, 0xAA, SIZE); ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_ANON | MAP_SHARED, -1, 0); if (ptr == MAP_FAILED) return 1; printf("ptr = %p\n", ptr); for (i=0;i<10000;i++){ memset(ptr, 0xAA, SIZE); if (memcmp(ptr, paragon, SIZE)) { printf("Unexpected bytes on iteration %u!!!\n", i); break; } } munmap(ptr, SIZE); }
In the "ptr" buffer there appear runs of zero bytes which are aligned by 16 and their lengths are multiple of 16.
Linux v5.11 does not have the bug, "git bisect" finds the first bad commit: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Before the commit update_mmu_cache() was called during a call to filemap_map_pages() as well as finish_fault(). After the commit finish_fault() lacks it.
Bring back update_mmu_cache() to finish_fault() to fix the bug. Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more closely reproduce the code of alloc_set_pte() function that existed before the commit.
On many platforms update_mmu_cache() is nop: x86, see arch/x86/include/asm/pgtable ARMv6+, see arch/arm/include/asm/tlbflush.h So, it seems, few users ran into this bug.
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") Signed-off-by: Sergei Antonov saproj@gmail.com Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com --- mm/memory.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c index 4ba73f5aa8bb..a78814413ac0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4386,14 +4386,20 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); - ret = 0; + /* Re-check under ptl */ - if (likely(!vmf_pte_changed(vmf))) + if (likely(!vmf_pte_changed(vmf))) { do_set_pte(vmf, page, vmf->address); - else + + /* no need to invalidate: a not-present page won't be cached */ + update_mmu_cache(vma, vmf->address, vmf->pte); + + ret = 0; + } else { + update_mmu_tlb(vma, vmf->address, vmf->pte); ret = VM_FAULT_NOPAGE; + }
- update_mmu_tlb(vma, vmf->address, vmf->pte); pte_unmap_unlock(vmf->pte, vmf->ptl); return ret; }
Hi,
Thanks for your patch.
FYI: kernel test robot notices the stable kernel rule is not satisfied.
Rule: 'Cc: stable@vger.kernel.org' or 'commit <sha1> upstream.' Subject: [PATCH] mm: bring back update_mmu_cache() to finish_fault() Link: https://lore.kernel.org/stable/20220908204809.2012451-1-saproj%40gmail.com
The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
On Thu, Sep 08, 2022 at 11:48:09PM +0300, Sergei Antonov wrote:
Running this test program on ARMv4 a few times (sometimes just once) reproduces the bug.
int main() { unsigned i; char paragon[SIZE]; void* ptr;
memset(paragon, 0xAA, SIZE); ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_ANON | MAP_SHARED, -1, 0); if (ptr == MAP_FAILED) return 1; printf("ptr = %p\n", ptr); for (i=0;i<10000;i++){ memset(ptr, 0xAA, SIZE); if (memcmp(ptr, paragon, SIZE)) { printf("Unexpected bytes on iteration %u!!!\n", i); break; } } munmap(ptr, SIZE);
}
In the "ptr" buffer there appear runs of zero bytes which are aligned by 16 and their lengths are multiple of 16.
Linux v5.11 does not have the bug, "git bisect" finds the first bad commit: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Before the commit update_mmu_cache() was called during a call to filemap_map_pages() as well as finish_fault(). After the commit finish_fault() lacks it.
Bring back update_mmu_cache() to finish_fault() to fix the bug. Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more closely reproduce the code of alloc_set_pte() function that existed before the commit.
On many platforms update_mmu_cache() is nop: x86, see arch/x86/include/asm/pgtable ARMv6+, see arch/arm/include/asm/tlbflush.h So, it seems, few users ran into this bug.
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") Signed-off-by: Sergei Antonov saproj@gmail.com Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com
+Will.
Seems I confused update_mmu_tlb() with update_mmu_cache() :/
Looks good to me:
Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com
mm/memory.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c index 4ba73f5aa8bb..a78814413ac0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4386,14 +4386,20 @@ vm_fault_t finish_fault(struct vm_fault *vmf) vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl);
- ret = 0;
- /* Re-check under ptl */
- if (likely(!vmf_pte_changed(vmf)))
- if (likely(!vmf_pte_changed(vmf))) { do_set_pte(vmf, page, vmf->address);
- else
/* no need to invalidate: a not-present page won't be cached */
update_mmu_cache(vma, vmf->address, vmf->pte);
ret = 0;
- } else {
ret = VM_FAULT_NOPAGE;update_mmu_tlb(vma, vmf->address, vmf->pte);
- }
- update_mmu_tlb(vma, vmf->address, vmf->pte); pte_unmap_unlock(vmf->pte, vmf->ptl); return ret;
}
2.34.1
On Fri, Sep 09, 2022 at 01:24:10AM +0300, Kirill A. Shutemov wrote:
On Thu, Sep 08, 2022 at 11:48:09PM +0300, Sergei Antonov wrote:
Running this test program on ARMv4 a few times (sometimes just once) reproduces the bug.
int main() { unsigned i; char paragon[SIZE]; void* ptr;
memset(paragon, 0xAA, SIZE); ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_ANON | MAP_SHARED, -1, 0); if (ptr == MAP_FAILED) return 1; printf("ptr = %p\n", ptr); for (i=0;i<10000;i++){ memset(ptr, 0xAA, SIZE); if (memcmp(ptr, paragon, SIZE)) { printf("Unexpected bytes on iteration %u!!!\n", i); break; } } munmap(ptr, SIZE);
}
In the "ptr" buffer there appear runs of zero bytes which are aligned by 16 and their lengths are multiple of 16.
Linux v5.11 does not have the bug, "git bisect" finds the first bad commit: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Before the commit update_mmu_cache() was called during a call to filemap_map_pages() as well as finish_fault(). After the commit finish_fault() lacks it.
Bring back update_mmu_cache() to finish_fault() to fix the bug. Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more closely reproduce the code of alloc_set_pte() function that existed before the commit.
On many platforms update_mmu_cache() is nop: x86, see arch/x86/include/asm/pgtable ARMv6+, see arch/arm/include/asm/tlbflush.h So, it seems, few users ran into this bug.
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") Signed-off-by: Sergei Antonov saproj@gmail.com Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com
+Will.
Seems I confused update_mmu_tlb() with update_mmu_cache() :/
Urgh, that thing is pretty horrible! But anyway, I agree that this change looks correct based on the other callers in the file.
Looks good to me:
Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com
I'm assuming Andrew will pick this up. Otherwise, please let me know if I should route it via the arm64 tree.
Will
On Thu, Sep 08, 2022 at 11:48:09PM +0300, Sergei Antonov wrote:
Running this test program on ARMv4 a few times (sometimes just once) reproduces the bug.
int main() { unsigned i; char paragon[SIZE]; void* ptr;
memset(paragon, 0xAA, SIZE); ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_ANON | MAP_SHARED, -1, 0); if (ptr == MAP_FAILED) return 1; printf("ptr = %p\n", ptr); for (i=0;i<10000;i++){ memset(ptr, 0xAA, SIZE); if (memcmp(ptr, paragon, SIZE)) { printf("Unexpected bytes on iteration %u!!!\n", i); break; } } munmap(ptr, SIZE);
}
In the "ptr" buffer there appear runs of zero bytes which are aligned by 16 and their lengths are multiple of 16.
Linux v5.11 does not have the bug, "git bisect" finds the first bad commit: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Before the commit update_mmu_cache() was called during a call to filemap_map_pages() as well as finish_fault(). After the commit finish_fault() lacks it.
Bring back update_mmu_cache() to finish_fault() to fix the bug. Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more closely reproduce the code of alloc_set_pte() function that existed before the commit.
On many platforms update_mmu_cache() is nop: x86, see arch/x86/include/asm/pgtable ARMv6+, see arch/arm/include/asm/tlbflush.h So, it seems, few users ran into this bug.
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") Signed-off-by: Sergei Antonov saproj@gmail.com Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com
mm/memory.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
<formletter>
This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly.
</formletter>
linux-stable-mirror@lists.linaro.org