I found a report from syzbot [1]
When __folio_test_movable() is called in migrate_folio_unmap() to read folio->mapping, a data race occurs because the folio is read without protecting it with folio_lock.
This can cause unintended behavior because folio->mapping is initialized to a NULL value. Therefore, I think it is appropriate to call __folio_test_movable() under the protection of folio_lock to prevent data-race.
[1]
================================================================== BUG: KCSAN: data-race in __filemap_remove_folio / migrate_pages_batch
write to 0xffffea0004b81dd8 of 8 bytes by task 6348 on cpu 0: page_cache_delete mm/filemap.c:153 [inline] __filemap_remove_folio+0x1ac/0x2c0 mm/filemap.c:233 filemap_remove_folio+0x6b/0x1f0 mm/filemap.c:265 truncate_inode_folio+0x42/0x50 mm/truncate.c:178 shmem_undo_range+0x25b/0xa70 mm/shmem.c:1028 shmem_truncate_range mm/shmem.c:1144 [inline] shmem_evict_inode+0x14d/0x530 mm/shmem.c:1272 evict+0x2f0/0x580 fs/inode.c:731 iput_final fs/inode.c:1883 [inline] iput+0x42a/0x5b0 fs/inode.c:1909 dentry_unlink_inode+0x24f/0x260 fs/dcache.c:412 __dentry_kill+0x18b/0x4c0 fs/dcache.c:615 dput+0x5c/0xd0 fs/dcache.c:857 __fput+0x3fb/0x6d0 fs/file_table.c:439 ____fput+0x1c/0x30 fs/file_table.c:459 task_work_run+0x13a/0x1a0 kernel/task_work.c:228 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop kernel/entry/common.c:114 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0xbe/0x130 kernel/entry/common.c:218 do_syscall_64+0xd6/0x1c0 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f
read to 0xffffea0004b81dd8 of 8 bytes by task 6342 on cpu 1: __folio_test_movable include/linux/page-flags.h:699 [inline] migrate_folio_unmap mm/migrate.c:1199 [inline] migrate_pages_batch+0x24c/0x1940 mm/migrate.c:1797 migrate_pages_sync mm/migrate.c:1963 [inline] migrate_pages+0xff1/0x1820 mm/migrate.c:2072 do_mbind mm/mempolicy.c:1390 [inline] kernel_mbind mm/mempolicy.c:1533 [inline] __do_sys_mbind mm/mempolicy.c:1607 [inline] __se_sys_mbind+0xf76/0x1160 mm/mempolicy.c:1603 __x64_sys_mbind+0x78/0x90 mm/mempolicy.c:1603 x64_sys_call+0x2b4d/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:238 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
value changed: 0xffff888127601078 -> 0x0000000000000000
Reported-by: syzbot syzkaller@googlegroups.com Cc: stable@vger.kernel.org Fixes: 7e2a5e5ab217 ("mm: migrate: use __folio_test_movable()") Signed-off-by: Jeongjun Park aha310510@gmail.com --- mm/migrate.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e62dac12406b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1118,7 +1118,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, int rc = -EAGAIN; int old_page_state = 0; struct anon_vma *anon_vma = NULL; - bool is_lru = !__folio_test_movable(src); + bool is_lru; bool locked = false; bool dst_locked = false;
@@ -1172,6 +1172,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, locked = true; if (folio_test_mlocked(src)) old_page_state |= PAGE_WAS_MLOCKED; + is_lru = !__folio_test_movable(src);
if (folio_test_writeback(src)) { /* --
On 22.09.24 17:17, Jeongjun Park wrote:
I found a report from syzbot [1]
When __folio_test_movable() is called in migrate_folio_unmap() to read folio->mapping, a data race occurs because the folio is read without protecting it with folio_lock.
This can cause unintended behavior because folio->mapping is initialized to a NULL value. Therefore, I think it is appropriate to call __folio_test_movable() under the protection of folio_lock to prevent data-race.
We hold a folio reference, would we really see PAGE_MAPPING_MOVABLE flip? Hmm
Even a racing __ClearPageMovable() would still leave PAGE_MAPPING_MOVABLE set.
[1]
================================================================== BUG: KCSAN: data-race in __filemap_remove_folio / migrate_pages_batch
write to 0xffffea0004b81dd8 of 8 bytes by task 6348 on cpu 0: page_cache_delete mm/filemap.c:153 [inline] __filemap_remove_folio+0x1ac/0x2c0 mm/filemap.c:233 filemap_remove_folio+0x6b/0x1f0 mm/filemap.c:265 truncate_inode_folio+0x42/0x50 mm/truncate.c:178 shmem_undo_range+0x25b/0xa70 mm/shmem.c:1028 shmem_truncate_range mm/shmem.c:1144 [inline] shmem_evict_inode+0x14d/0x530 mm/shmem.c:1272 evict+0x2f0/0x580 fs/inode.c:731 iput_final fs/inode.c:1883 [inline] iput+0x42a/0x5b0 fs/inode.c:1909 dentry_unlink_inode+0x24f/0x260 fs/dcache.c:412 __dentry_kill+0x18b/0x4c0 fs/dcache.c:615 dput+0x5c/0xd0 fs/dcache.c:857 __fput+0x3fb/0x6d0 fs/file_table.c:439 ____fput+0x1c/0x30 fs/file_table.c:459 task_work_run+0x13a/0x1a0 kernel/task_work.c:228 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop kernel/entry/common.c:114 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0xbe/0x130 kernel/entry/common.c:218 do_syscall_64+0xd6/0x1c0 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f
read to 0xffffea0004b81dd8 of 8 bytes by task 6342 on cpu 1: __folio_test_movable include/linux/page-flags.h:699 [inline] migrate_folio_unmap mm/migrate.c:1199 [inline] migrate_pages_batch+0x24c/0x1940 mm/migrate.c:1797 migrate_pages_sync mm/migrate.c:1963 [inline] migrate_pages+0xff1/0x1820 mm/migrate.c:2072 do_mbind mm/mempolicy.c:1390 [inline] kernel_mbind mm/mempolicy.c:1533 [inline] __do_sys_mbind mm/mempolicy.c:1607 [inline] __se_sys_mbind+0xf76/0x1160 mm/mempolicy.c:1603 __x64_sys_mbind+0x78/0x90 mm/mempolicy.c:1603 x64_sys_call+0x2b4d/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:238 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
value changed: 0xffff888127601078 -> 0x0000000000000000
Note that this doesn't flip PAGE_MAPPING_MOVABLE, just some unrelated bits.
Reported-by: syzbot syzkaller@googlegroups.com Cc: stable@vger.kernel.org Fixes: 7e2a5e5ab217 ("mm: migrate: use __folio_test_movable()") Signed-off-by: Jeongjun Park aha310510@gmail.com
mm/migrate.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e62dac12406b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1118,7 +1118,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, int rc = -EAGAIN; int old_page_state = 0; struct anon_vma *anon_vma = NULL;
- bool is_lru = !__folio_test_movable(src);
- bool is_lru; bool locked = false; bool dst_locked = false;
@@ -1172,6 +1172,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, locked = true; if (folio_test_mlocked(src)) old_page_state |= PAGE_WAS_MLOCKED;
- is_lru = !__folio_test_movable(src);
Looks straight forward, though
Acked-by: David Hildenbrand david@redhat.com
On Mon, Sep 23, 2024 at 05:56:40PM +0200, David Hildenbrand wrote:
On 22.09.24 17:17, Jeongjun Park wrote:
I found a report from syzbot [1]
When __folio_test_movable() is called in migrate_folio_unmap() to read folio->mapping, a data race occurs because the folio is read without protecting it with folio_lock.
This can cause unintended behavior because folio->mapping is initialized to a NULL value. Therefore, I think it is appropriate to call __folio_test_movable() under the protection of folio_lock to prevent data-race.
We hold a folio reference, would we really see PAGE_MAPPING_MOVABLE flip? Hmm
No; this shows a page cache folio getting truncated. It's fine; really a false alarm from the tool. I don't think the proposed patch introduces any problems, but it's all a bit meh.
Even a racing __ClearPageMovable() would still leave PAGE_MAPPING_MOVABLE set.
[1]
================================================================== BUG: KCSAN: data-race in __filemap_remove_folio / migrate_pages_batch
write to 0xffffea0004b81dd8 of 8 bytes by task 6348 on cpu 0: page_cache_delete mm/filemap.c:153 [inline] __filemap_remove_folio+0x1ac/0x2c0 mm/filemap.c:233 filemap_remove_folio+0x6b/0x1f0 mm/filemap.c:265 truncate_inode_folio+0x42/0x50 mm/truncate.c:178 shmem_undo_range+0x25b/0xa70 mm/shmem.c:1028 shmem_truncate_range mm/shmem.c:1144 [inline] shmem_evict_inode+0x14d/0x530 mm/shmem.c:1272 evict+0x2f0/0x580 fs/inode.c:731 iput_final fs/inode.c:1883 [inline] iput+0x42a/0x5b0 fs/inode.c:1909 dentry_unlink_inode+0x24f/0x260 fs/dcache.c:412 __dentry_kill+0x18b/0x4c0 fs/dcache.c:615 dput+0x5c/0xd0 fs/dcache.c:857 __fput+0x3fb/0x6d0 fs/file_table.c:439 ____fput+0x1c/0x30 fs/file_table.c:459 task_work_run+0x13a/0x1a0 kernel/task_work.c:228 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop kernel/entry/common.c:114 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0xbe/0x130 kernel/entry/common.c:218 do_syscall_64+0xd6/0x1c0 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f
read to 0xffffea0004b81dd8 of 8 bytes by task 6342 on cpu 1: __folio_test_movable include/linux/page-flags.h:699 [inline] migrate_folio_unmap mm/migrate.c:1199 [inline] migrate_pages_batch+0x24c/0x1940 mm/migrate.c:1797 migrate_pages_sync mm/migrate.c:1963 [inline] migrate_pages+0xff1/0x1820 mm/migrate.c:2072 do_mbind mm/mempolicy.c:1390 [inline] kernel_mbind mm/mempolicy.c:1533 [inline] __do_sys_mbind mm/mempolicy.c:1607 [inline] __se_sys_mbind+0xf76/0x1160 mm/mempolicy.c:1603 __x64_sys_mbind+0x78/0x90 mm/mempolicy.c:1603 x64_sys_call+0x2b4d/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:238 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
value changed: 0xffff888127601078 -> 0x0000000000000000
Note that this doesn't flip PAGE_MAPPING_MOVABLE, just some unrelated bits.
Reported-by: syzbot syzkaller@googlegroups.com Cc: stable@vger.kernel.org Fixes: 7e2a5e5ab217 ("mm: migrate: use __folio_test_movable()") Signed-off-by: Jeongjun Park aha310510@gmail.com
mm/migrate.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e62dac12406b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1118,7 +1118,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, int rc = -EAGAIN; int old_page_state = 0; struct anon_vma *anon_vma = NULL;
- bool is_lru = !__folio_test_movable(src);
- bool is_lru; bool locked = false; bool dst_locked = false;
@@ -1172,6 +1172,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, locked = true; if (folio_test_mlocked(src)) old_page_state |= PAGE_WAS_MLOCKED;
- is_lru = !__folio_test_movable(src);
Looks straight forward, though
Acked-by: David Hildenbrand david@redhat.com
-- Cheers,
David / dhildenb
Matthew Wilcox willy@infradead.org wrote:
On Mon, Sep 23, 2024 at 05:56:40PM +0200, David Hildenbrand wrote:
On 22.09.24 17:17, Jeongjun Park wrote: I found a report from syzbot [1]
When __folio_test_movable() is called in migrate_folio_unmap() to read folio->mapping, a data race occurs because the folio is read without protecting it with folio_lock.
This can cause unintended behavior because folio->mapping is initialized to a NULL value. Therefore, I think it is appropriate to call __folio_test_movable() under the protection of folio_lock to prevent data-race.
We hold a folio reference, would we really see PAGE_MAPPING_MOVABLE flip? Hmm
No; this shows a page cache folio getting truncated. It's fine; really a false alarm from the tool. I don't think the proposed patch introduces any problems, but it's all a bit meh.
Well, I still don't understand why it's okay to read folio->mapping without folio_lock . Since migrate_folio_unmap() is already protected by folio_lock , I think it's definitely necessary to fix it to read folio->mapping under folio_lock protection. If it were still okay to call __folio_test_movable() without folio_lock , then we could annotate data-race, but I'm still not sure if this is a good way to do it.
Regards, Jeongjun Park
Even a racing __ClearPageMovable() would still leave PAGE_MAPPING_MOVABLE set.
[1]
================================================================== BUG: KCSAN: data-race in __filemap_remove_folio / migrate_pages_batch
write to 0xffffea0004b81dd8 of 8 bytes by task 6348 on cpu 0: page_cache_delete mm/filemap.c:153 [inline] __filemap_remove_folio+0x1ac/0x2c0 mm/filemap.c:233 filemap_remove_folio+0x6b/0x1f0 mm/filemap.c:265 truncate_inode_folio+0x42/0x50 mm/truncate.c:178 shmem_undo_range+0x25b/0xa70 mm/shmem.c:1028 shmem_truncate_range mm/shmem.c:1144 [inline] shmem_evict_inode+0x14d/0x530 mm/shmem.c:1272 evict+0x2f0/0x580 fs/inode.c:731 iput_final fs/inode.c:1883 [inline] iput+0x42a/0x5b0 fs/inode.c:1909 dentry_unlink_inode+0x24f/0x260 fs/dcache.c:412 __dentry_kill+0x18b/0x4c0 fs/dcache.c:615 dput+0x5c/0xd0 fs/dcache.c:857 __fput+0x3fb/0x6d0 fs/file_table.c:439 ____fput+0x1c/0x30 fs/file_table.c:459 task_work_run+0x13a/0x1a0 kernel/task_work.c:228 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop kernel/entry/common.c:114 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0xbe/0x130 kernel/entry/common.c:218 do_syscall_64+0xd6/0x1c0 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f
read to 0xffffea0004b81dd8 of 8 bytes by task 6342 on cpu 1: __folio_test_movable include/linux/page-flags.h:699 [inline] migrate_folio_unmap mm/migrate.c:1199 [inline] migrate_pages_batch+0x24c/0x1940 mm/migrate.c:1797 migrate_pages_sync mm/migrate.c:1963 [inline] migrate_pages+0xff1/0x1820 mm/migrate.c:2072 do_mbind mm/mempolicy.c:1390 [inline] kernel_mbind mm/mempolicy.c:1533 [inline] __do_sys_mbind mm/mempolicy.c:1607 [inline] __se_sys_mbind+0xf76/0x1160 mm/mempolicy.c:1603 __x64_sys_mbind+0x78/0x90 mm/mempolicy.c:1603 x64_sys_call+0x2b4d/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:238 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
value changed: 0xffff888127601078 -> 0x0000000000000000
Note that this doesn't flip PAGE_MAPPING_MOVABLE, just some unrelated bits.
Reported-by: syzbot syzkaller@googlegroups.com Cc: stable@vger.kernel.org Fixes: 7e2a5e5ab217 ("mm: migrate: use __folio_test_movable()") Signed-off-by: Jeongjun Park aha310510@gmail.com
mm/migrate.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index 923ea80ba744..e62dac12406b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1118,7 +1118,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, int rc = -EAGAIN; int old_page_state = 0; struct anon_vma *anon_vma = NULL;
- bool is_lru = !__folio_test_movable(src);
- bool is_lru; bool locked = false; bool dst_locked = false;
@@ -1172,6 +1172,7 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, locked = true; if (folio_test_mlocked(src)) old_page_state |= PAGE_WAS_MLOCKED;
- is_lru = !__folio_test_movable(src);
Looks straight forward, though
Acked-by: David Hildenbrand david@redhat.com
-- Cheers,
David / dhildenb
On Tue, Sep 24, 2024 at 09:28:44AM +0900, Jeongjun Park wrote:
Matthew Wilcox willy@infradead.org wrote:
On Mon, Sep 23, 2024 at 05:56:40PM +0200, David Hildenbrand wrote:
On 22.09.24 17:17, Jeongjun Park wrote: I found a report from syzbot [1]
When __folio_test_movable() is called in migrate_folio_unmap() to read folio->mapping, a data race occurs because the folio is read without protecting it with folio_lock.
This can cause unintended behavior because folio->mapping is initialized to a NULL value. Therefore, I think it is appropriate to call __folio_test_movable() under the protection of folio_lock to prevent data-race.
We hold a folio reference, would we really see PAGE_MAPPING_MOVABLE flip? Hmm
No; this shows a page cache folio getting truncated. It's fine; really a false alarm from the tool. I don't think the proposed patch introduces any problems, but it's all a bit meh.
Well, I still don't understand why it's okay to read folio->mapping without folio_lock .
Because it can't be changed in a way which changes the value of __folio_test_movable(). We have a refcount on the folio at this point, so it can't be freed. And __folio_set_movable() happens at allocation.
Matthew Wilcox willy@infradead.org wrote:
On Tue, Sep 24, 2024 at 09:28:44AM +0900, Jeongjun Park wrote:
Matthew Wilcox willy@infradead.org wrote:
On Mon, Sep 23, 2024 at 05:56:40PM +0200, David Hildenbrand wrote:
On 22.09.24 17:17, Jeongjun Park wrote: I found a report from syzbot [1]
When __folio_test_movable() is called in migrate_folio_unmap() to read folio->mapping, a data race occurs because the folio is read without protecting it with folio_lock.
This can cause unintended behavior because folio->mapping is initialized to a NULL value. Therefore, I think it is appropriate to call __folio_test_movable() under the protection of folio_lock to prevent data-race.
We hold a folio reference, would we really see PAGE_MAPPING_MOVABLE flip? Hmm
No; this shows a page cache folio getting truncated. It's fine; really a false alarm from the tool. I don't think the proposed patch introduces any problems, but it's all a bit meh.
Well, I still don't understand why it's okay to read folio->mapping without folio_lock .
Because it can't be changed in a way which changes the value of __folio_test_movable(). We have a refcount on the folio at this point, so it can't be freed. And __folio_set_movable() happens at allocation.
Thanks for the explanation. Then it seems appropriate to annotate data-race in __folio_test_movable() so that KCSAN ignores it.
I will apply the change and send you a new patch.
Regards, Jeongjun Park
linux-stable-mirror@lists.linaro.org