At the beginning of the function, folio queues with marks3==0 are skipped, but after that, the `marks3` field is ignored. If one such queue is found, `slot` is set to 64 (because `__ffs(0)==64`), leading to a buffer overflow in the folioq_folio() call. The resulting crash may look like this:
BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 11 UID: 0 PID: 2909 Comm: kworker/u262:1 Not tainted 6.13.1-cm4all2-vm #415 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Workqueue: events_unbound netfs_read_termination_worker RIP: 0010:netfs_pgpriv2_write_to_the_cache+0x15a/0x3f0 Code: 48 85 c0 48 89 44 24 08 0f 84 24 01 00 00 48 8b 80 40 01 00 00 48 8b 7c 24 08 f3 48 0f bc c0 89 44 24 18 89 c0 48 8b 74 c7 08 <48> 8b 06 48 c7 04 24 00 10 00 00 a8 40 74 10 0f b6 4e 40 b8 00 10 RSP: 0018:ffffbbc440effe18 EFLAGS: 00010203 RAX: 0000000000000040 RBX: ffff96f8fc034000 RCX: 0000000000000000 RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffff96f8fc036400 RBP: 0000000000001000 R08: ffff96f9132bb400 R09: 0000000000001000 R10: ffff96f8c1263c80 R11: 0000000000000003 R12: 0000000000001000 R13: ffff96f8fb75ade8 R14: fffffaaf5ca90000 R15: ffff96f8fb75ad00 FS: 0000000000000000(0000) GS:ffff9703cf0c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000010c9ca003 CR4: 00000000001706b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __die+0x1f/0x60 ? page_fault_oops+0x158/0x450 ? search_extable+0x22/0x30 ? netfs_pgpriv2_write_to_the_cache+0x15a/0x3f0 ? search_module_extables+0xe/0x40 ? exc_page_fault+0x62/0x120 ? asm_exc_page_fault+0x22/0x30 ? netfs_pgpriv2_write_to_the_cache+0x15a/0x3f0 ? netfs_pgpriv2_write_to_the_cache+0xf6/0x3f0 netfs_read_termination_worker+0x1f/0x60 process_one_work+0x138/0x2d0 worker_thread+0x2a5/0x3b0 ? __pfx_worker_thread+0x10/0x10 kthread+0xba/0xe0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x30/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK>
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Cc: stable@vger.kernel.org Signed-off-by: Max Kellermann max.kellermann@ionos.com --- Note this patch doesn't apply to v6.14 as it was obsoleted by commit e2d46f2ec332 ("netfs: Change the read result collector to only use one work item"). --- fs/netfs/read_pgpriv2.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c index 54d5004fec18..e72f5e674834 100644 --- a/fs/netfs/read_pgpriv2.c +++ b/fs/netfs/read_pgpriv2.c @@ -181,16 +181,17 @@ void netfs_pgpriv2_write_to_the_cache(struct netfs_io_request *rreq) break;
folioq_unmark3(folioq, slot); - if (!folioq->marks3) { + while (!folioq->marks3) { folioq = folioq->next; if (!folioq) - break; + goto end_of_queue; }
slot = __ffs(folioq->marks3); folio = folioq_folio(folioq, slot); }
+end_of_queue: netfs_issue_write(wreq, &wreq->io_streams[1]); smp_wmb(); /* Write lists before ALL_QUEUED. */ set_bit(NETFS_RREQ_ALL_QUEUED, &wreq->flags);
Hi,
Thanks for your patch.
FYI: kernel test robot notices the stable kernel rule is not satisfied.
The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#opti...
Rule: The upstream commit ID must be specified with a separate line above the commit text. Subject: [PATCH v6.13] fs/netfs/read_pgpriv2: skip folio queues without `marks3` Link: https://lore.kernel.org/stable/20250210223144.3481766-1-max.kellermann%40ion...
Please ignore this mail if the patch is not relevant for upstream.
On Mon, Feb 10, 2025 at 11:31:44PM +0100, Max Kellermann wrote:
At the beginning of the function, folio queues with marks3==0 are skipped, but after that, the `marks3` field is ignored. If one such queue is found, `slot` is set to 64 (because `__ffs(0)==64`), leading to a buffer overflow in the folioq_folio() call. The resulting crash may look like this:
BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 11 UID: 0 PID: 2909 Comm: kworker/u262:1 Not tainted 6.13.1-cm4all2-vm #415 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Workqueue: events_unbound netfs_read_termination_worker RIP: 0010:netfs_pgpriv2_write_to_the_cache+0x15a/0x3f0 Code: 48 85 c0 48 89 44 24 08 0f 84 24 01 00 00 48 8b 80 40 01 00 00 48 8b 7c 24 08 f3 48 0f bc c0 89 44 24 18 89 c0 48 8b 74 c7 08 <48> 8b 06 48 c7 04 24 00 10 00 00 a8 40 74 10 0f b6 4e 40 b8 00 10 RSP: 0018:ffffbbc440effe18 EFLAGS: 00010203 RAX: 0000000000000040 RBX: ffff96f8fc034000 RCX: 0000000000000000 RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffff96f8fc036400 RBP: 0000000000001000 R08: ffff96f9132bb400 R09: 0000000000001000 R10: ffff96f8c1263c80 R11: 0000000000000003 R12: 0000000000001000 R13: ffff96f8fb75ade8 R14: fffffaaf5ca90000 R15: ffff96f8fb75ad00 FS: 0000000000000000(0000) GS:ffff9703cf0c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000010c9ca003 CR4: 00000000001706b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace:
<TASK> ? __die+0x1f/0x60 ? page_fault_oops+0x158/0x450 ? search_extable+0x22/0x30 ? netfs_pgpriv2_write_to_the_cache+0x15a/0x3f0 ? search_module_extables+0xe/0x40 ? exc_page_fault+0x62/0x120 ? asm_exc_page_fault+0x22/0x30 ? netfs_pgpriv2_write_to_the_cache+0x15a/0x3f0 ? netfs_pgpriv2_write_to_the_cache+0xf6/0x3f0 netfs_read_termination_worker+0x1f/0x60 process_one_work+0x138/0x2d0 worker_thread+0x2a5/0x3b0 ? __pfx_worker_thread+0x10/0x10 kthread+0xba/0xe0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x30/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK>
Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Cc: stable@vger.kernel.org Signed-off-by: Max Kellermann max.kellermann@ionos.com
Note this patch doesn't apply to v6.14 as it was obsoleted by commit e2d46f2ec332 ("netfs: Change the read result collector to only use one work item").
Why can't we just take what is upstream instead?
Diverging from that ALWAYS ends up being more work and problems in the end. Only do so if you have no other choice.
thanks,
greg k-h
On Tue, Feb 11, 2025 at 7:30 AM Greg KH gregkh@linuxfoundation.org wrote:
Note this patch doesn't apply to v6.14 as it was obsoleted by commit e2d46f2ec332 ("netfs: Change the read result collector to only use one work item").
Why can't we just take what is upstream instead?
Diverging from that ALWAYS ends up being more work and problems in the end. Only do so if you have no other choice.
Usually I agree with that, and I trust that you will make the right decision.
Before you decide, let me point out that netfs has been extremely unstable since 6.10 (July 2024), ever since commit 2e9d7e4b984a ("mm: Remove the PG_fscache alias for PG_private_2). All of our web servers have been crashing since 6.10 all the time (see https://lore.kernel.org/netfs/CAKPOu+_DA8XiMAA2ApMj7Pyshve_YWknw8Hdt1=zCy9Y8... for one of several bug reports I posted), and I went through considerable trouble by resisting the pressure from people asking me to downgrade to 6.6. (I want the bugs fixed, I don't want to go back.) For several months, 6.12 had been crashing instantly on boot due to yet another netfs regression (https://lore.kernel.org/netfs/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY...) which wasn't fixed until 6.12.11, so our production servers are still on 6.11.11 today. Before these bugs got ironed out, v6.11 commit ee4cdf7ba857 ("netfs: Speed up buffered reading") wreaked more havoc, leading to 8 "Fixes" commits so far, plus the 2 I posted yesterday.
I wrote that e2d46f2ec332 has obsoleted my patch (actually both patches), but I don't know if it really fixes both bugs. The code that was buggy does not exist anymore in the form that my patch addresses, but I don't know if it was just refactored and the bugs were kept (or maybe yet more bugs sneaked in).
Max
On Tue, Feb 11, 2025 at 09:05:44AM +0100, Max Kellermann wrote:
On Tue, Feb 11, 2025 at 7:30 AM Greg KH gregkh@linuxfoundation.org wrote:
Note this patch doesn't apply to v6.14 as it was obsoleted by commit e2d46f2ec332 ("netfs: Change the read result collector to only use one work item").
Why can't we just take what is upstream instead?
Diverging from that ALWAYS ends up being more work and problems in the end. Only do so if you have no other choice.
Usually I agree with that, and I trust that you will make the right decision.
I defer to the maintainers of the subsystem here to make that decision, as they are the ones that will be getting my "FAILED" backport emails when things fail to backport due to this being out-of-tree :)
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org