On Tue, 2020-05-19 at 16:00 -0400, Daniel Jordan wrote:
Hello Ben,
On Tue, May 19, 2020 at 02:53:05PM +0100, Ben Hutchings wrote:
I noticed that commit 07928d9bfc81 "padata: Remove broken queue flushing" has been backported to most stable branches, but commit 6fc4dbcf0276 "padata: Replace delayed timer with immediate workqueue in padata_reorder" has not.
Is this correct? What prevents the parallel_data ref-count from dropping to 0 while the timer is scheduled?
Doesn't seem like anything does, looking at 4.19.
OK, so it looks like the following commits should be backported:
[3.16-4.9] 119a0798dc42 padata: Remove unused but set variables [3.16] de5540d088fe padata: avoid race in reordering [3.16-4.9] 69b348449bda padata: get_next is never NULL [3.16-4.14] cf5868c8a22d padata: ensure the reorder timer callback runs on the correct CPU [3.16-4.14] 350ef88e7e92 padata: ensure padata_do_serial() runs on the correct CPU [3.16-4.19] 6fc4dbcf0276 padata: Replace delayed timer with immediate workqueue in padata_reorder [3.16-4.19] ec9c7d19336e padata: initialize pd->cpu with effective cpumask [3.16-4.19] 065cf577135a padata: purge get_cpu and reorder_via_wq from padata_do_serial
Ben.
I can see a race where the timer function uses a parallel_data after free whether or not the refcount goes to 0. Don't think it's likely to happen in practice because of how small the window is between the serial callback finishing and the timer being deactivated.
task1: padata_reorder task2: padata_do_serial // object arrives in reorder queue // sees reorder_objects > 0, // set timer for 1 second mod_timer return padata_reorder // queue serial work, which finishes // (now possibly no more objects // left) | task1: | // pd is freed one of two ways: | // 1) pcrypt is unloaded | // 2) padata_replace triggered | // from userspace | (small window) | task3: | padata_reorder_timer | // uses pd after free | | del_timer // too late
If I got this right we might want to backport the commit you mentioned to be on the safe side.