Hello Ben,
On Tue, May 19, 2020 at 02:53:05PM +0100, Ben Hutchings wrote:
I noticed that commit 07928d9bfc81 "padata: Remove broken queue flushing" has been backported to most stable branches, but commit 6fc4dbcf0276 "padata: Replace delayed timer with immediate workqueue in padata_reorder" has not.
Is this correct? What prevents the parallel_data ref-count from dropping to 0 while the timer is scheduled?
Doesn't seem like anything does, looking at 4.19.
I can see a race where the timer function uses a parallel_data after free whether or not the refcount goes to 0. Don't think it's likely to happen in practice because of how small the window is between the serial callback finishing and the timer being deactivated.
task1: padata_reorder task2: padata_do_serial // object arrives in reorder queue // sees reorder_objects > 0, // set timer for 1 second mod_timer return padata_reorder // queue serial work, which finishes // (now possibly no more objects // left) | task1: | // pd is freed one of two ways: | // 1) pcrypt is unloaded | // 2) padata_replace triggered | // from userspace | (small window) | task3: | padata_reorder_timer | // uses pd after free | | del_timer // too late
If I got this right we might want to backport the commit you mentioned to be on the safe side.