On Mon, Apr 29, 2019 at 02:05:42PM +0200, Salvatore Bonaccorso wrote:
Hi Jan, hi Greg,
On Wed, Mar 20, 2019 at 01:58:06PM +0100, Jan Kara wrote:
Hello,
commit 310ca162d77 "block/loop: Use global lock for ioctl() operation." has been pushed to multiple stable trees. This patch is a part of larger series that overhauls the locking inside loopback device upstream and for 4.4, 4.9, and 4.14 stable trees only this patch from the series is applied. Our testing now has shown [1] that the patch alone makes present deadlocks inside loopback driver more likely (the openqa test in our infrastructure didn't hit the deadlock before whereas with the new kernel it hits it reliably every time). So I would suggest we revert 310ca162d77 from 4.4, 4.9, and 4.14 kernels.
A user in Debian reported [1], providing the following testcase which showed up after the recent update to 4.9.168-1 in Debian stretch (based on upstream v4.9.168) as follows:
dd if=/dev/zero of=/tmp/ff1.raw bs=1G seek=8 count=0 sync sleep 1 parted /tmp/ff1.raw mklabel msdos parted -s /tmp/ff1.raw mkpart primary linux-swap 1 100 parted -s -- /tmp/ff1.raw mkpart primary ext2 101 -1 parted -s -- /tmp/ff1.raw set 2 boot on sleep 5 losetup -Pf /tmp/ff1.raw --show
I have verified that the same happens with v4.9.171 where the mentioned commit was not reverted, and bisecting of the testcase showed it was introduced with 3ae3d167f5ec2c7bb5fcd12b7772cfadc93b2305 (v4.9.152~9) (which is the backport of 310ca162d77 for 4.9).
Reverting 3ae3d167f5ec2c7bb5fcd12b7772cfadc93b2305 on top of v4.9.171 worked and fixed the respective issue.
Can this commit in meanwhile be reverted or is there further ongoing work in integrating the followup fixes as mentioned in https://lore.kernel.org/stable/20190321104110.GF29086@quack2.suse.cz/ .
Sorry for the delay here. No, I didn't find any time for the followup stuff here, and Jan is right, this should just be dropped.
I've now reverted it from 3.18.y, 4.4.y, 4.9.y, and 4.14.y.
thanks,
greg k-h