On Mon, Dec 16, 2019 at 10:18:48AM +0100, Andre Tomt wrote:
On 16.12.2019 08:42, Jack Wang wrote:
Andre Tomt andre@tomt.net 于2019年12月14日周六 下午3:24写道:
4.19.87, 4.19.88, 4.19.89 all lock up frequently on some of my systems. The same systems run 5.4.3 fine, so the newer trees are probably OK. Reverting this commit on top of 4.19.87 makes everything stable.
To trigger it all I have to do is re-rsyncing a directory tree with some changed files churn, it will usually crash in 10 to 30 minutes.
The systems crashing has ext4 filesystem on a two ssd md raid1 mounted with the mount option discard. If mounting it without discard, the crashes no longer seem to occur.
No oops/panic made it to the ipmi console. I suspect the console is just misbehaving and it didnt really livelock. At one point one line of the crash made it to the console (kernel BUG at block/blk-core.c:1776), and it was enough to pinpoint this commit. Note that the line number might be off, as I was attempting a bisect at the time.
This commit also made it to 4.14.x, but I have not tested it.
Hi Andre,
I noticed one fix is missing for discard merge in 4.19.y 2a5cf35cd6c5 ("block: fix single range discard merge")
Can you try if it helps? just "git cherry-pick 2a5cf35cd6c5"
Indeed, adding this commit on top a clean 4.19.89 fixes the issue. So far survived about an hour of rsyncing file churn.
Great!
Thanks Jack for finding the fix and Andre for reporting this. I'll go queue this fix up right now.
greg k-h