On Mon, Sep 10, 2018 at 02:45:31PM +0200, Lars Ellenberg wrote:
On Sat, Sep 08, 2018 at 09:34:32AM +0200, Valentin Vidic wrote:
On Fri, Sep 07, 2018 at 07:14:59PM +0200, Valentin Vidic wrote:
In fact the first one is the original code path before I modified blkback. The problem is it gets executed async from workqueue so it might not always run before the call to drbdadm secondary.
As the DRBD device gets released only when the last IO request has finished, I found a way to check and wait for this in the block-drbd script:
--- block-drbd.orig 2018-09-08 09:07:23.499648515 +0200 +++ block-drbd 2018-09-08 09:28:12.892193649 +0200 @@ -230,6 +230,24 @@ and so cannot be mounted ${m2}${when}." } +wait_for_inflight() +{
- local dev="$1"
- local inflight="/sys/block/${dev#/dev/}/inflight"
- local rd wr
- if ! [ -f "$inflight" ]; then
- return
- fi
- while true; do
- read rd wr < $inflight
- if [ "$rd" = "0" -a "$wr" = "0" ]; then
If it is "idle" now, but still "open", this will not sleep, and still fail the demotion below.
True, but in this case blkback is holding it open until all the writes have finished and the last write closes the device. Since fuser can't check blkback this is an approximation that seems to work because I don't get any failed drbdadm calls now.
You try to help it by "waiting forever until it appears to be idle". I suggest to at least limit the retries by iteration or time. And also (or, instead; but you'd potentially get a number of "scary messages" in the logs) add something like:
Ok, should I open a PR to discuss this change further?
Or, well, yes, fix blkback to not "defer" the final close "too long", if at all possible.
blkback needs to finish the writes on shutdown or I get a fsck errors on next boot. Ideally XenbusStateClosed should be delayed until the device release but currently it does not seem possible without breaking other things.