The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9a6f209e36500efac51528132a3e3083586eda5f Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Mon, 19 Nov 2018 14:15:36 +0000
Subject: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent
snapshot creation
If the quota enable and snapshot creation ioctls are called concurrently
we can get into a deadlock where the task enabling quotas will deadlock
on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
twice, or the task creating a snapshot tries to commit the transaction
while the task enabling quota waits for the former task to commit the
transaction while holding the mutex. The following time diagrams show how
both cases happen.
First scenario:
CPU 0 CPU 1
btrfs_ioctl()
btrfs_ioctl_quota_ctl()
btrfs_quota_enable()
mutex_lock(fs_info->qgroup_ioctl_lock)
btrfs_start_transaction()
btrfs_ioctl()
btrfs_ioctl_snap_create_v2
create_snapshot()
--> adds snapshot to the
list pending_snapshots
of the current
transaction
btrfs_commit_transaction()
create_pending_snapshots()
create_pending_snapshot()
qgroup_account_snapshot()
btrfs_qgroup_inherit()
mutex_lock(fs_info->qgroup_ioctl_lock)
--> deadlock, mutex already locked
by this task at
btrfs_quota_enable()
Second scenario:
CPU 0 CPU 1
btrfs_ioctl()
btrfs_ioctl_quota_ctl()
btrfs_quota_enable()
mutex_lock(fs_info->qgroup_ioctl_lock)
btrfs_start_transaction()
btrfs_ioctl()
btrfs_ioctl_snap_create_v2
create_snapshot()
--> adds snapshot to the
list pending_snapshots
of the current
transaction
btrfs_commit_transaction()
--> waits for task at
CPU 0 to release
its transaction
handle
btrfs_commit_transaction()
--> sees another task started
the transaction commit first
--> releases its transaction
handle
--> waits for the transaction
commit to be completed by
the task at CPU 1
create_pending_snapshot()
qgroup_account_snapshot()
btrfs_qgroup_inherit()
mutex_lock(fs_info->qgroup_ioctl_lock)
--> deadlock, task at CPU 0
has the mutex locked but
it is waiting for us to
finish the transaction
commit
So fix this by setting the quota enabled flag in fs_info after committing
the transaction at btrfs_quota_enable(). This ends up serializing quota
enable and snapshot creation as if the snapshot creation happened just
before the quota enable request. The quota rescan task, scheduled after
committing the transaction in btrfs_quote_enable(), will do the accounting.
Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating snapshot")
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index b53d2e7b938d..2272419ade7e 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1013,16 +1013,22 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
btrfs_abort_transaction(trans, ret);
goto out_free_path;
}
- spin_lock(&fs_info->qgroup_lock);
- fs_info->quota_root = quota_root;
- set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
- spin_unlock(&fs_info->qgroup_lock);
ret = btrfs_commit_transaction(trans);
trans = NULL;
if (ret)
goto out_free_path;
+ /*
+ * Set quota enabled flag after committing the transaction, to avoid
+ * deadlocks on fs_info->qgroup_ioctl_lock with concurrent snapshot
+ * creation.
+ */
+ spin_lock(&fs_info->qgroup_lock);
+ fs_info->quota_root = quota_root;
+ set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
+ spin_unlock(&fs_info->qgroup_lock);
+
ret = qgroup_rescan_init(fs_info, 0, 1);
if (!ret) {
qgroup_rescan_zero_tracking(fs_info);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9a6f209e36500efac51528132a3e3083586eda5f Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Mon, 19 Nov 2018 14:15:36 +0000
Subject: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent
snapshot creation
If the quota enable and snapshot creation ioctls are called concurrently
we can get into a deadlock where the task enabling quotas will deadlock
on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
twice, or the task creating a snapshot tries to commit the transaction
while the task enabling quota waits for the former task to commit the
transaction while holding the mutex. The following time diagrams show how
both cases happen.
First scenario:
CPU 0 CPU 1
btrfs_ioctl()
btrfs_ioctl_quota_ctl()
btrfs_quota_enable()
mutex_lock(fs_info->qgroup_ioctl_lock)
btrfs_start_transaction()
btrfs_ioctl()
btrfs_ioctl_snap_create_v2
create_snapshot()
--> adds snapshot to the
list pending_snapshots
of the current
transaction
btrfs_commit_transaction()
create_pending_snapshots()
create_pending_snapshot()
qgroup_account_snapshot()
btrfs_qgroup_inherit()
mutex_lock(fs_info->qgroup_ioctl_lock)
--> deadlock, mutex already locked
by this task at
btrfs_quota_enable()
Second scenario:
CPU 0 CPU 1
btrfs_ioctl()
btrfs_ioctl_quota_ctl()
btrfs_quota_enable()
mutex_lock(fs_info->qgroup_ioctl_lock)
btrfs_start_transaction()
btrfs_ioctl()
btrfs_ioctl_snap_create_v2
create_snapshot()
--> adds snapshot to the
list pending_snapshots
of the current
transaction
btrfs_commit_transaction()
--> waits for task at
CPU 0 to release
its transaction
handle
btrfs_commit_transaction()
--> sees another task started
the transaction commit first
--> releases its transaction
handle
--> waits for the transaction
commit to be completed by
the task at CPU 1
create_pending_snapshot()
qgroup_account_snapshot()
btrfs_qgroup_inherit()
mutex_lock(fs_info->qgroup_ioctl_lock)
--> deadlock, task at CPU 0
has the mutex locked but
it is waiting for us to
finish the transaction
commit
So fix this by setting the quota enabled flag in fs_info after committing
the transaction at btrfs_quota_enable(). This ends up serializing quota
enable and snapshot creation as if the snapshot creation happened just
before the quota enable request. The quota rescan task, scheduled after
committing the transaction in btrfs_quote_enable(), will do the accounting.
Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating snapshot")
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index b53d2e7b938d..2272419ade7e 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1013,16 +1013,22 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
btrfs_abort_transaction(trans, ret);
goto out_free_path;
}
- spin_lock(&fs_info->qgroup_lock);
- fs_info->quota_root = quota_root;
- set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
- spin_unlock(&fs_info->qgroup_lock);
ret = btrfs_commit_transaction(trans);
trans = NULL;
if (ret)
goto out_free_path;
+ /*
+ * Set quota enabled flag after committing the transaction, to avoid
+ * deadlocks on fs_info->qgroup_ioctl_lock with concurrent snapshot
+ * creation.
+ */
+ spin_lock(&fs_info->qgroup_lock);
+ fs_info->quota_root = quota_root;
+ set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
+ spin_unlock(&fs_info->qgroup_lock);
+
ret = qgroup_rescan_init(fs_info, 0, 1);
if (!ret) {
qgroup_rescan_zero_tracking(fs_info);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9a6f209e36500efac51528132a3e3083586eda5f Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Mon, 19 Nov 2018 14:15:36 +0000
Subject: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent
snapshot creation
If the quota enable and snapshot creation ioctls are called concurrently
we can get into a deadlock where the task enabling quotas will deadlock
on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
twice, or the task creating a snapshot tries to commit the transaction
while the task enabling quota waits for the former task to commit the
transaction while holding the mutex. The following time diagrams show how
both cases happen.
First scenario:
CPU 0 CPU 1
btrfs_ioctl()
btrfs_ioctl_quota_ctl()
btrfs_quota_enable()
mutex_lock(fs_info->qgroup_ioctl_lock)
btrfs_start_transaction()
btrfs_ioctl()
btrfs_ioctl_snap_create_v2
create_snapshot()
--> adds snapshot to the
list pending_snapshots
of the current
transaction
btrfs_commit_transaction()
create_pending_snapshots()
create_pending_snapshot()
qgroup_account_snapshot()
btrfs_qgroup_inherit()
mutex_lock(fs_info->qgroup_ioctl_lock)
--> deadlock, mutex already locked
by this task at
btrfs_quota_enable()
Second scenario:
CPU 0 CPU 1
btrfs_ioctl()
btrfs_ioctl_quota_ctl()
btrfs_quota_enable()
mutex_lock(fs_info->qgroup_ioctl_lock)
btrfs_start_transaction()
btrfs_ioctl()
btrfs_ioctl_snap_create_v2
create_snapshot()
--> adds snapshot to the
list pending_snapshots
of the current
transaction
btrfs_commit_transaction()
--> waits for task at
CPU 0 to release
its transaction
handle
btrfs_commit_transaction()
--> sees another task started
the transaction commit first
--> releases its transaction
handle
--> waits for the transaction
commit to be completed by
the task at CPU 1
create_pending_snapshot()
qgroup_account_snapshot()
btrfs_qgroup_inherit()
mutex_lock(fs_info->qgroup_ioctl_lock)
--> deadlock, task at CPU 0
has the mutex locked but
it is waiting for us to
finish the transaction
commit
So fix this by setting the quota enabled flag in fs_info after committing
the transaction at btrfs_quota_enable(). This ends up serializing quota
enable and snapshot creation as if the snapshot creation happened just
before the quota enable request. The quota rescan task, scheduled after
committing the transaction in btrfs_quote_enable(), will do the accounting.
Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating snapshot")
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index b53d2e7b938d..2272419ade7e 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1013,16 +1013,22 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
btrfs_abort_transaction(trans, ret);
goto out_free_path;
}
- spin_lock(&fs_info->qgroup_lock);
- fs_info->quota_root = quota_root;
- set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
- spin_unlock(&fs_info->qgroup_lock);
ret = btrfs_commit_transaction(trans);
trans = NULL;
if (ret)
goto out_free_path;
+ /*
+ * Set quota enabled flag after committing the transaction, to avoid
+ * deadlocks on fs_info->qgroup_ioctl_lock with concurrent snapshot
+ * creation.
+ */
+ spin_lock(&fs_info->qgroup_lock);
+ fs_info->quota_root = quota_root;
+ set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
+ spin_unlock(&fs_info->qgroup_lock);
+
ret = qgroup_rescan_init(fs_info, 0, 1);
if (!ret) {
qgroup_rescan_zero_tracking(fs_info);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5a8067c0d17feb7579db0476191417b441a8996e Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Mon, 19 Nov 2018 09:48:12 +0000
Subject: [PATCH] Btrfs: fix access to available allocation bits when starting
balance
The available allocation bits members from struct btrfs_fs_info are
protected by a sequence lock, and when starting balance we access them
incorrectly in two different ways:
1) In the read sequence lock loop at btrfs_balance() we use the values we
read from fs_info->avail_*_alloc_bits and we can immediately do actions
that have side effects and can not be undone (printing a message and
jumping to a label). This is wrong because a retry might be needed, so
our actions must not have side effects and must be repeatable as long
as read_seqretry() returns a non-zero value. In other words, we were
essentially ignoring the sequence lock;
2) Right below the read sequence lock loop, we were reading the values
from avail_metadata_alloc_bits and avail_data_alloc_bits without any
protection from concurrent writers, that is, reading them outside of
the read sequence lock critical section.
So fix this by making sure we only read the available allocation bits
while in a read sequence lock critical section and that what we do in the
critical section is repeatable (has nothing that can not be undone) so
that any eventual retry that is needed is handled properly.
Fixes: de98ced9e743 ("Btrfs: use seqlock to protect fs_info->avail_{data, metadata, system}_alloc_bits")
Fixes: 14506127979a ("btrfs: fix a bogus warning when converting only data or metadata")
Reviewed-by: Nikolay Borisov <nborisov(a)suse.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a89596c3de56..2ebc827bed88 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3938,6 +3938,7 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
int ret;
u64 num_devices;
unsigned seq;
+ bool reducing_integrity;
if (btrfs_fs_closing(fs_info) ||
atomic_read(&fs_info->balance_pause_req) ||
@@ -4017,24 +4018,30 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
!(bctl->sys.target & allowed)) ||
((bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) &&
(fs_info->avail_metadata_alloc_bits & allowed) &&
- !(bctl->meta.target & allowed))) {
- if (bctl->flags & BTRFS_BALANCE_FORCE) {
- btrfs_info(fs_info,
- "balance: force reducing metadata integrity");
- } else {
- btrfs_err(fs_info,
- "balance: reduces metadata integrity, use --force if you want this");
- ret = -EINVAL;
- goto out;
- }
- }
+ !(bctl->meta.target & allowed)))
+ reducing_integrity = true;
+ else
+ reducing_integrity = false;
+
+ /* if we're not converting, the target field is uninitialized */
+ meta_target = (bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
+ bctl->meta.target : fs_info->avail_metadata_alloc_bits;
+ data_target = (bctl->data.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
+ bctl->data.target : fs_info->avail_data_alloc_bits;
} while (read_seqretry(&fs_info->profiles_lock, seq));
- /* if we're not converting, the target field is uninitialized */
- meta_target = (bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
- bctl->meta.target : fs_info->avail_metadata_alloc_bits;
- data_target = (bctl->data.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
- bctl->data.target : fs_info->avail_data_alloc_bits;
+ if (reducing_integrity) {
+ if (bctl->flags & BTRFS_BALANCE_FORCE) {
+ btrfs_info(fs_info,
+ "balance: force reducing metadata integrity");
+ } else {
+ btrfs_err(fs_info,
+ "balance: reduces metadata integrity, use --force if you want this");
+ ret = -EINVAL;
+ goto out;
+ }
+ }
+
if (btrfs_get_num_tolerated_disk_barrier_failures(meta_target) <
btrfs_get_num_tolerated_disk_barrier_failures(data_target)) {
int meta_index = btrfs_bg_flags_to_raid_index(meta_target);
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5a8067c0d17feb7579db0476191417b441a8996e Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Mon, 19 Nov 2018 09:48:12 +0000
Subject: [PATCH] Btrfs: fix access to available allocation bits when starting
balance
The available allocation bits members from struct btrfs_fs_info are
protected by a sequence lock, and when starting balance we access them
incorrectly in two different ways:
1) In the read sequence lock loop at btrfs_balance() we use the values we
read from fs_info->avail_*_alloc_bits and we can immediately do actions
that have side effects and can not be undone (printing a message and
jumping to a label). This is wrong because a retry might be needed, so
our actions must not have side effects and must be repeatable as long
as read_seqretry() returns a non-zero value. In other words, we were
essentially ignoring the sequence lock;
2) Right below the read sequence lock loop, we were reading the values
from avail_metadata_alloc_bits and avail_data_alloc_bits without any
protection from concurrent writers, that is, reading them outside of
the read sequence lock critical section.
So fix this by making sure we only read the available allocation bits
while in a read sequence lock critical section and that what we do in the
critical section is repeatable (has nothing that can not be undone) so
that any eventual retry that is needed is handled properly.
Fixes: de98ced9e743 ("Btrfs: use seqlock to protect fs_info->avail_{data, metadata, system}_alloc_bits")
Fixes: 14506127979a ("btrfs: fix a bogus warning when converting only data or metadata")
Reviewed-by: Nikolay Borisov <nborisov(a)suse.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a89596c3de56..2ebc827bed88 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3938,6 +3938,7 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
int ret;
u64 num_devices;
unsigned seq;
+ bool reducing_integrity;
if (btrfs_fs_closing(fs_info) ||
atomic_read(&fs_info->balance_pause_req) ||
@@ -4017,24 +4018,30 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
!(bctl->sys.target & allowed)) ||
((bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) &&
(fs_info->avail_metadata_alloc_bits & allowed) &&
- !(bctl->meta.target & allowed))) {
- if (bctl->flags & BTRFS_BALANCE_FORCE) {
- btrfs_info(fs_info,
- "balance: force reducing metadata integrity");
- } else {
- btrfs_err(fs_info,
- "balance: reduces metadata integrity, use --force if you want this");
- ret = -EINVAL;
- goto out;
- }
- }
+ !(bctl->meta.target & allowed)))
+ reducing_integrity = true;
+ else
+ reducing_integrity = false;
+
+ /* if we're not converting, the target field is uninitialized */
+ meta_target = (bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
+ bctl->meta.target : fs_info->avail_metadata_alloc_bits;
+ data_target = (bctl->data.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
+ bctl->data.target : fs_info->avail_data_alloc_bits;
} while (read_seqretry(&fs_info->profiles_lock, seq));
- /* if we're not converting, the target field is uninitialized */
- meta_target = (bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
- bctl->meta.target : fs_info->avail_metadata_alloc_bits;
- data_target = (bctl->data.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
- bctl->data.target : fs_info->avail_data_alloc_bits;
+ if (reducing_integrity) {
+ if (bctl->flags & BTRFS_BALANCE_FORCE) {
+ btrfs_info(fs_info,
+ "balance: force reducing metadata integrity");
+ } else {
+ btrfs_err(fs_info,
+ "balance: reduces metadata integrity, use --force if you want this");
+ ret = -EINVAL;
+ goto out;
+ }
+ }
+
if (btrfs_get_num_tolerated_disk_barrier_failures(meta_target) <
btrfs_get_num_tolerated_disk_barrier_failures(data_target)) {
int meta_index = btrfs_bg_flags_to_raid_index(meta_target);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5a8067c0d17feb7579db0476191417b441a8996e Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Mon, 19 Nov 2018 09:48:12 +0000
Subject: [PATCH] Btrfs: fix access to available allocation bits when starting
balance
The available allocation bits members from struct btrfs_fs_info are
protected by a sequence lock, and when starting balance we access them
incorrectly in two different ways:
1) In the read sequence lock loop at btrfs_balance() we use the values we
read from fs_info->avail_*_alloc_bits and we can immediately do actions
that have side effects and can not be undone (printing a message and
jumping to a label). This is wrong because a retry might be needed, so
our actions must not have side effects and must be repeatable as long
as read_seqretry() returns a non-zero value. In other words, we were
essentially ignoring the sequence lock;
2) Right below the read sequence lock loop, we were reading the values
from avail_metadata_alloc_bits and avail_data_alloc_bits without any
protection from concurrent writers, that is, reading them outside of
the read sequence lock critical section.
So fix this by making sure we only read the available allocation bits
while in a read sequence lock critical section and that what we do in the
critical section is repeatable (has nothing that can not be undone) so
that any eventual retry that is needed is handled properly.
Fixes: de98ced9e743 ("Btrfs: use seqlock to protect fs_info->avail_{data, metadata, system}_alloc_bits")
Fixes: 14506127979a ("btrfs: fix a bogus warning when converting only data or metadata")
Reviewed-by: Nikolay Borisov <nborisov(a)suse.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a89596c3de56..2ebc827bed88 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3938,6 +3938,7 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
int ret;
u64 num_devices;
unsigned seq;
+ bool reducing_integrity;
if (btrfs_fs_closing(fs_info) ||
atomic_read(&fs_info->balance_pause_req) ||
@@ -4017,24 +4018,30 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
!(bctl->sys.target & allowed)) ||
((bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) &&
(fs_info->avail_metadata_alloc_bits & allowed) &&
- !(bctl->meta.target & allowed))) {
- if (bctl->flags & BTRFS_BALANCE_FORCE) {
- btrfs_info(fs_info,
- "balance: force reducing metadata integrity");
- } else {
- btrfs_err(fs_info,
- "balance: reduces metadata integrity, use --force if you want this");
- ret = -EINVAL;
- goto out;
- }
- }
+ !(bctl->meta.target & allowed)))
+ reducing_integrity = true;
+ else
+ reducing_integrity = false;
+
+ /* if we're not converting, the target field is uninitialized */
+ meta_target = (bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
+ bctl->meta.target : fs_info->avail_metadata_alloc_bits;
+ data_target = (bctl->data.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
+ bctl->data.target : fs_info->avail_data_alloc_bits;
} while (read_seqretry(&fs_info->profiles_lock, seq));
- /* if we're not converting, the target field is uninitialized */
- meta_target = (bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
- bctl->meta.target : fs_info->avail_metadata_alloc_bits;
- data_target = (bctl->data.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
- bctl->data.target : fs_info->avail_data_alloc_bits;
+ if (reducing_integrity) {
+ if (bctl->flags & BTRFS_BALANCE_FORCE) {
+ btrfs_info(fs_info,
+ "balance: force reducing metadata integrity");
+ } else {
+ btrfs_err(fs_info,
+ "balance: reduces metadata integrity, use --force if you want this");
+ ret = -EINVAL;
+ goto out;
+ }
+ }
+
if (btrfs_get_num_tolerated_disk_barrier_failures(meta_target) <
btrfs_get_num_tolerated_disk_barrier_failures(data_target)) {
int meta_index = btrfs_bg_flags_to_raid_index(meta_target);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 53290432145a8eb143fe29e06e9c1465d43dc723 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon(a)arm.com>
Date: Thu, 3 Jan 2019 18:00:39 +0000
Subject: [PATCH] arm64: compat: Don't pull syscall number from regs in
arm_compat_syscall
The syscall number may have been changed by a tracer, so we should pass
the actual number in from the caller instead of pulling it from the
saved r7 value directly.
Cc: <stable(a)vger.kernel.org>
Cc: Pi-Hsun Shih <pihsun(a)chromium.org>
Reviewed-by: Dave Martin <Dave.Martin(a)arm.com>
Signed-off-by: Will Deacon <will.deacon(a)arm.com>
diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c
index a79db4e485a6..bc348ab3dd6b 100644
--- a/arch/arm64/kernel/sys_compat.c
+++ b/arch/arm64/kernel/sys_compat.c
@@ -66,12 +66,11 @@ do_compat_cache_op(unsigned long start, unsigned long end, int flags)
/*
* Handle all unrecognised system calls.
*/
-long compat_arm_syscall(struct pt_regs *regs)
+long compat_arm_syscall(struct pt_regs *regs, int scno)
{
- unsigned int no = regs->regs[7];
void __user *addr;
- switch (no) {
+ switch (scno) {
/*
* Flush a region from virtual address 'r0' to virtual address 'r1'
* _exclusive_. There is no alignment requirement on either address;
@@ -107,7 +106,7 @@ long compat_arm_syscall(struct pt_regs *regs)
* way the calling program can gracefully determine whether
* a feature is supported.
*/
- if (no < __ARM_NR_COMPAT_END)
+ if (scno < __ARM_NR_COMPAT_END)
return -ENOSYS;
break;
}
@@ -116,6 +115,6 @@ long compat_arm_syscall(struct pt_regs *regs)
(compat_thumb_mode(regs) ? 2 : 4);
arm64_notify_die("Oops - bad compat syscall(2)", regs,
- SIGILL, ILL_ILLTRP, addr, no);
+ SIGILL, ILL_ILLTRP, addr, scno);
return 0;
}
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 032d22312881..5610ac01c1ec 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -13,16 +13,15 @@
#include <asm/thread_info.h>
#include <asm/unistd.h>
-long compat_arm_syscall(struct pt_regs *regs);
-
+long compat_arm_syscall(struct pt_regs *regs, int scno);
long sys_ni_syscall(void);
-asmlinkage long do_ni_syscall(struct pt_regs *regs)
+static long do_ni_syscall(struct pt_regs *regs, int scno)
{
#ifdef CONFIG_COMPAT
long ret;
if (is_compat_task()) {
- ret = compat_arm_syscall(regs);
+ ret = compat_arm_syscall(regs, scno);
if (ret != -ENOSYS)
return ret;
}
@@ -47,7 +46,7 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
syscall_fn = syscall_table[array_index_nospec(scno, sc_nr)];
ret = __invoke_syscall(regs, syscall_fn);
} else {
- ret = do_ni_syscall(regs);
+ ret = do_ni_syscall(regs, scno);
}
regs->regs[0] = ret;
Commit fb544d1ca65a89f7a3895f7531221ceeed74ada7 upstream.
We recently addressed a VMID generation race by introducing a read/write
lock around accesses and updates to the vmid generation values.
However, kvm_arch_vcpu_ioctl_run() also calls need_new_vmid_gen() but
does so without taking the read lock.
As far as I can tell, this can lead to the same kind of race:
VM 0, VCPU 0 VM 0, VCPU 1
------------ ------------
update_vttbr (vmid 254)
update_vttbr (vmid 1) // roll over
read_lock(kvm_vmid_lock);
force_vm_exit()
local_irq_disable
need_new_vmid_gen == false //because vmid gen matches
enter_guest (vmid 254)
kvm_arch.vttbr = <PGD>:<VMID 1>
read_unlock(kvm_vmid_lock);
enter_guest (vmid 1)
Which results in running two VCPUs in the same VM with different VMIDs
and (even worse) other VCPUs from other VMs could now allocate clashing
VMID 254 from the new generation as long as VCPU 0 is not exiting.
Attempt to solve this by making sure vttbr is updated before another CPU
can observe the updated VMID generation.
Change-Id: I40aae6e89a3c8a496e13fcd8ae6bb663d16b057c
Cc: stable(a)vger.kernel.org # v4.19
Fixes: f0cf47d939d0 "KVM: arm/arm64: Close VMID generation race"
Reviewed-by: Julien Thierry <julien.thierry(a)arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall(a)arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier(a)arm.com>
---
virt/kvm/arm/arm.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 8fb31a7cc22c..91495045ad5a 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -66,7 +66,7 @@ static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_arm_running_vcpu);
static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
static u32 kvm_next_vmid;
static unsigned int kvm_vmid_bits __read_mostly;
-static DEFINE_RWLOCK(kvm_vmid_lock);
+static DEFINE_SPINLOCK(kvm_vmid_lock);
static bool vgic_present;
@@ -482,7 +482,9 @@ void force_vm_exit(const cpumask_t *mask)
*/
static bool need_new_vmid_gen(struct kvm *kvm)
{
- return unlikely(kvm->arch.vmid_gen != atomic64_read(&kvm_vmid_gen));
+ u64 current_vmid_gen = atomic64_read(&kvm_vmid_gen);
+ smp_rmb(); /* Orders read of kvm_vmid_gen and kvm->arch.vmid */
+ return unlikely(READ_ONCE(kvm->arch.vmid_gen) != current_vmid_gen);
}
/**
@@ -497,16 +499,11 @@ static void update_vttbr(struct kvm *kvm)
{
phys_addr_t pgd_phys;
u64 vmid;
- bool new_gen;
- read_lock(&kvm_vmid_lock);
- new_gen = need_new_vmid_gen(kvm);
- read_unlock(&kvm_vmid_lock);
-
- if (!new_gen)
+ if (!need_new_vmid_gen(kvm))
return;
- write_lock(&kvm_vmid_lock);
+ spin_lock(&kvm_vmid_lock);
/*
* We need to re-check the vmid_gen here to ensure that if another vcpu
@@ -514,7 +511,7 @@ static void update_vttbr(struct kvm *kvm)
* use the same vmid.
*/
if (!need_new_vmid_gen(kvm)) {
- write_unlock(&kvm_vmid_lock);
+ spin_unlock(&kvm_vmid_lock);
return;
}
@@ -537,7 +534,6 @@ static void update_vttbr(struct kvm *kvm)
kvm_call_hyp(__kvm_flush_vm_context);
}
- kvm->arch.vmid_gen = atomic64_read(&kvm_vmid_gen);
kvm->arch.vmid = kvm_next_vmid;
kvm_next_vmid++;
kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
@@ -548,7 +544,10 @@ static void update_vttbr(struct kvm *kvm)
vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits);
kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid;
- write_unlock(&kvm_vmid_lock);
+ smp_wmb();
+ WRITE_ONCE(kvm->arch.vmid_gen, atomic64_read(&kvm_vmid_gen));
+
+ spin_unlock(&kvm_vmid_lock);
}
static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
--
2.18.0
This is a note to let you know that I've just added the patch titled
staging: rtl8188eu: Add device code for D-Link DWA-121 rev B1
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 5f74a8cbb38d10615ed46bc3e37d9a4c9af8045a Mon Sep 17 00:00:00 2001
From: Michael Straube <straube.linux(a)gmail.com>
Date: Mon, 7 Jan 2019 18:28:58 +0100
Subject: staging: rtl8188eu: Add device code for D-Link DWA-121 rev B1
This device was added to the stand-alone driver on github.
Add it to the staging driver as well.
Link: https://github.com/lwfinger/rtl8188eu/commit/a0619a07cd1e
Signed-off-by: Michael Straube <straube.linux(a)gmail.com>
Acked-by: Larry Finger <Larry.Finger(a)lwfinger.net>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/rtl8188eu/os_dep/usb_intf.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/staging/rtl8188eu/os_dep/usb_intf.c b/drivers/staging/rtl8188eu/os_dep/usb_intf.c
index 28cbd6b3d26c..dfee6985efa6 100644
--- a/drivers/staging/rtl8188eu/os_dep/usb_intf.c
+++ b/drivers/staging/rtl8188eu/os_dep/usb_intf.c
@@ -35,6 +35,7 @@ static const struct usb_device_id rtw_usb_id_tbl[] = {
{USB_DEVICE(0x2001, 0x330F)}, /* DLink DWA-125 REV D1 */
{USB_DEVICE(0x2001, 0x3310)}, /* Dlink DWA-123 REV D1 */
{USB_DEVICE(0x2001, 0x3311)}, /* DLink GO-USB-N150 REV B1 */
+ {USB_DEVICE(0x2001, 0x331B)}, /* D-Link DWA-121 rev B1 */
{USB_DEVICE(0x2357, 0x010c)}, /* TP-Link TL-WN722N v2 */
{USB_DEVICE(0x0df6, 0x0076)}, /* Sitecom N150 v2 */
{USB_DEVICE(USB_VENDER_ID_REALTEK, 0xffef)}, /* Rosewill RNX-N150NUB */
--
2.20.1
https://bugzilla.kernel.org/show_bug.cgi?id=202133
Hi,
After upgrading kernel from 4.14.40 to 4.14.88,I found that 'HP FlexFabric 10Gb 2-port 554FLB Adapter' device is not in use. There are erros in dmesg log.
The Server is 'HP FlexServer B390'.
Device info:
lspci -n | grep 04:00
04:00.2 0c04: 19a2:0714 (rev 01)
...
04:00.2 Fibre Channel: Emulex Corporation OneConnect 10Gb FCoE Initiator (be3) (rev 01)
Subsystem: Hewlett-Packard Company NC554FLB 10Gb 2-port FlexFabric Converged Network Adapter
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin C routed to IRQ 95
...
Kernel driver in use: lpfc
The error info:
[ 1046.980480] lpfc 0000:04:00.3: 1:1303 Link Up Event x1 received Data: x1 x0 x4 x0 x0 x0 0
[ 1046.980482] lpfc 0000:04:00.3: 1:(0):2753 PLOGI failure DID:020009 Status:x3/x103
[ 1050.435167] lpfc 0000:04:00.2: 0:(0):2753 PLOGI failure DID:010012 Status:x3/x103
[ 1065.713327] lpfc 0000:04:00.3: 1:(0):2753 PLOGI failure DID:040002 Status:x3/x103
[ 1072.331933] lpfc 0000:04:00.2: 0:(0):2753 PLOGI failure DID:030003 Status:x3/x103
[ 1137.628132] lpfc 0000:04:00.2: 0:(0):0748 abort handler timed out waiting for aborting I/O (xri:x64) to complete: ret 0x2003, ID 2, LUN 0
[ 1137.644257] lpfc 0000:04:00.2: 0:(0):0713 SCSI layer issued Device Reset (2, 0) return x2002
[ 1139.676124] lpfc 0000:04:00.3: 1:(0):0748 abort handler timed out waiting for aborting I/O (xri:x464) to complete: ret 0x2003, ID 4, LUN 0
[ 1139.692242] lpfc 0000:04:00.3: 1:(0):0713 SCSI layer issued Device Reset (4, 0) return x2002
[ 1197.664150] lpfc 0000:04:00.2: 0:(0):0724 I/O flush failure for context LUN : cnt x1
[ 1197.664344] lpfc 0000:04:00.2: 0:(0):0723 SCSI layer issued Target Reset (2, 0) return x2002
[ 1199.704116] lpfc 0000:04:00.3: 1:(0):0724 I/O flush failure for context LUN : cnt x1
[ 1199.704368] lpfc 0000:04:00.3: 1:(0):0723 SCSI layer issued Target Reset (4, 0) return x2002
At the beginning, I thought the lpfc driver itself is the cause of the error.But,the error is still seen when 'lpfc driver' updates to the latest version.
To find the root cause and fix it, we checked the kernel version from 4.14.41 to 4.14.88, built and tested the kernel for booting.
I fount that the commit resulted in this error after bisect is ef86f3a72adb8a7931f67335560740a7ad696d1d,when I removed the commit the issue went away.
Removed commit info:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
During the test, I also found another issue that the system of 'HP FlexServer B390' server failed to boot by "hpsa driver timeout",basically, looks like the hpsa didn't detect the hard drives and udevd is stalled.
After upgrading from v4.14.54 to 4.14.55, hp system didn't boot ,but the system is ok when using the v4.14.55 kernel that has removed the commit.
The commit of ef86f3a72adb8a7931f67335560740a7ad696d1d also resulted in the error of HP Smart Array P220i RAID device. Because the v4.14.88 kernel is ok, I think that subsequent commits may have fixed the hpsa driver issue, but lpfc driver issue is not.
If there is any more info I can provide, just ask what would be useful. Any suggestions?
Thanks
Liang