The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 0d836392cadd5535f4184d46d901a82eb276ed62 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Fri, 20 Jul 2018 10:59:06 +0100
Subject: [PATCH] Btrfs: fix mount failure after fsync due to hard link
recreation
If we end up with logging an inode reference item which has the same name
but different index from the one we have persisted, we end up failing when
replaying the log with an errno value of -EEXIST. The error comes from
btrfs_add_link(), which is called from add_inode_ref(), when we are
replaying an inode reference item.
Example scenario where this happens:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ touch /mnt/foo
$ ln /mnt/foo /mnt/bar
$ sync
# Rename the first hard link (foo) to a new name and rename the second
# hard link (bar) to the old name of the first hard link (foo).
$ mv /mnt/foo /mnt/qwerty
$ mv /mnt/bar /mnt/foo
# Create a new file, in the same parent directory, with the old name of
# the second hard link (bar) and fsync this new file.
# We do this instead of calling fsync on foo/qwerty because if we did
# that the fsync resulted in a full transaction commit, not triggering
# the problem.
$ touch /mnt/bar
$ xfs_io -c "fsync" /mnt/bar
<power fail>
$ mount /dev/sdb /mnt
mount: mount /dev/sdb on /mnt failed: File exists
So fix this by checking if a conflicting inode reference exists (same
name, same parent but different index), removing it (and the associated
dir index entries from the parent inode) if it exists, before attempting
to add the new reference.
A test case for fstests follows soon.
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 10f6a4223897..033aeebbe9de 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -1290,6 +1290,46 @@ again:
return ret;
}
+static int btrfs_inode_ref_exists(struct inode *inode, struct inode *dir,
+ const u8 ref_type, const char *name,
+ const int namelen)
+{
+ struct btrfs_key key;
+ struct btrfs_path *path;
+ const u64 parent_id = btrfs_ino(BTRFS_I(dir));
+ int ret;
+
+ path = btrfs_alloc_path();
+ if (!path)
+ return -ENOMEM;
+
+ key.objectid = btrfs_ino(BTRFS_I(inode));
+ key.type = ref_type;
+ if (key.type == BTRFS_INODE_REF_KEY)
+ key.offset = parent_id;
+ else
+ key.offset = btrfs_extref_hash(parent_id, name, namelen);
+
+ ret = btrfs_search_slot(NULL, BTRFS_I(inode)->root, &key, path, 0, 0);
+ if (ret < 0)
+ goto out;
+ if (ret > 0) {
+ ret = 0;
+ goto out;
+ }
+ if (key.type == BTRFS_INODE_EXTREF_KEY)
+ ret = btrfs_find_name_in_ext_backref(path->nodes[0],
+ path->slots[0], parent_id,
+ name, namelen, NULL);
+ else
+ ret = btrfs_find_name_in_backref(path->nodes[0], path->slots[0],
+ name, namelen, NULL);
+
+out:
+ btrfs_free_path(path);
+ return ret;
+}
+
/*
* replay one inode back reference item found in the log tree.
* eb, slot and key refer to the buffer and key found in the log tree.
@@ -1399,6 +1439,32 @@ static noinline int add_inode_ref(struct btrfs_trans_handle *trans,
}
}
+ /*
+ * If a reference item already exists for this inode
+ * with the same parent and name, but different index,
+ * drop it and the corresponding directory index entries
+ * from the parent before adding the new reference item
+ * and dir index entries, otherwise we would fail with
+ * -EEXIST returned from btrfs_add_link() below.
+ */
+ ret = btrfs_inode_ref_exists(inode, dir, key->type,
+ name, namelen);
+ if (ret > 0) {
+ ret = btrfs_unlink_inode(trans, root,
+ BTRFS_I(dir),
+ BTRFS_I(inode),
+ name, namelen);
+ /*
+ * If we dropped the link count to 0, bump it so
+ * that later the iput() on the inode will not
+ * free it. We will fixup the link count later.
+ */
+ if (!ret && inode->i_nlink == 0)
+ inc_nlink(inode);
+ }
+ if (ret < 0)
+ goto out;
+
/* insert our name */
ret = btrfs_add_link(trans, BTRFS_I(dir),
BTRFS_I(inode),
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 46b2f4590aab71d31088a265c86026b1e96c9de4 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 24 Jul 2018 11:54:04 +0100
Subject: [PATCH] Btrfs: fix send failure when root has deleted files still
open
The more common use case of send involves creating a RO snapshot and then
use it for a send operation. In this case it's not possible to have inodes
in the snapshot that have a link count of zero (inode with an orphan item)
since during snapshot creation we do the orphan cleanup. However, other
less common use cases for send can end up seeing inodes with a link count
of zero and in this case the send operation fails with a ENOENT error
because any attempt to generate a path for the inode, with the purpose
of creating it or updating it at the receiver, fails since there are no
inode reference items. One use case it to use a regular subvolume for
a send operation after turning it to RO mode or turning a RW snapshot
into RO mode and then using it for a send operation. In both cases, if a
file gets all its hard links deleted while there is an open file
descriptor before turning the subvolume/snapshot into RO mode, the send
operation will encounter an inode with a link count of zero and then
fail with errno ENOENT.
Example using a full send with a subvolume:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ btrfs subvolume create /mnt/sv1
$ touch /mnt/sv1/foo
$ touch /mnt/sv1/bar
# keep an open file descriptor on file bar
$ exec 73</mnt/sv1/bar
$ unlink /mnt/sv1/bar
# Turn the subvolume to RO mode and use it for a full send, while
# holding the open file descriptor.
$ btrfs property set /mnt/sv1 ro true
$ btrfs send -f /tmp/full.send /mnt/sv1
At subvol /mnt/sv1
ERROR: send ioctl failed with -2: No such file or directory
Example using an incremental send with snapshots:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ btrfs subvolume create /mnt/sv1
$ touch /mnt/sv1/foo
$ touch /mnt/sv1/bar
$ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap1
$ echo "hello world" >> /mnt/sv1/bar
$ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap2
# Turn the second snapshot to RW mode and delete file foo while
# holding an open file descriptor on it.
$ btrfs property set /mnt/snap2 ro false
$ exec 73</mnt/snap2/foo
$ unlink /mnt/snap2/foo
# Set the second snapshot back to RO mode and do an incremental send.
$ btrfs property set /mnt/snap2 ro true
$ btrfs send -f /tmp/inc.send -p /mnt/snap1 /mnt/snap2
At subvol /mnt/snap2
ERROR: send ioctl failed with -2: No such file or directory
So fix this by ignoring inodes with a link count of zero if we are either
doing a full send or if they do not exist in the parent snapshot (they
are new in the send snapshot), and unlink all paths found in the parent
snapshot when doing an incremental send (and ignoring all other inode
items, such as xattrs and extents).
A test case for fstests follows soon.
CC: stable(a)vger.kernel.org # 4.4+
Reported-by: Martin Wilck <martin.wilck(a)suse.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 42e04cd3cd95..551294a6c9e2 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -100,6 +100,7 @@ struct send_ctx {
u64 cur_inode_rdev;
u64 cur_inode_last_extent;
u64 cur_inode_next_write_offset;
+ bool ignore_cur_inode;
u64 send_progress;
@@ -5796,6 +5797,9 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end)
int pending_move = 0;
int refs_processed = 0;
+ if (sctx->ignore_cur_inode)
+ return 0;
+
ret = process_recorded_refs_if_needed(sctx, at_end, &pending_move,
&refs_processed);
if (ret < 0)
@@ -5914,6 +5918,93 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end)
return ret;
}
+struct parent_paths_ctx {
+ struct list_head *refs;
+ struct send_ctx *sctx;
+};
+
+static int record_parent_ref(int num, u64 dir, int index, struct fs_path *name,
+ void *ctx)
+{
+ struct parent_paths_ctx *ppctx = ctx;
+
+ return record_ref(ppctx->sctx->parent_root, dir, name, ppctx->sctx,
+ ppctx->refs);
+}
+
+/*
+ * Issue unlink operations for all paths of the current inode found in the
+ * parent snapshot.
+ */
+static int btrfs_unlink_all_paths(struct send_ctx *sctx)
+{
+ LIST_HEAD(deleted_refs);
+ struct btrfs_path *path;
+ struct btrfs_key key;
+ struct parent_paths_ctx ctx;
+ int ret;
+
+ path = alloc_path_for_send();
+ if (!path)
+ return -ENOMEM;
+
+ key.objectid = sctx->cur_ino;
+ key.type = BTRFS_INODE_REF_KEY;
+ key.offset = 0;
+ ret = btrfs_search_slot(NULL, sctx->parent_root, &key, path, 0, 0);
+ if (ret < 0)
+ goto out;
+
+ ctx.refs = &deleted_refs;
+ ctx.sctx = sctx;
+
+ while (true) {
+ struct extent_buffer *eb = path->nodes[0];
+ int slot = path->slots[0];
+
+ if (slot >= btrfs_header_nritems(eb)) {
+ ret = btrfs_next_leaf(sctx->parent_root, path);
+ if (ret < 0)
+ goto out;
+ else if (ret > 0)
+ break;
+ continue;
+ }
+
+ btrfs_item_key_to_cpu(eb, &key, slot);
+ if (key.objectid != sctx->cur_ino)
+ break;
+ if (key.type != BTRFS_INODE_REF_KEY &&
+ key.type != BTRFS_INODE_EXTREF_KEY)
+ break;
+
+ ret = iterate_inode_ref(sctx->parent_root, path, &key, 1,
+ record_parent_ref, &ctx);
+ if (ret < 0)
+ goto out;
+
+ path->slots[0]++;
+ }
+
+ while (!list_empty(&deleted_refs)) {
+ struct recorded_ref *ref;
+
+ ref = list_first_entry(&deleted_refs, struct recorded_ref, list);
+ ret = send_unlink(sctx, ref->full_path);
+ if (ret < 0)
+ goto out;
+ fs_path_free(ref->full_path);
+ list_del(&ref->list);
+ kfree(ref);
+ }
+ ret = 0;
+out:
+ btrfs_free_path(path);
+ if (ret)
+ __free_recorded_refs(&deleted_refs);
+ return ret;
+}
+
static int changed_inode(struct send_ctx *sctx,
enum btrfs_compare_tree_result result)
{
@@ -5928,6 +6019,7 @@ static int changed_inode(struct send_ctx *sctx,
sctx->cur_inode_new_gen = 0;
sctx->cur_inode_last_extent = (u64)-1;
sctx->cur_inode_next_write_offset = 0;
+ sctx->ignore_cur_inode = false;
/*
* Set send_progress to current inode. This will tell all get_cur_xxx
@@ -5968,6 +6060,33 @@ static int changed_inode(struct send_ctx *sctx,
sctx->cur_inode_new_gen = 1;
}
+ /*
+ * Normally we do not find inodes with a link count of zero (orphans)
+ * because the most common case is to create a snapshot and use it
+ * for a send operation. However other less common use cases involve
+ * using a subvolume and send it after turning it to RO mode just
+ * after deleting all hard links of a file while holding an open
+ * file descriptor against it or turning a RO snapshot into RW mode,
+ * keep an open file descriptor against a file, delete it and then
+ * turn the snapshot back to RO mode before using it for a send
+ * operation. So if we find such cases, ignore the inode and all its
+ * items completely if it's a new inode, or if it's a changed inode
+ * make sure all its previous paths (from the parent snapshot) are all
+ * unlinked and all other the inode items are ignored.
+ */
+ if (result == BTRFS_COMPARE_TREE_NEW ||
+ result == BTRFS_COMPARE_TREE_CHANGED) {
+ u32 nlinks;
+
+ nlinks = btrfs_inode_nlink(sctx->left_path->nodes[0], left_ii);
+ if (nlinks == 0) {
+ sctx->ignore_cur_inode = true;
+ if (result == BTRFS_COMPARE_TREE_CHANGED)
+ ret = btrfs_unlink_all_paths(sctx);
+ goto out;
+ }
+ }
+
if (result == BTRFS_COMPARE_TREE_NEW) {
sctx->cur_inode_gen = left_gen;
sctx->cur_inode_new = 1;
@@ -6306,15 +6425,17 @@ static int changed_cb(struct btrfs_path *left_path,
key->objectid == BTRFS_FREE_SPACE_OBJECTID)
goto out;
- if (key->type == BTRFS_INODE_ITEM_KEY)
+ if (key->type == BTRFS_INODE_ITEM_KEY) {
ret = changed_inode(sctx, result);
- else if (key->type == BTRFS_INODE_REF_KEY ||
- key->type == BTRFS_INODE_EXTREF_KEY)
- ret = changed_ref(sctx, result);
- else if (key->type == BTRFS_XATTR_ITEM_KEY)
- ret = changed_xattr(sctx, result);
- else if (key->type == BTRFS_EXTENT_DATA_KEY)
- ret = changed_extent(sctx, result);
+ } else if (!sctx->ignore_cur_inode) {
+ if (key->type == BTRFS_INODE_REF_KEY ||
+ key->type == BTRFS_INODE_EXTREF_KEY)
+ ret = changed_ref(sctx, result);
+ else if (key->type == BTRFS_XATTR_ITEM_KEY)
+ ret = changed_xattr(sctx, result);
+ else if (key->type == BTRFS_EXTENT_DATA_KEY)
+ ret = changed_extent(sctx, result);
+ }
out:
return ret;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 46b2f4590aab71d31088a265c86026b1e96c9de4 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 24 Jul 2018 11:54:04 +0100
Subject: [PATCH] Btrfs: fix send failure when root has deleted files still
open
The more common use case of send involves creating a RO snapshot and then
use it for a send operation. In this case it's not possible to have inodes
in the snapshot that have a link count of zero (inode with an orphan item)
since during snapshot creation we do the orphan cleanup. However, other
less common use cases for send can end up seeing inodes with a link count
of zero and in this case the send operation fails with a ENOENT error
because any attempt to generate a path for the inode, with the purpose
of creating it or updating it at the receiver, fails since there are no
inode reference items. One use case it to use a regular subvolume for
a send operation after turning it to RO mode or turning a RW snapshot
into RO mode and then using it for a send operation. In both cases, if a
file gets all its hard links deleted while there is an open file
descriptor before turning the subvolume/snapshot into RO mode, the send
operation will encounter an inode with a link count of zero and then
fail with errno ENOENT.
Example using a full send with a subvolume:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ btrfs subvolume create /mnt/sv1
$ touch /mnt/sv1/foo
$ touch /mnt/sv1/bar
# keep an open file descriptor on file bar
$ exec 73</mnt/sv1/bar
$ unlink /mnt/sv1/bar
# Turn the subvolume to RO mode and use it for a full send, while
# holding the open file descriptor.
$ btrfs property set /mnt/sv1 ro true
$ btrfs send -f /tmp/full.send /mnt/sv1
At subvol /mnt/sv1
ERROR: send ioctl failed with -2: No such file or directory
Example using an incremental send with snapshots:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ btrfs subvolume create /mnt/sv1
$ touch /mnt/sv1/foo
$ touch /mnt/sv1/bar
$ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap1
$ echo "hello world" >> /mnt/sv1/bar
$ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap2
# Turn the second snapshot to RW mode and delete file foo while
# holding an open file descriptor on it.
$ btrfs property set /mnt/snap2 ro false
$ exec 73</mnt/snap2/foo
$ unlink /mnt/snap2/foo
# Set the second snapshot back to RO mode and do an incremental send.
$ btrfs property set /mnt/snap2 ro true
$ btrfs send -f /tmp/inc.send -p /mnt/snap1 /mnt/snap2
At subvol /mnt/snap2
ERROR: send ioctl failed with -2: No such file or directory
So fix this by ignoring inodes with a link count of zero if we are either
doing a full send or if they do not exist in the parent snapshot (they
are new in the send snapshot), and unlink all paths found in the parent
snapshot when doing an incremental send (and ignoring all other inode
items, such as xattrs and extents).
A test case for fstests follows soon.
CC: stable(a)vger.kernel.org # 4.4+
Reported-by: Martin Wilck <martin.wilck(a)suse.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 42e04cd3cd95..551294a6c9e2 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -100,6 +100,7 @@ struct send_ctx {
u64 cur_inode_rdev;
u64 cur_inode_last_extent;
u64 cur_inode_next_write_offset;
+ bool ignore_cur_inode;
u64 send_progress;
@@ -5796,6 +5797,9 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end)
int pending_move = 0;
int refs_processed = 0;
+ if (sctx->ignore_cur_inode)
+ return 0;
+
ret = process_recorded_refs_if_needed(sctx, at_end, &pending_move,
&refs_processed);
if (ret < 0)
@@ -5914,6 +5918,93 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end)
return ret;
}
+struct parent_paths_ctx {
+ struct list_head *refs;
+ struct send_ctx *sctx;
+};
+
+static int record_parent_ref(int num, u64 dir, int index, struct fs_path *name,
+ void *ctx)
+{
+ struct parent_paths_ctx *ppctx = ctx;
+
+ return record_ref(ppctx->sctx->parent_root, dir, name, ppctx->sctx,
+ ppctx->refs);
+}
+
+/*
+ * Issue unlink operations for all paths of the current inode found in the
+ * parent snapshot.
+ */
+static int btrfs_unlink_all_paths(struct send_ctx *sctx)
+{
+ LIST_HEAD(deleted_refs);
+ struct btrfs_path *path;
+ struct btrfs_key key;
+ struct parent_paths_ctx ctx;
+ int ret;
+
+ path = alloc_path_for_send();
+ if (!path)
+ return -ENOMEM;
+
+ key.objectid = sctx->cur_ino;
+ key.type = BTRFS_INODE_REF_KEY;
+ key.offset = 0;
+ ret = btrfs_search_slot(NULL, sctx->parent_root, &key, path, 0, 0);
+ if (ret < 0)
+ goto out;
+
+ ctx.refs = &deleted_refs;
+ ctx.sctx = sctx;
+
+ while (true) {
+ struct extent_buffer *eb = path->nodes[0];
+ int slot = path->slots[0];
+
+ if (slot >= btrfs_header_nritems(eb)) {
+ ret = btrfs_next_leaf(sctx->parent_root, path);
+ if (ret < 0)
+ goto out;
+ else if (ret > 0)
+ break;
+ continue;
+ }
+
+ btrfs_item_key_to_cpu(eb, &key, slot);
+ if (key.objectid != sctx->cur_ino)
+ break;
+ if (key.type != BTRFS_INODE_REF_KEY &&
+ key.type != BTRFS_INODE_EXTREF_KEY)
+ break;
+
+ ret = iterate_inode_ref(sctx->parent_root, path, &key, 1,
+ record_parent_ref, &ctx);
+ if (ret < 0)
+ goto out;
+
+ path->slots[0]++;
+ }
+
+ while (!list_empty(&deleted_refs)) {
+ struct recorded_ref *ref;
+
+ ref = list_first_entry(&deleted_refs, struct recorded_ref, list);
+ ret = send_unlink(sctx, ref->full_path);
+ if (ret < 0)
+ goto out;
+ fs_path_free(ref->full_path);
+ list_del(&ref->list);
+ kfree(ref);
+ }
+ ret = 0;
+out:
+ btrfs_free_path(path);
+ if (ret)
+ __free_recorded_refs(&deleted_refs);
+ return ret;
+}
+
static int changed_inode(struct send_ctx *sctx,
enum btrfs_compare_tree_result result)
{
@@ -5928,6 +6019,7 @@ static int changed_inode(struct send_ctx *sctx,
sctx->cur_inode_new_gen = 0;
sctx->cur_inode_last_extent = (u64)-1;
sctx->cur_inode_next_write_offset = 0;
+ sctx->ignore_cur_inode = false;
/*
* Set send_progress to current inode. This will tell all get_cur_xxx
@@ -5968,6 +6060,33 @@ static int changed_inode(struct send_ctx *sctx,
sctx->cur_inode_new_gen = 1;
}
+ /*
+ * Normally we do not find inodes with a link count of zero (orphans)
+ * because the most common case is to create a snapshot and use it
+ * for a send operation. However other less common use cases involve
+ * using a subvolume and send it after turning it to RO mode just
+ * after deleting all hard links of a file while holding an open
+ * file descriptor against it or turning a RO snapshot into RW mode,
+ * keep an open file descriptor against a file, delete it and then
+ * turn the snapshot back to RO mode before using it for a send
+ * operation. So if we find such cases, ignore the inode and all its
+ * items completely if it's a new inode, or if it's a changed inode
+ * make sure all its previous paths (from the parent snapshot) are all
+ * unlinked and all other the inode items are ignored.
+ */
+ if (result == BTRFS_COMPARE_TREE_NEW ||
+ result == BTRFS_COMPARE_TREE_CHANGED) {
+ u32 nlinks;
+
+ nlinks = btrfs_inode_nlink(sctx->left_path->nodes[0], left_ii);
+ if (nlinks == 0) {
+ sctx->ignore_cur_inode = true;
+ if (result == BTRFS_COMPARE_TREE_CHANGED)
+ ret = btrfs_unlink_all_paths(sctx);
+ goto out;
+ }
+ }
+
if (result == BTRFS_COMPARE_TREE_NEW) {
sctx->cur_inode_gen = left_gen;
sctx->cur_inode_new = 1;
@@ -6306,15 +6425,17 @@ static int changed_cb(struct btrfs_path *left_path,
key->objectid == BTRFS_FREE_SPACE_OBJECTID)
goto out;
- if (key->type == BTRFS_INODE_ITEM_KEY)
+ if (key->type == BTRFS_INODE_ITEM_KEY) {
ret = changed_inode(sctx, result);
- else if (key->type == BTRFS_INODE_REF_KEY ||
- key->type == BTRFS_INODE_EXTREF_KEY)
- ret = changed_ref(sctx, result);
- else if (key->type == BTRFS_XATTR_ITEM_KEY)
- ret = changed_xattr(sctx, result);
- else if (key->type == BTRFS_EXTENT_DATA_KEY)
- ret = changed_extent(sctx, result);
+ } else if (!sctx->ignore_cur_inode) {
+ if (key->type == BTRFS_INODE_REF_KEY ||
+ key->type == BTRFS_INODE_EXTREF_KEY)
+ ret = changed_ref(sctx, result);
+ else if (key->type == BTRFS_XATTR_ITEM_KEY)
+ ret = changed_xattr(sctx, result);
+ else if (key->type == BTRFS_EXTENT_DATA_KEY)
+ ret = changed_extent(sctx, result);
+ }
out:
return ret;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 46b2f4590aab71d31088a265c86026b1e96c9de4 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 24 Jul 2018 11:54:04 +0100
Subject: [PATCH] Btrfs: fix send failure when root has deleted files still
open
The more common use case of send involves creating a RO snapshot and then
use it for a send operation. In this case it's not possible to have inodes
in the snapshot that have a link count of zero (inode with an orphan item)
since during snapshot creation we do the orphan cleanup. However, other
less common use cases for send can end up seeing inodes with a link count
of zero and in this case the send operation fails with a ENOENT error
because any attempt to generate a path for the inode, with the purpose
of creating it or updating it at the receiver, fails since there are no
inode reference items. One use case it to use a regular subvolume for
a send operation after turning it to RO mode or turning a RW snapshot
into RO mode and then using it for a send operation. In both cases, if a
file gets all its hard links deleted while there is an open file
descriptor before turning the subvolume/snapshot into RO mode, the send
operation will encounter an inode with a link count of zero and then
fail with errno ENOENT.
Example using a full send with a subvolume:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ btrfs subvolume create /mnt/sv1
$ touch /mnt/sv1/foo
$ touch /mnt/sv1/bar
# keep an open file descriptor on file bar
$ exec 73</mnt/sv1/bar
$ unlink /mnt/sv1/bar
# Turn the subvolume to RO mode and use it for a full send, while
# holding the open file descriptor.
$ btrfs property set /mnt/sv1 ro true
$ btrfs send -f /tmp/full.send /mnt/sv1
At subvol /mnt/sv1
ERROR: send ioctl failed with -2: No such file or directory
Example using an incremental send with snapshots:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ btrfs subvolume create /mnt/sv1
$ touch /mnt/sv1/foo
$ touch /mnt/sv1/bar
$ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap1
$ echo "hello world" >> /mnt/sv1/bar
$ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap2
# Turn the second snapshot to RW mode and delete file foo while
# holding an open file descriptor on it.
$ btrfs property set /mnt/snap2 ro false
$ exec 73</mnt/snap2/foo
$ unlink /mnt/snap2/foo
# Set the second snapshot back to RO mode and do an incremental send.
$ btrfs property set /mnt/snap2 ro true
$ btrfs send -f /tmp/inc.send -p /mnt/snap1 /mnt/snap2
At subvol /mnt/snap2
ERROR: send ioctl failed with -2: No such file or directory
So fix this by ignoring inodes with a link count of zero if we are either
doing a full send or if they do not exist in the parent snapshot (they
are new in the send snapshot), and unlink all paths found in the parent
snapshot when doing an incremental send (and ignoring all other inode
items, such as xattrs and extents).
A test case for fstests follows soon.
CC: stable(a)vger.kernel.org # 4.4+
Reported-by: Martin Wilck <martin.wilck(a)suse.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 42e04cd3cd95..551294a6c9e2 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -100,6 +100,7 @@ struct send_ctx {
u64 cur_inode_rdev;
u64 cur_inode_last_extent;
u64 cur_inode_next_write_offset;
+ bool ignore_cur_inode;
u64 send_progress;
@@ -5796,6 +5797,9 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end)
int pending_move = 0;
int refs_processed = 0;
+ if (sctx->ignore_cur_inode)
+ return 0;
+
ret = process_recorded_refs_if_needed(sctx, at_end, &pending_move,
&refs_processed);
if (ret < 0)
@@ -5914,6 +5918,93 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end)
return ret;
}
+struct parent_paths_ctx {
+ struct list_head *refs;
+ struct send_ctx *sctx;
+};
+
+static int record_parent_ref(int num, u64 dir, int index, struct fs_path *name,
+ void *ctx)
+{
+ struct parent_paths_ctx *ppctx = ctx;
+
+ return record_ref(ppctx->sctx->parent_root, dir, name, ppctx->sctx,
+ ppctx->refs);
+}
+
+/*
+ * Issue unlink operations for all paths of the current inode found in the
+ * parent snapshot.
+ */
+static int btrfs_unlink_all_paths(struct send_ctx *sctx)
+{
+ LIST_HEAD(deleted_refs);
+ struct btrfs_path *path;
+ struct btrfs_key key;
+ struct parent_paths_ctx ctx;
+ int ret;
+
+ path = alloc_path_for_send();
+ if (!path)
+ return -ENOMEM;
+
+ key.objectid = sctx->cur_ino;
+ key.type = BTRFS_INODE_REF_KEY;
+ key.offset = 0;
+ ret = btrfs_search_slot(NULL, sctx->parent_root, &key, path, 0, 0);
+ if (ret < 0)
+ goto out;
+
+ ctx.refs = &deleted_refs;
+ ctx.sctx = sctx;
+
+ while (true) {
+ struct extent_buffer *eb = path->nodes[0];
+ int slot = path->slots[0];
+
+ if (slot >= btrfs_header_nritems(eb)) {
+ ret = btrfs_next_leaf(sctx->parent_root, path);
+ if (ret < 0)
+ goto out;
+ else if (ret > 0)
+ break;
+ continue;
+ }
+
+ btrfs_item_key_to_cpu(eb, &key, slot);
+ if (key.objectid != sctx->cur_ino)
+ break;
+ if (key.type != BTRFS_INODE_REF_KEY &&
+ key.type != BTRFS_INODE_EXTREF_KEY)
+ break;
+
+ ret = iterate_inode_ref(sctx->parent_root, path, &key, 1,
+ record_parent_ref, &ctx);
+ if (ret < 0)
+ goto out;
+
+ path->slots[0]++;
+ }
+
+ while (!list_empty(&deleted_refs)) {
+ struct recorded_ref *ref;
+
+ ref = list_first_entry(&deleted_refs, struct recorded_ref, list);
+ ret = send_unlink(sctx, ref->full_path);
+ if (ret < 0)
+ goto out;
+ fs_path_free(ref->full_path);
+ list_del(&ref->list);
+ kfree(ref);
+ }
+ ret = 0;
+out:
+ btrfs_free_path(path);
+ if (ret)
+ __free_recorded_refs(&deleted_refs);
+ return ret;
+}
+
static int changed_inode(struct send_ctx *sctx,
enum btrfs_compare_tree_result result)
{
@@ -5928,6 +6019,7 @@ static int changed_inode(struct send_ctx *sctx,
sctx->cur_inode_new_gen = 0;
sctx->cur_inode_last_extent = (u64)-1;
sctx->cur_inode_next_write_offset = 0;
+ sctx->ignore_cur_inode = false;
/*
* Set send_progress to current inode. This will tell all get_cur_xxx
@@ -5968,6 +6060,33 @@ static int changed_inode(struct send_ctx *sctx,
sctx->cur_inode_new_gen = 1;
}
+ /*
+ * Normally we do not find inodes with a link count of zero (orphans)
+ * because the most common case is to create a snapshot and use it
+ * for a send operation. However other less common use cases involve
+ * using a subvolume and send it after turning it to RO mode just
+ * after deleting all hard links of a file while holding an open
+ * file descriptor against it or turning a RO snapshot into RW mode,
+ * keep an open file descriptor against a file, delete it and then
+ * turn the snapshot back to RO mode before using it for a send
+ * operation. So if we find such cases, ignore the inode and all its
+ * items completely if it's a new inode, or if it's a changed inode
+ * make sure all its previous paths (from the parent snapshot) are all
+ * unlinked and all other the inode items are ignored.
+ */
+ if (result == BTRFS_COMPARE_TREE_NEW ||
+ result == BTRFS_COMPARE_TREE_CHANGED) {
+ u32 nlinks;
+
+ nlinks = btrfs_inode_nlink(sctx->left_path->nodes[0], left_ii);
+ if (nlinks == 0) {
+ sctx->ignore_cur_inode = true;
+ if (result == BTRFS_COMPARE_TREE_CHANGED)
+ ret = btrfs_unlink_all_paths(sctx);
+ goto out;
+ }
+ }
+
if (result == BTRFS_COMPARE_TREE_NEW) {
sctx->cur_inode_gen = left_gen;
sctx->cur_inode_new = 1;
@@ -6306,15 +6425,17 @@ static int changed_cb(struct btrfs_path *left_path,
key->objectid == BTRFS_FREE_SPACE_OBJECTID)
goto out;
- if (key->type == BTRFS_INODE_ITEM_KEY)
+ if (key->type == BTRFS_INODE_ITEM_KEY) {
ret = changed_inode(sctx, result);
- else if (key->type == BTRFS_INODE_REF_KEY ||
- key->type == BTRFS_INODE_EXTREF_KEY)
- ret = changed_ref(sctx, result);
- else if (key->type == BTRFS_XATTR_ITEM_KEY)
- ret = changed_xattr(sctx, result);
- else if (key->type == BTRFS_EXTENT_DATA_KEY)
- ret = changed_extent(sctx, result);
+ } else if (!sctx->ignore_cur_inode) {
+ if (key->type == BTRFS_INODE_REF_KEY ||
+ key->type == BTRFS_INODE_EXTREF_KEY)
+ ret = changed_ref(sctx, result);
+ else if (key->type == BTRFS_XATTR_ITEM_KEY)
+ ret = changed_xattr(sctx, result);
+ else if (key->type == BTRFS_EXTENT_DATA_KEY)
+ ret = changed_extent(sctx, result);
+ }
out:
return ret;
The patch below does not apply to the 3.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3c4276936f6fbe52884b4ea4e6cc120b890a0f9f Mon Sep 17 00:00:00 2001
From: Josef Bacik <jbacik(a)fb.com>
Date: Fri, 20 Jul 2018 11:46:10 -0700
Subject: [PATCH] Btrfs: fix btrfs_write_inode vs delayed iput deadlock
We recently ran into the following deadlock involving
btrfs_write_inode():
[ +0.005066] __schedule+0x38e/0x8c0
[ +0.007144] schedule+0x36/0x80
[ +0.006447] bit_wait+0x11/0x60
[ +0.006446] __wait_on_bit+0xbe/0x110
[ +0.007487] ? bit_wait_io+0x60/0x60
[ +0.007319] __inode_wait_for_writeback+0x96/0xc0
[ +0.009568] ? autoremove_wake_function+0x40/0x40
[ +0.009565] inode_wait_for_writeback+0x21/0x30
[ +0.009224] evict+0xb0/0x190
[ +0.006099] iput+0x1a8/0x210
[ +0.006103] btrfs_run_delayed_iputs+0x73/0xc0
[ +0.009047] btrfs_commit_transaction+0x799/0x8c0
[ +0.009567] btrfs_write_inode+0x81/0xb0
[ +0.008008] __writeback_single_inode+0x267/0x320
[ +0.009569] writeback_sb_inodes+0x25b/0x4e0
[ +0.008702] wb_writeback+0x102/0x2d0
[ +0.007487] wb_workfn+0xa4/0x310
[ +0.006794] ? wb_workfn+0xa4/0x310
[ +0.007143] process_one_work+0x150/0x410
[ +0.008179] worker_thread+0x6d/0x520
[ +0.007490] kthread+0x12c/0x160
[ +0.006620] ? put_pwq_unlocked+0x80/0x80
[ +0.008185] ? kthread_park+0xa0/0xa0
[ +0.007484] ? do_syscall_64+0x53/0x150
[ +0.007837] ret_from_fork+0x29/0x40
Writeback calls:
btrfs_write_inode
btrfs_commit_transaction
btrfs_run_delayed_iputs
If iput() is called on that same inode, evict() will wait for writeback
forever.
btrfs_write_inode() was originally added way back in 4730a4bc5bf3
("btrfs_dirty_inode") to support O_SYNC writes. However, ->write_inode()
hasn't been used for O_SYNC since 148f948ba877 ("vfs: Introduce new
helpers for syncing after writing to O_SYNC file or IS_SYNC inode"), so
btrfs_write_inode() is actually unnecessary (and leads to a bunch of
unnecessary commits). Get rid of it, which also gets rid of the
deadlock.
CC: stable(a)vger.kernel.org # 3.2+
Signed-off-by: Josef Bacik <jbacik(a)fb.com>
[Omar: new commit message]
Signed-off-by: Omar Sandoval <osandov(a)fb.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 4955e04da4c8..472457795486 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6021,32 +6021,6 @@ static int btrfs_real_readdir(struct file *file, struct dir_context *ctx)
return ret;
}
-int btrfs_write_inode(struct inode *inode, struct writeback_control *wbc)
-{
- struct btrfs_root *root = BTRFS_I(inode)->root;
- struct btrfs_trans_handle *trans;
- int ret = 0;
- bool nolock = false;
-
- if (test_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags))
- return 0;
-
- if (btrfs_fs_closing(root->fs_info) &&
- btrfs_is_free_space_inode(BTRFS_I(inode)))
- nolock = true;
-
- if (wbc->sync_mode == WB_SYNC_ALL) {
- if (nolock)
- trans = btrfs_join_transaction_nolock(root);
- else
- trans = btrfs_join_transaction(root);
- if (IS_ERR(trans))
- return PTR_ERR(trans);
- ret = btrfs_commit_transaction(trans);
- }
- return ret;
-}
-
/*
* This is somewhat expensive, updating the tree every time the
* inode changes. But, it is most likely to find the inode in cache.
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index efe8b03ce380..67de3c0fc85b 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2344,7 +2344,6 @@ static const struct super_operations btrfs_super_ops = {
.sync_fs = btrfs_sync_fs,
.show_options = btrfs_show_options,
.show_devname = btrfs_show_devname,
- .write_inode = btrfs_write_inode,
.alloc_inode = btrfs_alloc_inode,
.destroy_inode = btrfs_destroy_inode,
.statfs = btrfs_statfs,
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3c4276936f6fbe52884b4ea4e6cc120b890a0f9f Mon Sep 17 00:00:00 2001
From: Josef Bacik <jbacik(a)fb.com>
Date: Fri, 20 Jul 2018 11:46:10 -0700
Subject: [PATCH] Btrfs: fix btrfs_write_inode vs delayed iput deadlock
We recently ran into the following deadlock involving
btrfs_write_inode():
[ +0.005066] __schedule+0x38e/0x8c0
[ +0.007144] schedule+0x36/0x80
[ +0.006447] bit_wait+0x11/0x60
[ +0.006446] __wait_on_bit+0xbe/0x110
[ +0.007487] ? bit_wait_io+0x60/0x60
[ +0.007319] __inode_wait_for_writeback+0x96/0xc0
[ +0.009568] ? autoremove_wake_function+0x40/0x40
[ +0.009565] inode_wait_for_writeback+0x21/0x30
[ +0.009224] evict+0xb0/0x190
[ +0.006099] iput+0x1a8/0x210
[ +0.006103] btrfs_run_delayed_iputs+0x73/0xc0
[ +0.009047] btrfs_commit_transaction+0x799/0x8c0
[ +0.009567] btrfs_write_inode+0x81/0xb0
[ +0.008008] __writeback_single_inode+0x267/0x320
[ +0.009569] writeback_sb_inodes+0x25b/0x4e0
[ +0.008702] wb_writeback+0x102/0x2d0
[ +0.007487] wb_workfn+0xa4/0x310
[ +0.006794] ? wb_workfn+0xa4/0x310
[ +0.007143] process_one_work+0x150/0x410
[ +0.008179] worker_thread+0x6d/0x520
[ +0.007490] kthread+0x12c/0x160
[ +0.006620] ? put_pwq_unlocked+0x80/0x80
[ +0.008185] ? kthread_park+0xa0/0xa0
[ +0.007484] ? do_syscall_64+0x53/0x150
[ +0.007837] ret_from_fork+0x29/0x40
Writeback calls:
btrfs_write_inode
btrfs_commit_transaction
btrfs_run_delayed_iputs
If iput() is called on that same inode, evict() will wait for writeback
forever.
btrfs_write_inode() was originally added way back in 4730a4bc5bf3
("btrfs_dirty_inode") to support O_SYNC writes. However, ->write_inode()
hasn't been used for O_SYNC since 148f948ba877 ("vfs: Introduce new
helpers for syncing after writing to O_SYNC file or IS_SYNC inode"), so
btrfs_write_inode() is actually unnecessary (and leads to a bunch of
unnecessary commits). Get rid of it, which also gets rid of the
deadlock.
CC: stable(a)vger.kernel.org # 3.2+
Signed-off-by: Josef Bacik <jbacik(a)fb.com>
[Omar: new commit message]
Signed-off-by: Omar Sandoval <osandov(a)fb.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 4955e04da4c8..472457795486 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6021,32 +6021,6 @@ static int btrfs_real_readdir(struct file *file, struct dir_context *ctx)
return ret;
}
-int btrfs_write_inode(struct inode *inode, struct writeback_control *wbc)
-{
- struct btrfs_root *root = BTRFS_I(inode)->root;
- struct btrfs_trans_handle *trans;
- int ret = 0;
- bool nolock = false;
-
- if (test_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags))
- return 0;
-
- if (btrfs_fs_closing(root->fs_info) &&
- btrfs_is_free_space_inode(BTRFS_I(inode)))
- nolock = true;
-
- if (wbc->sync_mode == WB_SYNC_ALL) {
- if (nolock)
- trans = btrfs_join_transaction_nolock(root);
- else
- trans = btrfs_join_transaction(root);
- if (IS_ERR(trans))
- return PTR_ERR(trans);
- ret = btrfs_commit_transaction(trans);
- }
- return ret;
-}
-
/*
* This is somewhat expensive, updating the tree every time the
* inode changes. But, it is most likely to find the inode in cache.
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index efe8b03ce380..67de3c0fc85b 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2344,7 +2344,6 @@ static const struct super_operations btrfs_super_ops = {
.sync_fs = btrfs_sync_fs,
.show_options = btrfs_show_options,
.show_devname = btrfs_show_devname,
- .write_inode = btrfs_write_inode,
.alloc_inode = btrfs_alloc_inode,
.destroy_inode = btrfs_destroy_inode,
.statfs = btrfs_statfs,
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3c4276936f6fbe52884b4ea4e6cc120b890a0f9f Mon Sep 17 00:00:00 2001
From: Josef Bacik <jbacik(a)fb.com>
Date: Fri, 20 Jul 2018 11:46:10 -0700
Subject: [PATCH] Btrfs: fix btrfs_write_inode vs delayed iput deadlock
We recently ran into the following deadlock involving
btrfs_write_inode():
[ +0.005066] __schedule+0x38e/0x8c0
[ +0.007144] schedule+0x36/0x80
[ +0.006447] bit_wait+0x11/0x60
[ +0.006446] __wait_on_bit+0xbe/0x110
[ +0.007487] ? bit_wait_io+0x60/0x60
[ +0.007319] __inode_wait_for_writeback+0x96/0xc0
[ +0.009568] ? autoremove_wake_function+0x40/0x40
[ +0.009565] inode_wait_for_writeback+0x21/0x30
[ +0.009224] evict+0xb0/0x190
[ +0.006099] iput+0x1a8/0x210
[ +0.006103] btrfs_run_delayed_iputs+0x73/0xc0
[ +0.009047] btrfs_commit_transaction+0x799/0x8c0
[ +0.009567] btrfs_write_inode+0x81/0xb0
[ +0.008008] __writeback_single_inode+0x267/0x320
[ +0.009569] writeback_sb_inodes+0x25b/0x4e0
[ +0.008702] wb_writeback+0x102/0x2d0
[ +0.007487] wb_workfn+0xa4/0x310
[ +0.006794] ? wb_workfn+0xa4/0x310
[ +0.007143] process_one_work+0x150/0x410
[ +0.008179] worker_thread+0x6d/0x520
[ +0.007490] kthread+0x12c/0x160
[ +0.006620] ? put_pwq_unlocked+0x80/0x80
[ +0.008185] ? kthread_park+0xa0/0xa0
[ +0.007484] ? do_syscall_64+0x53/0x150
[ +0.007837] ret_from_fork+0x29/0x40
Writeback calls:
btrfs_write_inode
btrfs_commit_transaction
btrfs_run_delayed_iputs
If iput() is called on that same inode, evict() will wait for writeback
forever.
btrfs_write_inode() was originally added way back in 4730a4bc5bf3
("btrfs_dirty_inode") to support O_SYNC writes. However, ->write_inode()
hasn't been used for O_SYNC since 148f948ba877 ("vfs: Introduce new
helpers for syncing after writing to O_SYNC file or IS_SYNC inode"), so
btrfs_write_inode() is actually unnecessary (and leads to a bunch of
unnecessary commits). Get rid of it, which also gets rid of the
deadlock.
CC: stable(a)vger.kernel.org # 3.2+
Signed-off-by: Josef Bacik <jbacik(a)fb.com>
[Omar: new commit message]
Signed-off-by: Omar Sandoval <osandov(a)fb.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 4955e04da4c8..472457795486 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6021,32 +6021,6 @@ static int btrfs_real_readdir(struct file *file, struct dir_context *ctx)
return ret;
}
-int btrfs_write_inode(struct inode *inode, struct writeback_control *wbc)
-{
- struct btrfs_root *root = BTRFS_I(inode)->root;
- struct btrfs_trans_handle *trans;
- int ret = 0;
- bool nolock = false;
-
- if (test_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags))
- return 0;
-
- if (btrfs_fs_closing(root->fs_info) &&
- btrfs_is_free_space_inode(BTRFS_I(inode)))
- nolock = true;
-
- if (wbc->sync_mode == WB_SYNC_ALL) {
- if (nolock)
- trans = btrfs_join_transaction_nolock(root);
- else
- trans = btrfs_join_transaction(root);
- if (IS_ERR(trans))
- return PTR_ERR(trans);
- ret = btrfs_commit_transaction(trans);
- }
- return ret;
-}
-
/*
* This is somewhat expensive, updating the tree every time the
* inode changes. But, it is most likely to find the inode in cache.
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index efe8b03ce380..67de3c0fc85b 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2344,7 +2344,6 @@ static const struct super_operations btrfs_super_ops = {
.sync_fs = btrfs_sync_fs,
.show_options = btrfs_show_options,
.show_devname = btrfs_show_devname,
- .write_inode = btrfs_write_inode,
.alloc_inode = btrfs_alloc_inode,
.destroy_inode = btrfs_destroy_inode,
.statfs = btrfs_statfs,
Hi !
this is also the wrong version of the patch - the proper version is
below. This has been posted to lkml https://lkml.org/lkml/2018/7/18/191
"[PATCH V3] drm: handle error values properly" but there was no review yet
The version you have here though is for sure broken. So maybe this should
be simply dropped until the above, presumably correct fix, is confirmed.
thx!
hofrat
----- Forwarded message from gregkh(a)linuxfoundation.org -----
Date: Tue, 28 Aug 2018 16:09:59 +0200
From: gregkh(a)linuxfoundation.org
To: 1531571532-22733-1-git-send-email-hofrat(a)osadl.org, alexander.levin(a)microsoft.com, gregkh(a)linuxfoundation.org, hofrat(a)osadl.org,
seanpaul(a)chromium.org
Cc: stable-commits(a)vger.kernel.org
Subject: Patch "drm: re-enable error handling" has been added to the 3.18-stable tree
This is a note to let you know that I've just added the patch titled
drm: re-enable error handling
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-re-enable-error-handling.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Aug 28 16:08:28 CEST 2018
From: Nicholas Mc Guire <hofrat(a)osadl.org>
Date: Sat, 14 Jul 2018 14:32:12 +0200
Subject: drm: re-enable error handling
From: Nicholas Mc Guire <hofrat(a)osadl.org>
[ Upstream commit d530b5f1ca0bb66958a2b714bebe40a1248b9c15 ]
drm_legacy_ctxbitmap_next() returns idr_alloc() which can return
-ENOMEM, -EINVAL or -ENOSPC none of which are -1 . but the call sites
of drm_legacy_ctxbitmap_next() seem to be assuming that the error case
would be -1 (original return of drm_ctxbitmap_next() prior to 2.6.23
was actually -1). Thus reenable error handling by checking for < 0.
Signed-off-by: Nicholas Mc Guire <hofrat(a)osadl.org>
Fixes: 62968144e673 ("drm: convert drm context code to use Linux idr")
Signed-off-by: Sean Paul <seanpaul(a)chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1531571532-22733-1-git-send-e…
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/drm_context.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/drm_context.c
+++ b/drivers/gpu/drm/drm_context.c
@@ -341,7 +341,7 @@ int drm_legacy_addctx(struct drm_device
ctx->handle = drm_legacy_ctxbitmap_next(dev);
}
DRM_DEBUG("%d\n", ctx->handle);
- if (ctx->handle == -1) {
+ if (ctx->handle < 0) {
DRM_DEBUG("Not enough free contexts.\n");
/* Should this return -EBUSY instead? */
return -ENOMEM;
Patches currently in stable-queue which might be from hofrat(a)osadl.org are
queue-3.18/drm-re-enable-error-handling.patch
queue-3.18/can-mpc5xxx_can-check-of_iomap-return-before-use.patch
----- End forwarded message -----
In NMI context, we might be in the middle of context switching or in
the middle of switch_mm_irqs_off(). In either case, CR3 might not
match current->mm, which could cause copy_from_user_nmi() and
friends to read the wrong memory.
Fix it by adding a new nmi_uaccess_okay() helper and checking it in
copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.
Cc: stable(a)vger.kernel.org
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
---
The 0day bot is still chewing on this, but I've tested it a bit locally
and it seems to do the right thing.
I've never observed the bug it fixes, but it does appear to fix a bug
unless I've missed something. It's also a prerequisite for Nadav's
fixmap bugfix.
arch/x86/events/core.c | 2 +-
arch/x86/include/asm/tlbflush.h | 16 ++++++++++++++++
arch/x86/lib/usercopy.c | 5 +++++
arch/x86/mm/tlb.c | 3 +++
4 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 5f4829f10129..dfb2f7c0d019 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2465,7 +2465,7 @@ perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs
perf_callchain_store(entry, regs->ip);
- if (!current->mm)
+ if (!nmi_uaccess_okay())
return;
if (perf_callchain_user32(regs, entry))
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 89a73bc31622..b23b2625793b 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -230,6 +230,22 @@ struct tlb_state {
};
DECLARE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate);
+/*
+ * Blindly accessing user memory from NMI context can be dangerous
+ * if we're in the middle of switching the current user task or
+ * switching the loaded mm. It can also be dangerous if we
+ * interrupted some kernel code that was temporarily using a
+ * different mm.
+ */
+static inline bool nmi_uaccess_okay(void)
+{
+ struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm);
+ struct mm_struct *current_mm = current->mm;
+
+ return current_mm && loaded_mm == current_mm &&
+ loaded_mm->pgd == __va(read_cr3_pa());
+}
+
/* Initialize cr4 shadow for this CPU. */
static inline void cr4_init_shadow(void)
{
diff --git a/arch/x86/lib/usercopy.c b/arch/x86/lib/usercopy.c
index c8c6ad0d58b8..3f435d7fca5e 100644
--- a/arch/x86/lib/usercopy.c
+++ b/arch/x86/lib/usercopy.c
@@ -7,6 +7,8 @@
#include <linux/uaccess.h>
#include <linux/export.h>
+#include <asm/tlbflush.h>
+
/*
* We rely on the nested NMI work to allow atomic faults from the NMI path; the
* nested NMI paths are careful to preserve CR2.
@@ -19,6 +21,9 @@ copy_from_user_nmi(void *to, const void __user *from, unsigned long n)
if (__range_not_ok(from, n, TASK_SIZE))
return n;
+ if (!nmi_uaccess_okay())
+ return n;
+
/*
* Even though this function is typically called from NMI/IRQ context
* disable pagefaults so that its behaviour is consistent even when
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 457b281b9339..f4b41d5a93dd 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -345,6 +345,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
*/
trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
} else {
+ /* Let NMI code know that CR3 may not match expectations. */
+ this_cpu_write(cpu_tlbstate.loaded_mm, NULL);
+
/* The new ASID is already up to date. */
load_new_mm_cr3(next->pgd, new_asid, false);
--
2.17.1
The patch
regulator: bd71837: Disable voltage monitoring for LDO3/4
has been applied to the regulator tree at
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
>From 823f18f8b860526fc099c222619a126d57d2ad8c Mon Sep 17 00:00:00 2001
From: Matti Vaittinen <matti.vaittinen(a)fi.rohmeurope.com>
Date: Wed, 29 Aug 2018 15:36:10 +0300
Subject: [PATCH] regulator: bd71837: Disable voltage monitoring for LDO3/4
There is a HW quirk in BD71837. The shutdown sequence timings for
bucks/LDOs which are enabled via register interface are changed.
At PMIC poweroff the voltage for BUCK6/7 is cut immediately at the
beginning of shut-down sequence. This causes LDO5/6 voltage
monitoring to detect under voltage and force PMIC to emergency
state instead of poweroff. Disable voltage monitoring for LDO5 and
LDO6 at probe to avoid this.
Signed-off-by: Matti Vaittinen <matti.vaittinen(a)fi.rohmeurope.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
drivers/regulator/bd71837-regulator.c | 19 +++++++++++++++
include/linux/mfd/rohm-bd718x7.h | 33 ++++++++++++++++++++++++---
2 files changed, 49 insertions(+), 3 deletions(-)
diff --git a/drivers/regulator/bd71837-regulator.c b/drivers/regulator/bd71837-regulator.c
index 0f8ac8dec3e1..a1bd8aaf4d98 100644
--- a/drivers/regulator/bd71837-regulator.c
+++ b/drivers/regulator/bd71837-regulator.c
@@ -569,6 +569,25 @@ static int bd71837_probe(struct platform_device *pdev)
BD71837_REG_REGLOCK);
}
+ /*
+ * There is a HW quirk in BD71837. The shutdown sequence timings for
+ * bucks/LDOs which are controlled via register interface are changed.
+ * At PMIC poweroff the voltage for BUCK6/7 is cut immediately at the
+ * beginning of shut-down sequence. As bucks 6 and 7 are parent
+ * supplies for LDO5 and LDO6 - this causes LDO5/6 voltage
+ * monitoring to errorneously detect under voltage and force PMIC to
+ * emergency state instead of poweroff. In order to avoid this we
+ * disable voltage monitoring for LDO5 and LDO6
+ */
+ err = regmap_update_bits(pmic->mfd->regmap, BD718XX_REG_MVRFLTMASK2,
+ BD718XX_LDO5_VRMON80 | BD718XX_LDO6_VRMON80,
+ BD718XX_LDO5_VRMON80 | BD718XX_LDO6_VRMON80);
+ if (err) {
+ dev_err(&pmic->pdev->dev,
+ "Failed to disable voltage monitoring\n");
+ goto err;
+ }
+
for (i = 0; i < ARRAY_SIZE(pmic_regulator_inits); i++) {
struct regulator_desc *desc;
diff --git a/include/linux/mfd/rohm-bd718x7.h b/include/linux/mfd/rohm-bd718x7.h
index a528747f8aed..e8338e5dc10b 100644
--- a/include/linux/mfd/rohm-bd718x7.h
+++ b/include/linux/mfd/rohm-bd718x7.h
@@ -78,9 +78,9 @@ enum {
BD71837_REG_TRANS_COND0 = 0x1F,
BD71837_REG_TRANS_COND1 = 0x20,
BD71837_REG_VRFAULTEN = 0x21,
- BD71837_REG_MVRFLTMASK0 = 0x22,
- BD71837_REG_MVRFLTMASK1 = 0x23,
- BD71837_REG_MVRFLTMASK2 = 0x24,
+ BD718XX_REG_MVRFLTMASK0 = 0x22,
+ BD718XX_REG_MVRFLTMASK1 = 0x23,
+ BD718XX_REG_MVRFLTMASK2 = 0x24,
BD71837_REG_RCVCFG = 0x25,
BD71837_REG_RCVNUM = 0x26,
BD71837_REG_PWRONCONFIG0 = 0x27,
@@ -159,6 +159,33 @@ enum {
#define BUCK8_MASK 0x3F
#define BUCK8_DEFAULT 0x1E
+/* BD718XX Voltage monitoring masks */
+#define BD718XX_BUCK1_VRMON80 0x1
+#define BD718XX_BUCK1_VRMON130 0x2
+#define BD718XX_BUCK2_VRMON80 0x4
+#define BD718XX_BUCK2_VRMON130 0x8
+#define BD718XX_1ST_NODVS_BUCK_VRMON80 0x1
+#define BD718XX_1ST_NODVS_BUCK_VRMON130 0x2
+#define BD718XX_2ND_NODVS_BUCK_VRMON80 0x4
+#define BD718XX_2ND_NODVS_BUCK_VRMON130 0x8
+#define BD718XX_3RD_NODVS_BUCK_VRMON80 0x10
+#define BD718XX_3RD_NODVS_BUCK_VRMON130 0x20
+#define BD718XX_4TH_NODVS_BUCK_VRMON80 0x40
+#define BD718XX_4TH_NODVS_BUCK_VRMON130 0x80
+#define BD718XX_LDO1_VRMON80 0x1
+#define BD718XX_LDO2_VRMON80 0x2
+#define BD718XX_LDO3_VRMON80 0x4
+#define BD718XX_LDO4_VRMON80 0x8
+#define BD718XX_LDO5_VRMON80 0x10
+#define BD718XX_LDO6_VRMON80 0x20
+
+/* BD71837 specific voltage monitoring masks */
+#define BD71837_BUCK3_VRMON80 0x10
+#define BD71837_BUCK3_VRMON130 0x20
+#define BD71837_BUCK4_VRMON80 0x40
+#define BD71837_BUCK4_VRMON130 0x80
+#define BD71837_LDO7_VRMON80 0x40
+
/* BD71837_REG_IRQ bits */
#define IRQ_SWRST 0x40
#define IRQ_PWRON_S 0x20
--
2.18.0
[BUG]
fstrim on some btrfs only trims the unallocated space, not trimming any
space in existing block groups.
[CAUSE]
Before fstrim_range passed to btrfs_trim_fs(), it get truncated to
range [0, super->total_bytes).
So later btrfs_trim_fs() will only be able to trim block groups in range
[0, super->total_bytes).
While for btrfs, any bytenr aligned to sector size is valid, since btrfs use
its logical address space, there is nothing limiting the location where
we put block groups.
For btrfs with routine balance, it's quite easy to relocate all
block groups and bytenr of block groups will start beyond super->total_bytes.
In that case, btrfs will not trim existing block groups.
[FIX]
Just remove the truncation in btrfs_ioctl_fitrim(), so btrfs_trim_fs()
can get the unmodified range, which is normally set to [0, U64_MAX].
Reported-by: Chris Murphy <lists(a)colorremedies.com>
Fixes: f4c697e6406d ("btrfs: return EINVAL if start > total_bytes in fitrim ioctl")
Cc: <stable(a)vger.kernel.org> # v4.0+
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/extent-tree.c | 10 +---------
fs/btrfs/ioctl.c | 11 +++++++----
2 files changed, 8 insertions(+), 13 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 7768f206196a..d1478d66c7a5 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -10851,21 +10851,13 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
u64 start;
u64 end;
u64 trimmed = 0;
- u64 total_bytes = btrfs_super_total_bytes(fs_info->super_copy);
u64 bg_failed = 0;
u64 dev_failed = 0;
int bg_ret = 0;
int dev_ret = 0;
int ret = 0;
- /*
- * try to trim all FS space, our block group may start from non-zero.
- */
- if (range->len == total_bytes)
- cache = btrfs_lookup_first_block_group(fs_info, range->start);
- else
- cache = btrfs_lookup_block_group(fs_info, range->start);
-
+ cache = btrfs_lookup_first_block_group(fs_info, range->start);
for (; cache; cache = next_block_group(fs_info, cache)) {
if (cache->key.objectid >= (range->start + range->len)) {
btrfs_put_block_group(cache);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 63600dc2ac4c..8165a4bfa579 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -491,7 +491,6 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg)
struct fstrim_range range;
u64 minlen = ULLONG_MAX;
u64 num_devices = 0;
- u64 total_bytes = btrfs_super_total_bytes(fs_info->super_copy);
int ret;
if (!capable(CAP_SYS_ADMIN))
@@ -515,11 +514,15 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg)
return -EOPNOTSUPP;
if (copy_from_user(&range, arg, sizeof(range)))
return -EFAULT;
- if (range.start > total_bytes ||
- range.len < fs_info->sb->s_blocksize)
+
+ /*
+ * NOTE: Don't truncate the range using super->total_bytes.
+ * Bytenr of btrfs block group is in btrfs logical address space,
+ * which can be any sector size aligned bytenr in [0, U64_MAX].
+ */
+ if (range.len < fs_info->sb->s_blocksize)
return -EINVAL;
- range.len = min(range.len, total_bytes - range.start);
range.minlen = max(range.minlen, minlen);
ret = btrfs_trim_fs(fs_info, &range);
if (ret < 0)
--
2.18.0
Like d88b6d04: "cdrom: information leak in cdrom_ioctl_media_changed()"
There is another cast from unsigned long to int which causes
a bounds check to fail with specially crafted input. The value is
then used as an index in the slot array in cdrom_slot_status().
Signed-off-by: Scott Bauer <scott.bauer(a)intel.com>
Signed-off-by: Scott Bauer <sbauer(a)plzdonthack.me>
Cc: stable(a)vger.kernel.org
---
drivers/cdrom/cdrom.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index bfc566d3f31a..8cfa10ab7abc 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -2542,7 +2542,7 @@ static int cdrom_ioctl_drive_status(struct cdrom_device_info *cdi,
if (!CDROM_CAN(CDC_SELECT_DISC) ||
(arg == CDSL_CURRENT || arg == CDSL_NONE))
return cdi->ops->drive_status(cdi, CDSL_CURRENT);
- if (((int)arg >= cdi->capacity))
+ if (arg >= cdi->capacity)
return -EINVAL;
return cdrom_slot_status(cdi, arg);
}
--
2.14.1
We're using imx_v6_v7_defconfig on our Wandboards. After upgrading to
v4.14.67, reboot no longer works (or, well, takes a very long time when
the watchdog is configured).
v4.14.66 works fine, the breakage bisects to
2059e527a659cf16d6bb709f1c8509f7a7623fc4 (ARM: imx_v6_v7_defconfig:
Select ULPI support), and reverting that on top of v4.14.67 again works.
Rasmus
The patch
spi: spi-fsl-dspi: fix broken DSPI_EOQ_MODE
has been applied to the spi tree at
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
>From 5223c9c1cbfc0cd4d0a1b50758e0949af3290fa1 Mon Sep 17 00:00:00 2001
From: Angelo Dureghello <angelo(a)sysam.it>
Date: Sat, 18 Aug 2018 01:51:58 +0200
Subject: [PATCH] spi: spi-fsl-dspi: fix broken DSPI_EOQ_MODE
This patch fixes the dspi_eoq_write function used by the
ColdFire mcf5441x family. The 16 bit cmd part must be re-set at
each data transfer.
Also, now that fifo_size variables are used for eoq_read/write,
a proper fifo size must be set (16 slots for the ColdFire dspi
module version).
Signed-off-by: Angelo Dureghello <angelo(a)sysam.it>
Acked-by: Esben Haabendal <esben(a)haabendal.dk>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
drivers/spi/spi-fsl-dspi.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 7cb3ab0a35a0..3082e72e4f6c 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -30,7 +30,11 @@
#define DRIVER_NAME "fsl-dspi"
+#ifdef CONFIG_M5441x
+#define DSPI_FIFO_SIZE 16
+#else
#define DSPI_FIFO_SIZE 4
+#endif
#define DSPI_DMA_BUFSIZE (DSPI_FIFO_SIZE * 1024)
#define SPI_MCR 0x00
@@ -623,9 +627,11 @@ static void dspi_tcfq_read(struct fsl_dspi *dspi)
static void dspi_eoq_write(struct fsl_dspi *dspi)
{
int fifo_size = DSPI_FIFO_SIZE;
+ u16 xfer_cmd = dspi->tx_cmd;
/* Fill TX FIFO with as many transfers as possible */
while (dspi->len && fifo_size--) {
+ dspi->tx_cmd = xfer_cmd;
/* Request EOQF for last transfer in FIFO */
if (dspi->len == dspi->bytes_per_word || fifo_size == 0)
dspi->tx_cmd |= SPI_PUSHR_CMD_EOQ;
--
2.18.0
Subject of the patch: iommu/arm-smmu: Error out only if not enough
context interrupts
Commit ID: 42b88ec6530fd76f1ae06de7f09830bcbca5bbd6
Why: ARM SMMU does not initialize for Broadcom Stingray SoC if
bootloader reserves few contexts. Kernel should not error out if there
are enough context interrupts.
Patch
Sehr geehrte Damen und Herren.
Nachdem wir Ihre Webseite besucht und das Profil Ihrer Geschäftstätigkeit analysiert haben, wissen wir schon, wie Sie neue Kunden ab sofort gewinnen können.
Wir verfügen über mehr als 100 000 Adressenangaben der potentiellen Kunden. Diese Daten wurden nach Branchen gegliedert.
http://www.gbc-at.net/?page=catalog
***
1. Österreich 2018 ( 104 000 ) - 149 EUR ( bis zum 29.08.2018 )
***
Bitte informieren Sie sich über die weiteren Details einmal unverbindlich auf unseren Webseite:
http://www.gbc-at.net/?page=catalog
Die Adressensortierung je nach der Branche findet KOSTENLOS im beigelegten DataManager-Programm statt.
Zusätzlich bieten wir KOSTENLOS das Tool zum automatischen Verschicken der Angebote an.
MfG
Thomas Weber
GC-Team
Some modems now use the Android Debug Bridge to provide a debugging
interface, and some phones can also export serial ports managed by the
"option" driver.
The ADB daemon running in userspace tries to use USB interfaces with
bDeviceClass=0xFF, bDeviceSubClass=0x42, bDeviceProtocol=1
Prevent the option driver from binding to those interfaces, as they
will not be serial ports.
This can fix issues like:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=781256
Signed-off-by: Romain Izard <romain.izard.pro(a)gmail.com>
Cc: stable <stable(a)vger.kernel.org>
---
drivers/usb/serial/option.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
index 664e61f16b6a..f98943a57ff0 100644
--- a/drivers/usb/serial/option.c
+++ b/drivers/usb/serial/option.c
@@ -1987,6 +1987,12 @@ static int option_probe(struct usb_serial *serial,
if (iface_desc->bInterfaceClass == USB_CLASS_MASS_STORAGE)
return -ENODEV;
+ /* Do not bind Android Debug Bridge interfaces */
+ if (iface_desc->bInterfaceClass == USB_CLASS_VENDOR_SPEC &&
+ iface_desc->bInterfaceSubClass == 0x42 &&
+ iface_desc->bInterfaceProtocol == 1)
+ return -ENODEV;
+
/*
* Don't bind reserved interfaces (like network ones) which often have
* the same class/subclass/protocol as the serial interfaces. Look at
--
2.17.1
Hi !
this is also the wrong version of the patch - the proper version is
below. This has been posted to lkml https://lkml.org/lkml/2018/7/18/191
but there was no review yet - the version you have though is for sure
broken. So maybe this should be simply dropped until the fix is confirmed
thx!
hofrat
----- Forwarded message from gregkh(a)linuxfoundation.org -----
Date: Tue, 28 Aug 2018 16:11:46 +0200
From: gregkh(a)linuxfoundation.org
To: 1531571532-22733-1-git-send-email-hofrat(a)osadl.org, alexander.levin(a)microsoft.com, gregkh(a)linuxfoundation.org, hofrat(a)osadl.org,
seanpaul(a)chromium.org
Cc: stable-commits(a)vger.kernel.org
Subject: Patch "drm: re-enable error handling" has been added to the 4.4-stable tree
This is a note to let you know that I've just added the patch titled
drm: re-enable error handling
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-re-enable-error-handling.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Aug 28 16:10:37 CEST 2018
From: Nicholas Mc Guire <hofrat(a)osadl.org>
Date: Sat, 14 Jul 2018 14:32:12 +0200
Subject: drm: re-enable error handling
From: Nicholas Mc Guire <hofrat(a)osadl.org>
[ Upstream commit d530b5f1ca0bb66958a2b714bebe40a1248b9c15 ]
drm_legacy_ctxbitmap_next() returns idr_alloc() which can return
-ENOMEM, -EINVAL or -ENOSPC none of which are -1 . but the call sites
of drm_legacy_ctxbitmap_next() seem to be assuming that the error case
would be -1 (original return of drm_ctxbitmap_next() prior to 2.6.23
was actually -1). Thus reenable error handling by checking for < 0.
Signed-off-by: Nicholas Mc Guire <hofrat(a)osadl.org>
Fixes: 62968144e673 ("drm: convert drm context code to use Linux idr")
Signed-off-by: Sean Paul <seanpaul(a)chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1531571532-22733-1-git-send-e…
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/drm_context.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/drm_context.c
+++ b/drivers/gpu/drm/drm_context.c
@@ -372,7 +372,7 @@ int drm_legacy_addctx(struct drm_device
ctx->handle = drm_legacy_ctxbitmap_next(dev);
}
DRM_DEBUG("%d\n", ctx->handle);
- if (ctx->handle == -1) {
+ if (ctx->handle < 0) {
DRM_DEBUG("Not enough free contexts.\n");
/* Should this return -EBUSY instead? */
return -ENOMEM;
Patches currently in stable-queue which might be from hofrat(a)osadl.org are
queue-4.4/drm-re-enable-error-handling.patch
queue-4.4/can-mpc5xxx_can-check-of_iomap-return-before-use.patch
----- End forwarded message -----
Two users have reported [1] that they have an "extremely unlikely" system
with more than MAX_PA/2 memory and L1TF mitigation is not effective. In fact
it's a CPU with 36bits phys limit (64GB) and 32GB memory, but due to holes
in the e820 map, the main region is almost 500MB over the 32GB limit:
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000081effffff] usable
Suggestions to use 'mem=32G' to prefer L1TF mitigation while losing the 500MB
revealed, that there's an off-by-one error in the check in
l1tf_select_mitigation(). l1tf_pfn_limit() returns the last usable pfn
(inclusive), but it's more common and hopefully less error-prone to return the
first pfn that's over limit, so this patch changes that and updates the other
callers.
[1] https://bugzilla.suse.com/show_bug.cgi?id=1105536
Reported-by: George Anchev <studio(a)anchev.net>
Reported-by: Christopher Snowhill <kode54(a)gmail.com>
Fixes: 17dbca119312 ("x86/speculation/l1tf: Add sysfs reporting for l1tf")
Cc: stable(a)vger.kernel.org
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
---
arch/x86/include/asm/processor.h | 2 +-
arch/x86/mm/init.c | 2 +-
arch/x86/mm/mmap.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a0a52274cb4a..c24297268ebc 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -183,7 +183,7 @@ extern void cpu_detect(struct cpuinfo_x86 *c);
static inline unsigned long long l1tf_pfn_limit(void)
{
- return BIT_ULL(boot_cpu_data.x86_phys_bits - 1 - PAGE_SHIFT) - 1;
+ return BIT_ULL(boot_cpu_data.x86_phys_bits - 1 - PAGE_SHIFT);
}
extern void early_cpu_init(void);
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 02de3d6065c4..63a6f9fcaf20 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -923,7 +923,7 @@ unsigned long max_swapfile_size(void)
if (boot_cpu_has_bug(X86_BUG_L1TF)) {
/* Limit the swap file size to MAX_PA/2 for L1TF workaround */
- unsigned long long l1tf_limit = l1tf_pfn_limit() + 1;
+ unsigned long long l1tf_limit = l1tf_pfn_limit();
/*
* We encode swap offsets also with 3 bits below those for pfn
* which makes the usable limit higher.
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index f40ab8185d94..1e95d57760cf 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -257,7 +257,7 @@ bool pfn_modify_allowed(unsigned long pfn, pgprot_t prot)
/* If it's real memory always allow */
if (pfn_valid(pfn))
return true;
- if (pfn > l1tf_pfn_limit() && !capable(CAP_SYS_ADMIN))
+ if (pfn >= l1tf_pfn_limit() && !capable(CAP_SYS_ADMIN))
return false;
return true;
}
--
2.18.0
Hi,
The upstream kernel has supported IIO free-running counters on Skylake
server. As of Skylake Server, there are a number of free running
counters in each IIO Box that collect counts of per-box IO clocks and
per-port Input/Output x BW/Utilization.
There are three types of IIO free-running counters on Skylake server:
1. IO CLOCKS counter: a clock of IIO box.
2. BANDWIDTH counters: count inbound(PCIe->CPU)/outbound(CPU->PCIe)
bandwidth.
3. UTILIZATION counters: count input/output utilization.
With these IIO free-running counters, we can get good observation for
IIO traffic on Skylake server. For example, we can see the IIO inbound
bandwidth (PCIe->CPU).
root@skx /sys/devices# perf stat -a -e
uncore_iio_free_running_2/bw_in_port0/
^C
Performance counter stats for 'system wide':
153.19 MiB uncore_iio_free_running_2/bw_in_port0/
8.037701069 seconds time elapsed
I propose to backport the patches which support IIO free-running
counters to 4.14 stable kernel.
perf/x86/intel/uncore: Introduce customized event_read() for client IMC
uncore
2da331465f44f9618abe8837d1a68405d550b66e
perf/x86/intel/uncore: Add new data structures for free running counters
927b2deb067b8b4753fc09c7a42092f43fc0c1f6
perf/x86/intel/uncore: Add infrastructure for free running counters
0e0162dfcd1fbe4c711ee86f24f966c318999603
perf/x86/intel/uncore: Support IIO free-running counters on SKX
0f519f0352e37e7d71bdce5559517c74a35f6e33
perf/x86/intel/uncore: Expose uncore_pmu_event*() functions
5a6c9d94e9ed7410142bc6fcb638a4db1895aa0c
perf/x86/intel/uncore: Clean up client IMC uncore
9aae1780e7e81e54edfb70ba33ead5b0b48be009
Thanks
Jin Yao
From: Randy Dunlap <rdunlap(a)infradead.org>
When $DEPMOD is not found, only print a warning instead of exiting
with an error message and error status.
Warning: 'make modules_install' requires /sbin/depmod. Please install it.
This is probably in the kmod package.
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Fixes: 934193a654c1 ("kbuild: verify that $DEPMOD is installed")
Cc: stable(a)vger.kernel.org
Cc: Lucas De Marchi <lucas.demarchi(a)profusion.mobi>
Cc: Lucas De Marchi <lucas.de.marchi(a)gmail.com>
Cc: Michal Marek <michal.lkml(a)markovi.net>
Cc: Jessica Yu <jeyu(a)kernel.org>
Cc: Chih-Wei Huang <cwhuang(a)linux.org.tw>
Cc: H. Nikolaus Schaller <hns(a)goldelico.com>
---
v2: add missing "exit 0" and update the commit message (no Error).
v3: add Fixes: and Cc: stable
scripts/depmod.sh | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- lnx-418.orig/scripts/depmod.sh
+++ lnx-418/scripts/depmod.sh
@@ -15,9 +15,9 @@ if ! test -r System.map ; then
fi
if [ -z $(command -v $DEPMOD) ]; then
- echo "'make modules_install' requires $DEPMOD. Please install it." >&2
+ echo "Warning: 'make modules_install' requires $DEPMOD. Please install it." >&2
echo "This is probably in the kmod package." >&2
- exit 1
+ exit 0
fi
# older versions of depmod require the version string to start with three
The patch titled
Subject: uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name
has been added to the -mm tree. Its filename is
uapi-linux-keyctlh-dont-use-c-reserved-keyword-as-a-struct-member-name.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/uapi-linux-keyctlh-dont-use-c-rese…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/uapi-linux-keyctlh-dont-use-c-rese…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Randy Dunlap <rdunlap(a)infradead.org>
Subject: uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name
Since this header is in "include/uapi/linux/", apparently people want to
use it in userspace programs -- even in C++ ones. However, the header
uses a C++ reserved keyword ("private"), so change that to "dh_private"
instead to allow the header file to be used in C++ userspace.
Fixes https://bugzilla.kernel.org/show_bug.cgi?id=191051
Link: http://lkml.kernel.org/r/0db6c314-1ef4-9bfa-1baa-7214dd2ee061@infradead.org
Fixes: ddbb41148724 ("KEYS: Add KEYCTL_DH_COMPUTE command")
Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: David Howells <dhowells(a)redhat.com>
Cc: James Morris <jmorris(a)namei.org>
Cc: "Serge E. Hallyn" <serge(a)hallyn.com>
Cc: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/uapi/linux/keyctl.h | 2 +-
security/keys/dh.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
--- a/include/uapi/linux/keyctl.h~uapi-linux-keyctlh-dont-use-c-reserved-keyword-as-a-struct-member-name
+++ a/include/uapi/linux/keyctl.h
@@ -65,7 +65,7 @@
/* keyctl structures */
struct keyctl_dh_params {
- __s32 private;
+ __s32 dh_private;
__s32 prime;
__s32 base;
};
--- a/security/keys/dh.c~uapi-linux-keyctlh-dont-use-c-reserved-keyword-as-a-struct-member-name
+++ a/security/keys/dh.c
@@ -300,7 +300,7 @@ long __keyctl_dh_compute(struct keyctl_d
}
dh_inputs.g_size = dlen;
- dlen = dh_data_from_key(pcopy.private, &dh_inputs.key);
+ dlen = dh_data_from_key(pcopy.dh_private, &dh_inputs.key);
if (dlen < 0) {
ret = dlen;
goto out2;
_
Patches currently in -mm which might be from rdunlap(a)infradead.org are
uapi-linux-keyctlh-dont-use-c-reserved-keyword-as-a-struct-member-name.patch
The patch titled
Subject: memory_hotplug: fix kernel_panic on offline page processing
has been added to the -mm tree. Its filename is
memory_hotplug-fix-kernel_panic-on-offline-page-processing.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/memory_hotplug-fix-kernel_panic-on…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/memory_hotplug-fix-kernel_panic-on…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Mikhail Zaslonko <zaslonko(a)linux.ibm.com>
Subject: memory_hotplug: fix kernel_panic on offline page processing
Within show_valid_zones() the function test_pages_in_a_zone() should be
called for online memory blocks only. Otherwise it might lead to the
VM_BUG_ON due to uninitialized struct pages (when CONFIG_DEBUG_VM_PGFLAGS
kernel option is set):
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
------------[ cut here ]------------
Call Trace:
([<000000000038f91e>] test_pages_in_a_zone+0xe6/0x168)
[<0000000000923472>] show_valid_zones+0x5a/0x1a8
[<0000000000900284>] dev_attr_show+0x3c/0x78
[<000000000046f6f0>] sysfs_kf_seq_show+0xd0/0x150
[<00000000003ef662>] seq_read+0x212/0x4b8
[<00000000003bf202>] __vfs_read+0x3a/0x178
[<00000000003bf3ca>] vfs_read+0x8a/0x148
[<00000000003bfa3a>] ksys_read+0x62/0xb8
[<0000000000bc2220>] system_call+0xdc/0x2d8
That VM_BUG_ON was triggered by the page poisoning introduced in
mm/sparse.c with the git commit d0dc12e86b31 ("mm/memory_hotplug: optimize
memory hotplug") With the same commit the new 'nid' field has been added
to the struct memory_block in order to store and later on derive the node
id for offline pages (instead of accessing struct page which might be
uninitialized). But one reference to nid in show_valid_zones() function
has been overlooked. Fixed with current commit. Also, nr_pages will not
be used any more after test_pages_in_a_zone() call, do not update it.
Link: http://lkml.kernel.org/r/20180828090539.41491-1-zaslonko@linux.ibm.com
Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")
Signed-off-by: Mikhail Zaslonko <zaslonko(a)linux.ibm.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin(a)microsoft.com>
Cc: <stable(a)vger.kernel.org> [4.17+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/base/memory.c | 20 +++++++++-----------
1 file changed, 9 insertions(+), 11 deletions(-)
--- a/drivers/base/memory.c~memory_hotplug-fix-kernel_panic-on-offline-page-processing
+++ a/drivers/base/memory.c
@@ -417,25 +417,23 @@ static ssize_t show_valid_zones(struct d
int nid;
/*
- * The block contains more than one zone can not be offlined.
- * This can happen e.g. for ZONE_DMA and ZONE_DMA32
- */
- if (!test_pages_in_a_zone(start_pfn, start_pfn + nr_pages, &valid_start_pfn, &valid_end_pfn))
- return sprintf(buf, "none\n");
-
- start_pfn = valid_start_pfn;
- nr_pages = valid_end_pfn - start_pfn;
-
- /*
* Check the existing zone. Make sure that we do that only on the
* online nodes otherwise the page_zone is not reliable
*/
if (mem->state == MEM_ONLINE) {
+ /*
+ * The block contains more than one zone can not be offlined.
+ * This can happen e.g. for ZONE_DMA and ZONE_DMA32
+ */
+ if (!test_pages_in_a_zone(start_pfn, start_pfn + nr_pages,
+ &valid_start_pfn, &valid_end_pfn))
+ return sprintf(buf, "none\n");
+ start_pfn = valid_start_pfn;
strcat(buf, page_zone(pfn_to_page(start_pfn))->name);
goto out;
}
- nid = pfn_to_nid(start_pfn);
+ nid = mem->nid;
default_zone = zone_for_pfn_range(MMOP_ONLINE_KEEP, nid, start_pfn, nr_pages);
strcat(buf, default_zone->name);
_
Patches currently in -mm which might be from zaslonko(a)linux.ibm.com are
memory_hotplug-fix-kernel_panic-on-offline-page-processing.patch