- Linux-stable-mirror - lists.linaro.org

FAILED: patch "[PATCH] Btrfs: fix mount failure after fsync due to hard link" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 0d836392cadd5535f4184d46d901a82eb276ed62 Mon Sep 17 00:00:00 2001 From: Filipe Manana <fdmanana(a)suse.com> Date: Fri, 20 Jul 2018 10:59:06 +0100 Subject: [PATCH] Btrfs: fix mount failure after fsync due to hard link recreation If we end up with logging an inode reference item which has the same name but different index from the one we have persisted, we end up failing when replaying the log with an errno value of -EEXIST. The error comes from btrfs_add_link(), which is called from add_inode_ref(), when we are replaying an inode reference item. Example scenario where this happens: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ touch /mnt/foo $ ln /mnt/foo /mnt/bar $ sync # Rename the first hard link (foo) to a new name and rename the second # hard link (bar) to the old name of the first hard link (foo). $ mv /mnt/foo /mnt/qwerty $ mv /mnt/bar /mnt/foo # Create a new file, in the same parent directory, with the old name of # the second hard link (bar) and fsync this new file. # We do this instead of calling fsync on foo/qwerty because if we did # that the fsync resulted in a full transaction commit, not triggering # the problem. $ touch /mnt/bar $ xfs_io -c "fsync" /mnt/bar <power fail> $ mount /dev/sdb /mnt mount: mount /dev/sdb on /mnt failed: File exists So fix this by checking if a conflicting inode reference exists (same name, same parent but different index), removing it (and the associated dir index entries from the parent inode) if it exists, before attempting to add the new reference. A test case for fstests follows soon. CC: stable(a)vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana(a)suse.com> Signed-off-by: David Sterba <dsterba(a)suse.com> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 10f6a4223897..033aeebbe9de 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -1290,6 +1290,46 @@ again: return ret; } +static int btrfs_inode_ref_exists(struct inode *inode, struct inode *dir, + const u8 ref_type, const char *name, + const int namelen) +{ + struct btrfs_key key; + struct btrfs_path *path; + const u64 parent_id = btrfs_ino(BTRFS_I(dir)); + int ret; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + key.objectid = btrfs_ino(BTRFS_I(inode)); + key.type = ref_type; + if (key.type == BTRFS_INODE_REF_KEY) + key.offset = parent_id; + else + key.offset = btrfs_extref_hash(parent_id, name, namelen); + + ret = btrfs_search_slot(NULL, BTRFS_I(inode)->root, &key, path, 0, 0); + if (ret < 0) + goto out; + if (ret > 0) { + ret = 0; + goto out; + } + if (key.type == BTRFS_INODE_EXTREF_KEY) + ret = btrfs_find_name_in_ext_backref(path->nodes[0], + path->slots[0], parent_id, + name, namelen, NULL); + else + ret = btrfs_find_name_in_backref(path->nodes[0], path->slots[0], + name, namelen, NULL); + +out: + btrfs_free_path(path); + return ret; +} + /* * replay one inode back reference item found in the log tree. * eb, slot and key refer to the buffer and key found in the log tree. @@ -1399,6 +1439,32 @@ static noinline int add_inode_ref(struct btrfs_trans_handle *trans, } } + /* + * If a reference item already exists for this inode + * with the same parent and name, but different index, + * drop it and the corresponding directory index entries + * from the parent before adding the new reference item + * and dir index entries, otherwise we would fail with + * -EEXIST returned from btrfs_add_link() below. + */ + ret = btrfs_inode_ref_exists(inode, dir, key->type, + name, namelen); + if (ret > 0) { + ret = btrfs_unlink_inode(trans, root, + BTRFS_I(dir), + BTRFS_I(inode), + name, namelen); + /* + * If we dropped the link count to 0, bump it so + * that later the iput() on the inode will not + * free it. We will fixup the link count later. + */ + if (!ret && inode->i_nlink == 0) + inc_nlink(inode); + } + if (ret < 0) + goto out; + /* insert our name */ ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode),

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] Btrfs: fix send failure when root has deleted files still" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 46b2f4590aab71d31088a265c86026b1e96c9de4 Mon Sep 17 00:00:00 2001 From: Filipe Manana <fdmanana(a)suse.com> Date: Tue, 24 Jul 2018 11:54:04 +0100 Subject: [PATCH] Btrfs: fix send failure when root has deleted files still open The more common use case of send involves creating a RO snapshot and then use it for a send operation. In this case it's not possible to have inodes in the snapshot that have a link count of zero (inode with an orphan item) since during snapshot creation we do the orphan cleanup. However, other less common use cases for send can end up seeing inodes with a link count of zero and in this case the send operation fails with a ENOENT error because any attempt to generate a path for the inode, with the purpose of creating it or updating it at the receiver, fails since there are no inode reference items. One use case it to use a regular subvolume for a send operation after turning it to RO mode or turning a RW snapshot into RO mode and then using it for a send operation. In both cases, if a file gets all its hard links deleted while there is an open file descriptor before turning the subvolume/snapshot into RO mode, the send operation will encounter an inode with a link count of zero and then fail with errno ENOENT. Example using a full send with a subvolume: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ btrfs subvolume create /mnt/sv1 $ touch /mnt/sv1/foo $ touch /mnt/sv1/bar # keep an open file descriptor on file bar $ exec 73</mnt/sv1/bar $ unlink /mnt/sv1/bar # Turn the subvolume to RO mode and use it for a full send, while # holding the open file descriptor. $ btrfs property set /mnt/sv1 ro true $ btrfs send -f /tmp/full.send /mnt/sv1 At subvol /mnt/sv1 ERROR: send ioctl failed with -2: No such file or directory Example using an incremental send with snapshots: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ btrfs subvolume create /mnt/sv1 $ touch /mnt/sv1/foo $ touch /mnt/sv1/bar $ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap1 $ echo "hello world" >> /mnt/sv1/bar $ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap2 # Turn the second snapshot to RW mode and delete file foo while # holding an open file descriptor on it. $ btrfs property set /mnt/snap2 ro false $ exec 73</mnt/snap2/foo $ unlink /mnt/snap2/foo # Set the second snapshot back to RO mode and do an incremental send. $ btrfs property set /mnt/snap2 ro true $ btrfs send -f /tmp/inc.send -p /mnt/snap1 /mnt/snap2 At subvol /mnt/snap2 ERROR: send ioctl failed with -2: No such file or directory So fix this by ignoring inodes with a link count of zero if we are either doing a full send or if they do not exist in the parent snapshot (they are new in the send snapshot), and unlink all paths found in the parent snapshot when doing an incremental send (and ignoring all other inode items, such as xattrs and extents). A test case for fstests follows soon. CC: stable(a)vger.kernel.org # 4.4+ Reported-by: Martin Wilck <martin.wilck(a)suse.com> Signed-off-by: Filipe Manana <fdmanana(a)suse.com> Reviewed-by: David Sterba <dsterba(a)suse.com> Signed-off-by: David Sterba <dsterba(a)suse.com> diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 42e04cd3cd95..551294a6c9e2 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -100,6 +100,7 @@ struct send_ctx { u64 cur_inode_rdev; u64 cur_inode_last_extent; u64 cur_inode_next_write_offset; + bool ignore_cur_inode; u64 send_progress; @@ -5796,6 +5797,9 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end) int pending_move = 0; int refs_processed = 0; + if (sctx->ignore_cur_inode) + return 0; + ret = process_recorded_refs_if_needed(sctx, at_end, &pending_move, &refs_processed); if (ret < 0) @@ -5914,6 +5918,93 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end) return ret; } +struct parent_paths_ctx { + struct list_head *refs; + struct send_ctx *sctx; +}; + +static int record_parent_ref(int num, u64 dir, int index, struct fs_path *name, + void *ctx) +{ + struct parent_paths_ctx *ppctx = ctx; + + return record_ref(ppctx->sctx->parent_root, dir, name, ppctx->sctx, + ppctx->refs); +} + +/* + * Issue unlink operations for all paths of the current inode found in the + * parent snapshot. + */ +static int btrfs_unlink_all_paths(struct send_ctx *sctx) +{ + LIST_HEAD(deleted_refs); + struct btrfs_path *path; + struct btrfs_key key; + struct parent_paths_ctx ctx; + int ret; + + path = alloc_path_for_send(); + if (!path) + return -ENOMEM; + + key.objectid = sctx->cur_ino; + key.type = BTRFS_INODE_REF_KEY; + key.offset = 0; + ret = btrfs_search_slot(NULL, sctx->parent_root, &key, path, 0, 0); + if (ret < 0) + goto out; + + ctx.refs = &deleted_refs; + ctx.sctx = sctx; + + while (true) { + struct extent_buffer *eb = path->nodes[0]; + int slot = path->slots[0]; + + if (slot >= btrfs_header_nritems(eb)) { + ret = btrfs_next_leaf(sctx->parent_root, path); + if (ret < 0) + goto out; + else if (ret > 0) + break; + continue; + } + + btrfs_item_key_to_cpu(eb, &key, slot); + if (key.objectid != sctx->cur_ino) + break; + if (key.type != BTRFS_INODE_REF_KEY && + key.type != BTRFS_INODE_EXTREF_KEY) + break; + + ret = iterate_inode_ref(sctx->parent_root, path, &key, 1, + record_parent_ref, &ctx); + if (ret < 0) + goto out; + + path->slots[0]++; + } + + while (!list_empty(&deleted_refs)) { + struct recorded_ref *ref; + + ref = list_first_entry(&deleted_refs, struct recorded_ref, list); + ret = send_unlink(sctx, ref->full_path); + if (ret < 0) + goto out; + fs_path_free(ref->full_path); + list_del(&ref->list); + kfree(ref); + } + ret = 0; +out: + btrfs_free_path(path); + if (ret) + __free_recorded_refs(&deleted_refs); + return ret; +} + static int changed_inode(struct send_ctx *sctx, enum btrfs_compare_tree_result result) { @@ -5928,6 +6019,7 @@ static int changed_inode(struct send_ctx *sctx, sctx->cur_inode_new_gen = 0; sctx->cur_inode_last_extent = (u64)-1; sctx->cur_inode_next_write_offset = 0; + sctx->ignore_cur_inode = false; /* * Set send_progress to current inode. This will tell all get_cur_xxx @@ -5968,6 +6060,33 @@ static int changed_inode(struct send_ctx *sctx, sctx->cur_inode_new_gen = 1; } + /* + * Normally we do not find inodes with a link count of zero (orphans) + * because the most common case is to create a snapshot and use it + * for a send operation. However other less common use cases involve + * using a subvolume and send it after turning it to RO mode just + * after deleting all hard links of a file while holding an open + * file descriptor against it or turning a RO snapshot into RW mode, + * keep an open file descriptor against a file, delete it and then + * turn the snapshot back to RO mode before using it for a send + * operation. So if we find such cases, ignore the inode and all its + * items completely if it's a new inode, or if it's a changed inode + * make sure all its previous paths (from the parent snapshot) are all + * unlinked and all other the inode items are ignored. + */ + if (result == BTRFS_COMPARE_TREE_NEW || + result == BTRFS_COMPARE_TREE_CHANGED) { + u32 nlinks; + + nlinks = btrfs_inode_nlink(sctx->left_path->nodes[0], left_ii); + if (nlinks == 0) { + sctx->ignore_cur_inode = true; + if (result == BTRFS_COMPARE_TREE_CHANGED) + ret = btrfs_unlink_all_paths(sctx); + goto out; + } + } + if (result == BTRFS_COMPARE_TREE_NEW) { sctx->cur_inode_gen = left_gen; sctx->cur_inode_new = 1; @@ -6306,15 +6425,17 @@ static int changed_cb(struct btrfs_path *left_path, key->objectid == BTRFS_FREE_SPACE_OBJECTID) goto out; - if (key->type == BTRFS_INODE_ITEM_KEY) + if (key->type == BTRFS_INODE_ITEM_KEY) { ret = changed_inode(sctx, result); - else if (key->type == BTRFS_INODE_REF_KEY || - key->type == BTRFS_INODE_EXTREF_KEY) - ret = changed_ref(sctx, result); - else if (key->type == BTRFS_XATTR_ITEM_KEY) - ret = changed_xattr(sctx, result); - else if (key->type == BTRFS_EXTENT_DATA_KEY) - ret = changed_extent(sctx, result); + } else if (!sctx->ignore_cur_inode) { + if (key->type == BTRFS_INODE_REF_KEY || + key->type == BTRFS_INODE_EXTREF_KEY) + ret = changed_ref(sctx, result); + else if (key->type == BTRFS_XATTR_ITEM_KEY) + ret = changed_xattr(sctx, result); + else if (key->type == BTRFS_EXTENT_DATA_KEY) + ret = changed_extent(sctx, result); + } out: return ret;

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] Btrfs: fix send failure when root has deleted files still" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 46b2f4590aab71d31088a265c86026b1e96c9de4 Mon Sep 17 00:00:00 2001 From: Filipe Manana <fdmanana(a)suse.com> Date: Tue, 24 Jul 2018 11:54:04 +0100 Subject: [PATCH] Btrfs: fix send failure when root has deleted files still open The more common use case of send involves creating a RO snapshot and then use it for a send operation. In this case it's not possible to have inodes in the snapshot that have a link count of zero (inode with an orphan item) since during snapshot creation we do the orphan cleanup. However, other less common use cases for send can end up seeing inodes with a link count of zero and in this case the send operation fails with a ENOENT error because any attempt to generate a path for the inode, with the purpose of creating it or updating it at the receiver, fails since there are no inode reference items. One use case it to use a regular subvolume for a send operation after turning it to RO mode or turning a RW snapshot into RO mode and then using it for a send operation. In both cases, if a file gets all its hard links deleted while there is an open file descriptor before turning the subvolume/snapshot into RO mode, the send operation will encounter an inode with a link count of zero and then fail with errno ENOENT. Example using a full send with a subvolume: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ btrfs subvolume create /mnt/sv1 $ touch /mnt/sv1/foo $ touch /mnt/sv1/bar # keep an open file descriptor on file bar $ exec 73</mnt/sv1/bar $ unlink /mnt/sv1/bar # Turn the subvolume to RO mode and use it for a full send, while # holding the open file descriptor. $ btrfs property set /mnt/sv1 ro true $ btrfs send -f /tmp/full.send /mnt/sv1 At subvol /mnt/sv1 ERROR: send ioctl failed with -2: No such file or directory Example using an incremental send with snapshots: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ btrfs subvolume create /mnt/sv1 $ touch /mnt/sv1/foo $ touch /mnt/sv1/bar $ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap1 $ echo "hello world" >> /mnt/sv1/bar $ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap2 # Turn the second snapshot to RW mode and delete file foo while # holding an open file descriptor on it. $ btrfs property set /mnt/snap2 ro false $ exec 73</mnt/snap2/foo $ unlink /mnt/snap2/foo # Set the second snapshot back to RO mode and do an incremental send. $ btrfs property set /mnt/snap2 ro true $ btrfs send -f /tmp/inc.send -p /mnt/snap1 /mnt/snap2 At subvol /mnt/snap2 ERROR: send ioctl failed with -2: No such file or directory So fix this by ignoring inodes with a link count of zero if we are either doing a full send or if they do not exist in the parent snapshot (they are new in the send snapshot), and unlink all paths found in the parent snapshot when doing an incremental send (and ignoring all other inode items, such as xattrs and extents). A test case for fstests follows soon. CC: stable(a)vger.kernel.org # 4.4+ Reported-by: Martin Wilck <martin.wilck(a)suse.com> Signed-off-by: Filipe Manana <fdmanana(a)suse.com> Reviewed-by: David Sterba <dsterba(a)suse.com> Signed-off-by: David Sterba <dsterba(a)suse.com> diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 42e04cd3cd95..551294a6c9e2 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -100,6 +100,7 @@ struct send_ctx { u64 cur_inode_rdev; u64 cur_inode_last_extent; u64 cur_inode_next_write_offset; + bool ignore_cur_inode; u64 send_progress; @@ -5796,6 +5797,9 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end) int pending_move = 0; int refs_processed = 0; + if (sctx->ignore_cur_inode) + return 0; + ret = process_recorded_refs_if_needed(sctx, at_end, &pending_move, &refs_processed); if (ret < 0) @@ -5914,6 +5918,93 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end) return ret; } +struct parent_paths_ctx { + struct list_head *refs; + struct send_ctx *sctx; +}; + +static int record_parent_ref(int num, u64 dir, int index, struct fs_path *name, + void *ctx) +{ + struct parent_paths_ctx *ppctx = ctx; + + return record_ref(ppctx->sctx->parent_root, dir, name, ppctx->sctx, + ppctx->refs); +} + +/* + * Issue unlink operations for all paths of the current inode found in the + * parent snapshot. + */ +static int btrfs_unlink_all_paths(struct send_ctx *sctx) +{ + LIST_HEAD(deleted_refs); + struct btrfs_path *path; + struct btrfs_key key; + struct parent_paths_ctx ctx; + int ret; + + path = alloc_path_for_send(); + if (!path) + return -ENOMEM; + + key.objectid = sctx->cur_ino; + key.type = BTRFS_INODE_REF_KEY; + key.offset = 0; + ret = btrfs_search_slot(NULL, sctx->parent_root, &key, path, 0, 0); + if (ret < 0) + goto out; + + ctx.refs = &deleted_refs; + ctx.sctx = sctx; + + while (true) { + struct extent_buffer *eb = path->nodes[0]; + int slot = path->slots[0]; + + if (slot >= btrfs_header_nritems(eb)) { + ret = btrfs_next_leaf(sctx->parent_root, path); + if (ret < 0) + goto out; + else if (ret > 0) + break; + continue; + } + + btrfs_item_key_to_cpu(eb, &key, slot); + if (key.objectid != sctx->cur_ino) + break; + if (key.type != BTRFS_INODE_REF_KEY && + key.type != BTRFS_INODE_EXTREF_KEY) + break; + + ret = iterate_inode_ref(sctx->parent_root, path, &key, 1, + record_parent_ref, &ctx); + if (ret < 0) + goto out; + + path->slots[0]++; + } + + while (!list_empty(&deleted_refs)) { + struct recorded_ref *ref; + + ref = list_first_entry(&deleted_refs, struct recorded_ref, list); + ret = send_unlink(sctx, ref->full_path); + if (ret < 0) + goto out; + fs_path_free(ref->full_path); + list_del(&ref->list); + kfree(ref); + } + ret = 0; +out: + btrfs_free_path(path); + if (ret) + __free_recorded_refs(&deleted_refs); + return ret; +} + static int changed_inode(struct send_ctx *sctx, enum btrfs_compare_tree_result result) { @@ -5928,6 +6019,7 @@ static int changed_inode(struct send_ctx *sctx, sctx->cur_inode_new_gen = 0; sctx->cur_inode_last_extent = (u64)-1; sctx->cur_inode_next_write_offset = 0; + sctx->ignore_cur_inode = false; /* * Set send_progress to current inode. This will tell all get_cur_xxx @@ -5968,6 +6060,33 @@ static int changed_inode(struct send_ctx *sctx, sctx->cur_inode_new_gen = 1; } + /* + * Normally we do not find inodes with a link count of zero (orphans) + * because the most common case is to create a snapshot and use it + * for a send operation. However other less common use cases involve + * using a subvolume and send it after turning it to RO mode just + * after deleting all hard links of a file while holding an open + * file descriptor against it or turning a RO snapshot into RW mode, + * keep an open file descriptor against a file, delete it and then + * turn the snapshot back to RO mode before using it for a send + * operation. So if we find such cases, ignore the inode and all its + * items completely if it's a new inode, or if it's a changed inode + * make sure all its previous paths (from the parent snapshot) are all + * unlinked and all other the inode items are ignored. + */ + if (result == BTRFS_COMPARE_TREE_NEW || + result == BTRFS_COMPARE_TREE_CHANGED) { + u32 nlinks; + + nlinks = btrfs_inode_nlink(sctx->left_path->nodes[0], left_ii); + if (nlinks == 0) { + sctx->ignore_cur_inode = true; + if (result == BTRFS_COMPARE_TREE_CHANGED) + ret = btrfs_unlink_all_paths(sctx); + goto out; + } + } + if (result == BTRFS_COMPARE_TREE_NEW) { sctx->cur_inode_gen = left_gen; sctx->cur_inode_new = 1; @@ -6306,15 +6425,17 @@ static int changed_cb(struct btrfs_path *left_path, key->objectid == BTRFS_FREE_SPACE_OBJECTID) goto out; - if (key->type == BTRFS_INODE_ITEM_KEY) + if (key->type == BTRFS_INODE_ITEM_KEY) { ret = changed_inode(sctx, result); - else if (key->type == BTRFS_INODE_REF_KEY || - key->type == BTRFS_INODE_EXTREF_KEY) - ret = changed_ref(sctx, result); - else if (key->type == BTRFS_XATTR_ITEM_KEY) - ret = changed_xattr(sctx, result); - else if (key->type == BTRFS_EXTENT_DATA_KEY) - ret = changed_extent(sctx, result); + } else if (!sctx->ignore_cur_inode) { + if (key->type == BTRFS_INODE_REF_KEY || + key->type == BTRFS_INODE_EXTREF_KEY) + ret = changed_ref(sctx, result); + else if (key->type == BTRFS_XATTR_ITEM_KEY) + ret = changed_xattr(sctx, result); + else if (key->type == BTRFS_EXTENT_DATA_KEY) + ret = changed_extent(sctx, result); + } out: return ret;

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] Btrfs: fix send failure when root has deleted files still" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 46b2f4590aab71d31088a265c86026b1e96c9de4 Mon Sep 17 00:00:00 2001 From: Filipe Manana <fdmanana(a)suse.com> Date: Tue, 24 Jul 2018 11:54:04 +0100 Subject: [PATCH] Btrfs: fix send failure when root has deleted files still open The more common use case of send involves creating a RO snapshot and then use it for a send operation. In this case it's not possible to have inodes in the snapshot that have a link count of zero (inode with an orphan item) since during snapshot creation we do the orphan cleanup. However, other less common use cases for send can end up seeing inodes with a link count of zero and in this case the send operation fails with a ENOENT error because any attempt to generate a path for the inode, with the purpose of creating it or updating it at the receiver, fails since there are no inode reference items. One use case it to use a regular subvolume for a send operation after turning it to RO mode or turning a RW snapshot into RO mode and then using it for a send operation. In both cases, if a file gets all its hard links deleted while there is an open file descriptor before turning the subvolume/snapshot into RO mode, the send operation will encounter an inode with a link count of zero and then fail with errno ENOENT. Example using a full send with a subvolume: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ btrfs subvolume create /mnt/sv1 $ touch /mnt/sv1/foo $ touch /mnt/sv1/bar # keep an open file descriptor on file bar $ exec 73</mnt/sv1/bar $ unlink /mnt/sv1/bar # Turn the subvolume to RO mode and use it for a full send, while # holding the open file descriptor. $ btrfs property set /mnt/sv1 ro true $ btrfs send -f /tmp/full.send /mnt/sv1 At subvol /mnt/sv1 ERROR: send ioctl failed with -2: No such file or directory Example using an incremental send with snapshots: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ btrfs subvolume create /mnt/sv1 $ touch /mnt/sv1/foo $ touch /mnt/sv1/bar $ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap1 $ echo "hello world" >> /mnt/sv1/bar $ btrfs subvolume snapshot -r /mnt/sv1 /mnt/snap2 # Turn the second snapshot to RW mode and delete file foo while # holding an open file descriptor on it. $ btrfs property set /mnt/snap2 ro false $ exec 73</mnt/snap2/foo $ unlink /mnt/snap2/foo # Set the second snapshot back to RO mode and do an incremental send. $ btrfs property set /mnt/snap2 ro true $ btrfs send -f /tmp/inc.send -p /mnt/snap1 /mnt/snap2 At subvol /mnt/snap2 ERROR: send ioctl failed with -2: No such file or directory So fix this by ignoring inodes with a link count of zero if we are either doing a full send or if they do not exist in the parent snapshot (they are new in the send snapshot), and unlink all paths found in the parent snapshot when doing an incremental send (and ignoring all other inode items, such as xattrs and extents). A test case for fstests follows soon. CC: stable(a)vger.kernel.org # 4.4+ Reported-by: Martin Wilck <martin.wilck(a)suse.com> Signed-off-by: Filipe Manana <fdmanana(a)suse.com> Reviewed-by: David Sterba <dsterba(a)suse.com> Signed-off-by: David Sterba <dsterba(a)suse.com> diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 42e04cd3cd95..551294a6c9e2 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -100,6 +100,7 @@ struct send_ctx { u64 cur_inode_rdev; u64 cur_inode_last_extent; u64 cur_inode_next_write_offset; + bool ignore_cur_inode; u64 send_progress; @@ -5796,6 +5797,9 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end) int pending_move = 0; int refs_processed = 0; + if (sctx->ignore_cur_inode) + return 0; + ret = process_recorded_refs_if_needed(sctx, at_end, &pending_move, &refs_processed); if (ret < 0) @@ -5914,6 +5918,93 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end) return ret; } +struct parent_paths_ctx { + struct list_head *refs; + struct send_ctx *sctx; +}; + +static int record_parent_ref(int num, u64 dir, int index, struct fs_path *name, + void *ctx) +{ + struct parent_paths_ctx *ppctx = ctx; + + return record_ref(ppctx->sctx->parent_root, dir, name, ppctx->sctx, + ppctx->refs); +} + +/* + * Issue unlink operations for all paths of the current inode found in the + * parent snapshot. + */ +static int btrfs_unlink_all_paths(struct send_ctx *sctx) +{ + LIST_HEAD(deleted_refs); + struct btrfs_path *path; + struct btrfs_key key; + struct parent_paths_ctx ctx; + int ret; + + path = alloc_path_for_send(); + if (!path) + return -ENOMEM; + + key.objectid = sctx->cur_ino; + key.type = BTRFS_INODE_REF_KEY; + key.offset = 0; + ret = btrfs_search_slot(NULL, sctx->parent_root, &key, path, 0, 0); + if (ret < 0) + goto out; + + ctx.refs = &deleted_refs; + ctx.sctx = sctx; + + while (true) { + struct extent_buffer *eb = path->nodes[0]; + int slot = path->slots[0]; + + if (slot >= btrfs_header_nritems(eb)) { + ret = btrfs_next_leaf(sctx->parent_root, path); + if (ret < 0) + goto out; + else if (ret > 0) + break; + continue; + } + + btrfs_item_key_to_cpu(eb, &key, slot); + if (key.objectid != sctx->cur_ino) + break; + if (key.type != BTRFS_INODE_REF_KEY && + key.type != BTRFS_INODE_EXTREF_KEY) + break; + + ret = iterate_inode_ref(sctx->parent_root, path, &key, 1, + record_parent_ref, &ctx); + if (ret < 0) + goto out; + + path->slots[0]++; + } + + while (!list_empty(&deleted_refs)) { + struct recorded_ref *ref; + + ref = list_first_entry(&deleted_refs, struct recorded_ref, list); + ret = send_unlink(sctx, ref->full_path); + if (ret < 0) + goto out; + fs_path_free(ref->full_path); + list_del(&ref->list); + kfree(ref); + } + ret = 0; +out: + btrfs_free_path(path); + if (ret) + __free_recorded_refs(&deleted_refs); + return ret; +} + static int changed_inode(struct send_ctx *sctx, enum btrfs_compare_tree_result result) { @@ -5928,6 +6019,7 @@ static int changed_inode(struct send_ctx *sctx, sctx->cur_inode_new_gen = 0; sctx->cur_inode_last_extent = (u64)-1; sctx->cur_inode_next_write_offset = 0; + sctx->ignore_cur_inode = false; /* * Set send_progress to current inode. This will tell all get_cur_xxx @@ -5968,6 +6060,33 @@ static int changed_inode(struct send_ctx *sctx, sctx->cur_inode_new_gen = 1; } + /* + * Normally we do not find inodes with a link count of zero (orphans) + * because the most common case is to create a snapshot and use it + * for a send operation. However other less common use cases involve + * using a subvolume and send it after turning it to RO mode just + * after deleting all hard links of a file while holding an open + * file descriptor against it or turning a RO snapshot into RW mode, + * keep an open file descriptor against a file, delete it and then + * turn the snapshot back to RO mode before using it for a send + * operation. So if we find such cases, ignore the inode and all its + * items completely if it's a new inode, or if it's a changed inode + * make sure all its previous paths (from the parent snapshot) are all + * unlinked and all other the inode items are ignored. + */ + if (result == BTRFS_COMPARE_TREE_NEW || + result == BTRFS_COMPARE_TREE_CHANGED) { + u32 nlinks; + + nlinks = btrfs_inode_nlink(sctx->left_path->nodes[0], left_ii); + if (nlinks == 0) { + sctx->ignore_cur_inode = true; + if (result == BTRFS_COMPARE_TREE_CHANGED) + ret = btrfs_unlink_all_paths(sctx); + goto out; + } + } + if (result == BTRFS_COMPARE_TREE_NEW) { sctx->cur_inode_gen = left_gen; sctx->cur_inode_new = 1; @@ -6306,15 +6425,17 @@ static int changed_cb(struct btrfs_path *left_path, key->objectid == BTRFS_FREE_SPACE_OBJECTID) goto out; - if (key->type == BTRFS_INODE_ITEM_KEY) + if (key->type == BTRFS_INODE_ITEM_KEY) { ret = changed_inode(sctx, result); - else if (key->type == BTRFS_INODE_REF_KEY || - key->type == BTRFS_INODE_EXTREF_KEY) - ret = changed_ref(sctx, result); - else if (key->type == BTRFS_XATTR_ITEM_KEY) - ret = changed_xattr(sctx, result); - else if (key->type == BTRFS_EXTENT_DATA_KEY) - ret = changed_extent(sctx, result); + } else if (!sctx->ignore_cur_inode) { + if (key->type == BTRFS_INODE_REF_KEY || + key->type == BTRFS_INODE_EXTREF_KEY) + ret = changed_ref(sctx, result); + else if (key->type == BTRFS_XATTR_ITEM_KEY) + ret = changed_xattr(sctx, result); + else if (key->type == BTRFS_EXTENT_DATA_KEY) + ret = changed_extent(sctx, result); + } out: return ret;

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] Btrfs: fix btrfs_write_inode vs delayed iput deadlock" failed to apply to 3.18-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 3.18-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 3c4276936f6fbe52884b4ea4e6cc120b890a0f9f Mon Sep 17 00:00:00 2001 From: Josef Bacik <jbacik(a)fb.com> Date: Fri, 20 Jul 2018 11:46:10 -0700 Subject: [PATCH] Btrfs: fix btrfs_write_inode vs delayed iput deadlock We recently ran into the following deadlock involving btrfs_write_inode(): [ +0.005066] __schedule+0x38e/0x8c0 [ +0.007144] schedule+0x36/0x80 [ +0.006447] bit_wait+0x11/0x60 [ +0.006446] __wait_on_bit+0xbe/0x110 [ +0.007487] ? bit_wait_io+0x60/0x60 [ +0.007319] __inode_wait_for_writeback+0x96/0xc0 [ +0.009568] ? autoremove_wake_function+0x40/0x40 [ +0.009565] inode_wait_for_writeback+0x21/0x30 [ +0.009224] evict+0xb0/0x190 [ +0.006099] iput+0x1a8/0x210 [ +0.006103] btrfs_run_delayed_iputs+0x73/0xc0 [ +0.009047] btrfs_commit_transaction+0x799/0x8c0 [ +0.009567] btrfs_write_inode+0x81/0xb0 [ +0.008008] __writeback_single_inode+0x267/0x320 [ +0.009569] writeback_sb_inodes+0x25b/0x4e0 [ +0.008702] wb_writeback+0x102/0x2d0 [ +0.007487] wb_workfn+0xa4/0x310 [ +0.006794] ? wb_workfn+0xa4/0x310 [ +0.007143] process_one_work+0x150/0x410 [ +0.008179] worker_thread+0x6d/0x520 [ +0.007490] kthread+0x12c/0x160 [ +0.006620] ? put_pwq_unlocked+0x80/0x80 [ +0.008185] ? kthread_park+0xa0/0xa0 [ +0.007484] ? do_syscall_64+0x53/0x150 [ +0.007837] ret_from_fork+0x29/0x40 Writeback calls: btrfs_write_inode btrfs_commit_transaction btrfs_run_delayed_iputs If iput() is called on that same inode, evict() will wait for writeback forever. btrfs_write_inode() was originally added way back in 4730a4bc5bf3 ("btrfs_dirty_inode") to support O_SYNC writes. However, ->write_inode() hasn't been used for O_SYNC since 148f948ba877 ("vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode"), so btrfs_write_inode() is actually unnecessary (and leads to a bunch of unnecessary commits). Get rid of it, which also gets rid of the deadlock. CC: stable(a)vger.kernel.org # 3.2+ Signed-off-by: Josef Bacik <jbacik(a)fb.com> [Omar: new commit message] Signed-off-by: Omar Sandoval <osandov(a)fb.com> Signed-off-by: David Sterba <dsterba(a)suse.com> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4955e04da4c8..472457795486 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6021,32 +6021,6 @@ static int btrfs_real_readdir(struct file *file, struct dir_context *ctx) return ret; } -int btrfs_write_inode(struct inode *inode, struct writeback_control *wbc) -{ - struct btrfs_root *root = BTRFS_I(inode)->root; - struct btrfs_trans_handle *trans; - int ret = 0; - bool nolock = false; - - if (test_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags)) - return 0; - - if (btrfs_fs_closing(root->fs_info) && - btrfs_is_free_space_inode(BTRFS_I(inode))) - nolock = true; - - if (wbc->sync_mode == WB_SYNC_ALL) { - if (nolock) - trans = btrfs_join_transaction_nolock(root); - else - trans = btrfs_join_transaction(root); - if (IS_ERR(trans)) - return PTR_ERR(trans); - ret = btrfs_commit_transaction(trans); - } - return ret; -} - /* * This is somewhat expensive, updating the tree every time the * inode changes. But, it is most likely to find the inode in cache. diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index efe8b03ce380..67de3c0fc85b 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2344,7 +2344,6 @@ static const struct super_operations btrfs_super_ops = { .sync_fs = btrfs_sync_fs, .show_options = btrfs_show_options, .show_devname = btrfs_show_devname, - .write_inode = btrfs_write_inode, .alloc_inode = btrfs_alloc_inode, .destroy_inode = btrfs_destroy_inode, .statfs = btrfs_statfs,

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] Btrfs: fix btrfs_write_inode vs delayed iput deadlock" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 3c4276936f6fbe52884b4ea4e6cc120b890a0f9f Mon Sep 17 00:00:00 2001 From: Josef Bacik <jbacik(a)fb.com> Date: Fri, 20 Jul 2018 11:46:10 -0700 Subject: [PATCH] Btrfs: fix btrfs_write_inode vs delayed iput deadlock We recently ran into the following deadlock involving btrfs_write_inode(): [ +0.005066] __schedule+0x38e/0x8c0 [ +0.007144] schedule+0x36/0x80 [ +0.006447] bit_wait+0x11/0x60 [ +0.006446] __wait_on_bit+0xbe/0x110 [ +0.007487] ? bit_wait_io+0x60/0x60 [ +0.007319] __inode_wait_for_writeback+0x96/0xc0 [ +0.009568] ? autoremove_wake_function+0x40/0x40 [ +0.009565] inode_wait_for_writeback+0x21/0x30 [ +0.009224] evict+0xb0/0x190 [ +0.006099] iput+0x1a8/0x210 [ +0.006103] btrfs_run_delayed_iputs+0x73/0xc0 [ +0.009047] btrfs_commit_transaction+0x799/0x8c0 [ +0.009567] btrfs_write_inode+0x81/0xb0 [ +0.008008] __writeback_single_inode+0x267/0x320 [ +0.009569] writeback_sb_inodes+0x25b/0x4e0 [ +0.008702] wb_writeback+0x102/0x2d0 [ +0.007487] wb_workfn+0xa4/0x310 [ +0.006794] ? wb_workfn+0xa4/0x310 [ +0.007143] process_one_work+0x150/0x410 [ +0.008179] worker_thread+0x6d/0x520 [ +0.007490] kthread+0x12c/0x160 [ +0.006620] ? put_pwq_unlocked+0x80/0x80 [ +0.008185] ? kthread_park+0xa0/0xa0 [ +0.007484] ? do_syscall_64+0x53/0x150 [ +0.007837] ret_from_fork+0x29/0x40 Writeback calls: btrfs_write_inode btrfs_commit_transaction btrfs_run_delayed_iputs If iput() is called on that same inode, evict() will wait for writeback forever. btrfs_write_inode() was originally added way back in 4730a4bc5bf3 ("btrfs_dirty_inode") to support O_SYNC writes. However, ->write_inode() hasn't been used for O_SYNC since 148f948ba877 ("vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode"), so btrfs_write_inode() is actually unnecessary (and leads to a bunch of unnecessary commits). Get rid of it, which also gets rid of the deadlock. CC: stable(a)vger.kernel.org # 3.2+ Signed-off-by: Josef Bacik <jbacik(a)fb.com> [Omar: new commit message] Signed-off-by: Omar Sandoval <osandov(a)fb.com> Signed-off-by: David Sterba <dsterba(a)suse.com> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4955e04da4c8..472457795486 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6021,32 +6021,6 @@ static int btrfs_real_readdir(struct file *file, struct dir_context *ctx) return ret; } -int btrfs_write_inode(struct inode *inode, struct writeback_control *wbc) -{ - struct btrfs_root *root = BTRFS_I(inode)->root; - struct btrfs_trans_handle *trans; - int ret = 0; - bool nolock = false; - - if (test_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags)) - return 0; - - if (btrfs_fs_closing(root->fs_info) && - btrfs_is_free_space_inode(BTRFS_I(inode))) - nolock = true; - - if (wbc->sync_mode == WB_SYNC_ALL) { - if (nolock) - trans = btrfs_join_transaction_nolock(root); - else - trans = btrfs_join_transaction(root); - if (IS_ERR(trans)) - return PTR_ERR(trans); - ret = btrfs_commit_transaction(trans); - } - return ret; -} - /* * This is somewhat expensive, updating the tree every time the * inode changes. But, it is most likely to find the inode in cache. diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index efe8b03ce380..67de3c0fc85b 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2344,7 +2344,6 @@ static const struct super_operations btrfs_super_ops = { .sync_fs = btrfs_sync_fs, .show_options = btrfs_show_options, .show_devname = btrfs_show_devname, - .write_inode = btrfs_write_inode, .alloc_inode = btrfs_alloc_inode, .destroy_inode = btrfs_destroy_inode, .statfs = btrfs_statfs,

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] Btrfs: fix btrfs_write_inode vs delayed iput deadlock" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 3c4276936f6fbe52884b4ea4e6cc120b890a0f9f Mon Sep 17 00:00:00 2001 From: Josef Bacik <jbacik(a)fb.com> Date: Fri, 20 Jul 2018 11:46:10 -0700 Subject: [PATCH] Btrfs: fix btrfs_write_inode vs delayed iput deadlock We recently ran into the following deadlock involving btrfs_write_inode(): [ +0.005066] __schedule+0x38e/0x8c0 [ +0.007144] schedule+0x36/0x80 [ +0.006447] bit_wait+0x11/0x60 [ +0.006446] __wait_on_bit+0xbe/0x110 [ +0.007487] ? bit_wait_io+0x60/0x60 [ +0.007319] __inode_wait_for_writeback+0x96/0xc0 [ +0.009568] ? autoremove_wake_function+0x40/0x40 [ +0.009565] inode_wait_for_writeback+0x21/0x30 [ +0.009224] evict+0xb0/0x190 [ +0.006099] iput+0x1a8/0x210 [ +0.006103] btrfs_run_delayed_iputs+0x73/0xc0 [ +0.009047] btrfs_commit_transaction+0x799/0x8c0 [ +0.009567] btrfs_write_inode+0x81/0xb0 [ +0.008008] __writeback_single_inode+0x267/0x320 [ +0.009569] writeback_sb_inodes+0x25b/0x4e0 [ +0.008702] wb_writeback+0x102/0x2d0 [ +0.007487] wb_workfn+0xa4/0x310 [ +0.006794] ? wb_workfn+0xa4/0x310 [ +0.007143] process_one_work+0x150/0x410 [ +0.008179] worker_thread+0x6d/0x520 [ +0.007490] kthread+0x12c/0x160 [ +0.006620] ? put_pwq_unlocked+0x80/0x80 [ +0.008185] ? kthread_park+0xa0/0xa0 [ +0.007484] ? do_syscall_64+0x53/0x150 [ +0.007837] ret_from_fork+0x29/0x40 Writeback calls: btrfs_write_inode btrfs_commit_transaction btrfs_run_delayed_iputs If iput() is called on that same inode, evict() will wait for writeback forever. btrfs_write_inode() was originally added way back in 4730a4bc5bf3 ("btrfs_dirty_inode") to support O_SYNC writes. However, ->write_inode() hasn't been used for O_SYNC since 148f948ba877 ("vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode"), so btrfs_write_inode() is actually unnecessary (and leads to a bunch of unnecessary commits). Get rid of it, which also gets rid of the deadlock. CC: stable(a)vger.kernel.org # 3.2+ Signed-off-by: Josef Bacik <jbacik(a)fb.com> [Omar: new commit message] Signed-off-by: Omar Sandoval <osandov(a)fb.com> Signed-off-by: David Sterba <dsterba(a)suse.com> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4955e04da4c8..472457795486 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6021,32 +6021,6 @@ static int btrfs_real_readdir(struct file *file, struct dir_context *ctx) return ret; } -int btrfs_write_inode(struct inode *inode, struct writeback_control *wbc) -{ - struct btrfs_root *root = BTRFS_I(inode)->root; - struct btrfs_trans_handle *trans; - int ret = 0; - bool nolock = false; - - if (test_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags)) - return 0; - - if (btrfs_fs_closing(root->fs_info) && - btrfs_is_free_space_inode(BTRFS_I(inode))) - nolock = true; - - if (wbc->sync_mode == WB_SYNC_ALL) { - if (nolock) - trans = btrfs_join_transaction_nolock(root); - else - trans = btrfs_join_transaction(root); - if (IS_ERR(trans)) - return PTR_ERR(trans); - ret = btrfs_commit_transaction(trans); - } - return ret; -} - /* * This is somewhat expensive, updating the tree every time the * inode changes. But, it is most likely to find the inode in cache. diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index efe8b03ce380..67de3c0fc85b 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2344,7 +2344,6 @@ static const struct super_operations btrfs_super_ops = { .sync_fs = btrfs_sync_fs, .show_options = btrfs_show_options, .show_devname = btrfs_show_devname, - .write_inode = btrfs_write_inode, .alloc_inode = btrfs_alloc_inode, .destroy_inode = btrfs_destroy_inode, .statfs = btrfs_statfs,

7 years, 3 months

1
0
0 0

[gregkh@linuxfoundation.org: Patch "drm: re-enable error handling" has been added to the 3.18-stable tree]

by Nicholas Mc Guire

Hi ! this is also the wrong version of the patch - the proper version is below. This has been posted to lkml https://lkml.org/lkml/2018/7/18/191 "[PATCH V3] drm: handle error values properly" but there was no review yet The version you have here though is for sure broken. So maybe this should be simply dropped until the above, presumably correct fix, is confirmed. thx! hofrat ----- Forwarded message from gregkh(a)linuxfoundation.org ----- Date: Tue, 28 Aug 2018 16:09:59 +0200 From: gregkh(a)linuxfoundation.org To: 1531571532-22733-1-git-send-email-hofrat(a)osadl.org, alexander.levin(a)microsoft.com, gregkh(a)linuxfoundation.org, hofrat(a)osadl.org, seanpaul(a)chromium.org Cc: stable-commits(a)vger.kernel.org Subject: Patch "drm: re-enable error handling" has been added to the 3.18-stable tree This is a note to let you know that I've just added the patch titled drm: re-enable error handling to the 3.18-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: drm-re-enable-error-handling.patch and it can be found in the queue-3.18 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From foo@baz Tue Aug 28 16:08:28 CEST 2018 From: Nicholas Mc Guire <hofrat(a)osadl.org> Date: Sat, 14 Jul 2018 14:32:12 +0200 Subject: drm: re-enable error handling From: Nicholas Mc Guire <hofrat(a)osadl.org> [ Upstream commit d530b5f1ca0bb66958a2b714bebe40a1248b9c15 ] drm_legacy_ctxbitmap_next() returns idr_alloc() which can return -ENOMEM, -EINVAL or -ENOSPC none of which are -1 . but the call sites of drm_legacy_ctxbitmap_next() seem to be assuming that the error case would be -1 (original return of drm_ctxbitmap_next() prior to 2.6.23 was actually -1). Thus reenable error handling by checking for < 0. Signed-off-by: Nicholas Mc Guire <hofrat(a)osadl.org> Fixes: 62968144e673 ("drm: convert drm context code to use Linux idr") Signed-off-by: Sean Paul <seanpaul(a)chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/1531571532-22733-1-git-send-e… Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/gpu/drm/drm_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/gpu/drm/drm_context.c +++ b/drivers/gpu/drm/drm_context.c @@ -341,7 +341,7 @@ int drm_legacy_addctx(struct drm_device ctx->handle = drm_legacy_ctxbitmap_next(dev); } DRM_DEBUG("%d\n", ctx->handle); - if (ctx->handle == -1) { + if (ctx->handle < 0) { DRM_DEBUG("Not enough free contexts.\n"); /* Should this return -EBUSY instead? */ return -ENOMEM; Patches currently in stable-queue which might be from hofrat(a)osadl.org are queue-3.18/drm-re-enable-error-handling.patch queue-3.18/can-mpc5xxx_can-check-of_iomap-return-before-use.patch ----- End forwarded message -----

7 years, 3 months

2
1
0 0

[PATCH] x86/nmi: Fix some races in NMI uaccess

by Andy Lutomirski

In NMI context, we might be in the middle of context switching or in the middle of switch_mm_irqs_off(). In either case, CR3 might not match current->mm, which could cause copy_from_user_nmi() and friends to read the wrong memory. Fix it by adding a new nmi_uaccess_okay() helper and checking it in copy_from_user_nmi() and in __copy_from_user_nmi()'s callers. Cc: stable(a)vger.kernel.org Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Nadav Amit <nadav.amit(a)gmail.com> Signed-off-by: Andy Lutomirski <luto(a)kernel.org> --- The 0day bot is still chewing on this, but I've tested it a bit locally and it seems to do the right thing. I've never observed the bug it fixes, but it does appear to fix a bug unless I've missed something. It's also a prerequisite for Nadav's fixmap bugfix. arch/x86/events/core.c | 2 +- arch/x86/include/asm/tlbflush.h | 16 ++++++++++++++++ arch/x86/lib/usercopy.c | 5 +++++ arch/x86/mm/tlb.c | 3 +++ 4 files changed, 25 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 5f4829f10129..dfb2f7c0d019 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -2465,7 +2465,7 @@ perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs perf_callchain_store(entry, regs->ip); - if (!current->mm) + if (!nmi_uaccess_okay()) return; if (perf_callchain_user32(regs, entry)) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 89a73bc31622..b23b2625793b 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -230,6 +230,22 @@ struct tlb_state { }; DECLARE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate); +/* + * Blindly accessing user memory from NMI context can be dangerous + * if we're in the middle of switching the current user task or + * switching the loaded mm. It can also be dangerous if we + * interrupted some kernel code that was temporarily using a + * different mm. + */ +static inline bool nmi_uaccess_okay(void) +{ + struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm); + struct mm_struct *current_mm = current->mm; + + return current_mm && loaded_mm == current_mm && + loaded_mm->pgd == __va(read_cr3_pa()); +} + /* Initialize cr4 shadow for this CPU. */ static inline void cr4_init_shadow(void) { diff --git a/arch/x86/lib/usercopy.c b/arch/x86/lib/usercopy.c index c8c6ad0d58b8..3f435d7fca5e 100644 --- a/arch/x86/lib/usercopy.c +++ b/arch/x86/lib/usercopy.c @@ -7,6 +7,8 @@ #include <linux/uaccess.h> #include <linux/export.h> +#include <asm/tlbflush.h> + /* * We rely on the nested NMI work to allow atomic faults from the NMI path; the * nested NMI paths are careful to preserve CR2. @@ -19,6 +21,9 @@ copy_from_user_nmi(void *to, const void __user *from, unsigned long n) if (__range_not_ok(from, n, TASK_SIZE)) return n; + if (!nmi_uaccess_okay()) + return n; + /* * Even though this function is typically called from NMI/IRQ context * disable pagefaults so that its behaviour is consistent even when diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 457b281b9339..f4b41d5a93dd 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -345,6 +345,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, */ trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL); } else { + /* Let NMI code know that CR3 may not match expectations. */ + this_cpu_write(cpu_tlbstate.loaded_mm, NULL); + /* The new ASID is already up to date. */ load_new_mm_cr3(next->pgd, new_asid, false); -- 2.17.1

7 years, 3 months

3
12
0 0

Applied "regulator: bd71837: Disable voltage monitoring for LDO3/4" to the regulator tree

by Mark Brown

The patch regulator: bd71837: Disable voltage monitoring for LDO3/4 has been applied to the regulator tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark >From 823f18f8b860526fc099c222619a126d57d2ad8c Mon Sep 17 00:00:00 2001 From: Matti Vaittinen <matti.vaittinen(a)fi.rohmeurope.com> Date: Wed, 29 Aug 2018 15:36:10 +0300 Subject: [PATCH] regulator: bd71837: Disable voltage monitoring for LDO3/4 There is a HW quirk in BD71837. The shutdown sequence timings for bucks/LDOs which are enabled via register interface are changed. At PMIC poweroff the voltage for BUCK6/7 is cut immediately at the beginning of shut-down sequence. This causes LDO5/6 voltage monitoring to detect under voltage and force PMIC to emergency state instead of poweroff. Disable voltage monitoring for LDO5 and LDO6 at probe to avoid this. Signed-off-by: Matti Vaittinen <matti.vaittinen(a)fi.rohmeurope.com> Signed-off-by: Mark Brown <broonie(a)kernel.org> Cc: stable(a)vger.kernel.org --- drivers/regulator/bd71837-regulator.c | 19 +++++++++++++++ include/linux/mfd/rohm-bd718x7.h | 33 ++++++++++++++++++++++++--- 2 files changed, 49 insertions(+), 3 deletions(-) diff --git a/drivers/regulator/bd71837-regulator.c b/drivers/regulator/bd71837-regulator.c index 0f8ac8dec3e1..a1bd8aaf4d98 100644 --- a/drivers/regulator/bd71837-regulator.c +++ b/drivers/regulator/bd71837-regulator.c @@ -569,6 +569,25 @@ static int bd71837_probe(struct platform_device *pdev) BD71837_REG_REGLOCK); } + /* + * There is a HW quirk in BD71837. The shutdown sequence timings for + * bucks/LDOs which are controlled via register interface are changed. + * At PMIC poweroff the voltage for BUCK6/7 is cut immediately at the + * beginning of shut-down sequence. As bucks 6 and 7 are parent + * supplies for LDO5 and LDO6 - this causes LDO5/6 voltage + * monitoring to errorneously detect under voltage and force PMIC to + * emergency state instead of poweroff. In order to avoid this we + * disable voltage monitoring for LDO5 and LDO6 + */ + err = regmap_update_bits(pmic->mfd->regmap, BD718XX_REG_MVRFLTMASK2, + BD718XX_LDO5_VRMON80 | BD718XX_LDO6_VRMON80, + BD718XX_LDO5_VRMON80 | BD718XX_LDO6_VRMON80); + if (err) { + dev_err(&pmic->pdev->dev, + "Failed to disable voltage monitoring\n"); + goto err; + } + for (i = 0; i < ARRAY_SIZE(pmic_regulator_inits); i++) { struct regulator_desc *desc; diff --git a/include/linux/mfd/rohm-bd718x7.h b/include/linux/mfd/rohm-bd718x7.h index a528747f8aed..e8338e5dc10b 100644 --- a/include/linux/mfd/rohm-bd718x7.h +++ b/include/linux/mfd/rohm-bd718x7.h @@ -78,9 +78,9 @@ enum { BD71837_REG_TRANS_COND0 = 0x1F, BD71837_REG_TRANS_COND1 = 0x20, BD71837_REG_VRFAULTEN = 0x21, - BD71837_REG_MVRFLTMASK0 = 0x22, - BD71837_REG_MVRFLTMASK1 = 0x23, - BD71837_REG_MVRFLTMASK2 = 0x24, + BD718XX_REG_MVRFLTMASK0 = 0x22, + BD718XX_REG_MVRFLTMASK1 = 0x23, + BD718XX_REG_MVRFLTMASK2 = 0x24, BD71837_REG_RCVCFG = 0x25, BD71837_REG_RCVNUM = 0x26, BD71837_REG_PWRONCONFIG0 = 0x27, @@ -159,6 +159,33 @@ enum { #define BUCK8_MASK 0x3F #define BUCK8_DEFAULT 0x1E +/* BD718XX Voltage monitoring masks */ +#define BD718XX_BUCK1_VRMON80 0x1 +#define BD718XX_BUCK1_VRMON130 0x2 +#define BD718XX_BUCK2_VRMON80 0x4 +#define BD718XX_BUCK2_VRMON130 0x8 +#define BD718XX_1ST_NODVS_BUCK_VRMON80 0x1 +#define BD718XX_1ST_NODVS_BUCK_VRMON130 0x2 +#define BD718XX_2ND_NODVS_BUCK_VRMON80 0x4 +#define BD718XX_2ND_NODVS_BUCK_VRMON130 0x8 +#define BD718XX_3RD_NODVS_BUCK_VRMON80 0x10 +#define BD718XX_3RD_NODVS_BUCK_VRMON130 0x20 +#define BD718XX_4TH_NODVS_BUCK_VRMON80 0x40 +#define BD718XX_4TH_NODVS_BUCK_VRMON130 0x80 +#define BD718XX_LDO1_VRMON80 0x1 +#define BD718XX_LDO2_VRMON80 0x2 +#define BD718XX_LDO3_VRMON80 0x4 +#define BD718XX_LDO4_VRMON80 0x8 +#define BD718XX_LDO5_VRMON80 0x10 +#define BD718XX_LDO6_VRMON80 0x20 + +/* BD71837 specific voltage monitoring masks */ +#define BD71837_BUCK3_VRMON80 0x10 +#define BD71837_BUCK3_VRMON130 0x20 +#define BD71837_BUCK4_VRMON80 0x40 +#define BD71837_BUCK4_VRMON130 0x80 +#define BD71837_LDO7_VRMON80 0x40 + /* BD71837_REG_IRQ bits */ #define IRQ_SWRST 0x40 #define IRQ_PWRON_S 0x20 -- 2.18.0

7 years, 3 months

1
0
0 0

[PATCH v3 2/2] btrfs: Ensure btrfs_trim_fs can trim the whole fs

by Qu Wenruo

[BUG] fstrim on some btrfs only trims the unallocated space, not trimming any space in existing block groups. [CAUSE] Before fstrim_range passed to btrfs_trim_fs(), it get truncated to range [0, super->total_bytes). So later btrfs_trim_fs() will only be able to trim block groups in range [0, super->total_bytes). While for btrfs, any bytenr aligned to sector size is valid, since btrfs use its logical address space, there is nothing limiting the location where we put block groups. For btrfs with routine balance, it's quite easy to relocate all block groups and bytenr of block groups will start beyond super->total_bytes. In that case, btrfs will not trim existing block groups. [FIX] Just remove the truncation in btrfs_ioctl_fitrim(), so btrfs_trim_fs() can get the unmodified range, which is normally set to [0, U64_MAX]. Reported-by: Chris Murphy <lists(a)colorremedies.com> Fixes: f4c697e6406d ("btrfs: return EINVAL if start > total_bytes in fitrim ioctl") Cc: <stable(a)vger.kernel.org> # v4.0+ Signed-off-by: Qu Wenruo <wqu(a)suse.com> --- fs/btrfs/extent-tree.c | 10 +--------- fs/btrfs/ioctl.c | 11 +++++++---- 2 files changed, 8 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 7768f206196a..d1478d66c7a5 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -10851,21 +10851,13 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) u64 start; u64 end; u64 trimmed = 0; - u64 total_bytes = btrfs_super_total_bytes(fs_info->super_copy); u64 bg_failed = 0; u64 dev_failed = 0; int bg_ret = 0; int dev_ret = 0; int ret = 0; - /* - * try to trim all FS space, our block group may start from non-zero. - */ - if (range->len == total_bytes) - cache = btrfs_lookup_first_block_group(fs_info, range->start); - else - cache = btrfs_lookup_block_group(fs_info, range->start); - + cache = btrfs_lookup_first_block_group(fs_info, range->start); for (; cache; cache = next_block_group(fs_info, cache)) { if (cache->key.objectid >= (range->start + range->len)) { btrfs_put_block_group(cache); diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 63600dc2ac4c..8165a4bfa579 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -491,7 +491,6 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg) struct fstrim_range range; u64 minlen = ULLONG_MAX; u64 num_devices = 0; - u64 total_bytes = btrfs_super_total_bytes(fs_info->super_copy); int ret; if (!capable(CAP_SYS_ADMIN)) @@ -515,11 +514,15 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg) return -EOPNOTSUPP; if (copy_from_user(&range, arg, sizeof(range))) return -EFAULT; - if (range.start > total_bytes || - range.len < fs_info->sb->s_blocksize) + + /* + * NOTE: Don't truncate the range using super->total_bytes. + * Bytenr of btrfs block group is in btrfs logical address space, + * which can be any sector size aligned bytenr in [0, U64_MAX]. + */ + if (range.len < fs_info->sb->s_blocksize) return -EINVAL; - range.len = min(range.len, total_bytes - range.start); range.minlen = max(range.minlen, minlen); ret = btrfs_trim_fs(fs_info, &range); if (ret < 0) -- 2.18.0

7 years, 3 months

2
1
0 0

[PATCH] cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status

by Scott Bauer

Like d88b6d04: "cdrom: information leak in cdrom_ioctl_media_changed()" There is another cast from unsigned long to int which causes a bounds check to fail with specially crafted input. The value is then used as an index in the slot array in cdrom_slot_status(). Signed-off-by: Scott Bauer <scott.bauer(a)intel.com> Signed-off-by: Scott Bauer <sbauer(a)plzdonthack.me> Cc: stable(a)vger.kernel.org --- drivers/cdrom/cdrom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c index bfc566d3f31a..8cfa10ab7abc 100644 --- a/drivers/cdrom/cdrom.c +++ b/drivers/cdrom/cdrom.c @@ -2542,7 +2542,7 @@ static int cdrom_ioctl_drive_status(struct cdrom_device_info *cdi, if (!CDROM_CAN(CDC_SELECT_DISC) || (arg == CDSL_CURRENT || arg == CDSL_NONE)) return cdi->ops->drive_status(cdi, CDSL_CURRENT); - if (((int)arg >= cdi->capacity)) + if (arg >= cdi->capacity) return -EINVAL; return cdrom_slot_status(cdi, arg); } -- 2.14.1

7 years, 3 months

3
3
0 0

reboot on wandboard fails with v4.14.67 (bisected to 2059e527a6)

by Rasmus Villemoes

We're using imx_v6_v7_defconfig on our Wandboards. After upgrading to v4.14.67, reboot no longer works (or, well, takes a very long time when the watchdog is configured). v4.14.66 works fine, the breakage bisects to 2059e527a659cf16d6bb709f1c8509f7a7623fc4 (ARM: imx_v6_v7_defconfig: Select ULPI support), and reverting that on top of v4.14.67 again works. Rasmus

7 years, 3 months

1
2
0 0

[PATCH v2 1/3] x86/mm: Restructure sme_encrypt_kernel()

by Brijesh Singh

Re-arrange the sme_encrypt_kernel() by moving the workarea map/unmap logic in a separate static function. There are no logical changes in this patch. The restructuring will allow us to expand the sme_encrypt_kernel in future. Signed-off-by: Brijesh Singh <brijesh.singh(a)amd.com> Cc: stable(a)vger.kernel.org Cc: Tom Lendacky <thomas.lendacky(a)amd.com> Cc: kvm(a)vger.kernel.org Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Borislav Petkov <bp(a)suse.de> Cc: "H. Peter Anvin" <hpa(a)zytor.com> Cc: linux-kernel(a)vger.kernel.org Cc: Paolo Bonzini <pbonzini(a)redhat.com> Cc: Sean Christopherson <sean.j.christopherson(a)intel.com> Cc: kvm(a)vger.kernel.org Cc: "Radim Krčmář" <rkrcmar(a)redhat.com> --- arch/x86/mm/mem_encrypt_identity.c | 160 ++++++++++++++++++++++++------------- 1 file changed, 104 insertions(+), 56 deletions(-) diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c index 7ae3686..bf6097e 100644 --- a/arch/x86/mm/mem_encrypt_identity.c +++ b/arch/x86/mm/mem_encrypt_identity.c @@ -72,6 +72,22 @@ struct sme_populate_pgd_data { unsigned long vaddr_end; }; +struct sme_workarea_data { + unsigned long kernel_start; + unsigned long kernel_end; + unsigned long kernel_len; + + unsigned long initrd_start; + unsigned long initrd_end; + unsigned long initrd_len; + + unsigned long workarea_start; + unsigned long workarea_end; + unsigned long workarea_len; + + unsigned long decrypted_base; +}; + static char sme_cmdline_arg[] __initdata = "mem_encrypt"; static char sme_cmdline_on[] __initdata = "on"; static char sme_cmdline_off[] __initdata = "off"; @@ -266,19 +282,17 @@ static unsigned long __init sme_pgtable_calc(unsigned long len) return entries + tables; } -void __init sme_encrypt_kernel(struct boot_params *bp) +static void __init build_workarea_map(struct boot_params *bp, + struct sme_workarea_data *wa, + struct sme_populate_pgd_data *ppd) { unsigned long workarea_start, workarea_end, workarea_len; unsigned long execute_start, execute_end, execute_len; unsigned long kernel_start, kernel_end, kernel_len; unsigned long initrd_start, initrd_end, initrd_len; - struct sme_populate_pgd_data ppd; unsigned long pgtable_area_len; unsigned long decrypted_base; - if (!sme_active()) - return; - /* * Prepare for encrypting the kernel and initrd by building new * pagetables with the necessary attributes needed to encrypt the @@ -358,17 +372,17 @@ void __init sme_encrypt_kernel(struct boot_params *bp) * pagetables and when the new encrypted and decrypted kernel * mappings are populated. */ - ppd.pgtable_area = (void *)execute_end; + ppd->pgtable_area = (void *)execute_end; /* * Make sure the current pagetable structure has entries for * addressing the workarea. */ - ppd.pgd = (pgd_t *)native_read_cr3_pa(); - ppd.paddr = workarea_start; - ppd.vaddr = workarea_start; - ppd.vaddr_end = workarea_end; - sme_map_range_decrypted(&ppd); + ppd->pgd = (pgd_t *)native_read_cr3_pa(); + ppd->paddr = workarea_start; + ppd->vaddr = workarea_start; + ppd->vaddr_end = workarea_end; + sme_map_range_decrypted(ppd); /* Flush the TLB - no globals so cr3 is enough */ native_write_cr3(__native_read_cr3()); @@ -379,9 +393,9 @@ void __init sme_encrypt_kernel(struct boot_params *bp) * then be populated with new PUDs and PMDs as the encrypted and * decrypted kernel mappings are created. */ - ppd.pgd = ppd.pgtable_area; - memset(ppd.pgd, 0, sizeof(pgd_t) * PTRS_PER_PGD); - ppd.pgtable_area += sizeof(pgd_t) * PTRS_PER_PGD; + ppd->pgd = ppd->pgtable_area; + memset(ppd->pgd, 0, sizeof(pgd_t) * PTRS_PER_PGD); + ppd->pgtable_area += sizeof(pgd_t) * PTRS_PER_PGD; /* * A different PGD index/entry must be used to get different @@ -399,75 +413,109 @@ void __init sme_encrypt_kernel(struct boot_params *bp) decrypted_base <<= PGDIR_SHIFT; /* Add encrypted kernel (identity) mappings */ - ppd.paddr = kernel_start; - ppd.vaddr = kernel_start; - ppd.vaddr_end = kernel_end; - sme_map_range_encrypted(&ppd); + ppd->paddr = kernel_start; + ppd->vaddr = kernel_start; + ppd->vaddr_end = kernel_end; + sme_map_range_encrypted(ppd); /* Add decrypted, write-protected kernel (non-identity) mappings */ - ppd.paddr = kernel_start; - ppd.vaddr = kernel_start + decrypted_base; - ppd.vaddr_end = kernel_end + decrypted_base; - sme_map_range_decrypted_wp(&ppd); + ppd->paddr = kernel_start; + ppd->vaddr = kernel_start + decrypted_base; + ppd->vaddr_end = kernel_end + decrypted_base; + sme_map_range_decrypted_wp(ppd); if (initrd_len) { /* Add encrypted initrd (identity) mappings */ - ppd.paddr = initrd_start; - ppd.vaddr = initrd_start; - ppd.vaddr_end = initrd_end; - sme_map_range_encrypted(&ppd); + ppd->paddr = initrd_start; + ppd->vaddr = initrd_start; + ppd->vaddr_end = initrd_end; + sme_map_range_encrypted(ppd); /* * Add decrypted, write-protected initrd (non-identity) mappings */ - ppd.paddr = initrd_start; - ppd.vaddr = initrd_start + decrypted_base; - ppd.vaddr_end = initrd_end + decrypted_base; - sme_map_range_decrypted_wp(&ppd); + ppd->paddr = initrd_start; + ppd->vaddr = initrd_start + decrypted_base; + ppd->vaddr_end = initrd_end + decrypted_base; + sme_map_range_decrypted_wp(ppd); } /* Add decrypted workarea mappings to both kernel mappings */ - ppd.paddr = workarea_start; - ppd.vaddr = workarea_start; - ppd.vaddr_end = workarea_end; - sme_map_range_decrypted(&ppd); + ppd->paddr = workarea_start; + ppd->vaddr = workarea_start; + ppd->vaddr_end = workarea_end; + sme_map_range_decrypted(ppd); - ppd.paddr = workarea_start; - ppd.vaddr = workarea_start + decrypted_base; - ppd.vaddr_end = workarea_end + decrypted_base; - sme_map_range_decrypted(&ppd); + ppd->paddr = workarea_start; + ppd->vaddr = workarea_start + decrypted_base; + ppd->vaddr_end = workarea_end + decrypted_base; + sme_map_range_decrypted(ppd); - /* Perform the encryption */ - sme_encrypt_execute(kernel_start, kernel_start + decrypted_base, - kernel_len, workarea_start, (unsigned long)ppd.pgd); + wa->kernel_start = kernel_start; + wa->kernel_end = kernel_end; + wa->kernel_len = kernel_len; - if (initrd_len) - sme_encrypt_execute(initrd_start, initrd_start + decrypted_base, - initrd_len, workarea_start, - (unsigned long)ppd.pgd); + wa->initrd_start = initrd_start; + wa->initrd_end = initrd_end; + wa->initrd_len = initrd_len; + + wa->workarea_start = workarea_start; + wa->workarea_end = workarea_end; + wa->workarea_len = workarea_len; + + wa->decrypted_base = decrypted_base; +} +static void __init remove_workarea_map(struct sme_workarea_data *wa, + struct sme_populate_pgd_data *ppd) +{ /* * At this point we are running encrypted. Remove the mappings for * the decrypted areas - all that is needed for this is to remove * the PGD entry/entries. */ - ppd.vaddr = kernel_start + decrypted_base; - ppd.vaddr_end = kernel_end + decrypted_base; - sme_clear_pgd(&ppd); - - if (initrd_len) { - ppd.vaddr = initrd_start + decrypted_base; - ppd.vaddr_end = initrd_end + decrypted_base; - sme_clear_pgd(&ppd); + ppd->vaddr = wa->kernel_start + wa->decrypted_base; + ppd->vaddr_end = wa->kernel_end + wa->decrypted_base; + sme_clear_pgd(ppd); + + if (wa->initrd_len) { + ppd->vaddr = wa->initrd_start + wa->decrypted_base; + ppd->vaddr_end = wa->initrd_end + wa->decrypted_base; + sme_clear_pgd(ppd); } - ppd.vaddr = workarea_start + decrypted_base; - ppd.vaddr_end = workarea_end + decrypted_base; - sme_clear_pgd(&ppd); + ppd->vaddr = wa->workarea_start + wa->decrypted_base; + ppd->vaddr_end = wa->workarea_end + wa->decrypted_base; + sme_clear_pgd(ppd); /* Flush the TLB - no globals so cr3 is enough */ native_write_cr3(__native_read_cr3()); } +void __init sme_encrypt_kernel(struct boot_params *bp) +{ + struct sme_populate_pgd_data ppd; + struct sme_workarea_data wa; + + if (!sme_active()) + return; + + build_workarea_map(bp, &wa, &ppd); + + /* When SEV is active, encrypt kernel and initrd */ + sme_encrypt_execute(wa.kernel_start, + wa.kernel_start + wa.decrypted_base, + wa.kernel_len, wa.workarea_start, + (unsigned long)ppd.pgd); + + if (wa.initrd_len) + sme_encrypt_execute(wa.initrd_start, + wa.initrd_start + wa.decrypted_base, + wa.initrd_len, wa.workarea_start, + (unsigned long)ppd.pgd); + + remove_workarea_map(&wa, &ppd); +} + void __init sme_enable(struct boot_params *bp) { const char *cmdline_ptr, *cmdline_arg, *cmdline_on, *cmdline_off; -- 2.7.4

7 years, 3 months

2
1
0 0

Applied "spi: spi-fsl-dspi: fix broken DSPI_EOQ_MODE" to the spi tree

by Mark Brown

The patch spi: spi-fsl-dspi: fix broken DSPI_EOQ_MODE has been applied to the spi tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark >From 5223c9c1cbfc0cd4d0a1b50758e0949af3290fa1 Mon Sep 17 00:00:00 2001 From: Angelo Dureghello <angelo(a)sysam.it> Date: Sat, 18 Aug 2018 01:51:58 +0200 Subject: [PATCH] spi: spi-fsl-dspi: fix broken DSPI_EOQ_MODE This patch fixes the dspi_eoq_write function used by the ColdFire mcf5441x family. The 16 bit cmd part must be re-set at each data transfer. Also, now that fifo_size variables are used for eoq_read/write, a proper fifo size must be set (16 slots for the ColdFire dspi module version). Signed-off-by: Angelo Dureghello <angelo(a)sysam.it> Acked-by: Esben Haabendal <esben(a)haabendal.dk> Signed-off-by: Mark Brown <broonie(a)kernel.org> Cc: stable(a)vger.kernel.org --- drivers/spi/spi-fsl-dspi.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c index 7cb3ab0a35a0..3082e72e4f6c 100644 --- a/drivers/spi/spi-fsl-dspi.c +++ b/drivers/spi/spi-fsl-dspi.c @@ -30,7 +30,11 @@ #define DRIVER_NAME "fsl-dspi" +#ifdef CONFIG_M5441x +#define DSPI_FIFO_SIZE 16 +#else #define DSPI_FIFO_SIZE 4 +#endif #define DSPI_DMA_BUFSIZE (DSPI_FIFO_SIZE * 1024) #define SPI_MCR 0x00 @@ -623,9 +627,11 @@ static void dspi_tcfq_read(struct fsl_dspi *dspi) static void dspi_eoq_write(struct fsl_dspi *dspi) { int fifo_size = DSPI_FIFO_SIZE; + u16 xfer_cmd = dspi->tx_cmd; /* Fill TX FIFO with as many transfers as possible */ while (dspi->len && fifo_size--) { + dspi->tx_cmd = xfer_cmd; /* Request EOQF for last transfer in FIFO */ if (dspi->len == dspi->bytes_per_word || fifo_size == 0) dspi->tx_cmd |= SPI_PUSHR_CMD_EOQ; -- 2.18.0

7 years, 3 months

1
0
0 0

Please apply commit d1e20222d537 to 4.14.y and 4.18.y

by Jitendra Bhivare

Subject of the patch: iommu/arm-smmu: Error out only if not enough context interrupts Commit ID: 42b88ec6530fd76f1ae06de7f09830bcbca5bbd6 Why: ARM SMMU does not initialize for Broadcom Stingray SoC if bootloader reserves few contexts. Kernel should not error out if there are enough context interrupts. Patch

7 years, 3 months

1
0
0 0

Profil.

by Thomas Weber

Sehr geehrte Damen und Herren. Nachdem wir Ihre Webseite besucht und das Profil Ihrer Geschäftstätigkeit analysiert haben, wissen wir schon, wie Sie neue Kunden ab sofort gewinnen können. Wir verfügen über mehr als 100 000 Adressenangaben der potentiellen Kunden. Diese Daten wurden nach Branchen gegliedert. http://www.gbc-at.net/?page=catalog *** 1. Österreich 2018 ( 104 000 ) - 149 EUR ( bis zum 29.08.2018 ) *** Bitte informieren Sie sich über die weiteren Details einmal unverbindlich auf unseren Webseite: http://www.gbc-at.net/?page=catalog Die Adressensortierung je nach der Branche findet KOSTENLOS im beigelegten DataManager-Programm statt. Zusätzlich bieten wir KOSTENLOS das Tool zum automatischen Verschicken der Angebote an. MfG Thomas Weber GC-Team

7 years, 3 months

1
0
0 0

[PATCH] drm/i915/gvt: move intel_runtime_pm_get out of spin_lock in stop_schedule

by hang.yuan＠linux.intel.com

From: Hang Yuan <hang.yuan(a)linux.intel.com> pm_runtime_get_sync in intel_runtime_pm_get might sleep if i915 device is not active. When stop vgpu schedule, the device may be inactive. So need to move runtime_pm_get out of spin_lock/unlock. Fixes: b24881e0b0b6("drm/i915/gvt: Add runtime_pm_get/put into gvt_switch_mmio Cc: <stable(a)vger.kernel.org> Signed-off-by: Hang Yuan <hang.yuan(a)linux.intel.com> Signed-off-by: Xiong Zhang <xiong.y.zhang(a)intel.com> --- drivers/gpu/drm/i915/gvt/mmio_context.c | 2 -- drivers/gpu/drm/i915/gvt/sched_policy.c | 3 +++ 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c index 7e702c6..10e63ee 100644 --- a/drivers/gpu/drm/i915/gvt/mmio_context.c +++ b/drivers/gpu/drm/i915/gvt/mmio_context.c @@ -549,11 +549,9 @@ void intel_gvt_switch_mmio(struct intel_vgpu *pre, * performace for batch mmio read/write, so we need * handle forcewake mannually. */ - intel_runtime_pm_get(dev_priv); intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); switch_mmio(pre, next, ring_id); intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); - intel_runtime_pm_put(dev_priv); } /** diff --git a/drivers/gpu/drm/i915/gvt/sched_policy.c b/drivers/gpu/drm/i915/gvt/sched_policy.c index 09d7bb7..985fe81 100644 --- a/drivers/gpu/drm/i915/gvt/sched_policy.c +++ b/drivers/gpu/drm/i915/gvt/sched_policy.c @@ -426,6 +426,7 @@ void intel_vgpu_stop_schedule(struct intel_vgpu *vgpu) &vgpu->gvt->scheduler; int ring_id; struct vgpu_sched_data *vgpu_data = vgpu->sched_data; + struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv; if (!vgpu_data->active) return; @@ -444,6 +445,7 @@ void intel_vgpu_stop_schedule(struct intel_vgpu *vgpu) scheduler->current_vgpu = NULL; } + intel_runtime_pm_get(dev_priv); spin_lock_bh(&scheduler->mmio_context_lock); for (ring_id = 0; ring_id < I915_NUM_ENGINES; ring_id++) { if (scheduler->engine_owner[ring_id] == vgpu) { @@ -452,5 +454,6 @@ void intel_vgpu_stop_schedule(struct intel_vgpu *vgpu) } } spin_unlock_bh(&scheduler->mmio_context_lock); + intel_runtime_pm_put(dev_priv); mutex_unlock(&vgpu->gvt->sched_lock); } -- 2.7.4

7 years, 3 months

1
0
0 0

[PATCH] option: Do not try to bind to ADB interfaces

by Romain Izard

Some modems now use the Android Debug Bridge to provide a debugging interface, and some phones can also export serial ports managed by the "option" driver. The ADB daemon running in userspace tries to use USB interfaces with bDeviceClass=0xFF, bDeviceSubClass=0x42, bDeviceProtocol=1 Prevent the option driver from binding to those interfaces, as they will not be serial ports. This can fix issues like: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=781256 Signed-off-by: Romain Izard <romain.izard.pro(a)gmail.com> Cc: stable <stable(a)vger.kernel.org> --- drivers/usb/serial/option.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index 664e61f16b6a..f98943a57ff0 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -1987,6 +1987,12 @@ static int option_probe(struct usb_serial *serial, if (iface_desc->bInterfaceClass == USB_CLASS_MASS_STORAGE) return -ENODEV; + /* Do not bind Android Debug Bridge interfaces */ + if (iface_desc->bInterfaceClass == USB_CLASS_VENDOR_SPEC && + iface_desc->bInterfaceSubClass == 0x42 && + iface_desc->bInterfaceProtocol == 1) + return -ENODEV; + /* * Don't bind reserved interfaces (like network ones) which often have * the same class/subclass/protocol as the serial interfaces. Look at -- 2.17.1

7 years, 3 months

5
7
0 0

[gregkh@linuxfoundation.org: Patch "drm: re-enable error handling" has been added to the 4.4-stable tree]

by Nicholas Mc Guire

Hi ! this is also the wrong version of the patch - the proper version is below. This has been posted to lkml https://lkml.org/lkml/2018/7/18/191 but there was no review yet - the version you have though is for sure broken. So maybe this should be simply dropped until the fix is confirmed thx! hofrat ----- Forwarded message from gregkh(a)linuxfoundation.org ----- Date: Tue, 28 Aug 2018 16:11:46 +0200 From: gregkh(a)linuxfoundation.org To: 1531571532-22733-1-git-send-email-hofrat(a)osadl.org, alexander.levin(a)microsoft.com, gregkh(a)linuxfoundation.org, hofrat(a)osadl.org, seanpaul(a)chromium.org Cc: stable-commits(a)vger.kernel.org Subject: Patch "drm: re-enable error handling" has been added to the 4.4-stable tree This is a note to let you know that I've just added the patch titled drm: re-enable error handling to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: drm-re-enable-error-handling.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From foo@baz Tue Aug 28 16:10:37 CEST 2018 From: Nicholas Mc Guire <hofrat(a)osadl.org> Date: Sat, 14 Jul 2018 14:32:12 +0200 Subject: drm: re-enable error handling From: Nicholas Mc Guire <hofrat(a)osadl.org> [ Upstream commit d530b5f1ca0bb66958a2b714bebe40a1248b9c15 ] drm_legacy_ctxbitmap_next() returns idr_alloc() which can return -ENOMEM, -EINVAL or -ENOSPC none of which are -1 . but the call sites of drm_legacy_ctxbitmap_next() seem to be assuming that the error case would be -1 (original return of drm_ctxbitmap_next() prior to 2.6.23 was actually -1). Thus reenable error handling by checking for < 0. Signed-off-by: Nicholas Mc Guire <hofrat(a)osadl.org> Fixes: 62968144e673 ("drm: convert drm context code to use Linux idr") Signed-off-by: Sean Paul <seanpaul(a)chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/1531571532-22733-1-git-send-e… Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/gpu/drm/drm_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/gpu/drm/drm_context.c +++ b/drivers/gpu/drm/drm_context.c @@ -372,7 +372,7 @@ int drm_legacy_addctx(struct drm_device ctx->handle = drm_legacy_ctxbitmap_next(dev); } DRM_DEBUG("%d\n", ctx->handle); - if (ctx->handle == -1) { + if (ctx->handle < 0) { DRM_DEBUG("Not enough free contexts.\n"); /* Should this return -EBUSY instead? */ return -ENOMEM; Patches currently in stable-queue which might be from hofrat(a)osadl.org are queue-4.4/drm-re-enable-error-handling.patch queue-4.4/can-mpc5xxx_can-check-of_iomap-return-before-use.patch ----- End forwarded message -----

7 years, 3 months

1
0
0 0

[PATCH] x86/speculation/l1tf: fix off-by-one error when warning that system has too much RAM

by Vlastimil Babka

Two users have reported [1] that they have an "extremely unlikely" system with more than MAX_PA/2 memory and L1TF mitigation is not effective. In fact it's a CPU with 36bits phys limit (64GB) and 32GB memory, but due to holes in the e820 map, the main region is almost 500MB over the 32GB limit: [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000081effffff] usable Suggestions to use 'mem=32G' to prefer L1TF mitigation while losing the 500MB revealed, that there's an off-by-one error in the check in l1tf_select_mitigation(). l1tf_pfn_limit() returns the last usable pfn (inclusive), but it's more common and hopefully less error-prone to return the first pfn that's over limit, so this patch changes that and updates the other callers. [1] https://bugzilla.suse.com/show_bug.cgi?id=1105536 Reported-by: George Anchev <studio(a)anchev.net> Reported-by: Christopher Snowhill <kode54(a)gmail.com> Fixes: 17dbca119312 ("x86/speculation/l1tf: Add sysfs reporting for l1tf") Cc: stable(a)vger.kernel.org Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz> --- arch/x86/include/asm/processor.h | 2 +- arch/x86/mm/init.c | 2 +- arch/x86/mm/mmap.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index a0a52274cb4a..c24297268ebc 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -183,7 +183,7 @@ extern void cpu_detect(struct cpuinfo_x86 *c); static inline unsigned long long l1tf_pfn_limit(void) { - return BIT_ULL(boot_cpu_data.x86_phys_bits - 1 - PAGE_SHIFT) - 1; + return BIT_ULL(boot_cpu_data.x86_phys_bits - 1 - PAGE_SHIFT); } extern void early_cpu_init(void); diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 02de3d6065c4..63a6f9fcaf20 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -923,7 +923,7 @@ unsigned long max_swapfile_size(void) if (boot_cpu_has_bug(X86_BUG_L1TF)) { /* Limit the swap file size to MAX_PA/2 for L1TF workaround */ - unsigned long long l1tf_limit = l1tf_pfn_limit() + 1; + unsigned long long l1tf_limit = l1tf_pfn_limit(); /* * We encode swap offsets also with 3 bits below those for pfn * which makes the usable limit higher. diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index f40ab8185d94..1e95d57760cf 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -257,7 +257,7 @@ bool pfn_modify_allowed(unsigned long pfn, pgprot_t prot) /* If it's real memory always allow */ if (pfn_valid(pfn)) return true; - if (pfn > l1tf_pfn_limit() && !capable(CAP_SYS_ADMIN)) + if (pfn >= l1tf_pfn_limit() && !capable(CAP_SYS_ADMIN)) return false; return true; } -- 2.18.0

7 years, 3 months

7
20
0 0

perf/x86/intel/uncore: propose to support IIO free-running counters in 4.14

by Jin, Yao

Hi, The upstream kernel has supported IIO free-running counters on Skylake server. As of Skylake Server, there are a number of free running counters in each IIO Box that collect counts of per-box IO clocks and per-port Input/Output x BW/Utilization. There are three types of IIO free-running counters on Skylake server: 1. IO CLOCKS counter: a clock of IIO box. 2. BANDWIDTH counters: count inbound(PCIe->CPU)/outbound(CPU->PCIe) bandwidth. 3. UTILIZATION counters: count input/output utilization. With these IIO free-running counters, we can get good observation for IIO traffic on Skylake server. For example, we can see the IIO inbound bandwidth (PCIe->CPU). root@skx /sys/devices# perf stat -a -e uncore_iio_free_running_2/bw_in_port0/ ^C Performance counter stats for 'system wide': 153.19 MiB uncore_iio_free_running_2/bw_in_port0/ 8.037701069 seconds time elapsed I propose to backport the patches which support IIO free-running counters to 4.14 stable kernel. perf/x86/intel/uncore: Introduce customized event_read() for client IMC uncore 2da331465f44f9618abe8837d1a68405d550b66e perf/x86/intel/uncore: Add new data structures for free running counters 927b2deb067b8b4753fc09c7a42092f43fc0c1f6 perf/x86/intel/uncore: Add infrastructure for free running counters 0e0162dfcd1fbe4c711ee86f24f966c318999603 perf/x86/intel/uncore: Support IIO free-running counters on SKX 0f519f0352e37e7d71bdce5559517c74a35f6e33 perf/x86/intel/uncore: Expose uncore_pmu_event*() functions 5a6c9d94e9ed7410142bc6fcb638a4db1895aa0c perf/x86/intel/uncore: Clean up client IMC uncore 9aae1780e7e81e54edfb70ba33ead5b0b48be009 Thanks Jin Yao

7 years, 3 months

2
4
0 0

[PATCH v3] modules_install: make missing $DEPMOD a warning instead of an error

by Randy Dunlap

From: Randy Dunlap <rdunlap(a)infradead.org> When $DEPMOD is not found, only print a warning instead of exiting with an error message and error status. Warning: 'make modules_install' requires /sbin/depmod. Please install it. This is probably in the kmod package. Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org> Fixes: 934193a654c1 ("kbuild: verify that $DEPMOD is installed") Cc: stable(a)vger.kernel.org Cc: Lucas De Marchi <lucas.demarchi(a)profusion.mobi> Cc: Lucas De Marchi <lucas.de.marchi(a)gmail.com> Cc: Michal Marek <michal.lkml(a)markovi.net> Cc: Jessica Yu <jeyu(a)kernel.org> Cc: Chih-Wei Huang <cwhuang(a)linux.org.tw> Cc: H. Nikolaus Schaller <hns(a)goldelico.com> --- v2: add missing "exit 0" and update the commit message (no Error). v3: add Fixes: and Cc: stable scripts/depmod.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- lnx-418.orig/scripts/depmod.sh +++ lnx-418/scripts/depmod.sh @@ -15,9 +15,9 @@ if ! test -r System.map ; then fi if [ -z $(command -v $DEPMOD) ]; then - echo "'make modules_install' requires $DEPMOD. Please install it." >&2 + echo "Warning: 'make modules_install' requires $DEPMOD. Please install it." >&2 echo "This is probably in the kmod package." >&2 - exit 1 + exit 0 fi # older versions of depmod require the version string to start with three

7 years, 3 months

3
4
0 0

+ uapi-linux-keyctlh-dont-use-c-reserved-keyword-as-a-struct-member-name.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name has been added to the -mm tree. Its filename is uapi-linux-keyctlh-dont-use-c-reserved-keyword-as-a-struct-member-name.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/uapi-linux-keyctlh-dont-use-c-rese… and later at http://ozlabs.org/~akpm/mmotm/broken-out/uapi-linux-keyctlh-dont-use-c-rese… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Randy Dunlap <rdunlap(a)infradead.org> Subject: uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name Since this header is in "include/uapi/linux/", apparently people want to use it in userspace programs -- even in C++ ones. However, the header uses a C++ reserved keyword ("private"), so change that to "dh_private" instead to allow the header file to be used in C++ userspace. Fixes https://bugzilla.kernel.org/show_bug.cgi?id=191051 Link: http://lkml.kernel.org/r/0db6c314-1ef4-9bfa-1baa-7214dd2ee061@infradead.org Fixes: ddbb41148724 ("KEYS: Add KEYCTL_DH_COMPUTE command") Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org> Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org> Cc: David Howells <dhowells(a)redhat.com> Cc: James Morris <jmorris(a)namei.org> Cc: "Serge E. Hallyn" <serge(a)hallyn.com> Cc: Mat Martineau <mathew.j.martineau(a)linux.intel.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/uapi/linux/keyctl.h | 2 +- security/keys/dh.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --- a/include/uapi/linux/keyctl.h~uapi-linux-keyctlh-dont-use-c-reserved-keyword-as-a-struct-member-name +++ a/include/uapi/linux/keyctl.h @@ -65,7 +65,7 @@ /* keyctl structures */ struct keyctl_dh_params { - __s32 private; + __s32 dh_private; __s32 prime; __s32 base; }; --- a/security/keys/dh.c~uapi-linux-keyctlh-dont-use-c-reserved-keyword-as-a-struct-member-name +++ a/security/keys/dh.c @@ -300,7 +300,7 @@ long __keyctl_dh_compute(struct keyctl_d } dh_inputs.g_size = dlen; - dlen = dh_data_from_key(pcopy.private, &dh_inputs.key); + dlen = dh_data_from_key(pcopy.dh_private, &dh_inputs.key); if (dlen < 0) { ret = dlen; goto out2; _ Patches currently in -mm which might be from rdunlap(a)infradead.org are uapi-linux-keyctlh-dont-use-c-reserved-keyword-as-a-struct-member-name.patch

7 years, 3 months

1
0
0 0

+ memory_hotplug-fix-kernel_panic-on-offline-page-processing.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: memory_hotplug: fix kernel_panic on offline page processing has been added to the -mm tree. Its filename is memory_hotplug-fix-kernel_panic-on-offline-page-processing.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/memory_hotplug-fix-kernel_panic-on… and later at http://ozlabs.org/~akpm/mmotm/broken-out/memory_hotplug-fix-kernel_panic-on… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Mikhail Zaslonko <zaslonko(a)linux.ibm.com> Subject: memory_hotplug: fix kernel_panic on offline page processing Within show_valid_zones() the function test_pages_in_a_zone() should be called for online memory blocks only. Otherwise it might lead to the VM_BUG_ON due to uninitialized struct pages (when CONFIG_DEBUG_VM_PGFLAGS kernel option is set): page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) ------------[ cut here ]------------ Call Trace: ([<000000000038f91e>] test_pages_in_a_zone+0xe6/0x168) [<0000000000923472>] show_valid_zones+0x5a/0x1a8 [<0000000000900284>] dev_attr_show+0x3c/0x78 [<000000000046f6f0>] sysfs_kf_seq_show+0xd0/0x150 [<00000000003ef662>] seq_read+0x212/0x4b8 [<00000000003bf202>] __vfs_read+0x3a/0x178 [<00000000003bf3ca>] vfs_read+0x8a/0x148 [<00000000003bfa3a>] ksys_read+0x62/0xb8 [<0000000000bc2220>] system_call+0xdc/0x2d8 That VM_BUG_ON was triggered by the page poisoning introduced in mm/sparse.c with the git commit d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug") With the same commit the new 'nid' field has been added to the struct memory_block in order to store and later on derive the node id for offline pages (instead of accessing struct page which might be uninitialized). But one reference to nid in show_valid_zones() function has been overlooked. Fixed with current commit. Also, nr_pages will not be used any more after test_pages_in_a_zone() call, do not update it. Link: http://lkml.kernel.org/r/20180828090539.41491-1-zaslonko@linux.ibm.com Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug") Signed-off-by: Mikhail Zaslonko <zaslonko(a)linux.ibm.com> Acked-by: Michal Hocko <mhocko(a)suse.com> Reviewed-by: Pavel Tatashin <pavel.tatashin(a)microsoft.com> Cc: <stable(a)vger.kernel.org> [4.17+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- drivers/base/memory.c | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-) --- a/drivers/base/memory.c~memory_hotplug-fix-kernel_panic-on-offline-page-processing +++ a/drivers/base/memory.c @@ -417,25 +417,23 @@ static ssize_t show_valid_zones(struct d int nid; /* - * The block contains more than one zone can not be offlined. - * This can happen e.g. for ZONE_DMA and ZONE_DMA32 - */ - if (!test_pages_in_a_zone(start_pfn, start_pfn + nr_pages, &valid_start_pfn, &valid_end_pfn)) - return sprintf(buf, "none\n"); - - start_pfn = valid_start_pfn; - nr_pages = valid_end_pfn - start_pfn; - - /* * Check the existing zone. Make sure that we do that only on the * online nodes otherwise the page_zone is not reliable */ if (mem->state == MEM_ONLINE) { + /* + * The block contains more than one zone can not be offlined. + * This can happen e.g. for ZONE_DMA and ZONE_DMA32 + */ + if (!test_pages_in_a_zone(start_pfn, start_pfn + nr_pages, + &valid_start_pfn, &valid_end_pfn)) + return sprintf(buf, "none\n"); + start_pfn = valid_start_pfn; strcat(buf, page_zone(pfn_to_page(start_pfn))->name); goto out; } - nid = pfn_to_nid(start_pfn); + nid = mem->nid; default_zone = zone_for_pfn_range(MMOP_ONLINE_KEEP, nid, start_pfn, nr_pages); strcat(buf, default_zone->name); _ Patches currently in -mm which might be from zaslonko(a)linux.ibm.com are memory_hotplug-fix-kernel_panic-on-offline-page-processing.patch

7 years, 3 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror