The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From dc7a10ddee0c56c6d891dd18de5c4ee9869545e0 Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim <jaegeuk(a)kernel.org>
Date: Fri, 30 Mar 2018 17:58:13 -0700
Subject: [PATCH] f2fs: truncate preallocated blocks in error case
If write is failed, we must deallocate the blocks that we couldn't write.
Cc: stable(a)vger.kernel.org
Reviewed-by: Chao Yu <yuchao0(a)huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 8068b015ece5..6b94f19b3fa8 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -2911,6 +2911,8 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
ret = generic_write_checks(iocb, from);
if (ret > 0) {
+ bool preallocated = false;
+ size_t target_size = 0;
int err;
if (iov_iter_fault_in_readable(from, iov_iter_count(from)))
@@ -2927,6 +2929,9 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
}
} else {
+ preallocated = true;
+ target_size = iocb->ki_pos + iov_iter_count(from);
+
err = f2fs_preallocate_blocks(iocb, from);
if (err) {
clear_inode_flag(inode, FI_NO_PREALLOC);
@@ -2939,6 +2944,10 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
blk_finish_plug(&plug);
clear_inode_flag(inode, FI_NO_PREALLOC);
+ /* if we couldn't write data, we should deallocate blocks. */
+ if (preallocated && i_size_read(inode) < target_size)
+ f2fs_truncate(inode);
+
if (ret > 0)
f2fs_update_iostat(F2FS_I_SB(inode), APP_WRITE_IO, ret);
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 31286a8484a85e8b4e91ddb0f5415aee8a416827 Mon Sep 17 00:00:00 2001
From: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Date: Thu, 5 Apr 2018 16:23:05 -0700
Subject: [PATCH] mm: hwpoison: disable memory error handling on 1GB hugepage
Recently the following BUG was reported:
Injecting memory failure for pfn 0x3c0000 at process virtual address 0x7fe300000000
Memory failure: 0x3c0000: recovery action for huge page: Recovered
BUG: unable to handle kernel paging request at ffff8dfcc0003000
IP: gup_pgd_range+0x1f0/0xc20
PGD 17ae72067 P4D 17ae72067 PUD 0
Oops: 0000 [#1] SMP PTI
...
CPU: 3 PID: 5467 Comm: hugetlb_1gb Not tainted 4.15.0-rc8-mm1-abc+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
You can easily reproduce this by calling madvise(MADV_HWPOISON) twice on
a 1GB hugepage. This happens because get_user_pages_fast() is not aware
of a migration entry on pud that was created in the 1st madvise() event.
I think that conversion to pud-aligned migration entry is working, but
other MM code walking over page table isn't prepared for it. We need
some time and effort to make all this work properly, so this patch
avoids the reported bug by just disabling error handling for 1GB
hugepage.
[n-horiguchi(a)ah.jp.nec.com: v2]
Link: http://lkml.kernel.org/r/1517284444-18149-1-git-send-email-n-horiguchi@ah.j…
Link: http://lkml.kernel.org/r/1517207283-15769-1-git-send-email-n-horiguchi@ah.j…
Signed-off-by: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: Punit Agrawal <punit.agrawal(a)arm.com>
Tested-by: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Anshuman Khandual <khandual(a)linux.vnet.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)linux.vnet.ibm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2e40a44a1fae..2e2be527642a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2613,6 +2613,7 @@ enum mf_action_page_type {
MF_MSG_POISONED_HUGE,
MF_MSG_HUGE,
MF_MSG_FREE_HUGE,
+ MF_MSG_NON_PMD_HUGE,
MF_MSG_UNMAP_FAILED,
MF_MSG_DIRTY_SWAPCACHE,
MF_MSG_CLEAN_SWAPCACHE,
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 8291b75f42c8..2d4bf647cf01 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -502,6 +502,7 @@ static const char * const action_page_types[] = {
[MF_MSG_POISONED_HUGE] = "huge page already hardware poisoned",
[MF_MSG_HUGE] = "huge page",
[MF_MSG_FREE_HUGE] = "free huge page",
+ [MF_MSG_NON_PMD_HUGE] = "non-pmd-sized huge page",
[MF_MSG_UNMAP_FAILED] = "unmapping failed page",
[MF_MSG_DIRTY_SWAPCACHE] = "dirty swapcache page",
[MF_MSG_CLEAN_SWAPCACHE] = "clean swapcache page",
@@ -1084,6 +1085,21 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
return 0;
}
+ /*
+ * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so
+ * simply disable it. In order to make it work properly, we need
+ * make sure that:
+ * - conversion of a pud that maps an error hugetlb into hwpoison
+ * entry properly works, and
+ * - other mm code walking over page table is aware of pud-aligned
+ * hwpoison entries.
+ */
+ if (huge_page_size(page_hstate(head)) > PMD_SIZE) {
+ action_result(pfn, MF_MSG_NON_PMD_HUGE, MF_IGNORED);
+ res = -EBUSY;
+ goto out;
+ }
+
if (!hwpoison_user_mappings(p, pfn, flags, &head)) {
action_result(pfn, MF_MSG_UNMAP_FAILED, MF_IGNORED);
res = -EBUSY;
Please consider the upstream commit below for the 4.14.y branch.
Without the fix the configuration mentioned in the commit
message crashes every time immediately at boot. What's even
worse, at least in our setup this crash is completely silent
and the computer just seems to hang, so the user gets no
hints what actually happened.
commit 10d94ff4d558b96bfc4f55bb0051ae4d938246fe
Author: Rakib Mullick <rakib.mullick(a)gmail.com>
Date: Wed Nov 1 10:14:51 2017 +0600
irq/core: Fix boot crash when the irqaffinity= boot parameter is passed on CPUMASK_OFFSTACK=y kernels(v1)
Jiri Slaby noticed that the backport of upstream commit 25cc72a33835
("mlxsw: spectrum: Forbid linking to devices that have uppers") to
kernel 4.9.y introduced the same check twice in the same function
instead of in two different places.
Fix this by relocating one of the checks to its intended place, thus
preventing unsupported configurations as described in the original
commit.
Fixes: 73ee5a73e75f ("mlxsw: spectrum: Forbid linking to devices that have uppers")
Signed-off-by: Ido Schimmel <idosch(a)mellanox.com>
Reported-by: Jiri Slaby <jslaby(a)suse.cz>
---
Greg, didn't hear from you, so posting v2. Removed the "commit <sha1>
upstream" line from the changelog which I think is what caused the
confusion. Please let me know if further changes are required. Thanks.
---
drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index d50350c7adc4..22a5916e477e 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -4187,10 +4187,6 @@ static int mlxsw_sp_netdevice_port_upper_event(struct net_device *dev,
if (netif_is_lag_port(dev) && is_vlan_dev(upper_dev) &&
!netif_is_lag_master(vlan_dev_real_dev(upper_dev)))
return -EINVAL;
- if (!info->linking)
- break;
- if (netdev_has_any_upper_dev(upper_dev))
- return -EINVAL;
break;
case NETDEV_CHANGEUPPER:
upper_dev = info->upper_dev;
@@ -4566,6 +4562,8 @@ static int mlxsw_sp_netdevice_vport_event(struct net_device *dev,
return -EINVAL;
if (!info->linking)
break;
+ if (netdev_has_any_upper_dev(upper_dev))
+ return -EINVAL;
/* We can't have multiple VLAN interfaces configured on
* the same port and being members in the same bridge.
*/
--
2.14.4
Please consider the two upstream commits below for the 4.14.y
branch.
As a part of an automated test setup, we deploy a disk image into
various types of hardware. With the current 4.14.y kernel and
certain hardware configurations, the first attempt to write the
image to the disk always fails with 'Remote I/O error'. Retrying
the exact same command then always succeeds. The second patch
below fixes this issue allowing the first attempt to work. It
requires the first patch to compile without errors.
commit 425a4dba7953e35ffd096771973add6d2f40d2ed
Author: Ilya Dryomov <idryomov(a)gmail.com>
Date: Mon Oct 16 15:59:09 2017 +0200
block: factor out __blkdev_issue_zero_pages()
commit d5ce4c31d6df518dd8f63bbae20d7423c5018a6c
Author: Ilya Dryomov <idryomov(a)gmail.com>
Date: Mon Oct 16 15:59:10 2017 +0200
block: cope with WRITE ZEROES failing in blkdev_issue_zeroout()
The patch below does not apply to the 4.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From db6516a5e7ddb6dc72d167b920f2f272596ea22d Mon Sep 17 00:00:00 2001
From: Amir Goldstein <amir73il(a)gmail.com>
Date: Sun, 13 May 2018 22:54:44 -0400
Subject: [PATCH] ext4: do not update s_last_mounted of a frozen fs
If fs is frozen after mount and before the first file open, the
update of s_last_mounted bypasses freeze protection and prints out
a WARNING splat:
$ mount /vdf
$ fsfreeze -f /vdf
$ cat /vdf/foo
[ 31.578555] WARNING: CPU: 1 PID: 1415 at
fs/ext4/ext4_jbd2.c:53 ext4_journal_check_start+0x48/0x82
[ 31.614016] Call Trace:
[ 31.614997] __ext4_journal_start_sb+0xe4/0x1a4
[ 31.616771] ? ext4_file_open+0xb6/0x189
[ 31.618094] ext4_file_open+0xb6/0x189
If fs is frozen, skip s_last_mounted update.
[backport hint: to apply to stable tree, need to apply also patches
vfs: add the sb_start_intwrite_trylock() helper
ext4: factor out helper ext4_sample_last_mounted()]
Cc: stable(a)vger.kernel.org
Fixes: bc0b0d6d69ee ("ext4: update the s_last_mounted field in the superblock")
Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Reviewed-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index c48ea76b63e4..7f8023340eb8 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -393,7 +393,7 @@ static int ext4_sample_last_mounted(struct super_block *sb,
if (likely(sbi->s_mount_flags & EXT4_MF_MNTDIR_SAMPLED))
return 0;
- if (sb_rdonly(sb))
+ if (sb_rdonly(sb) || !sb_start_intwrite_trylock(sb))
return 0;
sbi->s_mount_flags |= EXT4_MF_MNTDIR_SAMPLED;
@@ -407,21 +407,25 @@ static int ext4_sample_last_mounted(struct super_block *sb,
path.mnt = mnt;
path.dentry = mnt->mnt_root;
cp = d_path(&path, buf, sizeof(buf));
+ err = 0;
if (IS_ERR(cp))
- return 0;
+ goto out;
handle = ext4_journal_start_sb(sb, EXT4_HT_MISC, 1);
+ err = PTR_ERR(handle);
if (IS_ERR(handle))
- return PTR_ERR(handle);
+ goto out;
BUFFER_TRACE(sbi->s_sbh, "get_write_access");
err = ext4_journal_get_write_access(handle, sbi->s_sbh);
if (err)
- goto out;
+ goto out_journal;
strlcpy(sbi->s_es->s_last_mounted, cp,
sizeof(sbi->s_es->s_last_mounted));
ext4_handle_dirty_super(handle, sb);
-out:
+out_journal:
ext4_journal_stop(handle);
+out:
+ sb_end_intwrite(sb);
return err;
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8bc1379b82b8e809eef77a9fedbb75c6c297be19 Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso(a)mit.edu>
Date: Sat, 16 Jun 2018 23:41:59 -0400
Subject: [PATCH] ext4: avoid running out of journal credits when appending to
an inline file
Use a separate journal transaction if it turns out that we need to
convert an inline file to use an data block. Otherwise we could end
up failing due to not having journal credits.
This addresses CVE-2018-10883.
https://bugzilla.kernel.org/show_bug.cgi?id=200071
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Cc: stable(a)kernel.org
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 856b6a54d82b..859d6433dcc1 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3013,9 +3013,6 @@ extern int ext4_inline_data_fiemap(struct inode *inode,
struct iomap;
extern int ext4_inline_data_iomap(struct inode *inode, struct iomap *iomap);
-extern int ext4_try_to_evict_inline_data(handle_t *handle,
- struct inode *inode,
- int needed);
extern int ext4_inline_data_truncate(struct inode *inode, int *has_inline);
extern int ext4_convert_inline_data(struct inode *inode);
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index d79115d8d716..851bc552d849 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -887,11 +887,11 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping,
flags |= AOP_FLAG_NOFS;
if (ret == -ENOSPC) {
+ ext4_journal_stop(handle);
ret = ext4_da_convert_inline_data_to_extent(mapping,
inode,
flags,
fsdata);
- ext4_journal_stop(handle);
if (ret == -ENOSPC &&
ext4_should_retry_alloc(inode->i_sb, &retries))
goto retry_journal;
@@ -1891,42 +1891,6 @@ int ext4_inline_data_fiemap(struct inode *inode,
return (error < 0 ? error : 0);
}
-/*
- * Called during xattr set, and if we can sparse space 'needed',
- * just create the extent tree evict the data to the outer block.
- *
- * We use jbd2 instead of page cache to move data to the 1st block
- * so that the whole transaction can be committed as a whole and
- * the data isn't lost because of the delayed page cache write.
- */
-int ext4_try_to_evict_inline_data(handle_t *handle,
- struct inode *inode,
- int needed)
-{
- int error;
- struct ext4_xattr_entry *entry;
- struct ext4_inode *raw_inode;
- struct ext4_iloc iloc;
-
- error = ext4_get_inode_loc(inode, &iloc);
- if (error)
- return error;
-
- raw_inode = ext4_raw_inode(&iloc);
- entry = (struct ext4_xattr_entry *)((void *)raw_inode +
- EXT4_I(inode)->i_inline_off);
- if (EXT4_XATTR_LEN(entry->e_name_len) +
- EXT4_XATTR_SIZE(le32_to_cpu(entry->e_value_size)) < needed) {
- error = -ENOSPC;
- goto out;
- }
-
- error = ext4_convert_inline_data_nolock(handle, inode, &iloc);
-out:
- brelse(iloc.bh);
- return error;
-}
-
int ext4_inline_data_truncate(struct inode *inode, int *has_inline)
{
handle_t *handle;
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 72377b77fbd7..723df14f4084 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -2212,23 +2212,8 @@ int ext4_xattr_ibody_inline_set(handle_t *handle, struct inode *inode,
if (EXT4_I(inode)->i_extra_isize == 0)
return -ENOSPC;
error = ext4_xattr_set_entry(i, s, handle, inode, false /* is_block */);
- if (error) {
- if (error == -ENOSPC &&
- ext4_has_inline_data(inode)) {
- error = ext4_try_to_evict_inline_data(handle, inode,
- EXT4_XATTR_LEN(strlen(i->name) +
- EXT4_XATTR_SIZE(i->value_len)));
- if (error)
- return error;
- error = ext4_xattr_ibody_find(inode, i, is);
- if (error)
- return error;
- error = ext4_xattr_set_entry(i, s, handle, inode,
- false /* is_block */);
- }
- if (error)
- return error;
- }
+ if (error)
+ return error;
header = IHDR(inode, ext4_raw_inode(&is->iloc));
if (!IS_LAST_ENTRY(s->first)) {
header->h_magic = cpu_to_le32(EXT4_XATTR_MAGIC);