The patch titled
Subject: hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race-v3
has been removed from the -mm tree. Its filename was
hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race-v3.patch
This patch was dropped because it was folded into hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race.patch
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race-v3
Incorporated suggestions from Kirill. Code change to hold i_mmap_rwsem
for duration of copy in copy_hugetlb_page_range. Took i_mmap_rwsem in
hugetlbfs_evict_inode to be consistent with other callers. Other changes
were to documentation/comments.
Link: http://lkml.kernel.org/r/20181222223013.22193-3-mike.kravetz@oracle.com
Cc: <stable(a)vger.kernel.org>
Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar(a)linux.vnet.ibm.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Prakash Sangappa <prakash.sangappa(a)oracle.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/fs/hugetlbfs/inode.c~hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race-v3
+++ a/fs/hugetlbfs/inode.c
@@ -462,9 +462,20 @@ static void remove_inode_hugepages(struc
static void hugetlbfs_evict_inode(struct inode *inode)
{
+ struct address_space *mapping = inode->i_mapping;
struct resv_map *resv_map;
+ /*
+ * The vfs layer guarantees that there are no other users of this
+ * inode. Therefore, it would be safe to call remove_inode_hugepages
+ * without holding i_mmap_rwsem. We acquire and hold here to be
+ * consistent with other callers. Since there will be no contention
+ * on the semaphore, overhead is negligible.
+ */
+ i_mmap_lock_write(mapping);
remove_inode_hugepages(inode, 0, LLONG_MAX);
+ i_mmap_unlock_write(mapping);
+
resv_map = (struct resv_map *)inode->i_mapping->private_data;
/* root inode doesn't have the resv_map, so we should check it */
if (resv_map)
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization.patch
hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race.patch
The patch titled
Subject: hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race-v3
has been added to the -mm tree. Its filename is
hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race-v3.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/hugetlbfs-use-i_mmap_rwsem-to-fix-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/hugetlbfs-use-i_mmap_rwsem-to-fix-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race-v3
Incorporated suggestions from Kirill. Code change to hold i_mmap_rwsem
for duration of copy in copy_hugetlb_page_range. Took i_mmap_rwsem in
hugetlbfs_evict_inode to be consistent with other callers. Other changes
were to documentation/comments.
Link: http://lkml.kernel.org/r/20181222223013.22193-3-mike.kravetz@oracle.com
Cc: <stable(a)vger.kernel.org>
Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar(a)linux.vnet.ibm.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Prakash Sangappa <prakash.sangappa(a)oracle.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/fs/hugetlbfs/inode.c~hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race-v3
+++ a/fs/hugetlbfs/inode.c
@@ -462,9 +462,20 @@ static void remove_inode_hugepages(struc
static void hugetlbfs_evict_inode(struct inode *inode)
{
+ struct address_space *mapping = inode->i_mapping;
struct resv_map *resv_map;
+ /*
+ * The vfs layer guarantees that there are no other users of this
+ * inode. Therefore, it would be safe to call remove_inode_hugepages
+ * without holding i_mmap_rwsem. We acquire and hold here to be
+ * consistent with other callers. Since there will be no contention
+ * on the semaphore, overhead is negligible.
+ */
+ i_mmap_lock_write(mapping);
remove_inode_hugepages(inode, 0, LLONG_MAX);
+ i_mmap_unlock_write(mapping);
+
resv_map = (struct resv_map *)inode->i_mapping->private_data;
/* root inode doesn't have the resv_map, so we should check it */
if (resv_map)
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization.patch
hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization-fix.patch
hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race.patch
hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race-v3.patch
Hi,
Static analysis with CoverityScan on linux-next detected a potential
null pointer dereference with the following commit:
>From d8a1051ed4ba55679ef24e838a1942c9c40f0a14 Mon Sep 17 00:00:00 2001
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Date: Sat, 22 Dec 2018 10:55:57 +1100
Subject: [PATCH] hugetlbfs: use i_mmap_rwsem for more pmd sharing
The earlier check implies that "mapping" may be a null pointer:
var_compare_op: Comparing mapping to null implies that mapping might be
null.
1008 if (!(flags & MF_MUST_KILL) && !PageDirty(hpage) && mapping &&
1009 mapping_cap_writeback_dirty(mapping)) {
..however later "mapper" is dereferenced when it may be potentially null:
1034 /*
1035 * For hugetlb pages, try_to_unmap could potentially
call
1036 * huge_pmd_unshare. Because of this, take semaphore in
1037 * write mode here and set TTU_RMAP_LOCKED to
indicate we
1038 * have taken the lock at this higer level.
1039 */
CID 1476097 (#1 of 1): Dereference after null check (FORWARD_NULL)
var_deref_model: Passing null pointer mapping to
i_mmap_lock_write, which dereferences it.
1040 i_mmap_lock_write(mapping);
1041 unmap_success = try_to_unmap(hpage,
ttu|TTU_RMAP_LOCKED);
1042 i_mmap_unlock_write(mapping);
Colin
The patch titled
Subject: memcg, oom: notify on oom killer invocation from the charge path
has been added to the -mm tree. Its filename is
memcg-oom-notify-on-oom-killer-invocation-from-the-charge-path.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/memcg-oom-notify-on-oom-killer-inv…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/memcg-oom-notify-on-oom-killer-inv…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Michal Hocko <mhocko(a)suse.com>
Subject: memcg, oom: notify on oom killer invocation from the charge path
Burt Holzman has noticed that memcg v1 doesn't notify about OOM events via
eventfd anymore. The reason is that 29ef680ae7c2 ("memcg, oom: move
out_of_memory back to the charge path") has moved the oom handling back to
the charge path. While doing so the notification was left behind in
mem_cgroup_oom_synchronize.
Fix the issue by replicating the oom hierarchy locking and the
notification.
Link: http://lkml.kernel.org/r/20181224091107.18354-1-mhocko@kernel.org
Fixes: 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the charge path")
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Reported-by: Burt Holzman <burt(a)fnal.gov>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com
Cc: <stable(a)vger.kernel.org> [4.19+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/memcontrol.c~memcg-oom-notify-on-oom-killer-invocation-from-the-charge-path
+++ a/mm/memcontrol.c
@@ -1673,6 +1673,9 @@ enum oom_status {
static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order)
{
+ enum oom_status ret;
+ bool locked;
+
if (order > PAGE_ALLOC_COSTLY_ORDER)
return OOM_SKIPPED;
@@ -1707,10 +1710,23 @@ static enum oom_status mem_cgroup_oom(st
return OOM_ASYNC;
}
+ mem_cgroup_mark_under_oom(memcg);
+
+ locked = mem_cgroup_oom_trylock(memcg);
+
+ if (locked)
+ mem_cgroup_oom_notify(memcg);
+
+ mem_cgroup_unmark_under_oom(memcg);
if (mem_cgroup_out_of_memory(memcg, mask, order))
- return OOM_SUCCESS;
+ ret = OOM_SUCCESS;
+ else
+ ret = OOM_FAILED;
+
+ if (locked)
+ mem_cgroup_oom_unlock(memcg);
- return OOM_FAILED;
+ return ret;
}
/**
_
Patches currently in -mm which might be from mhocko(a)suse.com are
mm-memcg-fix-reclaim-deadlock-with-writeback.patch
mm-print-more-information-about-mapping-in-__dump_page.patch
mm-lower-the-printk-loglevel-for-__dump_page-messages.patch
mm-memory_hotplug-drop-pointless-block-alignment-checks-from-__offline_pages.patch
mm-memory_hotplug-print-reason-for-the-offlining-failure.patch
mm-memory_hotplug-be-more-verbose-for-memory-offline-failures.patch
mm-memory_hotplug-be-more-verbose-for-memory-offline-failures-update.patch
mm-only-report-isolation-failures-when-offlining-memory.patch
mm-memory_hotplug-do-not-clear-numa_node-association-after-hot_remove.patch
hwpoison-memory_hotplug-allow-hwpoisoned-pages-to-be-offlined.patch
mm-proc-be-more-verbose-about-unstable-vma-flags-in-proc-pid-smaps.patch
mm-thp-proc-report-thp-eligibility-for-each-vma.patch
mm-proc-report-pr_set_thp_disable-in-proc.patch
mm-memory_hotplug-try-to-migrate-full-pfn-range.patch
mm-memory_hotplug-deobfuscate-migration-part-of-offlining.patch
mm-fault_around-do-not-take-a-reference-to-a-locked-page.patch
memory_hotplug-add-missing-newlines-to-debugging-output.patch
memcg-oom-notify-on-oom-killer-invocation-from-the-charge-path.patch
Martin,
Right now you get a message
"BUG: non-zero pgtables_bytes on freeing mm: -16384"
for EVERY process that exits in 4.19.5 and later.
bisect points to
commit 4136161d676a93fc8df6bdb80d720c15522d6c24
Author: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Date: Mon Oct 15 11:09:16 2018 +0200
s390/mm: fix mis-accounting of pgtable_bytes
[ Upstream commit e12e4044aede97974f2222eb7f0ed726a5179a32 ]
Turns out that this patch requires several dependencies so the autoselection of this
patch was missing that.
Can we either revert this patch or add the dependencies?
Christian
From: Richard Weinberger <richard(a)nod.at>
commit e58725d51fa8da9133f3f1c54170aa2e43056b91 upstream.
UBIFS's recovery code strictly assumes that a deleted inode will never
come back, therefore it removes all data which belongs to that inode
as soon it faces an inode with link count 0 in the replay list.
Before O_TMPFILE this assumption was perfectly fine. With O_TMPFILE
it can lead to data loss upon a power-cut.
Consider a journal with entries like:
0: inode X (nlink = 0) /* O_TMPFILE was created */
1: data for inode X /* Someone writes to the temp file */
2: inode X (nlink = 0) /* inode was changed, xattr, chmod, … */
3: inode X (nlink = 1) /* inode was re-linked via linkat() */
Upon replay of entry #2 UBIFS will drop all data that belongs to inode X,
this will lead to an empty file after mounting.
As solution for this problem, scan the replay list for a re-link entry
before dropping data.
Fixes: 474b93704f32 ("ubifs: Implement O_TMPFILE")
Cc: stable(a)vger.kernel.org # 4.9-4.18
Cc: Russell Senior <russell(a)personaltelco.net>
Cc: Rafał Miłecki <zajec5(a)gmail.com>
Reported-by: Russell Senior <russell(a)personaltelco.net>
Reported-by: Rafał Miłecki <zajec5(a)gmail.com>
Tested-by: Rafał Miłecki <rafal(a)milecki.pl>
Signed-off-by: Richard Weinberger <richard(a)nod.at>
[rmilecki: update ubifs_assert() calls to compile with 4.18 and older]
Signed-off-by: Rafał Miłecki <rafal(a)milecki.pl>
(cherry picked from commit e58725d51fa8da9133f3f1c54170aa2e43056b91)
---
fs/ubifs/replay.c | 37 +++++++++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/fs/ubifs/replay.c b/fs/ubifs/replay.c
index ae5c02f22f3e..d998fbf7de30 100644
--- a/fs/ubifs/replay.c
+++ b/fs/ubifs/replay.c
@@ -210,6 +210,38 @@ static int trun_remove_range(struct ubifs_info *c, struct replay_entry *r)
}
/**
+ * inode_still_linked - check whether inode in question will be re-linked.
+ * @c: UBIFS file-system description object
+ * @rino: replay entry to test
+ *
+ * O_TMPFILE files can be re-linked, this means link count goes from 0 to 1.
+ * This case needs special care, otherwise all references to the inode will
+ * be removed upon the first replay entry of an inode with link count 0
+ * is found.
+ */
+static bool inode_still_linked(struct ubifs_info *c, struct replay_entry *rino)
+{
+ struct replay_entry *r;
+
+ ubifs_assert(rino->deletion);
+ ubifs_assert(key_type(c, &rino->key) == UBIFS_INO_KEY);
+
+ /*
+ * Find the most recent entry for the inode behind @rino and check
+ * whether it is a deletion.
+ */
+ list_for_each_entry_reverse(r, &c->replay_list, list) {
+ ubifs_assert(r->sqnum >= rino->sqnum);
+ if (key_inum(c, &r->key) == key_inum(c, &rino->key))
+ return r->deletion == 0;
+
+ }
+
+ ubifs_assert(0);
+ return false;
+}
+
+/**
* apply_replay_entry - apply a replay entry to the TNC.
* @c: UBIFS file-system description object
* @r: replay entry to apply
@@ -239,6 +271,11 @@ static int apply_replay_entry(struct ubifs_info *c, struct replay_entry *r)
{
ino_t inum = key_inum(c, &r->key);
+ if (inode_still_linked(c, r)) {
+ err = 0;
+ break;
+ }
+
err = ubifs_tnc_remove_ino(c, inum);
break;
}
--
2.13.7
Hello,
even though the subject sounds harmless it fixes a real bug.
(Yes, I'm aware that commit isn't in Linus' tree yet, but I assume your
tracking of patches targetting stable is better than mine, so I didn't
wait :-)
For backporting you also need its parent commit (i.e. e697271c4e29
("spi: imx: add a device specific prepare_message callback")). I have a
local backport to v4.14 here which isn't entirely trivial. So just tell
me if you need help.
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | http://www.pengutronix.de/ |
在 2018-12-27四的 22:34 +0800,Icenowy Zheng写道:
> The SMI SM3350 USB-UFS bridge controller cannot handle long sense
> request
> correctly and will make the chip refuse to do read/write when
> requested
> long sense.
>
> Add a bad sense quirk for it.
>
> Signed-off-by: Icenowy Zheng <icenowy(a)aosc.io>
> ---
I forgot to:
Cc: stable(a)vger.kernel.org
> drivers/usb/storage/unusual_devs.h | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/usb/storage/unusual_devs.h
> b/drivers/usb/storage/unusual_devs.h
> index f7f83b21dc74..ea0d27a94afe 100644
> --- a/drivers/usb/storage/unusual_devs.h
> +++ b/drivers/usb/storage/unusual_devs.h
> @@ -1265,6 +1265,18 @@ UNUSUAL_DEV( 0x090c, 0x1132, 0x0000, 0xffff,
> USB_SC_DEVICE, USB_PR_DEVICE, NULL,
> US_FL_FIX_CAPACITY ),
>
> +/*
> + * Reported by Icenowy Zheng <icenowy(a)aosc.io>
> + * The SMI SM3350 USB-UFS bridge controller will enter a wrong state
> + * that do not process read/write command if a long sense is
> requested,
> + * so force to use 18-byte sense.
> + */
> +UNUSUAL_DEV( 0x090c, 0x3350, 0x0000, 0xffff,
> + "SMI",
> + "SM3350 UFS-to-USB-Mass-Storage bridge",
> + USB_SC_DEVICE, USB_PR_DEVICE, NULL,
> + US_FL_BAD_SENSE ),
> +
> /*
> * Reported by Paul Hartman <paul.hartman+linux(a)gmail.com>
> * This card reader returns "Illegal Request, Logical Block Address
在 2018-12-27四的 22:34 +0800,Icenowy Zheng写道:
> Currently the code will set US_FL_SANE_SENSE flag unconditionally if
> device claims SPC3+, however we should allow US_FL_BAD_SENSE flag to
> prevent this behavior, because SMI SM3350 UFS-USB bridge controller,
> which claims SPC4, will show strange behavior with 96-byte sense
> (put the chip into a wrong state that cannot read/write anything).
>
> Check the presence of US_FL_BAD_SENSE when assuming US_FL_SANE_SENSE
> on
> SPC4+ devices.
>
> Signed-off-by: Icenowy Zheng <icenowy(a)aosc.io>
> ---
I forgot to:
Cc: stable(a)vger.kernel.org
> drivers/usb/storage/scsiglue.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/usb/storage/scsiglue.c
> b/drivers/usb/storage/scsiglue.c
> index fde2e71a6ade..699fe9557127 100644
> --- a/drivers/usb/storage/scsiglue.c
> +++ b/drivers/usb/storage/scsiglue.c
> @@ -236,7 +236,8 @@ static int slave_configure(struct scsi_device
> *sdev)
> sdev->try_rc_10_first = 1;
>
> /* assume SPC3 or latter devices support sense size >
> 18 */
> - if (sdev->scsi_level > SCSI_SPC_2)
> + if (sdev->scsi_level > SCSI_SPC_2 &&
> + !(us->fflags & US_FL_BAD_SENSE))
> us->fflags |= US_FL_SANE_SENSE;
>
> /*