- Linux-stable-mirror - lists.linaro.org

+ slab-alien-caches-must-not-be-initialized-if-the-allocation-of-the-alien-cache-failed.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: slab: alien caches must not be initialized if the allocation of the alien cache failed has been added to the -mm tree. Its filename is slab-alien-caches-must-not-be-initialized-if-the-allocation-of-the-alien-cache-failed.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/slab-alien-caches-must-not-be-init… and later at http://ozlabs.org/~akpm/mmotm/broken-out/slab-alien-caches-must-not-be-init… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Christoph Lameter <cl(a)linux.com> Subject: slab: alien caches must not be initialized if the allocation of the alien cache failed Callers of __alloc_alien() check for NULL. We must do the same check in __alloc_alien_cache to avoid NULL pointer dereferences on allocation failures. Link: http://lkml.kernel.org/r/010001680f42f192-82b4e12e-1565-4ee0-ae1f-1e9897490… Signed-off-by: Christoph Lameter <cl(a)linux.com> Reported-by: syzbot+d6ed4ec679652b4fd4e4(a)syzkaller.appspotmail.com Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org> Cc: Pekka Enberg <penberg(a)kernel.org> Cc: David Rientjes <rientjes(a)google.com> Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/slab.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) --- a/mm/slab.c~slab-alien-caches-must-not-be-initialized-if-the-allocation-of-the-alien-cache-failed +++ a/mm/slab.c @@ -666,8 +666,10 @@ static struct alien_cache *__alloc_alien struct alien_cache *alc = NULL; alc = kmalloc_node(memsize, gfp, node); - init_arraycache(&alc->ac, entries, batch); - spin_lock_init(&alc->lock); + if (alc) { + init_arraycache(&alc->ac, entries, batch); + spin_lock_init(&alc->lock); + } return alc; } _ Patches currently in -mm which might be from cl(a)linux.com are slab-alien-caches-must-not-be-initialized-if-the-allocation-of-the-alien-cache-failed.patch

6 years, 8 months

1
0
0 0

+ fork-memcg-fix-cached_stacks-case.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: fork, memcg: fix cached_stacks case has been added to the -mm tree. Its filename is fork-memcg-fix-cached_stacks-case.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/fork-memcg-fix-cached_stacks-case.… and later at http://ozlabs.org/~akpm/mmotm/broken-out/fork-memcg-fix-cached_stacks-case.… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Shakeel Butt <shakeelb(a)google.com> Subject: fork, memcg: fix cached_stacks case 5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on memcg charge fail") fixes a crash caused due to failed memcg charge of the kernel stack. However the fix misses the cached_stacks case which this patch fixes. So, the same crash can happen if the memcg charge of a cached stack is failed. Link: http://lkml.kernel.org/r/20190102180145.57406-1-shakeelb@google.com Fixes: 5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on memcg charge fail") Signed-off-by: Shakeel Butt <shakeelb(a)google.com> Cc: Rik van Riel <riel(a)surriel.com> Cc: Roman Gushchin <guro(a)fb.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Tejun Heo <tj(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- kernel/fork.c | 1 + 1 file changed, 1 insertion(+) --- a/kernel/fork.c~fork-memcg-fix-cached_stacks-case +++ a/kernel/fork.c @@ -221,6 +221,7 @@ static unsigned long *alloc_thread_stack memset(s->addr, 0, THREAD_SIZE); tsk->stack_vm_area = s; + tsk->stack = s->addr; return s->addr; } _ Patches currently in -mm which might be from shakeelb(a)google.com are fork-memcg-fix-cached_stacks-case.patch

6 years, 8 months

1
0
0 0

[PATCH 4.14 0/4] netfilter: xt_connlimit: backport upstream fixes for race in connection counting

by Mauricio Faria de Oliveira

Recently, Alakesh Haloi reported the following issue [1] with stable/4.14: """ An iptable rule like the following on a multicore systems will result in accepting more connections than set in the rule. iptables -A INPUT -p tcp -m tcp --syn --dport 7777 -m connlimit \ --connlimit-above 2000 --connlimit-mask 0 -j DROP """ And proposed a fix that is not in Linus's tree. The discussion went on to confirm whether the issue was still reproducible with mainline/nf.git tip, and to either identify the upstream fix or re-submit the non-upstream fix. Alakesh eventually was able to test with upstream, and reported that issue was still reproducible [2]. On that, our findinds diverge, at least in my test environment: First, I verified that the suggested mainline fix for the issue [3] indeed fixes it, by testing with it applied and reverted on v4.18, a clean revert. (The issue is reproducible with the commit reverted). Then, with a consistent reproducer, I moved to nf.git, with HEAD on commit a007232 ("netfilter: nf_conncount: fix argument order to find_next_bit"), and the issues was not reproducible (even with 20+ threads on client side, the number Alakesh reported to achieve 2150+ connections [4], and I tried spreading the network interface IRQ affinity over more and more CPUs too.) Either way, the suggested mainline fix does actually fix the issue in 4.14 for at least one environment. So, it might well be the case that Alakesh's test environment has differences/subtleties that leads to more connections accepted, and more commits are needed for that particular environment type. But for now, with one bare-metal environment (24-core server, 4-core client) verified, I thought of submitting the patches for review/comments/testing, then looking for additional fixes for that environment separately. The fix is PATCH 4/4, and PATCHes 1-3/4 are helpers for a cleaner backport. All backports are simple, and essentially consist of refresh context lines and use older struct/file names. Reviews from netfilter maintainers are very appreciated, as I've no previous experience in this area, and although the backports look simple and build/run correctly, there's usually stuff that only more experienced people may notice. Thanks, Mauricio Links: ===== [1] https://www.spinics.net/lists/stable/msg270040.html [2] https://www.spinics.net/lists/stable/msg273669.html [3] https://www.spinics.net/lists/stable/msg271300.html [4] https://www.spinics.net/lists/stable/msg273669.html Test-case: ========= - v4.14.91 (original): client achieves 2000+ connections (6000 target) with 3 threads. server # iptables -F server # iptables -A INPUT -p tcp -m tcp --syn --dport 7777 -m connlimit --connlimit-above 2000 --connlimit-mask 0 -j DROP server # iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination DROP tcp -- anywhere anywhere tcp dpt:7777 flags:FIN,SYN,RST,ACK/SYN #conn src/0 > 2000 Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination server # ulimit -SHn 65000 server # ruby server.rb <... listening ...> client # ulimit -SHn 65000 client # ruby client.rb 10.230.56.100 7777 6000 3 Connecting to ["10.230.56.100"]:7777 6000 times with 3 1 2 3 <...> 2000 <...> 6000 Target reached. Thread finishing 6001 Target reached. Thread finishing 6002 Target reached. Thread finishing Threads done. 6002 connections press enter to exit - v4.14.91 + patches: client only achieved 2000 connections. server # (same procedure) client # (same procedure) Connecting to ["10.230.56.100"]:7777 6000 times with 3 1 2 3 <...> 2000 <... blocked for a while...> failed to create connection: Connection timed out - connect(2) for "10.230.56.100" port 7777 failed to create connection: Connection timed out - connect(2) for "10.230.56.100" port 7777 failed to create connection: Connection timed out - connect(2) for "10.230.56.100" port 7777 Threads done. 2000 connections press enter to exit Florian Westphal (2): netfilter: xt_connlimit: don't store address in the conn nodes netfilter: nf_conncount: fix garbage collection confirm race Pablo Neira Ayuso (1): netfilter: nf_conncount: expose connection list interface Yi-Hung Wei (1): netfilter: nf_conncount: Fix garbage collection with zones include/net/netfilter/nf_conntrack_count.h | 15 +++++ net/netfilter/xt_connlimit.c | 99 +++++++++++++++++++++++------- 2 files changed, 91 insertions(+), 23 deletions(-) create mode 100644 include/net/netfilter/nf_conntrack_count.h -- 2.7.4

6 years, 8 months

2
8
0 0

[merged] lib-dont-depend-on-linux-headers-being-installed.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: lib/gen_crc64table.c: don't depend on linux headers being installed has been removed from the -mm tree. Its filename was lib-dont-depend-on-linux-headers-being-installed.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: NeilBrown <neil(a)brown.name> Subject: lib/gen_crc64table.c: don't depend on linux headers being installed gen_crc64table requires linux include files to be installed in /usr/include/linux. This is a new requrement so hosts that could previously build the kernel, now cannot. gen_crc64table makes this requirement by including <linux/swab.h>, but nothing from that header is actually used. So remove the #include, so that the linux headers no longer need to be installed. Link: http://lkml.kernel.org/r/87y3899gzi.fsf@notabene.neil.brown.name Fixes: feba04fd2cf8 ("lib: add crc64 calculation routines") Signed-off-by: NeilBrown <neil(a)brown.name> Acked-by: Coly Li <colyli(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- --- a/lib/gen_crc64table.c~lib-dont-depend-on-linux-headers-being-installed +++ a/lib/gen_crc64table.c @@ -16,8 +16,6 @@ #include <inttypes.h> #include <stdio.h> -#include <linux/swab.h> - #define CRC64_ECMA182_POLY 0x42F0E1EBA9EA3693ULL static uint64_t crc64_table[256] = {0}; _ Patches currently in -mm which might be from neil(a)brown.name are

6 years, 8 months

1
0
0 0

[merged] memcg-oom-notify-on-oom-killer-invocation-from-the-charge-path.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: memcg, oom: notify on oom killer invocation from the charge path has been removed from the -mm tree. Its filename was memcg-oom-notify-on-oom-killer-invocation-from-the-charge-path.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Michal Hocko <mhocko(a)suse.com> Subject: memcg, oom: notify on oom killer invocation from the charge path Burt Holzman has noticed that memcg v1 doesn't notify about OOM events via eventfd anymore. The reason is that 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the charge path") has moved the oom handling back to the charge path. While doing so the notification was left behind in mem_cgroup_oom_synchronize. Fix the issue by replicating the oom hierarchy locking and the notification. Link: http://lkml.kernel.org/r/20181224091107.18354-1-mhocko@kernel.org Fixes: 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the charge path") Signed-off-by: Michal Hocko <mhocko(a)suse.com> Reported-by: Burt Holzman <burt(a)fnal.gov> Acked-by: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com Cc: <stable(a)vger.kernel.org> [4.19+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- --- a/mm/memcontrol.c~memcg-oom-notify-on-oom-killer-invocation-from-the-charge-path +++ a/mm/memcontrol.c @@ -1673,6 +1673,9 @@ enum oom_status { static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order) { + enum oom_status ret; + bool locked; + if (order > PAGE_ALLOC_COSTLY_ORDER) return OOM_SKIPPED; @@ -1707,10 +1710,23 @@ static enum oom_status mem_cgroup_oom(st return OOM_ASYNC; } + mem_cgroup_mark_under_oom(memcg); + + locked = mem_cgroup_oom_trylock(memcg); + + if (locked) + mem_cgroup_oom_notify(memcg); + + mem_cgroup_unmark_under_oom(memcg); if (mem_cgroup_out_of_memory(memcg, mask, order)) - return OOM_SUCCESS; + ret = OOM_SUCCESS; + else + ret = OOM_FAILED; + + if (locked) + mem_cgroup_oom_unlock(memcg); - return OOM_FAILED; + return ret; } /** _ Patches currently in -mm which might be from mhocko(a)suse.com are mm-memcg-fix-reclaim-deadlock-with-writeback.patch

6 years, 8 months

1
0
0 0

[merged] mm-swap-fix-swapoff-with-ksm-pages.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm, swap: fix swapoff with KSM pages has been removed from the -mm tree. Its filename was mm-swap-fix-swapoff-with-ksm-pages.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Huang Ying <ying.huang(a)intel.com> Subject: mm, swap: fix swapoff with KSM pages KSM pages may be mapped to the multiple VMAs that cannot be reached from one anon_vma. So during swapin, a new copy of the page need to be generated if a different anon_vma is needed, please refer to comments of ksm_might_need_to_copy() for details. During swapoff, unuse_vma() uses anon_vma (if available) to locate VMA and virtual address mapped to the page, so not all mappings to a swapped out KSM page could be found. So in try_to_unuse(), even if the swap count of a swap entry isn't zero, the page needs to be deleted from swap cache, so that, in the next round a new page could be allocated and swapin for the other mappings of the swapped out KSM page. But this contradicts with the THP swap support. Where the THP could be deleted from swap cache only after the swap count of every swap entry in the huge swap cluster backing the THP has reach 0. So try_to_unuse() is changed in commit e07098294adf ("mm, THP, swap: support to reclaim swap space for THP swapped out") to check that before delete a page from swap cache, but this has broken KSM swapoff too. Fortunately, KSM is for the normal pages only, so the original behavior for KSM pages could be restored easily via checking PageTransCompound(). That is how this patch works. The bug is introduced by e07098294adf ("mm, THP, swap: support to reclaim swap space for THP swapped out"), which is merged by v4.14-rc1. So I think we should backport the fix to from 4.14 on. But Hugh thinks it may be rare for the KSM pages being in the swap device when swapoff, so nobody reports the bug so far. Link: http://lkml.kernel.org/r/20181226051522.28442-1-ying.huang@intel.com Fixes: e07098294adf ("mm, THP, swap: support to reclaim swap space for THP swapped out") Signed-off-by: "Huang, Ying" <ying.huang(a)intel.com> Reported-by: Hugh Dickins <hughd(a)google.com> Tested-by: Hugh Dickins <hughd(a)google.com> Acked-by: Hugh Dickins <hughd(a)google.com> Cc: Rik van Riel <riel(a)redhat.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Minchan Kim <minchan(a)kernel.org> Cc: Shaohua Li <shli(a)kernel.org> Cc: Daniel Jordan <daniel.m.jordan(a)oracle.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- --- a/mm/swapfile.c~mm-swap-fix-swapoff-with-ksm-pages +++ a/mm/swapfile.c @@ -2197,7 +2197,8 @@ int try_to_unuse(unsigned int type, bool */ if (PageSwapCache(page) && likely(page_private(page) == entry.val) && - !page_swapped(page)) + (!PageTransCompound(page) || + !swap_page_trans_huge_swapped(si, entry))) delete_from_swap_cache(compound_head(page)); /* _ Patches currently in -mm which might be from ying.huang(a)intel.com are mm-swap-fix-race-between-swapoff-and-some-swap-operations.patch mm-swap-fix-race-between-swapoff-and-some-swap-operations-v6.patch mm-fix-race-between-swapoff-and-mincore.patch

6 years, 8 months

1
0
0 0

[merged] hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race has been removed from the -mm tree. Its filename was hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Mike Kravetz <mike.kravetz(a)oracle.com> Subject: hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race hugetlbfs page faults can race with truncate and hole punch operations. Current code in the page fault path attempts to handle this by 'backing out' operations if we encounter the race. One obvious omission in the current code is removing a page newly added to the page cache. This is pretty straight forward to address, but there is a more subtle and difficult issue of backing out hugetlb reservations. To handle this correctly, the 'reservation state' before page allocation needs to be noted so that it can be properly backed out. There are four distinct possibilities for reservation state: shared/reserved, shared/no-resv, private/reserved and private/no-resv. Backing out a reservation may require memory allocation which could fail so that needs to be taken into account as well. Instead of writing the required complicated code for this rare occurrence, just eliminate the race. i_mmap_rwsem is now held in read mode for the duration of page fault processing. Hold i_mmap_rwsem longer in truncation and hold punch code to cover the call to remove_inode_hugepages. With this modification, code in remove_inode_hugepages checking for races becomes 'dead' as it can not longer happen. Remove the dead code and expand comments to explain reasoning. Similarly, checks for races with truncation in the page fault path can be simplified and removed. [mike.kravetz(a)oracle.com: incorporat suggestions from Kirill] Link: http://lkml.kernel.org/r/20181222223013.22193-3-mike.kravetz@oracle.com Link: http://lkml.kernel.org/r/20181218223557.5202-3-mike.kravetz@oracle.com Fixes: ebed4bfc8da8 ("hugetlb: fix absurd HugePages_Rsvd") Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com> Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com> Cc: Michal Hocko <mhocko(a)kernel.org> Cc: Hugh Dickins <hughd(a)google.com> Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar(a)linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange(a)redhat.com> Cc: Davidlohr Bueso <dave(a)stgolabs.net> Cc: Prakash Sangappa <prakash.sangappa(a)oracle.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- --- a/fs/hugetlbfs/inode.c~hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race +++ a/fs/hugetlbfs/inode.c @@ -383,17 +383,16 @@ hugetlb_vmdelete_list(struct rb_root_cac * truncation is indicated by end of range being LLONG_MAX * In this case, we first scan the range and release found pages. * After releasing pages, hugetlb_unreserve_pages cleans up region/reserv - * maps and global counts. Page faults can not race with truncation - * in this routine. hugetlb_no_page() prevents page faults in the - * truncated range. It checks i_size before allocation, and again after - * with the page table lock for the page held. The same lock must be - * acquired to unmap a page. + * maps and global counts. * hole punch is indicated if end is not LLONG_MAX * In the hole punch case we scan the range and release found pages. * Only when releasing a page is the associated region/reserv map * deleted. The region/reserv map for ranges without associated - * pages are not modified. Page faults can race with hole punch. - * This is indicated if we find a mapped page. + * pages are not modified. + * + * Callers of this routine must hold the i_mmap_rwsem in write mode to prevent + * races with page faults. + * * Note: If the passed end of range value is beyond the end of file, but * not LLONG_MAX this routine still performs a hole punch operation. */ @@ -423,32 +422,14 @@ static void remove_inode_hugepages(struc for (i = 0; i < pagevec_count(&pvec); ++i) { struct page *page = pvec.pages[i]; - u32 hash; index = page->index; - hash = hugetlb_fault_mutex_hash(h, current->mm, - &pseudo_vma, - mapping, index, 0); - mutex_lock(&hugetlb_fault_mutex_table[hash]); - /* - * If page is mapped, it was faulted in after being - * unmapped in caller. Unmap (again) now after taking - * the fault mutex. The mutex will prevent faults - * until we finish removing the page. - * - * This race can only happen in the hole punch case. - * Getting here in a truncate operation is a bug. + * A mapped page is impossible as callers should unmap + * all references before calling. And, i_mmap_rwsem + * prevents the creation of additional mappings. */ - if (unlikely(page_mapped(page))) { - BUG_ON(truncate_op); - - i_mmap_lock_write(mapping); - hugetlb_vmdelete_list(&mapping->i_mmap, - index * pages_per_huge_page(h), - (index + 1) * pages_per_huge_page(h)); - i_mmap_unlock_write(mapping); - } + VM_BUG_ON(page_mapped(page)); lock_page(page); /* @@ -470,7 +451,6 @@ static void remove_inode_hugepages(struc } unlock_page(page); - mutex_unlock(&hugetlb_fault_mutex_table[hash]); } huge_pagevec_release(&pvec); cond_resched(); @@ -482,9 +462,20 @@ static void remove_inode_hugepages(struc static void hugetlbfs_evict_inode(struct inode *inode) { + struct address_space *mapping = inode->i_mapping; struct resv_map *resv_map; + /* + * The vfs layer guarantees that there are no other users of this + * inode. Therefore, it would be safe to call remove_inode_hugepages + * without holding i_mmap_rwsem. We acquire and hold here to be + * consistent with other callers. Since there will be no contention + * on the semaphore, overhead is negligible. + */ + i_mmap_lock_write(mapping); remove_inode_hugepages(inode, 0, LLONG_MAX); + i_mmap_unlock_write(mapping); + resv_map = (struct resv_map *)inode->i_mapping->private_data; /* root inode doesn't have the resv_map, so we should check it */ if (resv_map) @@ -505,8 +496,8 @@ static int hugetlb_vmtruncate(struct ino i_mmap_lock_write(mapping); if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)) hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0); - i_mmap_unlock_write(mapping); remove_inode_hugepages(inode, offset, LLONG_MAX); + i_mmap_unlock_write(mapping); return 0; } @@ -540,8 +531,8 @@ static long hugetlbfs_punch_hole(struct hugetlb_vmdelete_list(&mapping->i_mmap, hole_start >> PAGE_SHIFT, hole_end >> PAGE_SHIFT); - i_mmap_unlock_write(mapping); remove_inode_hugepages(inode, hole_start, hole_end); + i_mmap_unlock_write(mapping); inode_unlock(inode); } @@ -624,7 +615,11 @@ static long hugetlbfs_fallocate(struct f /* addr is the offset within the file (zero based) */ addr = index * hpage_size; - /* mutex taken here, fault path and hole punch */ + /* + * fault mutex taken here, protects against fault path + * and hole punch. inode_lock previously taken protects + * against truncation. + */ hash = hugetlb_fault_mutex_hash(h, mm, &pseudo_vma, mapping, index, addr); mutex_lock(&hugetlb_fault_mutex_table[hash]); --- a/mm/hugetlb.c~hugetlbfs-use-i_mmap_rwsem-to-fix-page-fault-truncate-race +++ a/mm/hugetlb.c @@ -3755,16 +3755,16 @@ static vm_fault_t hugetlb_no_page(struct } /* - * Use page lock to guard against racing truncation - * before we get page_table_lock. + * We can not race with truncation due to holding i_mmap_rwsem. + * Check once here for faults beyond end of file. */ + size = i_size_read(mapping->host) >> huge_page_shift(h); + if (idx >= size) + goto out; + retry: page = find_lock_page(mapping, idx); if (!page) { - size = i_size_read(mapping->host) >> huge_page_shift(h); - if (idx >= size) - goto out; - /* * Check for page in userfault range */ @@ -3854,9 +3854,6 @@ retry: } ptl = huge_pte_lock(h, mm, ptep); - size = i_size_read(mapping->host) >> huge_page_shift(h); - if (idx >= size) - goto backout; ret = 0; if (!huge_pte_none(huge_ptep_get(ptep))) @@ -3959,8 +3956,10 @@ vm_fault_t hugetlb_fault(struct mm_struc /* * Acquire i_mmap_rwsem before calling huge_pte_alloc and hold - * until finished with ptep. This prevents huge_pmd_unshare from - * being called elsewhere and making the ptep no longer valid. + * until finished with ptep. This serves two purposes: + * 1) It prevents huge_pmd_unshare from being called elsewhere + * and making the ptep no longer valid. + * 2) It synchronizes us with file truncation. * * ptep could have already be assigned via huge_pte_offset. That * is OK, as huge_pte_alloc will return the same value unless _ Patches currently in -mm which might be from mike.kravetz(a)oracle.com are

6 years, 8 months

1
0
0 0

[merged] hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization has been removed from the -mm tree. Its filename was hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Mike Kravetz <mike.kravetz(a)oracle.com> Subject: hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization While looking at BUGs associated with invalid huge page map counts, it was discovered and observed that a huge pte pointer could become 'invalid' and point to another task's page table. Consider the following: A task takes a page fault on a shared hugetlbfs file and calls huge_pte_alloc to get a ptep. Suppose the returned ptep points to a shared pmd. Now, another task truncates the hugetlbfs file. As part of truncation, it unmaps everyone who has the file mapped. If the range being truncated is covered by a shared pmd, huge_pmd_unshare will be called. For all but the last user of the shared pmd, huge_pmd_unshare will clear the pud pointing to the pmd. If the task in the middle of the page fault is not the last user, the ptep returned by huge_pte_alloc now points to another task's page table or worse. This leads to bad things such as incorrect page map/reference counts or invalid memory references. To fix, expand the use of i_mmap_rwsem as follows: - i_mmap_rwsem is held in read mode whenever huge_pmd_share is called. huge_pmd_share is only called via huge_pte_alloc, so callers of huge_pte_alloc take i_mmap_rwsem before calling. In addition, callers of huge_pte_alloc continue to hold the semaphore until finished with the ptep. - i_mmap_rwsem is held in write mode whenever huge_pmd_unshare is called. [mike.kravetz(a)oracle.com: add explicit check for mapping != null] Link: http://lkml.kernel.org/r/20181218223557.5202-2-mike.kravetz@oracle.com Fixes: 39dde65c9940 ("shared page table for hugetlb page") Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com> Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com> Cc: Michal Hocko <mhocko(a)kernel.org> Cc: Hugh Dickins <hughd(a)google.com> Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar(a)linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange(a)redhat.com> Cc: Davidlohr Bueso <dave(a)stgolabs.net> Cc: Prakash Sangappa <prakash.sangappa(a)oracle.com> Cc: Colin Ian King <colin.king(a)canonical.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- --- a/mm/hugetlb.c~hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization +++ a/mm/hugetlb.c @@ -3238,6 +3238,7 @@ int copy_hugetlb_page_range(struct mm_st struct page *ptepage; unsigned long addr; int cow; + struct address_space *mapping = vma->vm_file->f_mapping; struct hstate *h = hstate_vma(vma); unsigned long sz = huge_page_size(h); struct mmu_notifier_range range; @@ -3249,13 +3250,23 @@ int copy_hugetlb_page_range(struct mm_st mmu_notifier_range_init(&range, src, vma->vm_start, vma->vm_end); mmu_notifier_invalidate_range_start(&range); + } else { + /* + * For shared mappings i_mmap_rwsem must be held to call + * huge_pte_alloc, otherwise the returned ptep could go + * away if part of a shared pmd and another thread calls + * huge_pmd_unshare. + */ + i_mmap_lock_read(mapping); } for (addr = vma->vm_start; addr < vma->vm_end; addr += sz) { spinlock_t *src_ptl, *dst_ptl; + src_pte = huge_pte_offset(src, addr, sz); if (!src_pte) continue; + dst_pte = huge_pte_alloc(dst, addr, sz); if (!dst_pte) { ret = -ENOMEM; @@ -3326,6 +3337,8 @@ int copy_hugetlb_page_range(struct mm_st if (cow) mmu_notifier_invalidate_range_end(&range); + else + i_mmap_unlock_read(mapping); return ret; } @@ -3771,14 +3784,18 @@ retry: }; /* - * hugetlb_fault_mutex must be dropped before - * handling userfault. Reacquire after handling - * fault to make calling code simpler. + * hugetlb_fault_mutex and i_mmap_rwsem must be + * dropped before handling userfault. Reacquire + * after handling fault to make calling code simpler. */ hash = hugetlb_fault_mutex_hash(h, mm, vma, mapping, idx, haddr); mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); + ret = handle_userfault(&vmf, VM_UFFD_MISSING); + + i_mmap_lock_read(mapping); mutex_lock(&hugetlb_fault_mutex_table[hash]); goto out; } @@ -3926,6 +3943,11 @@ vm_fault_t hugetlb_fault(struct mm_struc ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); if (ptep) { + /* + * Since we hold no locks, ptep could be stale. That is + * OK as we are only making decisions based on content and + * not actually modifying content here. + */ entry = huge_ptep_get(ptep); if (unlikely(is_hugetlb_entry_migration(entry))) { migration_entry_wait_huge(vma, mm, ptep); @@ -3933,20 +3955,31 @@ vm_fault_t hugetlb_fault(struct mm_struc } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) return VM_FAULT_HWPOISON_LARGE | VM_FAULT_SET_HINDEX(hstate_index(h)); - } else { - ptep = huge_pte_alloc(mm, haddr, huge_page_size(h)); - if (!ptep) - return VM_FAULT_OOM; } + /* + * Acquire i_mmap_rwsem before calling huge_pte_alloc and hold + * until finished with ptep. This prevents huge_pmd_unshare from + * being called elsewhere and making the ptep no longer valid. + * + * ptep could have already be assigned via huge_pte_offset. That + * is OK, as huge_pte_alloc will return the same value unless + * something changed. + */ mapping = vma->vm_file->f_mapping; - idx = vma_hugecache_offset(h, vma, haddr); + i_mmap_lock_read(mapping); + ptep = huge_pte_alloc(mm, haddr, huge_page_size(h)); + if (!ptep) { + i_mmap_unlock_read(mapping); + return VM_FAULT_OOM; + } /* * Serialize hugepage allocation and instantiation, so that we don't * get spurious allocation failures if two CPUs race to instantiate * the same page in the page cache. */ + idx = vma_hugecache_offset(h, vma, haddr); hash = hugetlb_fault_mutex_hash(h, mm, vma, mapping, idx, haddr); mutex_lock(&hugetlb_fault_mutex_table[hash]); @@ -4034,6 +4067,7 @@ out_ptl: } out_mutex: mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); /* * Generally it's safe to hold refcount during waiting page lock. But * here we just wait to defer the next page fault to avoid busy loop and @@ -4638,10 +4672,12 @@ void adjust_range_if_pmd_sharing_possibl * Search for a shareable pmd page for hugetlb. In any case calls pmd_alloc() * and returns the corresponding pte. While this is not necessary for the * !shared pmd case because we can allocate the pmd later as well, it makes the - * code much cleaner. pmd allocation is essential for the shared case because - * pud has to be populated inside the same i_mmap_rwsem section - otherwise - * racing tasks could either miss the sharing (see huge_pte_offset) or select a - * bad pmd for sharing. + * code much cleaner. + * + * This routine must be called with i_mmap_rwsem held in at least read mode. + * For hugetlbfs, this prevents removal of any page table entries associated + * with the address space. This is important as we are setting up sharing + * based on existing page table entries (mappings). */ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud) { @@ -4658,7 +4694,6 @@ pte_t *huge_pmd_share(struct mm_struct * if (!vma_shareable(vma, addr)) return (pte_t *)pmd_alloc(mm, pud, addr); - i_mmap_lock_write(mapping); vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) { if (svma == vma) continue; @@ -4688,7 +4723,6 @@ pte_t *huge_pmd_share(struct mm_struct * spin_unlock(ptl); out: pte = (pte_t *)pmd_alloc(mm, pud, addr); - i_mmap_unlock_write(mapping); return pte; } @@ -4699,7 +4733,7 @@ out: * indicated by page_count > 1, unmap is achieved by clearing pud and * decrementing the ref count. If count == 1, the pte page is not shared. * - * called with page table lock held. + * Called with page table lock held and i_mmap_rwsem held in write mode. * * returns: 1 successfully unmapped a shared pte page * 0 the underlying pte page is not shared, or it is the last user --- a/mm/memory-failure.c~hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization +++ a/mm/memory-failure.c @@ -966,7 +966,7 @@ static bool hwpoison_user_mappings(struc enum ttu_flags ttu = TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS; struct address_space *mapping; LIST_HEAD(tokill); - bool unmap_success; + bool unmap_success = true; int kill = 1, forcekill; struct page *hpage = *hpagep; bool mlocked = PageMlocked(hpage); @@ -1028,7 +1028,19 @@ static bool hwpoison_user_mappings(struc if (kill) collect_procs(hpage, &tokill, flags & MF_ACTION_REQUIRED); - unmap_success = try_to_unmap(hpage, ttu); + if (!PageHuge(hpage)) { + unmap_success = try_to_unmap(hpage, ttu); + } else if (mapping) { + /* + * For hugetlb pages, try_to_unmap could potentially call + * huge_pmd_unshare. Because of this, take semaphore in + * write mode here and set TTU_RMAP_LOCKED to indicate we + * have taken the lock at this higer level. + */ + i_mmap_lock_write(mapping); + unmap_success = try_to_unmap(hpage, ttu|TTU_RMAP_LOCKED); + i_mmap_unlock_write(mapping); + } if (!unmap_success) pr_err("Memory failure: %#lx: failed to unmap page (mapcount=%d)\n", pfn, page_mapcount(hpage)); --- a/mm/migrate.c~hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization +++ a/mm/migrate.c @@ -1324,8 +1324,19 @@ static int unmap_and_move_huge_page(new_ goto put_anon; if (page_mapped(hpage)) { + struct address_space *mapping = page_mapping(hpage); + + /* + * try_to_unmap could potentially call huge_pmd_unshare. + * Because of this, take semaphore in write mode here and + * set TTU_RMAP_LOCKED to let lower levels know we have + * taken the lock. + */ + i_mmap_lock_write(mapping); try_to_unmap(hpage, - TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); + TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS| + TTU_RMAP_LOCKED); + i_mmap_unlock_write(mapping); page_was_mapped = 1; } --- a/mm/rmap.c~hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization +++ a/mm/rmap.c @@ -25,6 +25,7 @@ * page->flags PG_locked (lock_page) * hugetlbfs_i_mmap_rwsem_key (in huge_pmd_share) * mapping->i_mmap_rwsem + * hugetlb_fault_mutex (hugetlbfs specific page fault mutex) * anon_vma->rwsem * mm->page_table_lock or pte_lock * zone_lru_lock (in mark_page_accessed, isolate_lru_page) @@ -1378,6 +1379,9 @@ static bool try_to_unmap_one(struct page /* * If sharing is possible, start and end will be adjusted * accordingly. + * + * If called for a huge page, caller must hold i_mmap_rwsem + * in write mode as it is possible to call huge_pmd_unshare. */ adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end); --- a/mm/userfaultfd.c~hugetlbfs-use-i_mmap_rwsem-for-more-pmd-sharing-synchronization +++ a/mm/userfaultfd.c @@ -267,10 +267,14 @@ retry: VM_BUG_ON(dst_addr & ~huge_page_mask(h)); /* - * Serialize via hugetlb_fault_mutex + * Serialize via i_mmap_rwsem and hugetlb_fault_mutex. + * i_mmap_rwsem ensures the dst_pte remains valid even + * in the case of shared pmds. fault mutex prevents + * races with other faulting threads. */ - idx = linear_page_index(dst_vma, dst_addr); mapping = dst_vma->vm_file->f_mapping; + i_mmap_lock_read(mapping); + idx = linear_page_index(dst_vma, dst_addr); hash = hugetlb_fault_mutex_hash(h, dst_mm, dst_vma, mapping, idx, dst_addr); mutex_lock(&hugetlb_fault_mutex_table[hash]); @@ -279,6 +283,7 @@ retry: dst_pte = huge_pte_alloc(dst_mm, dst_addr, huge_page_size(h)); if (!dst_pte) { mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); goto out_unlock; } @@ -286,6 +291,7 @@ retry: dst_pteval = huge_ptep_get(dst_pte); if (!huge_pte_none(dst_pteval)) { mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); goto out_unlock; } @@ -293,6 +299,7 @@ retry: dst_addr, src_addr, &page); mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); vm_alloc_shared = vm_shared; cond_resched(); _ Patches currently in -mm which might be from mike.kravetz(a)oracle.com are

6 years, 8 months

1
0
0 0

[merged] hwpoison-memory_hotplug-allow-hwpoisoned-pages-to-be-offlined.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined has been removed from the -mm tree. Its filename was hwpoison-memory_hotplug-allow-hwpoisoned-pages-to-be-offlined.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Michal Hocko <mhocko(a)suse.com> Subject: hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined We have received a bug report that an injected MCE about faulty memory prevents memory offline to succeed on 4.4 base kernel. The underlying reason was that the HWPoison page has an elevated reference count and the migration keeps failing. There are two problems with that. First of all it is dubious to migrate the poisoned page because we know that accessing that memory is possible to fail. Secondly it doesn't make any sense to migrate a potentially broken content and preserve the memory corruption over to a new location. Oscar has found out that 4.4 and the current upstream kernels behave slightly differently with his simply testcase === int main(void) { int ret; int i; int fd; char *array = malloc(4096); char *array_locked = malloc(4096); fd = open("/tmp/data", O_RDONLY); read(fd, array, 4095); for (i = 0; i < 4096; i++) array_locked[i] = 'd'; ret = mlock((void *)PAGE_ALIGN((unsigned long)array_locked), sizeof(array_locked)); if (ret) perror("mlock"); sleep (20); ret = madvise((void *)PAGE_ALIGN((unsigned long)array_locked), 4096, MADV_HWPOISON); if (ret) perror("madvise"); for (i = 0; i < 4096; i++) array_locked[i] = 'd'; return 0; } === + offline this memory. In 4.4 kernels he saw the hwpoisoned page to be returned back to the LRU list kernel: [<ffffffff81019ac9>] dump_trace+0x59/0x340 kernel: [<ffffffff81019e9a>] show_stack_log_lvl+0xea/0x170 kernel: [<ffffffff8101ac71>] show_stack+0x21/0x40 kernel: [<ffffffff8132bb90>] dump_stack+0x5c/0x7c kernel: [<ffffffff810815a1>] warn_slowpath_common+0x81/0xb0 kernel: [<ffffffff811a275c>] __pagevec_lru_add_fn+0x14c/0x160 kernel: [<ffffffff811a2eed>] pagevec_lru_move_fn+0xad/0x100 kernel: [<ffffffff811a334c>] __lru_cache_add+0x6c/0xb0 kernel: [<ffffffff81195236>] add_to_page_cache_lru+0x46/0x70 kernel: [<ffffffffa02b4373>] extent_readpages+0xc3/0x1a0 [btrfs] kernel: [<ffffffff811a16d7>] __do_page_cache_readahead+0x177/0x200 kernel: [<ffffffff811a18c8>] ondemand_readahead+0x168/0x2a0 kernel: [<ffffffff8119673f>] generic_file_read_iter+0x41f/0x660 kernel: [<ffffffff8120e50d>] __vfs_read+0xcd/0x140 kernel: [<ffffffff8120e9ea>] vfs_read+0x7a/0x120 kernel: [<ffffffff8121404b>] kernel_read+0x3b/0x50 kernel: [<ffffffff81215c80>] do_execveat_common.isra.29+0x490/0x6f0 kernel: [<ffffffff81215f08>] do_execve+0x28/0x30 kernel: [<ffffffff81095ddb>] call_usermodehelper_exec_async+0xfb/0x130 kernel: [<ffffffff8161c045>] ret_from_fork+0x55/0x80 And that latter confuses the hotremove path because an LRU page is attempted to be migrated and that fails due to an elevated reference count. It is quite possible that the reuse of the HWPoisoned page is some kind of fixed race condition but I am not really sure about that. With the upstream kernel the failure is slightly different. The page doesn't seem to have LRU bit set but isolate_movable_page simply fails and do_migrate_range simply puts all the isolated pages back to LRU and therefore no progress is made and scan_movable_pages finds same set of pages over and over again. Fix both cases by explicitly checking HWPoisoned pages before we even try to get reference on the page, try to unmap it if it is still mapped. As explained by Naoya: : Hwpoison code never unmapped those for no big reason because : Ksm pages never dominate memory, so we simply didn't have strong : motivation to save the pages. Also put WARN_ON(PageLRU) in case there is a race and we can hit LRU HWPoison pages which shouldn't happen but I couldn't convince myself about that. Naoya has noted the following: : Theoretically no such gurantee, because try_to_unmap() doesn't have a : guarantee of success and then memory_failure() returns immediately : when hwpoison_user_mappings fails. : Or the following code (comes after hwpoison_user_mappings block) also impli= : es : that the target page can still have PageLRU flag. : : /* : * Torn down by someone else? : */ : if (PageLRU(p) && !PageSwapCache(p) && p->mapping =3D=3D NULL) { : action_result(pfn, MF_MSG_TRUNCATED_LRU, MF_IGNORED); : res =3D -EBUSY; : goto out; : } : : So I think it's OK to keep "if (WARN_ON(PageLRU(page)))" block in : current version of your patch. Link: http://lkml.kernel.org/r/20181206120135.14079-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko(a)suse.com> Reviewed-by: Oscar Salvador <osalvador(a)suse.com> Debugged-by: Oscar Salvador <osalvador(a)suse.com> Tested-by: Oscar Salvador <osalvador(a)suse.com> Acked-by: David Hildenbrand <david(a)redhat.com> Acked-by: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- --- a/mm/memory_hotplug.c~hwpoison-memory_hotplug-allow-hwpoisoned-pages-to-be-offlined +++ a/mm/memory_hotplug.c @@ -34,6 +34,7 @@ #include <linux/hugetlb.h> #include <linux/memblock.h> #include <linux/compaction.h> +#include <linux/rmap.h> #include <asm/tlbflush.h> @@ -1368,6 +1369,21 @@ do_migrate_range(unsigned long start_pfn pfn = page_to_pfn(compound_head(page)) + hpage_nr_pages(page) - 1; + /* + * HWPoison pages have elevated reference counts so the migration would + * fail on them. It also doesn't make any sense to migrate them in the + * first place. Still try to unmap such a page in case it is still mapped + * (e.g. current hwpoison implementation doesn't unmap KSM pages but keep + * the unmap as the catch all safety net). + */ + if (PageHWPoison(page)) { + if (WARN_ON(PageLRU(page))) + isolate_lru_page(page); + if (page_mapped(page)) + try_to_unmap(page, TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS); + continue; + } + if (!get_page_unless_zero(page)) continue; /* _ Patches currently in -mm which might be from mhocko(a)suse.com are mm-memcg-fix-reclaim-deadlock-with-writeback.patch

6 years, 8 months

1
0
0 0

[merged] zram-fix-double-free-backing-device.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: zram: fix double free backing device has been removed from the -mm tree. Its filename was zram-fix-double-free-backing-device.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Minchan Kim <minchan(a)kernel.org> Subject: zram: fix double free backing device If blkdev_get fails, we shouldn't do blkdev_put. Otherwise, kernel emits below log. This patch fixes it. [ 31.073006] WARNING: CPU: 0 PID: 1893 at fs/block_dev.c:1828 blkdev_put+0x105/0x120 [ 31.075104] Modules linked in: [ 31.075898] CPU: 0 PID: 1893 Comm: swapoff Not tainted 4.19.0+ #453 [ 31.077484] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 31.079589] RIP: 0010:blkdev_put+0x105/0x120 [ 31.080606] Code: 48 c7 80 a0 00 00 00 00 00 00 00 48 c7 c7 40 e7 40 96 e8 6e 47 73 00 48 8b bb e0 00 00 00 e9 2c ff ff ff 0f 0b e9 75 ff ff ff <0f> 0b e9 5a ff ff ff 48 c7 80 a0 00 00 00 00 00 00 00 eb 87 0f 1f [ 31.085080] RSP: 0018:ffffb409005c7ed0 EFLAGS: 00010297 [ 31.086383] RAX: ffff9779fe5a8040 RBX: ffff9779fbc17300 RCX: 00000000b9fc37a4 [ 31.088105] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffff9640e740 [ 31.089850] RBP: ffff9779fbc17318 R08: ffffffff95499a89 R09: 0000000000000004 [ 31.091201] R10: ffffb409005c7e50 R11: 7a9ef6088ff4d4a1 R12: 0000000000000083 [ 31.092276] R13: ffff9779fe607b98 R14: 0000000000000000 R15: ffff9779fe607a38 [ 31.093355] FS: 00007fc118d9b840(0000) GS:ffff9779fc600000(0000) knlGS:0000000000000000 [ 31.094582] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 31.095541] CR2: 00007fc11894b8dc CR3: 00000000339f6001 CR4: 0000000000160ef0 [ 31.096781] Call Trace: [ 31.097212] __x64_sys_swapoff+0x46d/0x490 [ 31.097914] do_syscall_64+0x5a/0x190 [ 31.098550] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 31.099402] RIP: 0033:0x7fc11843ec27 [ 31.100013] Code: 73 01 c3 48 8b 0d 71 62 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 a8 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 62 2c 00 f7 d8 64 89 01 48 [ 31.103149] RSP: 002b:00007ffdf69be648 EFLAGS: 00000206 ORIG_RAX: 00000000000000a8 [ 31.104425] RAX: ffffffffffffffda RBX: 00000000011d98c0 RCX: 00007fc11843ec27 [ 31.105627] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 00000000011d98c0 [ 31.106847] RBP: 0000000000000001 R08: 00007ffdf69be690 R09: 0000000000000001 [ 31.108038] R10: 00000000000002b1 R11: 0000000000000206 R12: 0000000000000001 [ 31.109231] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 31.110433] irq event stamp: 4466 [ 31.111001] hardirqs last enabled at (4465): [<ffffffff953ebd43>] __free_pages_ok+0x1e3/0x490 [ 31.112437] hardirqs last disabled at (4466): [<ffffffff95201b7a>] trace_hardirqs_off_thunk+0x1a/0x1c [ 31.113973] softirqs last enabled at (3420): [<ffffffff95e00333>] __do_softirq+0x333/0x446 [ 31.115364] softirqs last disabled at (3407): [<ffffffff9527aee1>] irq_exit+0xd1/0xe0 Link: http://lkml.kernel.org/r/20181127055429.251614-3-minchan@kernel.org Signed-off-by: Minchan Kim <minchan(a)kernel.org> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com> Reviewed-by: Joey Pabalinas <joeypabalinas(a)gmail.com> Cc: <stable(a)vger.kernel.org> [4.14+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- --- a/drivers/block/zram/zram_drv.c~zram-fix-double-free-backing-device +++ a/drivers/block/zram/zram_drv.c @@ -387,8 +387,10 @@ static ssize_t backing_dev_store(struct bdev = bdgrab(I_BDEV(inode)); err = blkdev_get(bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL, zram); - if (err < 0) + if (err < 0) { + bdev = NULL; goto out; + } nr_pages = i_size_read(inode) >> PAGE_SHIFT; bitmap_sz = BITS_TO_LONGS(nr_pages) * sizeof(long); _ Patches currently in -mm which might be from minchan(a)kernel.org are zram-idle-writeback-fixes-and-cleanup.patch

6 years, 8 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror