From: Jann Horn <jannh(a)google.com>
Subject: mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly
5dd0b16cdaff ("mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even
on UP") made the availability of the NR_TLB_REMOTE_FLUSH* counters inside
the kernel unconditional to reduce #ifdef soup, but (either to avoid
showing dummy zero counters to userspace, or because that code was missed)
didn't update the vmstat_array, meaning that all following counters would
be shown with incorrect values.
This only affects kernel builds with
CONFIG_VM_EVENT_COUNTERS=y && CONFIG_DEBUG_TLBFLUSH=y && CONFIG_SMP=n.
Link: http://lkml.kernel.org/r/20181001143138.95119-2-jannh@google.com
Fixes: 5dd0b16cdaff ("mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP")
Signed-off-by: Jann Horn <jannh(a)google.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Roman Gushchin <guro(a)fb.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Christoph Lameter <clameter(a)sgi.com>
Cc: Kemi Wang <kemi.wang(a)intel.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmstat.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/vmstat.c~mm-vmstat-skip-nr_tlb_remote_flush-properly
+++ a/mm/vmstat.c
@@ -1275,6 +1275,9 @@ const char * const vmstat_text[] = {
#ifdef CONFIG_SMP
"nr_tlb_remote_flush",
"nr_tlb_remote_flush_received",
+#else
+ "", /* nr_tlb_remote_flush */
+ "", /* nr_tlb_remote_flush_received */
#endif /* CONFIG_SMP */
"nr_tlb_local_flush_all",
"nr_tlb_local_flush_one",
_
From: Jann Horn <jannh(a)google.com>
Subject: proc: restrict kernel stack dumps to root
Currently, you can use /proc/self/task/*/stack to cause a stack walk on
a task you control while it is running on another CPU. That means that
the stack can change under the stack walker. The stack walker does
have guards against going completely off the rails and into random
kernel memory, but it can interpret random data from your kernel stack
as instruction pointers and stack pointers. This can cause exposure of
kernel stack contents to userspace.
Restrict the ability to inspect kernel stacks of arbitrary tasks to root
in order to prevent a local attacker from exploiting racy stack unwinding
to leak kernel task stack contents. See the added comment for a longer
rationale.
There don't seem to be any users of this userspace API that can't
gracefully bail out if reading from the file fails. Therefore, I believe
that this change is unlikely to break things. In the case that this patch
does end up needing a revert, the next-best solution might be to fake a
single-entry stack based on wchan.
Link: http://lkml.kernel.org/r/20180927153316.200286-1-jannh@google.com
Fixes: 2ec220e27f50 ("proc: add /proc/*/stack")
Signed-off-by: Jann Horn <jannh(a)google.com>
Acked-by: Kees Cook <keescook(a)chromium.org>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: Ken Chen <kenchen(a)google.com>
Cc: Will Deacon <will.deacon(a)arm.com>
Cc: Laura Abbott <labbott(a)redhat.com>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: "H . Peter Anvin" <hpa(a)zytor.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/base.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
--- a/fs/proc/base.c~proc-restrict-kernel-stack-dumps-to-root
+++ a/fs/proc/base.c
@@ -407,6 +407,20 @@ static int proc_pid_stack(struct seq_fil
unsigned long *entries;
int err;
+ /*
+ * The ability to racily run the kernel stack unwinder on a running task
+ * and then observe the unwinder output is scary; while it is useful for
+ * debugging kernel issues, it can also allow an attacker to leak kernel
+ * stack contents.
+ * Doing this in a manner that is at least safe from races would require
+ * some work to ensure that the remote task can not be scheduled; and
+ * even then, this would still expose the unwinder as local attack
+ * surface.
+ * Therefore, this interface is restricted to root.
+ */
+ if (!file_ns_capable(m->file, &init_user_ns, CAP_SYS_ADMIN))
+ return -EACCES;
+
entries = kmalloc_array(MAX_STACK_TRACE_DEPTH, sizeof(*entries),
GFP_KERNEL);
if (!entries)
_
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Subject: mm, thp: fix mlocking THP page with migration enabled
A transparent huge page is represented by a single entry on an LRU list.
Therefore, we can only make unevictable an entire compound page, not
individual subpages.
If a user tries to mlock() part of a huge page, we want the rest of the
page to be reclaimable.
We handle this by keeping PTE-mapped huge pages on normal LRU lists: the
PMD on border of VM_LOCKED VMA will be split into PTE table.
Introduction of THP migration breaks[1] the rules around mlocking THP
pages. If we had a single PMD mapping of the page in mlocked VMA, the
page will get mlocked, regardless of PTE mappings of the page.
For tmpfs/shmem it's easy to fix by checking PageDoubleMap() in
remove_migration_pmd().
Anon THP pages can only be shared between processes via fork(). Mlocked
page can only be shared if parent mlocked it before forking, otherwise CoW
will be triggered on mlock().
For Anon-THP, we can fix the issue by munlocking the page on removing PTE
migration entry for the page. PTEs for the page will always come after
mlocked PMD: rmap walks VMAs from oldest to newest.
Test-case:
#include <unistd.h>
#include <sys/mman.h>
#include <sys/wait.h>
#include <linux/mempolicy.h>
#include <numaif.h>
int main(void)
{
unsigned long nodemask = 4;
void *addr;
addr = mmap((void *)0x20000000UL, 2UL << 20, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_LOCKED, -1, 0);
if (fork()) {
wait(NULL);
return 0;
}
mlock(addr, 4UL << 10);
mbind(addr, 2UL << 20, MPOL_PREFERRED | MPOL_F_RELATIVE_NODES,
&nodemask, 4, MPOL_MF_MOVE);
return 0;
}
[1] https://lkml.kernel.org/r/CAOMGZ=G52R-30rZvhGxEbkTw7rLLwBGadVYeo--iizcD3upL…
Link: http://lkml.kernel.org/r/20180917133816.43995-1-kirill.shutemov@linux.intel…
Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Reported-by: Vegard Nossum <vegard.nossum(a)oracle.com>
Reviewed-by: Zi Yan <zi.yan(a)cs.rutgers.edu>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: <stable(a)vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/huge_memory.c | 2 +-
mm/migrate.c | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
--- a/mm/huge_memory.c~mm-thp-fix-mlocking-thp-page-with-migration-enabled
+++ a/mm/huge_memory.c
@@ -2931,7 +2931,7 @@ void remove_migration_pmd(struct page_vm
else
page_add_file_rmap(new, true);
set_pmd_at(mm, mmun_start, pvmw->pmd, pmde);
- if (vma->vm_flags & VM_LOCKED)
+ if ((vma->vm_flags & VM_LOCKED) && !PageDoubleMap(new))
mlock_vma_page(new);
update_mmu_cache_pmd(vma, address, pvmw->pmd);
}
--- a/mm/migrate.c~mm-thp-fix-mlocking-thp-page-with-migration-enabled
+++ a/mm/migrate.c
@@ -275,6 +275,9 @@ static bool remove_migration_pte(struct
if (vma->vm_flags & VM_LOCKED && !PageTransCompound(new))
mlock_vma_page(new);
+ if (PageTransHuge(page) && PageMlocked(page))
+ clear_page_mlock(page);
+
/* No need to invalidate - it was non-present before */
update_mmu_cache(vma, pvmw.address, pvmw.pte);
}
_
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: mm: migration: fix migration of huge PMD shared pages
The page migration code employs try_to_unmap() to try and unmap the source
page. This is accomplished by using rmap_walk to find all vmas where the
page is mapped. This search stops when page mapcount is zero. For shared
PMD huge pages, the page map count is always 1 no matter the number of
mappings. Shared mappings are tracked via the reference count of the PMD
page. Therefore, try_to_unmap stops prematurely and does not completely
unmap all mappings of the source page.
This problem can result is data corruption as writes to the original
source page can happen after contents of the page are copied to the target
page. Hence, data is lost.
This problem was originally seen as DB corruption of shared global areas
after a huge page was soft offlined due to ECC memory errors. DB
developers noticed they could reproduce the issue by (hotplug) offlining
memory used to back huge pages. A simple testcase can reproduce the
problem by creating a shared PMD mapping (note that this must be at least
PUD_SIZE in size and PUD_SIZE aligned (1GB on x86)), and using
migrate_pages() to migrate process pages between nodes while continually
writing to the huge pages being migrated.
To fix, have the try_to_unmap_one routine check for huge PMD sharing by
calling huge_pmd_unshare for hugetlbfs huge pages. If it is a shared
mapping it will be 'unshared' which removes the page table entry and drops
the reference on the PMD page. After this, flush caches and TLB.
mmu notifiers are called before locking page tables, but we can not be
sure of PMD sharing until page tables are locked. Therefore, check for
the possibility of PMD sharing before locking so that notifiers can
prepare for the worst possible case.
Link: http://lkml.kernel.org/r/20180823205917.16297-2-mike.kravetz@oracle.com
[mike.kravetz(a)oracle.com: make _range_in_vma() a static inline]
Link: http://lkml.kernel.org/r/6063f215-a5c8-2f0c-465a-2c515ddc952d@oracle.com
Fixes: 39dde65c9940 ("shared page table for hugetlb page")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Jerome Glisse <jglisse(a)redhat.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/hugetlb.h | 14 ++++++++++++
include/linux/mm.h | 6 +++++
mm/hugetlb.c | 37 +++++++++++++++++++++++++++++++--
mm/rmap.c | 42 +++++++++++++++++++++++++++++++++++---
4 files changed, 94 insertions(+), 5 deletions(-)
--- a/include/linux/hugetlb.h~mm-migration-fix-migration-of-huge-pmd-shared-pages
+++ a/include/linux/hugetlb.h
@@ -140,6 +140,8 @@ pte_t *huge_pte_alloc(struct mm_struct *
pte_t *huge_pte_offset(struct mm_struct *mm,
unsigned long addr, unsigned long sz);
int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep);
+void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
+ unsigned long *start, unsigned long *end);
struct page *follow_huge_addr(struct mm_struct *mm, unsigned long address,
int write);
struct page *follow_huge_pd(struct vm_area_struct *vma,
@@ -170,6 +172,18 @@ static inline unsigned long hugetlb_tota
return 0;
}
+static inline int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr,
+ pte_t *ptep)
+{
+ return 0;
+}
+
+static inline void adjust_range_if_pmd_sharing_possible(
+ struct vm_area_struct *vma,
+ unsigned long *start, unsigned long *end)
+{
+}
+
#define follow_hugetlb_page(m,v,p,vs,a,b,i,w,n) ({ BUG(); 0; })
#define follow_huge_addr(mm, addr, write) ERR_PTR(-EINVAL)
#define copy_hugetlb_page_range(src, dst, vma) ({ BUG(); 0; })
--- a/include/linux/mm.h~mm-migration-fix-migration-of-huge-pmd-shared-pages
+++ a/include/linux/mm.h
@@ -2455,6 +2455,12 @@ static inline struct vm_area_struct *fin
return vma;
}
+static inline bool range_in_vma(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end)
+{
+ return (vma && vma->vm_start <= start && end <= vma->vm_end);
+}
+
#ifdef CONFIG_MMU
pgprot_t vm_get_page_prot(unsigned long vm_flags);
void vma_set_page_prot(struct vm_area_struct *vma);
--- a/mm/hugetlb.c~mm-migration-fix-migration-of-huge-pmd-shared-pages
+++ a/mm/hugetlb.c
@@ -4545,13 +4545,41 @@ static bool vma_shareable(struct vm_area
/*
* check on proper vm_flags and page table alignment
*/
- if (vma->vm_flags & VM_MAYSHARE &&
- vma->vm_start <= base && end <= vma->vm_end)
+ if (vma->vm_flags & VM_MAYSHARE && range_in_vma(vma, base, end))
return true;
return false;
}
/*
+ * Determine if start,end range within vma could be mapped by shared pmd.
+ * If yes, adjust start and end to cover range associated with possible
+ * shared pmd mappings.
+ */
+void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
+ unsigned long *start, unsigned long *end)
+{
+ unsigned long check_addr = *start;
+
+ if (!(vma->vm_flags & VM_MAYSHARE))
+ return;
+
+ for (check_addr = *start; check_addr < *end; check_addr += PUD_SIZE) {
+ unsigned long a_start = check_addr & PUD_MASK;
+ unsigned long a_end = a_start + PUD_SIZE;
+
+ /*
+ * If sharing is possible, adjust start/end if necessary.
+ */
+ if (range_in_vma(vma, a_start, a_end)) {
+ if (a_start < *start)
+ *start = a_start;
+ if (a_end > *end)
+ *end = a_end;
+ }
+ }
+}
+
+/*
* Search for a shareable pmd page for hugetlb. In any case calls pmd_alloc()
* and returns the corresponding pte. While this is not necessary for the
* !shared pmd case because we can allocate the pmd later as well, it makes the
@@ -4648,6 +4676,11 @@ int huge_pmd_unshare(struct mm_struct *m
{
return 0;
}
+
+void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
+ unsigned long *start, unsigned long *end)
+{
+}
#define want_pmd_share() (0)
#endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */
--- a/mm/rmap.c~mm-migration-fix-migration-of-huge-pmd-shared-pages
+++ a/mm/rmap.c
@@ -1362,11 +1362,21 @@ static bool try_to_unmap_one(struct page
}
/*
- * We have to assume the worse case ie pmd for invalidation. Note that
- * the page can not be free in this function as call of try_to_unmap()
- * must hold a reference on the page.
+ * For THP, we have to assume the worse case ie pmd for invalidation.
+ * For hugetlb, it could be much worse if we need to do pud
+ * invalidation in the case of pmd sharing.
+ *
+ * Note that the page can not be free in this function as call of
+ * try_to_unmap() must hold a reference on the page.
*/
end = min(vma->vm_end, start + (PAGE_SIZE << compound_order(page)));
+ if (PageHuge(page)) {
+ /*
+ * If sharing is possible, start and end will be adjusted
+ * accordingly.
+ */
+ adjust_range_if_pmd_sharing_possible(vma, &start, &end);
+ }
mmu_notifier_invalidate_range_start(vma->vm_mm, start, end);
while (page_vma_mapped_walk(&pvmw)) {
@@ -1409,6 +1419,32 @@ static bool try_to_unmap_one(struct page
subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
address = pvmw.address;
+ if (PageHuge(page)) {
+ if (huge_pmd_unshare(mm, &address, pvmw.pte)) {
+ /*
+ * huge_pmd_unshare unmapped an entire PMD
+ * page. There is no way of knowing exactly
+ * which PMDs may be cached for this mm, so
+ * we must flush them all. start/end were
+ * already adjusted above to cover this range.
+ */
+ flush_cache_range(vma, start, end);
+ flush_tlb_range(vma, start, end);
+ mmu_notifier_invalidate_range(mm, start, end);
+
+ /*
+ * The ref count of the PMD page was dropped
+ * which is part of the way map counting
+ * is done for shared PMDs. Return 'true'
+ * here. When there is no other sharing,
+ * huge_pmd_unshare returns false and we will
+ * unmap the actual page and drop map count
+ * to zero.
+ */
+ page_vma_mapped_walk_done(&pvmw);
+ break;
+ }
+ }
if (IS_ENABLED(CONFIG_MIGRATION) &&
(flags & TTU_MIGRATION) &&
_
Per ARC TLS ABI, r25 is designated TP (thread pointer register).
However so far kernel didn't do any special treatment, like setting up
usermode r25, even for CLONE_SETTLS. We instead relied on libc runtime
to do this, in say clone libc wrapper [1]. This was deliberate to keep
kernel ABI agnostic (userspace could potentially change TP, specially
for different ARC ISA say ARCompact vs. ARCv2 with different spare
registers etc)
However userspace setting up r25, after clone syscall opens a race, if
child is not scheduled and gets a signal instead. It starts off in
userspace not in clone but in a signal handler and anything TP sepcific
there such as pthread_self() fails which showed up with uClibc
testsuite nptl/tst-kill6 [2]
Fix this by having kernel populate r25 to TP value. So this locks in
ABI, but it was not going to change anyways, and fwiw is same for both
ARCompact (arc700 core) and ARCvs (HS3x cores)
[1] https://cgit.uclibc-ng.org/cgi/cgit/uclibc-ng.git/tree/libc/sysdeps/linux/a…
[2] https://github.com/wbx-github/uclibc-ng-test/blob/master/test/nptl/tst-kill…
Fixes: ARC STAR 9001378481
Cc: stable(a)vger.kernel.org
Reported-by: Nikita Sobolev <sobolev(a)synopsys.com>
Signed-off-by: Vineet Gupta <vgupta(a)synopsys.com>
---
arch/arc/kernel/process.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/arch/arc/kernel/process.c b/arch/arc/kernel/process.c
index 4674541eba3f..c29fa8ceb2d6 100644
--- a/arch/arc/kernel/process.c
+++ b/arch/arc/kernel/process.c
@@ -241,6 +241,26 @@ int copy_thread(unsigned long clone_flags,
task_thread_info(current)->thr_ptr;
}
+
+ /*
+ * setup usermode thread pointer #1:
+ * when child is picked by scheduler, __switch_to() uses @c_callee to
+ * populate usermode callee regs: this is fine even despite being in a
+ * kernel function since special return path for child @ret_from_fork()
+ * ensures those regs are not clobbered all the way to RTIE to usermode
+ */
+ c_callee->r25 = task_thread_info(p)->thr_ptr;
+
+#ifdef CONFIG_ARC_CURR_IN_REG
+ /*
+ * setup usermode thread pointer #2:
+ * however for this special use of r25 in kernel, __switch_to() sets
+ * r25 for kernel needs and only in the final return path is usermode
+ * r25 setup, from pt_regs->user_r25. So set that up as well
+ */
+ c_regs->user_r25 = c_callee->r25;
+#endif
+
return 0;
}
--
2.7.4
The patch titled
Subject: kbuild: fix kernel/bounds.c 'W=1' warning
has been added to the -mm tree. Its filename is
kbuild-fix-kernel-boundsc-w=1-warning.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/kbuild-fix-kernel-boundsc-w%3D1-wa…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/kbuild-fix-kernel-boundsc-w%3D1-wa…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Arnd Bergmann <arnd(a)arndb.de>
Subject: kbuild: fix kernel/bounds.c 'W=1' warning
Building any configuration with 'make W=1' produces a warning:
kernel/bounds.c:16:6: warning: no previous prototype for 'foo' [-Wmissing-prototypes]
When also passing -Werror, this prevents us from building any other files.
Nobody ever calls the function, but we can't make it 'static' either
since we want the compiler output.
Calling it 'main' instead however avoids the warning, because gcc
does not insist on having a declaration for main.
Link: http://lkml.kernel.org/r/20181005083313.2088252-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Reported-by: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
Cc: David Laight <David.Laight(a)ACULAB.COM>
Cc: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/bounds.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/kernel/bounds.c~kbuild-fix-kernel-boundsc-w=1-warning
+++ a/kernel/bounds.c
@@ -13,7 +13,7 @@
#include <linux/log2.h>
#include <linux/spinlock_types.h>
-void foo(void)
+int main(void)
{
/* The enum constants to put into include/generated/bounds.h */
DEFINE(NR_PAGEFLAGS, __NR_PAGEFLAGS);
@@ -23,4 +23,6 @@ void foo(void)
#endif
DEFINE(SPINLOCK_SIZE, sizeof(spinlock_t));
/* End of constants */
+
+ return 0;
}
_
Patches currently in -mm which might be from arnd(a)arndb.de are
ocfs2-dlmglue-clean-up-timestamp-handling.patch
hugetlb-introduce-generic-version-of-huge_ptep_get-fix.patch
kbuild-fix-kernel-boundsc-w=1-warning.patch
vfs-replace-current_kernel_time64-with-ktime-equivalent.patch
The previous patch introduced very large kernel stack usage and a Makefile
change to hide the warning about it.
>From what I can tell, a number of things went wrong here:
- The BCH_MAX_T constant was set to the maximum value for 'n',
not the maximum for 't', which is much smaller.
- The stack usage is actually larger than the entire kernel stack
on some architectures that can use 4KB stacks (m68k, sh, c6x), which
leads to an immediate overrun.
- The justification in the patch description claimed that nothing
changed, however that is not the case even without the two points above:
the configuration is machine specific, and most boards never use the
maximum BCH_ECC_WORDS() length but instead have something much smaller.
That maximum would only apply to machines that use both the maximum
block size and the maximum ECC strength.
The largest value for 't' that I could find is '32', which in turn leads
to a 60 byte array instead of 2048 bytes. With that changed, the warning
can be enabled again.
Fixes: 02361bc77888 ("lib/bch: Remove VLA usage")
Reported-by: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
Please review carefully to ensure my logic is correct. I spent a
long time trying to understand what is going on here, but I'm not
too familiar with this algorithm, and it's possible I still got it
wrong as well.
In particular, I'm not 100% sure if '32' is the maximum ECC strength
we can support, or if a larger one (which?) might be possible.
---
lib/Makefile | 1 -
lib/bch.c | 10 ++++++----
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/lib/Makefile b/lib/Makefile
index 8c9647fa271a..12c479dd46e0 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -122,7 +122,6 @@ obj-$(CONFIG_ZLIB_INFLATE) += zlib_inflate/
obj-$(CONFIG_ZLIB_DEFLATE) += zlib_deflate/
obj-$(CONFIG_REED_SOLOMON) += reed_solomon/
obj-$(CONFIG_BCH) += bch.o
-CFLAGS_bch.o := $(call cc-option,-Wframe-larger-than=4500)
obj-$(CONFIG_LZO_COMPRESS) += lzo/
obj-$(CONFIG_LZO_DECOMPRESS) += lzo/
obj-$(CONFIG_LZ4_COMPRESS) += lz4/
diff --git a/lib/bch.c b/lib/bch.c
index 7b0f2006698b..3ef1a3467e7b 100644
--- a/lib/bch.c
+++ b/lib/bch.c
@@ -79,20 +79,19 @@
#define GF_T(_p) (CONFIG_BCH_CONST_T)
#define GF_N(_p) ((1 << (CONFIG_BCH_CONST_M))-1)
#define BCH_MAX_M (CONFIG_BCH_CONST_M)
+#define BCH_MAX_T (CONFIG_BCH_CONST_T)
#else
#define GF_M(_p) ((_p)->m)
#define GF_T(_p) ((_p)->t)
#define GF_N(_p) ((_p)->n)
-#define BCH_MAX_M 15
+#define BCH_MAX_M 15 /* 2KB */
+#define BCH_MAX_T 32 /* 32 bit correction */
#endif
-#define BCH_MAX_T (((1 << BCH_MAX_M) - 1) / BCH_MAX_M)
-
#define BCH_ECC_WORDS(_p) DIV_ROUND_UP(GF_M(_p)*GF_T(_p), 32)
#define BCH_ECC_BYTES(_p) DIV_ROUND_UP(GF_M(_p)*GF_T(_p), 8)
#define BCH_ECC_MAX_WORDS DIV_ROUND_UP(BCH_MAX_M * BCH_MAX_T, 32)
-#define BCH_ECC_MAX_BYTES DIV_ROUND_UP(BCH_MAX_M * BCH_MAX_T, 8)
#ifndef dbg
#define dbg(_fmt, args...) do {} while (0)
@@ -202,6 +201,9 @@ void encode_bch(struct bch_control *bch, const uint8_t *data,
const uint32_t * const tab3 = tab2 + 256*(l+1);
const uint32_t *pdata, *p0, *p1, *p2, *p3;
+ if (WARN_ON(r_bytes > sizeof(r)))
+ return;
+
if (ecc) {
/* load ecc parity bytes into internal 32-bit buffer */
load_ecc8(bch, bch->ecc_buf, ecc);
--
2.18.0