In the presence of multi-order entries the typical
pagevec_lookup_entries() pattern may loop forever:
while (index < end && pagevec_lookup_entries(&pvec, mapping, index,
min(end - index, (pgoff_t)PAGEVEC_SIZE),
indices)) {
...
for (i = 0; i < pagevec_count(&pvec); i++) {
index = indices[i];
...
}
index++; /* BUG */
}
The loop updates 'index' for each index found and then increments to the
next possible page to continue the lookup. However, if the last entry in
the pagevec is multi-order then the next possible page index is more
than 1 page away. Fix this locally for the filesystem-dax case by
checking for dax-multi-order entries. Going forward new users of
multi-order entries need to be similarly careful, or we need a generic
way to report the page increment in the radix iterator.
Fixes: 5fac7408d828 ("mm, fs, dax: handle layout changes to pinned dax...")
Cc: <stable(a)vger.kernel.org>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Ross Zwisler <zwisler(a)kernel.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
fs/dax.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index 4becbf168b7f..c1472eede1f7 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -666,6 +666,8 @@ struct page *dax_layout_busy_page(struct address_space *mapping)
while (index < end && pagevec_lookup_entries(&pvec, mapping, index,
min(end - index, (pgoff_t)PAGEVEC_SIZE),
indices)) {
+ pgoff_t nr_pages = 1;
+
for (i = 0; i < pagevec_count(&pvec); i++) {
struct page *pvec_ent = pvec.pages[i];
void *entry;
@@ -680,8 +682,11 @@ struct page *dax_layout_busy_page(struct address_space *mapping)
xa_lock_irq(&mapping->i_pages);
entry = get_unlocked_mapping_entry(mapping, index, NULL);
- if (entry)
+ if (entry) {
page = dax_busy_page(entry);
+ /* account for multi-order entries */
+ nr_pages = 1UL << dax_radix_order(entry);
+ }
put_unlocked_mapping_entry(mapping, index, entry);
xa_unlock_irq(&mapping->i_pages);
if (page)
@@ -696,7 +701,7 @@ struct page *dax_layout_busy_page(struct address_space *mapping)
*/
pagevec_remove_exceptionals(&pvec);
pagevec_release(&pvec);
- index++;
+ index += nr_pages;
if (page)
break;
From: "H. Peter Anvin (Intel)" <hpa(a)zytor.com>
It turns out that Alpha is the only architecture that never
implemented BOTHER and IBSHIFT, which is otherwise ages old. This is
one thing that has held up glibc support for this feature (all other
architectures have supported these for about a decade, at least before
the current 3.2 glibc cutoff.)
Furthermore, in the process of dealing with this, I discovered that
the current code in tty_baudrate.c can read past the end of the
baud_table[] on Alpha and PowerPC. The second patch in this series
fixes that, but it also cleans up the code substantially by
auto-generating the table and, since all architectures now have them,
removing all conditionals for BOTHER and IBSHIFT existing.
Tagging for stable because these are concrete problems. I have a much
bigger update in process, nearly done, which will clean up a lot of
duplicated code and make the uapi headers usable for libc, but that is
not critical on the same level.
Tested on x86, compile-tested on Alpha.
Signed-off-by: H. Peter Anvin (Intel) <hpa(a)zytor.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Jiri Slaby <jslaby(a)suse.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Richard Henderson <rth(a)twiddle.net>
Cc: Ivan Kokshaysky <ink(a)jurassic.park.msu.ru>
Cc: Matt Turner <mattst88(a)gmail.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Kate Stewart <kstewart(a)linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne(a)nexb.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Eugene Syromiatnikov <esyr(a)redhat.com>
Cc: <linux-alpha(a)vger.kernel.org>
Cc: <linux-serial(a)vger.kernel.org>
Cc: Johan Hovold <johan(a)kernel.org>
Cc: Alan Cox <alan(a)lxorguk.ukuu.org.uk>
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: Paul Mackerras <paulus(a)samba.org>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: <stable(a)vger.kernel.org>
---
arch/alpha/include/asm/termios.h | 8 +-
arch/alpha/include/uapi/asm/ioctls.h | 5 +
arch/alpha/include/uapi/asm/termbits.h | 17 +++
drivers/tty/.gitignore | 1 +
drivers/tty/Makefile | 16 +++
drivers/tty/bmacros.c | 2 +
drivers/tty/tty_baudrate.c | 190 +++++++++++++--------------------
7 files changed, 122 insertions(+), 117 deletions(-)
The code cleaning transaction's lists of checkpoint buffers has a bug
where it increases bh refcount only after releasing
journal->j_list_lock. Thus the following race is possible:
CPU0 CPU1
jbd2_log_do_checkpoint()
jbd2_journal_try_to_free_buffers()
__journal_try_to_free_buffer(bh)
...
while (transaction->t_checkpoint_io_list)
...
if (buffer_locked(bh)) {
<-- IO completes now, buffer gets unlocked -->
spin_unlock(&journal->j_list_lock);
spin_lock(&journal->j_list_lock);
__jbd2_journal_remove_checkpoint(jh);
spin_unlock(&journal->j_list_lock);
try_to_free_buffers(page);
get_bh(bh) <-- accesses freed bh
Fix the problem by grabbing bh reference before unlocking
journal->j_list_lock.
Fixes: dc6e8d669cf5cb3ff84707c372c0a2a8a5e80845
Fixes: be1158cc615fd723552f0d9912087423c7cadda5
Reported-by: syzbot+7f4a27091759e2fe7453(a)syzkaller.appspotmail.com
CC: stable(a)vger.kernel.org
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/jbd2/checkpoint.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index c125d662777c..26f8d7e46462 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -251,8 +251,8 @@ int jbd2_log_do_checkpoint(journal_t *journal)
bh = jh2bh(jh);
if (buffer_locked(bh)) {
- spin_unlock(&journal->j_list_lock);
get_bh(bh);
+ spin_unlock(&journal->j_list_lock);
wait_on_buffer(bh);
/* the journal_head may have gone by now */
BUFFER_TRACE(bh, "brelse");
@@ -333,8 +333,8 @@ int jbd2_log_do_checkpoint(journal_t *journal)
jh = transaction->t_checkpoint_io_list;
bh = jh2bh(jh);
if (buffer_locked(bh)) {
- spin_unlock(&journal->j_list_lock);
get_bh(bh);
+ spin_unlock(&journal->j_list_lock);
wait_on_buffer(bh);
/* the journal_head may have gone by now */
BUFFER_TRACE(bh, "brelse");
--
2.16.4
From: Ashish Samant <ashish.samant(a)oracle.com>
Subject: ocfs2: fix locking for res->tracking and dlm->tracking_list
In dlm_init_lockres() we access and modify res->tracking and
dlm->tracking_list without holding dlm->track_lock. This can cause list
corruptions and can end up in kernel panic.
Fix this by locking res->tracking and dlm->tracking_list with
dlm->track_lock instead of dlm->spinlock.
Link: http://lkml.kernel.org/r/1529951192-4686-1-git-send-email-ashish.samant@ora…
Signed-off-by: Ashish Samant <ashish.samant(a)oracle.com>
Reviewed-by: Changwei Ge <ge.changwei(a)h3c.com>
Acked-by: Joseph Qi <jiangqi903(a)gmail.com>
Acked-by: Jun Piao <piaojun(a)huawei.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <ge.changwei(a)h3c.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/dlm/dlmmaster.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/ocfs2/dlm/dlmmaster.c~ocfs2-fix-locking-for-res-tracking-and-dlm-tracking_list
+++ a/fs/ocfs2/dlm/dlmmaster.c
@@ -584,9 +584,9 @@ static void dlm_init_lockres(struct dlm_
res->last_used = 0;
- spin_lock(&dlm->spinlock);
+ spin_lock(&dlm->track_lock);
list_add_tail(&res->tracking, &dlm->tracking_list);
- spin_unlock(&dlm->spinlock);
+ spin_unlock(&dlm->track_lock);
memset(res->lvb, 0, DLM_LVB_LEN);
memset(res->refmap, 0, sizeof(res->refmap));
_
From: Jann Horn <jannh(a)google.com>
Subject: mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly
5dd0b16cdaff ("mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even
on UP") made the availability of the NR_TLB_REMOTE_FLUSH* counters inside
the kernel unconditional to reduce #ifdef soup, but (either to avoid
showing dummy zero counters to userspace, or because that code was missed)
didn't update the vmstat_array, meaning that all following counters would
be shown with incorrect values.
This only affects kernel builds with
CONFIG_VM_EVENT_COUNTERS=y && CONFIG_DEBUG_TLBFLUSH=y && CONFIG_SMP=n.
Link: http://lkml.kernel.org/r/20181001143138.95119-2-jannh@google.com
Fixes: 5dd0b16cdaff ("mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP")
Signed-off-by: Jann Horn <jannh(a)google.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Roman Gushchin <guro(a)fb.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Christoph Lameter <clameter(a)sgi.com>
Cc: Kemi Wang <kemi.wang(a)intel.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmstat.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/vmstat.c~mm-vmstat-skip-nr_tlb_remote_flush-properly
+++ a/mm/vmstat.c
@@ -1275,6 +1275,9 @@ const char * const vmstat_text[] = {
#ifdef CONFIG_SMP
"nr_tlb_remote_flush",
"nr_tlb_remote_flush_received",
+#else
+ "", /* nr_tlb_remote_flush */
+ "", /* nr_tlb_remote_flush_received */
#endif /* CONFIG_SMP */
"nr_tlb_local_flush_all",
"nr_tlb_local_flush_one",
_