This is the start of the stable review cycle for the 3.18.109 release.
There are 23 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed May 16 06:46:49 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.109-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.18.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 3.18.109-rc1
Masami Hiramatsu <mhiramat(a)kernel.org>
tracing/uprobe_event: Fix strncpy corner case
Jimmy Assarsson <extja(a)kvaser.com>
can: kvaser_usb: Increase correct stats counter in kvaser_usb_rx_can_msg()
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
tracing: Fix regex_match_front() to not over compare the test string
Hans de Goede <hdegoede(a)redhat.com>
libata: Apply NOLPM quirk for SanDisk SD7UB3Q*G1001 SSDs
Johan Hovold <johan(a)kernel.org>
rfkill: gpio: fix memory leak in probe error path
Eric Dumazet <edumazet(a)google.com>
tcp: fix TCP_REPAIR_QUEUE bound checking
Jiri Olsa <jolsa(a)kernel.org>
perf: Remove superfluous allocation error check
Eric Dumazet <edumazet(a)google.com>
soreuseport: initialise timewait reuseport field
Eric Dumazet <edumazet(a)google.com>
net: fix uninit-value in __hw_addr_add_ex()
Eric Dumazet <edumazet(a)google.com>
net: initialize skb->peeked when cloning
Eric Dumazet <edumazet(a)google.com>
net: fix rtnh_ok()
Eric Dumazet <edumazet(a)google.com>
netlink: fix uninit-value in netlink_sendmsg
Bin Liu <b-liu(a)ti.com>
usb: musb: host: fix potential NULL pointer dereference
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
USB: serial: visor: handle potential invalid device configuration
SZ Lin (林上智) <sz.lin(a)moxa.com>
NET: usb: qmi_wwan: add support for ublox R410M PID 0x90b2
Leon Romanovsky <leonro(a)mellanox.com>
RDMA/mlx5: Protect from shift operand overflow
Takashi Iwai <tiwai(a)suse.de>
ALSA: aloop: Add missing cable lock to ctl API callbacks
Robert Rosengren <robert.rosengren(a)axis.com>
ALSA: aloop: Mark paused device as inactive
Takashi Iwai <tiwai(a)suse.de>
ALSA: seq: Fix races at MIDI encoding in snd_virmidi_output_trigger()
Takashi Iwai <tiwai(a)suse.de>
ALSA: pcm: Check PCM state at xfern compat ioctl
Murilo Opsfelder Araujo <muriloo(a)linux.ibm.com>
perf session: Fix undeclared 'oe'
Tan Xiaojun <tanxiaojun(a)huawei.com>
perf/core: Fix the perf_cpu_time_max_percent check
Tejun Heo <tj(a)kernel.org>
percpu: include linux/sched.h for cond_resched()
-------------
Diffstat:
Makefile | 4 +--
drivers/ata/libata-core.c | 3 ++
drivers/infiniband/hw/mlx5/qp.c | 4 +++
drivers/net/can/usb/kvaser_usb.c | 2 +-
drivers/net/usb/qmi_wwan.c | 1 +
drivers/usb/musb/musb_host.c | 4 ++-
drivers/usb/serial/visor.c | 69 +++++++++++++++++++-------------------
include/net/inet_timewait_sock.h | 1 +
include/net/nexthop.h | 2 +-
kernel/events/callchain.c | 10 ++----
kernel/events/core.c | 2 +-
kernel/trace/trace_events_filter.c | 3 ++
kernel/trace/trace_uprobe.c | 2 ++
mm/percpu.c | 1 +
net/core/dev_addr_lists.c | 4 +--
net/core/skbuff.c | 1 +
net/ipv4/inet_timewait_sock.c | 1 +
net/ipv4/tcp.c | 2 +-
net/netlink/af_netlink.c | 2 ++
net/rfkill/rfkill-gpio.c | 7 +++-
sound/core/pcm_compat.c | 2 ++
sound/core/seq/seq_virmidi.c | 4 +--
sound/drivers/aloop.c | 29 +++++++++++++---
tools/perf/util/session.c | 1 +
24 files changed, 102 insertions(+), 59 deletions(-)
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Doing an audit of trace events, I discovered two trace events in the xen
subsystem that use a hack to create zero data size trace events. This is not
what trace events are for. Trace events add memory footprint overhead, and
if all you need to do is see if a function is hit or not, simply make that
function noinline and use function tracer filtering.
Worse yet, the hack used was:
__array(char, x, 0)
Which creates a static string of zero in length. There's assumptions about
such constructs in ftrace that this is a dynamic string that is nul
terminated. This is not the case with these tracepoints and can cause
problems in various parts of ftrace.
Nuke the trace events!
Cc: stable(a)vger.kernel.org
Fixes: 95a7d76897c1e ("xen/mmu: Use Xen specific TLB flush instead of the generic one.")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
arch/x86/xen/mmu.c | 4 +---
arch/x86/xen/mmu_pv.c | 4 +---
include/trace/events/xen.h | 16 ----------------
3 files changed, 2 insertions(+), 22 deletions(-)
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index d33e7dbe3129..2d76106788a3 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -42,13 +42,11 @@ xmaddr_t arbitrary_virt_to_machine(void *vaddr)
}
EXPORT_SYMBOL_GPL(arbitrary_virt_to_machine);
-static void xen_flush_tlb_all(void)
+static noinline void xen_flush_tlb_all(void)
{
struct mmuext_op *op;
struct multicall_space mcs;
- trace_xen_mmu_flush_tlb_all(0);
-
preempt_disable();
mcs = xen_mc_entry(sizeof(*op));
diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
index 486c0a34d00b..2c30cabfda90 100644
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -1310,13 +1310,11 @@ unsigned long xen_read_cr2_direct(void)
return this_cpu_read(xen_vcpu_info.arch.cr2);
}
-static void xen_flush_tlb(void)
+static noinline void xen_flush_tlb(void)
{
struct mmuext_op *op;
struct multicall_space mcs;
- trace_xen_mmu_flush_tlb(0);
-
preempt_disable();
mcs = xen_mc_entry(sizeof(*op));
diff --git a/include/trace/events/xen.h b/include/trace/events/xen.h
index 7dd8f34c37df..fdcf88bcf0ea 100644
--- a/include/trace/events/xen.h
+++ b/include/trace/events/xen.h
@@ -352,22 +352,6 @@ DECLARE_EVENT_CLASS(xen_mmu_pgd,
DEFINE_XEN_MMU_PGD_EVENT(xen_mmu_pgd_pin);
DEFINE_XEN_MMU_PGD_EVENT(xen_mmu_pgd_unpin);
-TRACE_EVENT(xen_mmu_flush_tlb_all,
- TP_PROTO(int x),
- TP_ARGS(x),
- TP_STRUCT__entry(__array(char, x, 0)),
- TP_fast_assign((void)x),
- TP_printk("%s", "")
- );
-
-TRACE_EVENT(xen_mmu_flush_tlb,
- TP_PROTO(int x),
- TP_ARGS(x),
- TP_STRUCT__entry(__array(char, x, 0)),
- TP_fast_assign((void)x),
- TP_printk("%s", "")
- );
-
TRACE_EVENT(xen_mmu_flush_tlb_one_user,
TP_PROTO(unsigned long addr),
TP_ARGS(addr),
--
2.13.6
The patch titled
Subject: mm, oom: fix concurrent munlock and oom reaper unmap, v3
has been removed from the -mm tree. Its filename was
mm-oom-fix-concurrent-munlock-and-oom-reaper-unmap.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: David Rientjes <rientjes(a)google.com>
Subject: mm, oom: fix concurrent munlock and oom reaper unmap, v3
Since exit_mmap() is done without the protection of mm->mmap_sem, it is
possible for the oom reaper to concurrently operate on an mm until
MMF_OOM_SKIP is set.
This allows munlock_vma_pages_all() to concurrently run while the oom
reaper is operating on a vma. Since munlock_vma_pages_range() depends on
clearing VM_LOCKED from vm_flags before actually doing the munlock to
determine if any other vmas are locking the same memory, the check for
VM_LOCKED in the oom reaper is racy.
This is especially noticeable on architectures such as powerpc where
clearing a huge pmd requires serialize_against_pte_lookup(). If the pmd
is zapped by the oom reaper during follow_page_mask() after the check for
pmd_none() is bypassed, this ends up deferencing a NULL ptl or a kernel
oops.
Fix this by manually freeing all possible memory from the mm before doing
the munlock and then setting MMF_OOM_SKIP. The oom reaper can not run on
the mm anymore so the munlock is safe to do in exit_mmap(). It also
matches the logic that the oom reaper currently uses for determining when
to set MMF_OOM_SKIP itself, so there's no new risk of excessive oom
killing.
This issue fixes CVE-2018-1000200.
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1804241526320.238665@chino.kir.cor…
Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: David Rientjes <rientjes(a)google.com>
Suggested-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: <stable(a)vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/oom.h | 2 +
mm/mmap.c | 44 +++++++++++++---------
mm/oom_kill.c | 81 ++++++++++++++++++++++--------------------
3 files changed, 71 insertions(+), 56 deletions(-)
diff -puN include/linux/oom.h~mm-oom-fix-concurrent-munlock-and-oom-reaper-unmap include/linux/oom.h
--- a/include/linux/oom.h~mm-oom-fix-concurrent-munlock-and-oom-reaper-unmap
+++ a/include/linux/oom.h
@@ -95,6 +95,8 @@ static inline int check_stable_address_s
return 0;
}
+void __oom_reap_task_mm(struct mm_struct *mm);
+
extern unsigned long oom_badness(struct task_struct *p,
struct mem_cgroup *memcg, const nodemask_t *nodemask,
unsigned long totalpages);
diff -puN mm/mmap.c~mm-oom-fix-concurrent-munlock-and-oom-reaper-unmap mm/mmap.c
--- a/mm/mmap.c~mm-oom-fix-concurrent-munlock-and-oom-reaper-unmap
+++ a/mm/mmap.c
@@ -3024,6 +3024,32 @@ void exit_mmap(struct mm_struct *mm)
/* mm's last user has gone, and its about to be pulled down */
mmu_notifier_release(mm);
+ if (unlikely(mm_is_oom_victim(mm))) {
+ /*
+ * Manually reap the mm to free as much memory as possible.
+ * Then, as the oom reaper does, set MMF_OOM_SKIP to disregard
+ * this mm from further consideration. Taking mm->mmap_sem for
+ * write after setting MMF_OOM_SKIP will guarantee that the oom
+ * reaper will not run on this mm again after mmap_sem is
+ * dropped.
+ *
+ * Nothing can be holding mm->mmap_sem here and the above call
+ * to mmu_notifier_release(mm) ensures mmu notifier callbacks in
+ * __oom_reap_task_mm() will not block.
+ *
+ * This needs to be done before calling munlock_vma_pages_all(),
+ * which clears VM_LOCKED, otherwise the oom reaper cannot
+ * reliably test it.
+ */
+ mutex_lock(&oom_lock);
+ __oom_reap_task_mm(mm);
+ mutex_unlock(&oom_lock);
+
+ set_bit(MMF_OOM_SKIP, &mm->flags);
+ down_write(&mm->mmap_sem);
+ up_write(&mm->mmap_sem);
+ }
+
if (mm->locked_vm) {
vma = mm->mmap;
while (vma) {
@@ -3045,24 +3071,6 @@ void exit_mmap(struct mm_struct *mm)
/* update_hiwater_rss(mm) here? but nobody should be looking */
/* Use -1 here to ensure all VMAs in the mm are unmapped */
unmap_vmas(&tlb, vma, 0, -1);
-
- if (unlikely(mm_is_oom_victim(mm))) {
- /*
- * Wait for oom_reap_task() to stop working on this
- * mm. Because MMF_OOM_SKIP is already set before
- * calling down_read(), oom_reap_task() will not run
- * on this "mm" post up_write().
- *
- * mm_is_oom_victim() cannot be set from under us
- * either because victim->mm is already set to NULL
- * under task_lock before calling mmput and oom_mm is
- * set not NULL by the OOM killer only if victim->mm
- * is found not NULL while holding the task_lock.
- */
- set_bit(MMF_OOM_SKIP, &mm->flags);
- down_write(&mm->mmap_sem);
- up_write(&mm->mmap_sem);
- }
free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING);
tlb_finish_mmu(&tlb, 0, -1);
diff -puN mm/oom_kill.c~mm-oom-fix-concurrent-munlock-and-oom-reaper-unmap mm/oom_kill.c
--- a/mm/oom_kill.c~mm-oom-fix-concurrent-munlock-and-oom-reaper-unmap
+++ a/mm/oom_kill.c
@@ -469,7 +469,6 @@ bool process_shares_mm(struct task_struc
return false;
}
-
#ifdef CONFIG_MMU
/*
* OOM Reaper kernel thread which tries to reap the memory used by the OOM
@@ -480,16 +479,54 @@ static DECLARE_WAIT_QUEUE_HEAD(oom_reape
static struct task_struct *oom_reaper_list;
static DEFINE_SPINLOCK(oom_reaper_lock);
-static bool __oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
+void __oom_reap_task_mm(struct mm_struct *mm)
{
- struct mmu_gather tlb;
struct vm_area_struct *vma;
+
+ /*
+ * Tell all users of get_user/copy_from_user etc... that the content
+ * is no longer stable. No barriers really needed because unmapping
+ * should imply barriers already and the reader would hit a page fault
+ * if it stumbled over a reaped memory.
+ */
+ set_bit(MMF_UNSTABLE, &mm->flags);
+
+ for (vma = mm->mmap ; vma; vma = vma->vm_next) {
+ if (!can_madv_dontneed_vma(vma))
+ continue;
+
+ /*
+ * Only anonymous pages have a good chance to be dropped
+ * without additional steps which we cannot afford as we
+ * are OOM already.
+ *
+ * We do not even care about fs backed pages because all
+ * which are reclaimable have already been reclaimed and
+ * we do not want to block exit_mmap by keeping mm ref
+ * count elevated without a good reason.
+ */
+ if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED)) {
+ const unsigned long start = vma->vm_start;
+ const unsigned long end = vma->vm_end;
+ struct mmu_gather tlb;
+
+ tlb_gather_mmu(&tlb, mm, start, end);
+ mmu_notifier_invalidate_range_start(mm, start, end);
+ unmap_page_range(&tlb, vma, start, end, NULL);
+ mmu_notifier_invalidate_range_end(mm, start, end);
+ tlb_finish_mmu(&tlb, start, end);
+ }
+ }
+}
+
+static bool oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
+{
bool ret = true;
/*
* We have to make sure to not race with the victim exit path
* and cause premature new oom victim selection:
- * __oom_reap_task_mm exit_mm
+ * oom_reap_task_mm exit_mm
* mmget_not_zero
* mmput
* atomic_dec_and_test
@@ -534,39 +571,8 @@ static bool __oom_reap_task_mm(struct ta
trace_start_task_reaping(tsk->pid);
- /*
- * Tell all users of get_user/copy_from_user etc... that the content
- * is no longer stable. No barriers really needed because unmapping
- * should imply barriers already and the reader would hit a page fault
- * if it stumbled over a reaped memory.
- */
- set_bit(MMF_UNSTABLE, &mm->flags);
-
- for (vma = mm->mmap ; vma; vma = vma->vm_next) {
- if (!can_madv_dontneed_vma(vma))
- continue;
+ __oom_reap_task_mm(mm);
- /*
- * Only anonymous pages have a good chance to be dropped
- * without additional steps which we cannot afford as we
- * are OOM already.
- *
- * We do not even care about fs backed pages because all
- * which are reclaimable have already been reclaimed and
- * we do not want to block exit_mmap by keeping mm ref
- * count elevated without a good reason.
- */
- if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED)) {
- const unsigned long start = vma->vm_start;
- const unsigned long end = vma->vm_end;
-
- tlb_gather_mmu(&tlb, mm, start, end);
- mmu_notifier_invalidate_range_start(mm, start, end);
- unmap_page_range(&tlb, vma, start, end, NULL);
- mmu_notifier_invalidate_range_end(mm, start, end);
- tlb_finish_mmu(&tlb, start, end);
- }
- }
pr_info("oom_reaper: reaped process %d (%s), now anon-rss:%lukB, file-rss:%lukB, shmem-rss:%lukB\n",
task_pid_nr(tsk), tsk->comm,
K(get_mm_counter(mm, MM_ANONPAGES)),
@@ -587,14 +593,13 @@ static void oom_reap_task(struct task_st
struct mm_struct *mm = tsk->signal->oom_mm;
/* Retry the down_read_trylock(mmap_sem) a few times */
- while (attempts++ < MAX_OOM_REAP_RETRIES && !__oom_reap_task_mm(tsk, mm))
+ while (attempts++ < MAX_OOM_REAP_RETRIES && !oom_reap_task_mm(tsk, mm))
schedule_timeout_idle(HZ/10);
if (attempts <= MAX_OOM_REAP_RETRIES ||
test_bit(MMF_OOM_SKIP, &mm->flags))
goto done;
-
pr_info("oom_reaper: unable to reap pid:%d (%s)\n",
task_pid_nr(tsk), tsk->comm);
debug_show_all_locks();
_
Patches currently in -mm which might be from rientjes(a)google.com are
The patch titled
Subject: mm: sections are not offlined during memory hotremove
has been removed from the -mm tree. Its filename was
mm-sections-are-not-offlined-during-memory-hotremove.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Subject: mm: sections are not offlined during memory hotremove
Memory hotplug and hotremove operate with per-block granularity. If the
machine has a large amount of memory (more than 64G), the size of a memory
block can span multiple sections. By mistake, during hotremove we set
only the first section to offline state.
The bug was discovered because kernel selftest started to fail:
https://lkml.kernel.org/r/20180423011247.GK5563@yexl-desktop
After commit, "mm/memory_hotplug: optimize probe routine". But, the bug
is older than this commit. In this optimization we also added a check for
sections to be in a proper state during hotplug operation.
Link: http://lkml.kernel.org/r/20180427145257.15222-1-pasha.tatashin@oracle.com
Fixes: 2d070eab2e82 ("mm: consider zone which is not fully populated to have holes")
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Steven Sistare <steven.sistare(a)oracle.com>
Cc: Daniel Jordan <daniel.m.jordan(a)oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/sparse.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN mm/sparse.c~mm-sections-are-not-offlined-during-memory-hotremove mm/sparse.c
--- a/mm/sparse.c~mm-sections-are-not-offlined-during-memory-hotremove
+++ a/mm/sparse.c
@@ -629,7 +629,7 @@ void offline_mem_sections(unsigned long
unsigned long pfn;
for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
- unsigned long section_nr = pfn_to_section_nr(start_pfn);
+ unsigned long section_nr = pfn_to_section_nr(pfn);
struct mem_section *ms;
/*
_
Patches currently in -mm which might be from pasha.tatashin(a)oracle.com are
mm-allow-deferred-page-init-for-vmemmap-only.patch
sparc64-ng4-memset-32-bits-overflow.patch
The patch titled
Subject: z3fold: fix reclaim lock-ups
has been removed from the -mm tree. Its filename was
z3fold-fix-reclaim-lock-ups.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Vitaly Wool <vitalywool(a)gmail.com>
Subject: z3fold: fix reclaim lock-ups
Do not try to optimize in-page object layout while the page is under
reclaim. This fixes lock-ups on reclaim and improves reclaim performance
at the same time.
[akpm(a)linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20180430125800.444cae9706489f412ad12621@gmail.com
Signed-off-by: Vitaly Wool <vitaly.vul(a)sony.com>
Reported-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Guenter Roeck <linux(a)roeck-us.net>
Cc: <Oleksiy.Avramchenko(a)sony.com>
Cc: Matthew Wilcox <mawilcox(a)microsoft.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/z3fold.c | 42 ++++++++++++++++++++++++++++++------------
1 file changed, 30 insertions(+), 12 deletions(-)
diff -puN mm/z3fold.c~z3fold-fix-reclaim-lock-ups mm/z3fold.c
--- a/mm/z3fold.c~z3fold-fix-reclaim-lock-ups
+++ a/mm/z3fold.c
@@ -144,7 +144,8 @@ enum z3fold_page_flags {
PAGE_HEADLESS = 0,
MIDDLE_CHUNK_MAPPED,
NEEDS_COMPACTING,
- PAGE_STALE
+ PAGE_STALE,
+ UNDER_RECLAIM
};
/*****************
@@ -173,6 +174,7 @@ static struct z3fold_header *init_z3fold
clear_bit(MIDDLE_CHUNK_MAPPED, &page->private);
clear_bit(NEEDS_COMPACTING, &page->private);
clear_bit(PAGE_STALE, &page->private);
+ clear_bit(UNDER_RECLAIM, &page->private);
spin_lock_init(&zhdr->page_lock);
kref_init(&zhdr->refcount);
@@ -756,6 +758,10 @@ static void z3fold_free(struct z3fold_po
atomic64_dec(&pool->pages_nr);
return;
}
+ if (test_bit(UNDER_RECLAIM, &page->private)) {
+ z3fold_page_unlock(zhdr);
+ return;
+ }
if (test_and_set_bit(NEEDS_COMPACTING, &page->private)) {
z3fold_page_unlock(zhdr);
return;
@@ -840,6 +846,8 @@ static int z3fold_reclaim_page(struct z3
kref_get(&zhdr->refcount);
list_del_init(&zhdr->buddy);
zhdr->cpu = -1;
+ set_bit(UNDER_RECLAIM, &page->private);
+ break;
}
list_del_init(&page->lru);
@@ -887,25 +895,35 @@ static int z3fold_reclaim_page(struct z3
goto next;
}
next:
- spin_lock(&pool->lock);
if (test_bit(PAGE_HEADLESS, &page->private)) {
if (ret == 0) {
- spin_unlock(&pool->lock);
free_z3fold_page(page);
return 0;
}
- } else if (kref_put(&zhdr->refcount, release_z3fold_page)) {
- atomic64_dec(&pool->pages_nr);
+ spin_lock(&pool->lock);
+ list_add(&page->lru, &pool->lru);
+ spin_unlock(&pool->lock);
+ } else {
+ z3fold_page_lock(zhdr);
+ clear_bit(UNDER_RECLAIM, &page->private);
+ if (kref_put(&zhdr->refcount,
+ release_z3fold_page_locked)) {
+ atomic64_dec(&pool->pages_nr);
+ return 0;
+ }
+ /*
+ * if we are here, the page is still not completely
+ * free. Take the global pool lock then to be able
+ * to add it back to the lru list
+ */
+ spin_lock(&pool->lock);
+ list_add(&page->lru, &pool->lru);
spin_unlock(&pool->lock);
- return 0;
+ z3fold_page_unlock(zhdr);
}
- /*
- * Add to the beginning of LRU.
- * Pool lock has to be kept here to ensure the page has
- * not already been released
- */
- list_add(&page->lru, &pool->lru);
+ /* We started off locked to we need to lock the pool back */
+ spin_lock(&pool->lock);
}
spin_unlock(&pool->lock);
return -EAGAIN;
_
Patches currently in -mm which might be from vitalywool(a)gmail.com are
This is a note to let you know that I've just added the patch titled
serial: 8250: omap: Fix idling of clocks for unused uarts
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 13dc04d0e5fdc25c8f713ad23fdce51cf2bf96ba Mon Sep 17 00:00:00 2001
From: Tony Lindgren <tony(a)atomide.com>
Date: Fri, 4 May 2018 10:44:09 -0700
Subject: serial: 8250: omap: Fix idling of clocks for unused uarts
I noticed that unused UARTs won't necessarily idle properly always
unless at least one byte tx transfer is done first.
After some debugging I narrowed down the problem to the scr register
dma configuration bits that need to be set before softreset for the
clocks to idle. Unless we do this, the module clkctrl idlest bits
may be set to 1 instead of 3 meaning the clock will never idle and
is blocking deeper idle states for the whole domain.
This might be related to the configuration done by the bootloader
or kexec booting where certain configurations cause the 8250 or
the clkctrl clock to jam in a way where setting of the scr bits
and reset is needed to clear it. I've tried diffing the 8250
registers for the various modes, but did not see anything specific.
So far I've only seen this on omap4 but I'm suspecting this might
also happen on the other clkctrl using SoCs considering they
already have a quirk enabled for UART_ERRATA_CLOCK_DISABLE.
Let's fix the issue by configuring scr before reset for basic dma
even if we don't use it. The scr register will be reset when we do
softreset few lines after, and we restore scr on resume. We should
do this for all the SoCs with UART_ERRATA_CLOCK_DISABLE quirk flag
set since the ones with UART_ERRATA_CLOCK_DISABLE are all based
using clkctrl similar to omap4.
Looks like both OMAP_UART_SCR_DMAMODE_1 | OMAP_UART_SCR_DMAMODE_CTL
bits are needed for the clkctrl to idle after a softreset.
And we need to add omap4 to also use the UART_ERRATA_CLOCK_DISABLE
for the related workaround to be enabled. This same compatible
value will also be used for omap5.
Fixes: cdb929e4452a ("serial: 8250_omap: workaround errata around idling UART after using DMA")
Cc: Keerthy <j-keerthy(a)ti.com>
Cc: Matthijs van Duin <matthijsvanduin(a)gmail.com>
Cc: Sekhar Nori <nsekhar(a)ti.com>
Cc: Tero Kristo <t-kristo(a)ti.com>
Signed-off-by: Tony Lindgren <tony(a)atomide.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/8250/8250_omap.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c
index 6aaa84355fd1..1b337fee07ed 100644
--- a/drivers/tty/serial/8250/8250_omap.c
+++ b/drivers/tty/serial/8250/8250_omap.c
@@ -1110,13 +1110,14 @@ static int omap8250_no_handle_irq(struct uart_port *port)
return 0;
}
+static const u8 omap4_habit = UART_ERRATA_CLOCK_DISABLE;
static const u8 am3352_habit = OMAP_DMA_TX_KICK | UART_ERRATA_CLOCK_DISABLE;
static const u8 dra742_habit = UART_ERRATA_CLOCK_DISABLE;
static const struct of_device_id omap8250_dt_ids[] = {
{ .compatible = "ti,omap2-uart" },
{ .compatible = "ti,omap3-uart" },
- { .compatible = "ti,omap4-uart" },
+ { .compatible = "ti,omap4-uart", .data = &omap4_habit, },
{ .compatible = "ti,am3352-uart", .data = &am3352_habit, },
{ .compatible = "ti,am4372-uart", .data = &am3352_habit, },
{ .compatible = "ti,dra742-uart", .data = &dra742_habit, },
@@ -1362,6 +1363,19 @@ static int omap8250_soft_reset(struct device *dev)
int sysc;
int syss;
+ /*
+ * At least on omap4, unused uarts may not idle after reset without
+ * a basic scr dma configuration even with no dma in use. The
+ * module clkctrl status bits will be 1 instead of 3 blocking idle
+ * for the whole clockdomain. The softreset below will clear scr,
+ * and we restore it on resume so this is safe to do on all SoCs
+ * needing omap8250_soft_reset() quirk. Do it in two writes as
+ * recommended in the comment for omap8250_update_scr().
+ */
+ serial_out(up, UART_OMAP_SCR, OMAP_UART_SCR_DMAMODE_1);
+ serial_out(up, UART_OMAP_SCR,
+ OMAP_UART_SCR_DMAMODE_1 | OMAP_UART_SCR_DMAMODE_CTL);
+
sysc = serial_in(up, UART_OMAP_SYSC);
/* softreset the UART */
--
2.17.0
This is a note to let you know that I've just added the patch titled
serial: samsung: fix maxburst parameter for DMA transactions
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From aa2f80e752c75e593b3820f42c416ed9458fa73e Mon Sep 17 00:00:00 2001
From: Marek Szyprowski <m.szyprowski(a)samsung.com>
Date: Thu, 10 May 2018 08:41:13 +0200
Subject: serial: samsung: fix maxburst parameter for DMA transactions
The best granularity of residue that DMA engine can report is in the BURST
units, so the serial driver must use MAXBURST = 1 and DMA_SLAVE_BUSWIDTH_1_BYTE
if it relies on exact number of bytes transferred by DMA engine.
Fixes: 62c37eedb74c ("serial: samsung: add dma reqest/release functions")
Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Acked-by: Krzysztof Kozlowski <krzk(a)kernel.org>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/samsung.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/tty/serial/samsung.c b/drivers/tty/serial/samsung.c
index 3f2f8c118ce0..64e96926f1ad 100644
--- a/drivers/tty/serial/samsung.c
+++ b/drivers/tty/serial/samsung.c
@@ -862,15 +862,12 @@ static int s3c24xx_serial_request_dma(struct s3c24xx_uart_port *p)
dma->rx_conf.direction = DMA_DEV_TO_MEM;
dma->rx_conf.src_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
dma->rx_conf.src_addr = p->port.mapbase + S3C2410_URXH;
- dma->rx_conf.src_maxburst = 16;
+ dma->rx_conf.src_maxburst = 1;
dma->tx_conf.direction = DMA_MEM_TO_DEV;
dma->tx_conf.dst_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
dma->tx_conf.dst_addr = p->port.mapbase + S3C2410_UTXH;
- if (dma_get_cache_alignment() >= 16)
- dma->tx_conf.dst_maxburst = 16;
- else
- dma->tx_conf.dst_maxburst = 1;
+ dma->tx_conf.dst_maxburst = 1;
dma->rx_chan = dma_request_chan(p->port.dev, "rx");
--
2.17.0
This is a note to let you know that I've just added the patch titled
tty: pl011: Avoid spuriously stuck-off interrupts
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 4a7e625ce50412a7711efa0f2ef0b96ce3826759 Mon Sep 17 00:00:00 2001
From: Dave Martin <Dave.Martin(a)arm.com>
Date: Thu, 10 May 2018 18:08:23 +0100
Subject: tty: pl011: Avoid spuriously stuck-off interrupts
Commit 9b96fbacda34 ("serial: PL011: clear pending interrupts")
clears the RX and receive timeout interrupts on pl011 startup, to
avoid a screaming-interrupt scenario that can occur when the
firmware or bootloader leaves these interrupts asserted.
This has been noted as an issue when running Linux on qemu [1].
Unfortunately, the above fix seems to lead to potential
misbehaviour if the RX FIFO interrupt is asserted _non_ spuriously
on driver startup, if the RX FIFO is also already full to the
trigger level.
Clearing the RX FIFO interrupt does not change the FIFO fill level.
In this scenario, because the interrupt is now clear and because
the FIFO is already full to the trigger level, no new assertion of
the RX FIFO interrupt can occur unless the FIFO is drained back
below the trigger level. This never occurs because the pl011
driver is waiting for an RX FIFO interrupt to tell it that there is
something to read, and does not read the FIFO at all until that
interrupt occurs.
Thus, simply clearing "spurious" interrupts on startup may be
misguided, since there is no way to be sure that the interrupts are
truly spurious, and things can go wrong if they are not.
This patch instead clears the interrupt condition by draining the
RX FIFO during UART startup, after clearing any potentially
spurious interrupt. This should ensure that an interrupt will
definitely be asserted if the RX FIFO subsequently becomes
sufficiently full.
The drain is done at the point of enabling interrupts only. This
means that it will occur any time the UART is newly opened through
the tty layer. It will not apply to polled-mode use of the UART by
kgdboc: since that scenario cannot use interrupts by design, this
should not matter. kgdboc will interact badly with "normal" use of
the UART in any case: this patch makes no attempt to paper over
such issues.
This patch does not attempt to address the case where the RX FIFO
fills faster than it can be drained: that is a pathological
hardware design problem that is beyond the scope of the driver to
work around. As a failsafe, the number of poll iterations for
draining the FIFO is limited to twice the FIFO size. This will
ensure that the kernel at least boots even if it is impossible to
drain the FIFO for some reason.
[1] [Qemu-devel] [Qemu-arm] [PATCH] pl011: do not put into fifo
before enabled the interruption
https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg06446.html
Reported-by: Wei Xu <xuwei5(a)hisilicon.com>
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: Linus Walleij <linus.walleij(a)linaro.org>
Cc: Peter Maydell <peter.maydell(a)linaro.org>
Fixes: 9b96fbacda34 ("serial: PL011: clear pending interrupts")
Signed-off-by: Dave Martin <Dave.Martin(a)arm.com>
Cc: stable <stable(a)vger.kernel.org>
Tested-by: Wei Xu <xuwei5(a)hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/amba-pl011.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index 4b40a5b449ee..ebd33c0232e6 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -1727,10 +1727,26 @@ static int pl011_allocate_irq(struct uart_amba_port *uap)
*/
static void pl011_enable_interrupts(struct uart_amba_port *uap)
{
+ unsigned int i;
+
spin_lock_irq(&uap->port.lock);
/* Clear out any spuriously appearing RX interrupts */
pl011_write(UART011_RTIS | UART011_RXIS, uap, REG_ICR);
+
+ /*
+ * RXIS is asserted only when the RX FIFO transitions from below
+ * to above the trigger threshold. If the RX FIFO is already
+ * full to the threshold this can't happen and RXIS will now be
+ * stuck off. Drain the RX FIFO explicitly to fix this:
+ */
+ for (i = 0; i < uap->fifosize * 2; ++i) {
+ if (pl011_read(uap, REG_FR) & UART01x_FR_RXFE)
+ break;
+
+ pl011_read(uap, REG_DR);
+ }
+
uap->im = UART011_RTIM;
if (!pl011_dma_rx_running(uap))
uap->im |= UART011_RXIM;
--
2.17.0
This is a note to let you know that I've just added the patch titled
driver core: Don't ignore class_dir_create_and_add() failure.
to my driver-core git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git
in the driver-core-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 84d0c27d6233a9ba0578b20f5a09701eb66cee42 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Date: Mon, 7 May 2018 19:10:31 +0900
Subject: driver core: Don't ignore class_dir_create_and_add() failure.
syzbot is hitting WARN() at kernfs_add_one() [1].
This is because kernfs_create_link() is confused by previous device_add()
call which continued without setting dev->kobj.parent field when
get_device_parent() failed by memory allocation fault injection.
Fix this by propagating the error from class_dir_create_and_add() to
the calllers of get_device_parent().
[1] https://syzkaller.appspot.com/bug?id=fae0fb607989ea744526d1c082a5b8de652911…
Signed-off-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+df47f81c226b31d89fb1(a)syzkaller.appspotmail.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/base/core.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index b610816eb887..d680fd030316 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1467,7 +1467,7 @@ class_dir_create_and_add(struct class *class, struct kobject *parent_kobj)
dir = kzalloc(sizeof(*dir), GFP_KERNEL);
if (!dir)
- return NULL;
+ return ERR_PTR(-ENOMEM);
dir->class = class;
kobject_init(&dir->kobj, &class_dir_ktype);
@@ -1477,7 +1477,7 @@ class_dir_create_and_add(struct class *class, struct kobject *parent_kobj)
retval = kobject_add(&dir->kobj, parent_kobj, "%s", class->name);
if (retval < 0) {
kobject_put(&dir->kobj);
- return NULL;
+ return ERR_PTR(retval);
}
return &dir->kobj;
}
@@ -1784,6 +1784,10 @@ int device_add(struct device *dev)
parent = get_device(dev->parent);
kobj = get_device_parent(dev, parent);
+ if (IS_ERR(kobj)) {
+ error = PTR_ERR(kobj);
+ goto parent_error;
+ }
if (kobj)
dev->kobj.parent = kobj;
@@ -1882,6 +1886,7 @@ int device_add(struct device *dev)
kobject_del(&dev->kobj);
Error:
cleanup_glue_dir(dev, glue_dir);
+parent_error:
put_device(parent);
name_error:
kfree(dev->p);
@@ -2701,6 +2706,11 @@ int device_move(struct device *dev, struct device *new_parent,
device_pm_lock();
new_parent = get_device(new_parent);
new_parent_kobj = get_device_parent(dev, new_parent);
+ if (IS_ERR(new_parent_kobj)) {
+ error = PTR_ERR(new_parent_kobj);
+ put_device(new_parent);
+ goto out;
+ }
pr_debug("device: '%s': %s: moving to '%s'\n", dev_name(dev),
__func__, new_parent ? dev_name(new_parent) : "<NULL>");
--
2.17.0
This is a note to let you know that I've just added the patch titled
1wire: family module autoload fails because of upper/lower case
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 065c09563c872e52813a17218c52cd642be1dca6 Mon Sep 17 00:00:00 2001
From: Ingo Flaschberger <ingo.flaschberger(a)gmail.com>
Date: Tue, 1 May 2018 16:10:33 +0200
Subject: 1wire: family module autoload fails because of upper/lower case
mismatch.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
1wire family module autoload fails because of upper/lower
case mismatch.
Signed-off-by: Ingo Flaschberger <ingo.flaschberger(a)gmail.com>
Acked-by: Evgeniy Polyakov <zbr(a)ioremap.net>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/w1/w1.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/w1/w1.c b/drivers/w1/w1.c
index 80a778b02f28..caef0e0fd817 100644
--- a/drivers/w1/w1.c
+++ b/drivers/w1/w1.c
@@ -751,7 +751,7 @@ int w1_attach_slave_device(struct w1_master *dev, struct w1_reg_num *rn)
/* slave modules need to be loaded in a context with unlocked mutex */
mutex_unlock(&dev->mutex);
- request_module("w1-family-0x%02x", rn->family);
+ request_module("w1-family-0x%02X", rn->family);
mutex_lock(&dev->mutex);
spin_lock(&w1_flock);
--
2.17.0