March 2018 - Linux-stable-mirror

[Linux-stable-mirror] [PATCH] scsi: lpfc: Switch memcpy_fromio() to __read32_copy()

by Huacai Chen

In commit bc73905abf770192 ("[SCSI] lpfc 8.3.16: SLI Additions, updates, and code cleanup"), lpfc_memcpy_to_slim() have switched memcpy_toio() to __write32_copy() in order to prevent unaligned 64 bit copy. Recently, we found that lpfc_memcpy_from_slim() have similar issues, so let it switch memcpy_fromio() to __read32_copy(). Cc: stable(a)vger.kernel.org Signed-off-by: Huacai Chen <chenhc(a)lemote.com> --- drivers/scsi/lpfc/lpfc_compat.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_compat.h b/drivers/scsi/lpfc/lpfc_compat.h index 6b32b0a..47d4fad 100644 --- a/drivers/scsi/lpfc/lpfc_compat.h +++ b/drivers/scsi/lpfc/lpfc_compat.h @@ -91,8 +91,8 @@ lpfc_memcpy_to_slim( void __iomem *dest, void *src, unsigned int bytes) static inline void lpfc_memcpy_from_slim( void *dest, void __iomem *src, unsigned int bytes) { - /* actually returns 1 byte past dest */ - memcpy_fromio( dest, src, bytes); + /* convert bytes in argument list to word count for copy function */ + __ioread32_copy(dest, src, bytes / sizeof(uint32_t)); } #endif /* __BIG_ENDIAN */ -- 2.7.0

7 years

3
5
0 0

[PATCH] Revert "mm: page_alloc: skip over regions of invalid pfns where possible"

by Daniel Vacek

This reverts commit b92df1de5d289c0b5d653e72414bf0850b8511e0. The commit is meant to be a boot init speed up skipping the loop in memmap_init_zone() for invalid pfns. But given some specific memory mapping on x86_64 (or more generally theoretically anywhere but on arm with CONFIG_HAVE_ARCH_PFN_VALID) the implementation also skips valid pfns which is plain wrong and causes 'kernel BUG at mm/page_alloc.c:1389!' crash> log | grep -e BUG -e RIP -e Call.Trace -e move_freepages_block -e rmqueue -e freelist -A1 kernel BUG at mm/page_alloc.c:1389! invalid opcode: 0000 [#1] SMP -- RIP: 0010:[<ffffffff8118833e>] [<ffffffff8118833e>] move_freepages+0x15e/0x160 RSP: 0018:ffff88054d727688 EFLAGS: 00010087 -- Call Trace: [<ffffffff811883b3>] move_freepages_block+0x73/0x80 [<ffffffff81189e63>] __rmqueue+0x263/0x460 [<ffffffff8118c781>] get_page_from_freelist+0x7e1/0x9e0 [<ffffffff8118caf6>] __alloc_pages_nodemask+0x176/0x420 -- RIP [<ffffffff8118833e>] move_freepages+0x15e/0x160 RSP <ffff88054d727688> crash> page_init_bug -v | grep RAM <struct resource 0xffff88067fffd2f8> 1000 - 9bfff System RAM (620.00 KiB) <struct resource 0xffff88067fffd3a0> 100000 - 430bffff System RAM ( 1.05 GiB = 1071.75 MiB = 1097472.00 KiB) <struct resource 0xffff88067fffd410> 4b0c8000 - 4bf9cfff System RAM ( 14.83 MiB = 15188.00 KiB) <struct resource 0xffff88067fffd480> 4bfac000 - 646b1fff System RAM (391.02 MiB = 400408.00 KiB) <struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB) <struct resource 0xffff88067fffd640> 100000000 - 67fffffff System RAM ( 22.00 GiB) crash> page_init_bug | head -6 <struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB) <struct page 0xffffea0001ede200> 1fffff00000000 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575 <struct page 0xffffea0001ede200> 505736 505344 <struct page 0xffffea0001ed8000> 505855 <struct page 0xffffea0001edffc0> <struct page 0xffffea0001ed8000> 0 0 <struct pglist_data 0xffff88047ffd9000> 0 <struct zone 0xffff88047ffd9000> DMA 1 4095 <struct page 0xffffea0001edffc0> 1fffff00000400 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575 BUG, zones differ! crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b787000 7b788000 PAGE PHYSICAL MAPPING INDEX CNT FLAGS ffffea0001e00000 78000000 0 0 0 0 ffffea0001ed7fc0 7b5ff000 0 0 0 0 ffffea0001ed8000 7b600000 0 0 0 0 <<<< ffffea0001ede1c0 7b787000 0 0 0 0 ffffea0001ede200 7b788000 0 0 1 1fffff00000000 Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible") Signed-off-by: Daniel Vacek <neelx(a)redhat.com> Acked-by: Ard Biesheuvel <ard.biesheuvel(a)linaro.org> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: Mel Gorman <mgorman(a)techsingularity.net> Cc: Pavel Tatashin <pasha.tatashin(a)oracle.com> Cc: Paul Burton <paul.burton(a)imgtec.com> Cc: stable(a)vger.kernel.org --- include/linux/memblock.h | 1 - mm/memblock.c | 28 ---------------------------- mm/page_alloc.c | 11 +---------- 3 files changed, 1 insertion(+), 39 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index 8be5077efb5f..f92ea7783652 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn, unsigned long *end_pfn); void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn, unsigned long *out_end_pfn, int *out_nid); -unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn); /** * for_each_mem_pfn_range - early memory pfn range iterator diff --git a/mm/memblock.c b/mm/memblock.c index b6ba6b7adadc..48376bd33274 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1101,34 +1101,6 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid, *out_nid = r->nid; } -unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn, - unsigned long max_pfn) -{ - struct memblock_type *type = &memblock.memory; - unsigned int right = type->cnt; - unsigned int mid, left = 0; - phys_addr_t addr = PFN_PHYS(++pfn); - - do { - mid = (right + left) / 2; - - if (addr < type->regions[mid].base) - right = mid; - else if (addr >= (type->regions[mid].base + - type->regions[mid].size)) - left = mid + 1; - else { - /* addr is within the region, so pfn is valid */ - return pfn; - } - } while (left < right); - - if (right == type->cnt) - return -1UL; - else - return PHYS_PFN(type->regions[right].base); -} - /** * memblock_set_node - set node ID on memblock regions * @base: base of area to set node ID for diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 635d7dd29d7f..e4566a3f8083 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5356,17 +5356,8 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, if (context != MEMMAP_EARLY) goto not_early; - if (!early_pfn_valid(pfn)) { -#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP - /* - * Skip to the pfn preceding the next valid one (or - * end_pfn), such that we hit a valid pfn (or end_pfn) - * on our next iteration of the loop. - */ - pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1; -#endif + if (!early_pfn_valid(pfn)) continue; - } if (!early_pfn_in_nid(pfn, nid)) continue; if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised)) -- 2.16.2

7 years

3
4
0 0

[Linux-stable-mirror] [PATCH] MIPS: Fix arch_trigger_cpumask_backtrace()

by Huacai Chen

SysRq-L and RCU stall detector call arch_trigger_cpumask_backtrace() to trigger other CPU's backtrace, but its behavior is totally broken. The root cause is arch_trigger_cpumask_backtrace() use call-function IPI in irq context, which trigger deadlocks in smp_call_function_single() and smp_call_function_many(). This patch fix arch_trigger_cpumask_backtrace() by: 1, Use a dedecated IPI (SMP_CPU_BACKTRACE) to trigger backtraces; 2, If myself is in target cpumask, do backtrace and clear myself; 3, Use a spinlock to avoid parallel backtrace output; 4, Handle SMP_CPU_BACKTRACE IPI for Loongson-3. I have attempted to implement SMP_CPU_BACKTRACE for all MIPS CPUs, but I failed because some of their IPIs are not extensible. :( Cc: stable(a)vger.kernel.org Signed-off-by: Huacai Chen <chenhc(a)lemote.com> --- arch/mips/include/asm/smp.h | 3 +++ arch/mips/kernel/process.c | 23 ++++++++++++++++++----- arch/mips/loongson64/loongson-3/smp.c | 6 ++++++ 3 files changed, 27 insertions(+), 5 deletions(-) diff --git a/arch/mips/include/asm/smp.h b/arch/mips/include/asm/smp.h index 88ebd83..b0521f4 100644 --- a/arch/mips/include/asm/smp.h +++ b/arch/mips/include/asm/smp.h @@ -43,6 +43,7 @@ extern int __cpu_logical_map[NR_CPUS]; /* Octeon - Tell another core to flush its icache */ #define SMP_ICACHE_FLUSH 0x4 #define SMP_ASK_C0COUNT 0x8 +#define SMP_CPU_BACKTRACE 0x10 /* Mask of CPUs which are currently definitely operating coherently */ extern cpumask_t cpu_coherent_mask; @@ -81,6 +82,8 @@ static inline void __cpu_die(unsigned int cpu) extern void play_dead(void); #endif +void arch_dump_stack(void); + /* * This function will set up the necessary IPIs for Linux to communicate * with the CPUs in mask. diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c index 57028d4..647e15d 100644 --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -655,26 +655,39 @@ unsigned long arch_align_stack(unsigned long sp) return sp & ALMASK; } -static void arch_dump_stack(void *info) +void arch_dump_stack(void) { struct pt_regs *regs; + static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED; + arch_spin_lock(&lock); regs = get_irq_regs(); if (regs) show_regs(regs); dump_stack(); + arch_spin_unlock(&lock); } void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self) { long this_cpu = get_cpu(); + struct cpumask backtrace_mask; + extern const struct plat_smp_ops *mp_ops; + + cpumask_copy(&backtrace_mask, mask); + if (cpumask_test_cpu(this_cpu, mask)) { + if (!exclude_self) { + struct pt_regs *regs = get_irq_regs(); + if (regs) + show_regs(regs); + dump_stack(); + } + cpumask_clear_cpu(this_cpu, &backtrace_mask); + } - if (cpumask_test_cpu(this_cpu, mask) && !exclude_self) - dump_stack(); - - smp_call_function_many(mask, arch_dump_stack, NULL, 1); + mp_ops->send_ipi_mask(&backtrace_mask, SMP_CPU_BACKTRACE); put_cpu(); } diff --git a/arch/mips/loongson64/loongson-3/smp.c b/arch/mips/loongson64/loongson-3/smp.c index 8501109..0655114 100644 --- a/arch/mips/loongson64/loongson-3/smp.c +++ b/arch/mips/loongson64/loongson-3/smp.c @@ -291,6 +291,12 @@ void loongson3_ipi_interrupt(struct pt_regs *regs) __wbflush(); /* Let others see the result ASAP */ } + if (action & SMP_CPU_BACKTRACE) { + irq_enter(); + arch_dump_stack(); + irq_exit(); + } + if (irqs) { int irq; while ((irq = ffs(irqs))) { -- 2.7.0

7 years

3
4
0 0

[PATCH] vmw_balloon: fixing double free when batching mode is off

by Nadav Amit

From: Gil Kupfer <gilkup(a)gmail.com> The balloon.page field is used for two different purposes if paging is on or off. If batching is on, the field point to the page which is used to communicate with with the hypervisor. If it is off, balloon.page points to the page that is about to be (un)locked. Unfortunately, this dual-purpose of the field introduced a bug: when the balloon is popped (e.g., when the machine is reset or the balloon driver is explicitly removed), the balloon driver frees, unconditionally, the page that is held in balloon.page. As a result, if batching is disabled, this leads to double freeing the last page that is sent to the hypervisor. batching and pointer to current page otherwise. When the balloon is popped (during reset or rmmode), it tries to free the communication page. If batching is disabled, this leads to double free of the last page sent to the backend. The following error occurs during rmmod when kernel checkers are on, and the balloon is not empty: [ 42.307653] ------------[ cut here ]------------ [ 42.307657] Kernel BUG at ffffffffba1e4b28 [verbose debug info unavailable] [ 42.307720] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 42.312512] Modules linked in: vmw_vsock_vmci_transport vsock ppdev joydev vmw_balloon(-) input_leds serio_raw vmw_vmci parport_pc shpchp parport i2c_piix4 nfit mac_hid autofs4 vmwgfx drm_kms_helper hid_generic syscopyarea sysfillrect usbhid sysimgblt fb_sys_fops hid ttm mptspi scsi_transport_spi ahci mptscsih drm psmouse vmxnet3 libahci mptbase pata_acpi [ 42.312766] CPU: 10 PID: 1527 Comm: rmmod Not tainted 4.12.0+ #5 [ 42.312803] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2016 [ 42.313042] task: ffff9bf9680f8000 task.stack: ffffbfefc1638000 [ 42.313290] RIP: 0010:__free_pages+0x38/0x40 [ 42.313510] RSP: 0018:ffffbfefc163be98 EFLAGS: 00010246 [ 42.313731] RAX: 000000000000003e RBX: ffffffffc02b9720 RCX: 0000000000000006 [ 42.313972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9bf97e08e0a0 [ 42.314201] RBP: ffffbfefc163be98 R08: 0000000000000000 R09: 0000000000000000 [ 42.314435] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc02b97e4 [ 42.314505] R13: ffffffffc02b9748 R14: ffffffffc02b9728 R15: 0000000000000200 [ 42.314550] FS: 00007f3af5fec700(0000) GS:ffff9bf97e080000(0000) knlGS:0000000000000000 [ 42.314599] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 42.314635] CR2: 00007f44f6f4ab24 CR3: 00000003a7d12000 CR4: 00000000000006e0 [ 42.314864] Call Trace: [ 42.315774] vmballoon_pop+0x102/0x130 [vmw_balloon] [ 42.315816] vmballoon_exit+0x42/0xd64 [vmw_balloon] [ 42.315853] SyS_delete_module+0x1e2/0x250 [ 42.315891] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 42.315924] RIP: 0033:0x7f3af5b0e8e7 [ 42.315949] RSP: 002b:00007fffe6ce0148 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 42.315996] RAX: ffffffffffffffda RBX: 000055be676401e0 RCX: 00007f3af5b0e8e7 [ 42.316951] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055be67640248 [ 42.317887] RBP: 0000000000000003 R08: 0000000000000000 R09: 1999999999999999 [ 42.318845] R10: 0000000000000883 R11: 0000000000000206 R12: 00007fffe6cdf130 [ 42.319755] R13: 0000000000000000 R14: 0000000000000000 R15: 000055be676401e0 [ 42.320606] Code: c0 74 1c f0 ff 4f 1c 74 02 5d c3 85 f6 74 07 e8 0f d8 ff ff 5d c3 31 f6 e8 c6 fb ff ff 5d c3 48 c7 c6 c8 0f c5 ba e8 58 be 02 00 <0f> 0b 66 0f 1f 44 00 00 66 66 66 66 90 48 85 ff 75 01 c3 55 48 [ 42.323462] RIP: __free_pages+0x38/0x40 RSP: ffffbfefc163be98 [ 42.325735] ---[ end trace 872e008e33f81508 ]--- To solve the bug, we eliminate the dual purpose of balloon.page. Fixes: 220a80f0c2e7 ("VMware balloon: add batching to the vmw_balloon.") Cc: stable(a)vger.kernel.org Reported-by: Oleksandr Natalenko <onatalen(a)redhat.com> Signed-off-by: Gil Kupfer <gilkup(a)gmail.com> Signed-off-by: Nadav Amit <namit(a)vmware.com> Reviewed-by: Xavier Deguillard <xdeguillard(a)vmware.com> --- drivers/misc/vmw_balloon.c | 23 +++++++---------------- 1 file changed, 7 insertions(+), 16 deletions(-) diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c index 9047c0a529b2..efd733472a35 100644 --- a/drivers/misc/vmw_balloon.c +++ b/drivers/misc/vmw_balloon.c @@ -576,15 +576,9 @@ static void vmballoon_pop(struct vmballoon *b) } } - if (b->batch_page) { - vunmap(b->batch_page); - b->batch_page = NULL; - } - - if (b->page) { - __free_page(b->page); - b->page = NULL; - } + /* Clearing the batch_page unconditionally has no adverse effect */ + free_page((unsigned long)b->batch_page); + b->batch_page = NULL; } /* @@ -991,16 +985,13 @@ static const struct vmballoon_ops vmballoon_batched_ops = { static bool vmballoon_init_batching(struct vmballoon *b) { - b->page = alloc_page(VMW_PAGE_ALLOC_NOSLEEP); - if (!b->page) - return false; + struct page *page; - b->batch_page = vmap(&b->page, 1, VM_MAP, PAGE_KERNEL); - if (!b->batch_page) { - __free_page(b->page); + page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) return false; - } + b->batch_page = page_address(page); return true; } -- 2.14.1

7 years, 1 month

3
18
0 0

[Linux-stable-mirror] [PATCH 1/2] bdi: make sure congestion states are clear on free

by Tejun Heo

FUSE has a bug where it fails to clear congestion states if a connection gets aborted while congested, which can leave nr_wb_congested[] stuck until reboot causing wait_iff_congested() to wait spuriously. While the bdi owner, FUSE, is primarily responsible for clearing congestion states before destroying bdi_writebacks, bdi layer can ensure that congestion states are not leaked beyond bdi_writeback lifecycle. Signed-off-by: Tejun Heo <tj(a)kernel.org> Reported-by: Joshua Miller <joshmiller(a)fb.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Jan Kara <jack(a)suse.cz> Cc: stable(a)vger.kernel.org --- include/linux/backing-dev.h | 14 +++++++++++++- mm/backing-dev.c | 2 +- 2 files changed, 14 insertions(+), 2 deletions(-) --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h @@ -220,6 +220,18 @@ static inline int bdi_sched_wait(void *w return 0; } +static inline void __wb_congested_free(struct bdi_writeback_congested *congested) +{ + /* + * Make sure congestion states are cleared before freeing to avoid + * nr_wb_congested() corruption which can lead to misbehaving + * wait_iff_congested(). + */ + clear_wb_congested(congested, BLK_RW_SYNC); + clear_wb_congested(congested, BLK_RW_ASYNC); + kfree(congested); +} + #ifdef CONFIG_CGROUP_WRITEBACK struct bdi_writeback_congested * @@ -409,7 +421,7 @@ wb_congested_get_create(struct backing_d static inline void wb_congested_put(struct bdi_writeback_congested *congested) { if (atomic_dec_and_test(&congested->refcnt)) - kfree(congested); + __wb_congested_free(congested); } static inline struct bdi_writeback *wb_find_current(struct backing_dev_info *bdi) --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -509,7 +509,7 @@ void wb_congested_put(struct bdi_writeba } spin_unlock_irqrestore(&cgwb_lock, flags); - kfree(congested); + __wb_congested_free(congested); } static void cgwb_release_workfn(struct work_struct *work)

7 years, 1 month

4
5
0 0

[Linux-stable-mirror] [PATCH] mm/kasan: Don't vfree() nonexistent vm_area.

by Andrey Ryabinin

KASAN uses different routines to map shadow for hot added memory and memory obtained in boot process. Attempt to offline memory onlined by normal boot process leads to this: Trying to vfree() nonexistent vm area (000000005d3b34b9) WARNING: CPU: 2 PID: 13215 at mm/vmalloc.c:1525 __vunmap+0x147/0x190 Call Trace: kasan_mem_notifier+0xad/0xb9 notifier_call_chain+0x166/0x260 __blocking_notifier_call_chain+0xdb/0x140 __offline_pages+0x96a/0xb10 memory_subsys_offline+0x76/0xc0 device_offline+0xb8/0x120 store_mem_state+0xfa/0x120 kernfs_fop_write+0x1d5/0x320 __vfs_write+0xd4/0x530 vfs_write+0x105/0x340 SyS_write+0xb0/0x140 Obviously we can't call vfree() to free memory that wasn't allocated via vmalloc(). Use find_vm_area() to see if we can call vfree(). Unfortunately it's a bit tricky to properly unmap and free shadow allocated during boot, so we'll have to keep it. If memory will come online again that shadow will be reused. Fixes: fa69b5989bb0 ("mm/kasan: add support for memory hotplug") Reported-by: Paul Menzel <pmenzel+linux-kasan-dev(a)molgen.mpg.de> Signed-off-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com> Cc: <stable(a)vger.kernel.org> --- mm/kasan/kasan.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 55 insertions(+), 2 deletions(-) diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c index e13d911251e7..0d9d9d268f32 100644 --- a/mm/kasan/kasan.c +++ b/mm/kasan/kasan.c @@ -791,6 +791,41 @@ DEFINE_ASAN_SET_SHADOW(f5); DEFINE_ASAN_SET_SHADOW(f8); #ifdef CONFIG_MEMORY_HOTPLUG +static bool shadow_mapped(unsigned long addr) +{ + pgd_t *pgd = pgd_offset_k(addr); + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + + if (pgd_none(*pgd)) + return false; + p4d = p4d_offset(pgd, addr); + if (p4d_none(*p4d)) + return false; + pud = pud_offset(p4d, addr); + if (pud_none(*pud)) + return false; + + /* + * We can't use pud_large() or pud_huge(), the first one + * is arch-specific, the last one depend on HUGETLB_PAGE. + * So let's abuse pud_bad(), if bud is bad it's has to + * because it's huge. + */ + if (pud_bad(*pud)) + return true; + pmd = pmd_offset(pud, addr); + if (pmd_none(*pmd)) + return false; + + if (pmd_bad(*pmd)) + return true; + pte = pte_offset_kernel(pmd, addr); + return !pte_none(*pte); +} + static int __meminit kasan_mem_notifier(struct notifier_block *nb, unsigned long action, void *data) { @@ -812,6 +847,14 @@ static int __meminit kasan_mem_notifier(struct notifier_block *nb, case MEM_GOING_ONLINE: { void *ret; + /* + * If shadow is mapped already than it must have been mapped + * during the boot. This could happen if we onlining previously + * offlined memory. + */ + if (shadow_mapped(shadow_start)) + return NOTIFY_OK; + ret = __vmalloc_node_range(shadow_size, PAGE_SIZE, shadow_start, shadow_end, GFP_KERNEL, PAGE_KERNEL, VM_NO_GUARD, @@ -823,8 +866,18 @@ static int __meminit kasan_mem_notifier(struct notifier_block *nb, kmemleak_ignore(ret); return NOTIFY_OK; } - case MEM_OFFLINE: - vfree((void *)shadow_start); + case MEM_OFFLINE: { + struct vm_struct *vm; + + /* + * Only hot-added memory have vm_area. Freeing shadow + * mapped during boot would be tricky, so we'll just + * have to keep it. + */ + vm = find_vm_area((void *)shadow_start); + if (vm) + vfree((void *)shadow_start); + } } return NOTIFY_OK; -- 2.13.6

7 years, 1 month

5
12
0 0

[PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer

by Theodore Ts'o

Previously the jbd2 layer assumed that a file system check would be required after a journal abort. In the case of the deliberate file system shutdown, this should not be necessary. Allow the jbd2 layer to distinguish between these two cases by using the ESHUTDOWN errno. Also add proper locking to __journal_abort_soft(). Signed-off-by: Theodore Ts'o <tytso(a)mit.edu> Cc: stable(a)vger.kernel.org --- fs/ext4/ioctl.c | 4 ++-- fs/jbd2/journal.c | 25 +++++++++++++++++++------ 2 files changed, 21 insertions(+), 8 deletions(-) diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index 16d3d1325f5b..9ac33a7cbd32 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -493,13 +493,13 @@ static int ext4_shutdown(struct super_block *sb, unsigned long arg) set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags); if (sbi->s_journal && !is_journal_aborted(sbi->s_journal)) { (void) ext4_force_commit(sb); - jbd2_journal_abort(sbi->s_journal, 0); + jbd2_journal_abort(sbi->s_journal, -ESHUTDOWN); } break; case EXT4_GOING_FLAGS_NOLOGFLUSH: set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags); if (sbi->s_journal && !is_journal_aborted(sbi->s_journal)) - jbd2_journal_abort(sbi->s_journal, 0); + jbd2_journal_abort(sbi->s_journal, -ESHUTDOWN); break; default: return -EINVAL; diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index 3fbf48ec2188..efa0c72a0b9f 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -1483,12 +1483,15 @@ static void jbd2_mark_journal_empty(journal_t *journal, int write_op) void jbd2_journal_update_sb_errno(journal_t *journal) { journal_superblock_t *sb = journal->j_superblock; + int errcode; read_lock(&journal->j_state_lock); - jbd_debug(1, "JBD2: updating superblock error (errno %d)\n", - journal->j_errno); - sb->s_errno = cpu_to_be32(journal->j_errno); + errcode = journal->j_errno; read_unlock(&journal->j_state_lock); + if (errcode == -ESHUTDOWN) + errcode = 0; + jbd_debug(1, "JBD2: updating superblock error (errno %d)\n", errcode); + sb->s_errno = cpu_to_be32(errcode); jbd2_write_superblock(journal, REQ_SYNC | REQ_FUA); } @@ -2105,12 +2108,22 @@ void __jbd2_journal_abort_hard(journal_t *journal) * but don't do any other IO. */ static void __journal_abort_soft (journal_t *journal, int errno) { - if (journal->j_flags & JBD2_ABORT) - return; + int old_errno; - if (!journal->j_errno) + write_lock(&journal->j_state_lock); + old_errno = journal->j_errno; + if (!journal->j_errno || errno == -ESHUTDOWN) journal->j_errno = errno; + if (journal->j_flags & JBD2_ABORT) { + write_unlock(&journal->j_state_lock); + if (!old_errno && old_errno != -ESHUTDOWN && + errno == -ESHUTDOWN) + jbd2_journal_update_sb_errno(journal); + return; + } + write_unlock(&journal->j_state_lock); + __jbd2_journal_abort_hard(journal); if (errno) { -- 2.16.1.72.g5be1f00a9a

7 years, 1 month

3
3
0 0

[PATCH 1/2] MIPS: memset.S: EVA & fault support for small_memset

by Matt Redfearn

The MIPS kernel memset / bzero implementation includes a small_memset branch which is used when the region to be set is smaller than a long (4 bytes on 32bit, 8 bytes on 64bit). The current small_memset implementation uses a simple store byte loop to write the destination. There are 2 issues with this implementation: 1. When EVA mode is active, user and kernel address spaces may overlap. Currently the use of the sb instruction means kernel mode addressing is always used and an intended write to userspace may actually overwrite some critical kernel data. 2. If the write triggers a page fault, for example by calling __clear_user(NULL, 2), instead of gracefully handling the fault, an OOPS is triggered. Fix these issues by replacing the sb instruction with the EX() macro, which will emit EVA compatible instuctions as required. Additionally implement a fault fixup for small_memset which sets a2 to the number of bytes that could not be cleared (as defined by __clear_user). Reported-by: Chuanhua Lei <chuanhua.lei(a)intel.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable(a)vger.kernel.org Signed-off-by: Matt Redfearn <matt.redfearn(a)mips.com> --- arch/mips/lib/memset.S | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/mips/lib/memset.S b/arch/mips/lib/memset.S index a1456664d6c2..90bcdf1224ee 100644 --- a/arch/mips/lib/memset.S +++ b/arch/mips/lib/memset.S @@ -219,7 +219,7 @@ 1: PTR_ADDIU a0, 1 /* fill bytewise */ R10KCBARRIER(0(ra)) bne t1, a0, 1b - sb a1, -1(a0) + EX(sb, a1, -1(a0), .Lsmall_fixup\@) 2: jr ra /* done */ move a2, zero @@ -260,6 +260,11 @@ jr ra andi v1, a2, STORMASK +.Lsmall_fixup\@: + PTR_SUBU a2, t1, a0 + jr ra + PTR_ADDIU a2, 1 + .endm /* -- 2.7.4

7 years, 1 month

3
3
0 0

[PATCH v3] PCI / PM: Always check PME wakeup capability for runtime wakeup support

by Kai-Heng Feng

USB controller ASM1042 stops working after commit de3ef1eb1cd0 ("PM / core: Drop run_wake flag from struct dev_pm_info"). The device in question is not power managed by platform firmware, furthermore, it only supports PME# from D3cold: Capabilities: [78] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0-,D1-,D2-,D3hot-,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Before commit de3ef1eb1cd0, the device never gets runtime suspended. After that commit, the device gets runtime suspended, so it does not respond to any PME#. usb_hcd_pci_probe() mandatorily calls device_wakeup_enable(), hence device_can_wakeup() in pci_dev_run_wake() always returns true. So pci_dev_run_wake() needs to check PME wakeup capability as its first condition. In addition, change wakeup flag passed to pci_target_state() from false to true, because we want to find the deepest state that the device can still generate PME#. Fixes: de3ef1eb1cd0 ("PM / core: Drop run_wake flag from struct dev_pm_info") Cc: stable(a)vger.kernel.org # 4.13+ Signed-off-by: Kai-Heng Feng <kai.heng.feng(a)canonical.com> --- v3: State the reason why the wakeup flag gets changed. v2: Explicitly check dev->pme_support. drivers/pci/pci.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index f6a4dd10d9b0..52821a21fc07 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2125,16 +2125,16 @@ bool pci_dev_run_wake(struct pci_dev *dev) { struct pci_bus *bus = dev->bus; - if (device_can_wakeup(&dev->dev)) - return true; - if (!dev->pme_support) return false; /* PME-capable in principle, but not from the target power state */ - if (!pci_pme_capable(dev, pci_target_state(dev, false))) + if (!pci_pme_capable(dev, pci_target_state(dev, true))) return false; + if (device_can_wakeup(&dev->dev)) + return true; + while (bus->parent) { struct pci_dev *bridge = bus->self; -- 2.15.1

7 years, 2 months

4
8
0 0

[PATCH v2 2/2] x86/mm: implement free pmd/pte page interfaces

by Toshi Kani

Implement pud_free_pmd_page() and pmd_free_pte_page() on x86, which clear a given pud/pmd entry and free up lower level page table(s). Address range associated with the pud/pmd entry must have been purged by INVLPG. fixes: e61ce6ade404e ("mm: change ioremap to set up huge I/O mappings") Signed-off-by: Toshi Kani <toshi.kani(a)hpe.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Ingo Molnar <mingo(a)redhat.com> Cc: "H. Peter Anvin" <hpa(a)zytor.com> Cc: Borislav Petkov <bp(a)suse.de> Cc: Matthew Wilcox <willy(a)infradead.org> Cc: <stable(a)vger.kernel.org> --- arch/x86/mm/pgtable.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 1eed7ed518e6..34cda7e0551b 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -712,7 +712,22 @@ int pmd_clear_huge(pmd_t *pmd) */ int pud_free_pmd_page(pud_t *pud) { - return pud_none(*pud); + pmd_t *pmd; + int i; + + if (pud_none(*pud)) + return 1; + + pmd = (pmd_t *)pud_page_vaddr(*pud); + + for (i = 0; i < PTRS_PER_PMD; i++) + if (!pmd_free_pte_page(&pmd[i])) + return 0; + + pud_clear(pud); + free_page((unsigned long)pmd); + + return 1; } /** @@ -724,6 +739,15 @@ int pud_free_pmd_page(pud_t *pud) */ int pmd_free_pte_page(pmd_t *pmd) { - return pmd_none(*pmd); + pte_t *pte; + + if (pmd_none(*pmd)) + return 1; + + pte = (pte_t *)pmd_page_vaddr(*pmd); + pmd_clear(pmd); + free_page((unsigned long)pte); + + return 1; } #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */

7 years, 2 months

6
19
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror March 2018