On Tue, Jan 2, 2018 at 12:48 PM, kernelci.org bot <bot(a)kernelci.org> wrote:
Hi Ben,
almost a clean build with kernelci!
> Errors summary:
> 1 drivers/scsi/mpt2sas/mpt2sas_base.c:3550:1: internal compiler error: in extract_constrain_insn, at recog.c:2190
> 1 drivers/scsi/mpt2sas/mpt2sas_base.c:3550:1: error: insn does not satisfy its constraints:
See earlier discussion https://www.spinics.net/lists/stable/msg195996.html
> Warnings summary:
> 54 include/linux/stddef.h:8:14: warning: 'return' with a value, in function returning void
This comes from an incorrect backport of commit
49e67dd17649 ("of: fdt: add missing allocation-failure check")
It's harmless, and stable/linux-3.18.y has the correct version:
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -380,6 +380,6 @@ static void __unflatten_device_tree(void *blob,
/* Allocate memory for the expanded device tree */
mem = dt_alloc(size + 4, __alignof__(struct device_node));
if (!mem)
- return NULL;
+ return;
memset(mem, 0, size);
> 2 ipc/sem.c:377:6: warning: '___p1' may be used uninitialized in this function [-Wmaybe-uninitialized]
This code was last touched in 3.16 by the backport of commit
5864a2fd3088 ("ipc/sem.c: fix complex_count vs. simple op race")
The warning is in "smp_load_acquire(&sma->complex_mode))", and I suspect
that commit 27d7be1801a4 ("ipc/sem.c: avoid using spin_unlock_wait()")
avoided the warning upstream by removing the smp_mb() before it.
The code is way too complex for a fly-by analysis, so I'm adding Manfred
to Cc here. It may be worth comparing the full list of backports that
went into ipc/sem.c in 3.16.y with those in 3.18.y and 4.1.y that don't
have the warning. Here is what I see in the git history:
$ git log --oneline v3.16..stable/linux-3.16.y ipc/sem.c
accb9f16adba ipc/sem.c: fix complex_count vs. simple op race
5b11c133308b ipc: remove use of seq_printf return value
08397b1a5cd4 sysv, ipc: fix security-layer leaking
35cfc2b3a9da ipc/sem.c: fully initialize sem_array before making it visible
69a9a86b645f ipc/sem.c: update/correct memory barriers
30f995ba77ca ipc/sem.c: change memory barrier in sem_lock() to smp_rmb()
76ce4fe19d6b ipc,sem: fix use after free on IPC_RMID after a task
using same semaphore set exits
$ git log --oneline v3.16..stable/linux-3.18.y ipc/sem.c
7dd90826dfba sysv, ipc: fix security-layer leaking
ff12efa03da1 ipc/sem.c: update/correct memory barriers
38b50c47c25e ipc,sem: fix use after free on IPC_RMID after a task
using same semaphore set exits
e8577d1f0329 ipc/sem.c: fully initialize sem_array before making it visible
$ git log --oneline v3.16..stable/linux-4.1.y ipc/sem.c
e2b438fdfa4d sysv, ipc: fix security-layer leaking
b6805da60f01 ipc/sem.c: update/correct memory barriers
7be83cf01024 ipc,sem: fix use after free on IPC_RMID after a task
using same semaphore set exits
7f032d6ef615 ipc: remove use of seq_printf return value
52644c9ab3fa ipc,sem: use current->state helpers
2e094abfd1f2 ipc/sem.c: change memory barrier in sem_lock() to smp_rmb()
e8577d1f0329 ipc/sem.c: fully initialize sem_array before making it visible
$ git log --oneline v3.16..stable/linux-4.4.y ipc/sem.c
f6031d95320d ipc/sem.c: fix complex_count vs. simple op race
62659f0b9ed7 sysv, ipc: fix security-layer leaking
3ed1f8a99d70 ipc/sem.c: update/correct memory barriers
a97955844807 ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()
602b8593d2b4 ipc,sem: fix use after free on IPC_RMID after a task
using same semaphore set exits
55b7ae50167e ipc: rename ipc_obtain_object
7f032d6ef615 ipc: remove use of seq_printf return value
52644c9ab3fa ipc,sem: use current->state helpers
2e094abfd1f2 ipc/sem.c: change memory barrier in sem_lock() to smp_rmb()
e8577d1f0329 ipc/sem.c: fully initialize sem_array before making it visible
$ git log --oneline v3.16..stable/linux-4.9.y ipc/sem.c
2a1613a586de ipc/sem.c: add cond_resched in exit_sme
5864a2fd3088 ipc/sem.c: fix complex_count vs. simple op race
9b24fef9f041 sysv, ipc: fix security-layer leaking
be3e78449803 locking/spinlock: Update spin_unlock_wait() users
33ac279677dc locking/barriers: Introduce smp_acquire__after_ctrl_dep()
a5f4db877177 ipc/sem: make semctl setting sempid consistent
1d5cfdb07628 tree wide: use kvfree() than conditional kfree()/vfree()
3ed1f8a99d70 ipc/sem.c: update/correct memory barriers
a97955844807 ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()
602b8593d2b4 ipc,sem: fix use after free on IPC_RMID after a task
using same semaphore set exits
55b7ae50167e ipc: rename ipc_obtain_object
7f032d6ef615 ipc: remove use of seq_printf return value
52644c9ab3fa ipc,sem: use current->state helpers
2e094abfd1f2 ipc/sem.c: change memory barrier in sem_lock() to smp_rmb()
e8577d1f0329 ipc/sem.c: fully initialize sem_array before making it visible
> 1 arch/arm/kernel/head-nommu.S:167: Warning: Use of r13 as a source register is deprecated when r15 is the destination register.
Fixed by backporting:
970d96f9a81b ("ARM: 8383/1: nommu: avoid deprecated source register on mov")
Arnd
On Fri, Feb 16, 2018 at 7:21 PM, kbuild test robot
<fengguang.wu(a)intel.com> wrote:
> Hi Andrey,
>
> It's probably a bug fix that unveils the link errors.
>
> tree: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-3.16.y
> head: 0b9f4cdd4d75131d8886b919bbf6e0c98906d36e
> commit: 3cb0dc19883f0c69225311d4f76aa8128d3681a4 [2872/3488] module: fix types of device tables aliases
> config: x86_64-allmodconfig (attached as .config)
> compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
> reproduce:
> git checkout 3cb0dc19883f0c69225311d4f76aa8128d3681a4
> # save the attached .config to linux build tree
> make ARCH=x86_64
>
> All errors (new ones prefixed by >>):
>
> arch/x86/kernel/head64.o: In function `_GLOBAL__sub_D_00100_1_early_pmd_flags':
>>> head64.c:(.text.exit+0x5): undefined reference to `__gcov_exit'
> arch/x86/kernel/head.o: In function `_GLOBAL__sub_D_00100_1_reserve_ebda_region':
> head.c:(.text.exit+0x5): undefined reference to `__gcov_exit'
> init/built-in.o: In function `_GLOBAL__sub_D_00100_1___ksymtab_system_state':
> main.c:(.text.exit+0x5): undefined reference to `__gcov_exit'
> init/built-in.o: In function `_GLOBAL__sub_D_00100_1_root_mountflags':
> do_mounts.c:(.text.exit+0x10): undefined reference to `__gcov_exit'
> init/built-in.o: In function `_GLOBAL__sub_D_00100_1_initrd_load':
> do_mounts_initrd.c:(.text.exit+0x1b): undefined reference to `__gcov_exit'
> init/built-in.o:initramfs.c:(.text.exit+0x26): more undefined references to `__gcov_exit' follow
I think this is a result of using a too new compiler with the old 3.16
kernel. In order
to build with gcc-7.3, you need to backport
05384213436a ("gcov: support GCC 7.1")
It's already part of stable-3.18 and later, but not 3.2 and 3.16.
Arnd
In September last year, Ben Hutchings submitted commit [9547837bdccb]
for 3.16.48-rc1 and I informed him that it would be useless without
[3f3752705dbd] (and that maybe [c3883fe06488] would be useful as well).
Ben dropped the patch but suggested I email this list with the
information of the other two patches but I never quite got around to it.
Now I see Sasha Levin is submitting [3f3752705dbd] and [c3883fe06488]
for 4.9, 4.4 and 3.18 it would now make sense to include [9547837bdccb].
This patch fixes a minor problem where a certain USB adapter for Sega
Genesis controllers appears as one input device when it has two ports
for two controllers. I imagine some users of emulator distributions
might use stable kernels and might benefit from this fix.
I'm actually not entirely sure that patch is something suitable for
stable but since it was already submitted once then I don't think it
hurts to bring it up again (despite it breaking stable-kernel-rules as
far as I understand it).
Commits mentioned:
[9547837bdccb]: HID: usbhid: add quirk for innomedia INNEX GENESIS/ATARI adapter
[3f3752705dbd]: HID: reject input outside logical range only if null state is set
[c3883fe06488]: HID: clamp input to logical range if no null state
If the patch [9547837bdccb] is not relevant then feel free to ignore
this email.
Thanks,
--
Tomasz Kramkowski
This is backport of the upstream patch
41c73a49df31151f4ff868f28fe4f129f113fa2c. It should be backported to
stable kernels because this performance problem was seen on the android
4.4 kernel.
commit 41c73a49df31151f4ff868f28fe4f129f113fa2c
Author: Mikulas Patocka <mpatocka(a)redhat.com>
Date: Wed Nov 23 17:04:00 2016 -0500
dm bufio: drop the lock when doing GFP_NOIO allocation
If the first allocation attempt using GFP_NOWAIT fails, drop the lock
and retry using GFP_NOIO allocation (lock is dropped because the
allocation can take some time).
Note that we won't do GFP_NOIO allocation when we loop for the second
time, because the lock shouldn't be dropped between __wait_for_free_buffer
and __get_unclaimed_buffer.
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 9ef88f0e2382..1c2e1dd7ca16 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -822,6 +822,7 @@ enum new_flag {
static struct dm_buffer *__alloc_buffer_wait_no_callback(struct dm_bufio_client *c, enum new_flag nf)
{
struct dm_buffer *b;
+ bool tried_noio_alloc = false;
/*
* dm-bufio is resistant to allocation failures (it just keeps
@@ -846,6 +847,15 @@ static struct dm_buffer *__alloc_buffer_wait_no_callback(struct dm_bufio_client
if (nf == NF_PREFETCH)
return NULL;
+ if (dm_bufio_cache_size_latch != 1 && !tried_noio_alloc) {
+ dm_bufio_unlock(c);
+ b = alloc_buffer(c, GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
+ dm_bufio_lock(c);
+ if (b)
+ return b;
+ tried_noio_alloc = true;
+ }
+
if (!list_empty(&c->reserved_buffers)) {
b = list_entry(c->reserved_buffers.next,
struct dm_buffer, lru_list);
This is backport of the upstream patch
9ea61cac0b1ad0c09022f39fd97e9b99a2cfc2dc. It should be backported to
stable kernels because this performance problem was seen on the android
4.4 kernel.
commit 9ea61cac0b1ad0c09022f39fd97e9b99a2cfc2dc
Author: Douglas Anderson <dianders(a)chromium.org>
Date: Thu Nov 17 11:24:20 2016 -0800
dm bufio: avoid sleeping while holding the dm_bufio lock
We've seen in-field reports showing _lots_ (18 in one case, 41 in
another) of tasks all sitting there blocked on:
mutex_lock+0x4c/0x68
dm_bufio_shrink_count+0x38/0x78
shrink_slab.part.54.constprop.65+0x100/0x464
shrink_zone+0xa8/0x198
In the two cases analyzed, we see one task that looks like this:
Workqueue: kverityd verity_prefetch_io
__switch_to+0x9c/0xa8
__schedule+0x440/0x6d8
schedule+0x94/0xb4
schedule_timeout+0x204/0x27c
schedule_timeout_uninterruptible+0x44/0x50
wait_iff_congested+0x9c/0x1f0
shrink_inactive_list+0x3a0/0x4cc
shrink_lruvec+0x418/0x5cc
shrink_zone+0x88/0x198
try_to_free_pages+0x51c/0x588
__alloc_pages_nodemask+0x648/0xa88
__get_free_pages+0x34/0x7c
alloc_buffer+0xa4/0x144
__bufio_new+0x84/0x278
dm_bufio_prefetch+0x9c/0x154
verity_prefetch_io+0xe8/0x10c
process_one_work+0x240/0x424
worker_thread+0x2fc/0x424
kthread+0x10c/0x114
...and that looks to be the one holding the mutex.
The problem has been reproduced on fairly easily:
0. Be running Chrome OS w/ verity enabled on the root filesystem
1. Pick test patch: http://crosreview.com/412360
2. Install launchBalloons.sh and balloon.arm from
http://crbug.com/468342
...that's just a memory stress test app.
3. On a 4GB rk3399 machine, run
nice ./launchBalloons.sh 4 900 100000
...that tries to eat 4 * 900 MB of memory and keep accessing.
4. Login to the Chrome web browser and restore many tabs
With that, I've seen printouts like:
DOUG: long bufio 90758 ms
...and stack trace always show's we're in dm_bufio_prefetch().
The problem is that we try to allocate memory with GFP_NOIO while
we're holding the dm_bufio lock. Instead we should be using
GFP_NOWAIT. Using GFP_NOIO can cause us to sleep while holding the
lock and that causes the above problems.
The current behavior explained by David Rientjes:
It will still try reclaim initially because __GFP_WAIT (or
__GFP_KSWAPD_RECLAIM) is set by GFP_NOIO. This is the cause of
contention on dm_bufio_lock() that the thread holds. You want to
pass GFP_NOWAIT instead of GFP_NOIO to alloc_buffer() when holding a
mutex that can be contended by a concurrent slab shrinker (if
count_objects didn't use a trylock, this pattern would trivially
deadlock).
This change significantly increases responsiveness of the system while
in this state. It makes a real difference because it unblocks kswapd.
In the bug report analyzed, kswapd was hung:
kswapd0 D ffffffc000204fd8 0 72 2 0x00000000
Call trace:
[<ffffffc000204fd8>] __switch_to+0x9c/0xa8
[<ffffffc00090b794>] __schedule+0x440/0x6d8
[<ffffffc00090bac0>] schedule+0x94/0xb4
[<ffffffc00090be44>] schedule_preempt_disabled+0x28/0x44
[<ffffffc00090d900>] __mutex_lock_slowpath+0x120/0x1ac
[<ffffffc00090d9d8>] mutex_lock+0x4c/0x68
[<ffffffc000708e7c>] dm_bufio_shrink_count+0x38/0x78
[<ffffffc00030b268>] shrink_slab.part.54.constprop.65+0x100/0x464
[<ffffffc00030dbd8>] shrink_zone+0xa8/0x198
[<ffffffc00030e578>] balance_pgdat+0x328/0x508
[<ffffffc00030eb7c>] kswapd+0x424/0x51c
[<ffffffc00023f06c>] kthread+0x10c/0x114
[<ffffffc000203dd0>] ret_from_fork+0x10/0x40
By unblocking kswapd memory pressure should be reduced.
Suggested-by: David Rientjes <rientjes(a)google.com>
Reviewed-by: Guenter Roeck <linux(a)roeck-us.net>
Signed-off-by: Douglas Anderson <dianders(a)chromium.org>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 125aedc3875f..b5d3428d109e 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -827,7 +827,8 @@ static struct dm_buffer *__alloc_buffer_wait_no_callback(struct dm_bufio_client
* dm-bufio is resistant to allocation failures (it just keeps
* one buffer reserved in cases all the allocations fail).
* So set flags to not try too hard:
- * GFP_NOIO: don't recurse into the I/O layer
+ * GFP_NOWAIT: don't wait; if we need to sleep we'll release our
+ * mutex and wait ourselves.
* __GFP_NORETRY: don't retry and rather return failure
* __GFP_NOMEMALLOC: don't use emergency reserves
* __GFP_NOWARN: don't print a warning in case of failure
@@ -837,7 +838,7 @@ static struct dm_buffer *__alloc_buffer_wait_no_callback(struct dm_bufio_client
*/
while (1) {
if (dm_bufio_cache_size_latch != 1) {
- b = alloc_buffer(c, GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
+ b = alloc_buffer(c, GFP_NOWAIT | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
if (b)
return b;
}
On a powerpc 8xx, 'btc' fails as follows:
Entering kdb (current=0x(ptrval), pid 282) due to Keyboard Entry
kdb> btc
btc: cpu status: Currently on cpu 0
Available cpus: 0
kdb_getarea: Bad address 0x0
when booting the kernel with 'debug_boot_weak_hash', it fails as well
Entering kdb (current=0xba99ad80, pid 284) due to Keyboard Entry
kdb> btc
btc: cpu status: Currently on cpu 0
Available cpus: 0
kdb_getarea: Bad address 0xba99ad80
On other platforms, Oopses have been observed too, see
https://github.com/linuxppc/linux/issues/139
This is due to btc calling 'btt' with %p pointer as an argument.
This patch replaces %p by %px to get the real pointer value as
expected by 'btt'
Signed-off-by: Christophe Leroy <christophe.leroy(a)c-s.fr>
Cc: <stable(a)vger.kernel.org> # 4.15+
---
kernel/debug/kdb/kdb_bt.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/debug/kdb/kdb_bt.c b/kernel/debug/kdb/kdb_bt.c
index 6ad4a9fcbd6f..7921ae4fca8d 100644
--- a/kernel/debug/kdb/kdb_bt.c
+++ b/kernel/debug/kdb/kdb_bt.c
@@ -179,14 +179,14 @@ kdb_bt(int argc, const char **argv)
kdb_printf("no process for cpu %ld\n", cpu);
return 0;
}
- sprintf(buf, "btt 0x%p\n", KDB_TSK(cpu));
+ sprintf(buf, "btt 0x%px\n", KDB_TSK(cpu));
kdb_parse(buf);
return 0;
}
kdb_printf("btc: cpu status: ");
kdb_parse("cpu\n");
for_each_online_cpu(cpu) {
- sprintf(buf, "btt 0x%p\n", KDB_TSK(cpu));
+ sprintf(buf, "btt 0x%px\n", KDB_TSK(cpu));
kdb_parse(buf);
touch_nmi_watchdog();
}
--
2.13.3
From: Changwei Ge <ge.changwei(a)h3c.com>
Somehow, file system metadata was corrupted, which causes
ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el().
According to the original design intention, if above happens we should
skip the problematic block and continue to retrieve dir entry. But there
is obviouse misuse of brelse around related code.
After failure of ocfs2_check_dir_entry(), currunt code just moves to next
position and uses the problematic buffer head again and again during
which the problematic buffer head is released for multiple times. I
suppose, this a serious issue which is long-lived in ocfs2. This may
cause other file systems which is also used in a the same host insane.
So we should also consider about bakcporting this patch into
linux -stable.
Suggested-by: Changkuo Shi <shi.changkuo(a)h3c.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Changwei Ge <ge.changwei(a)h3c.com>
---
fs/ocfs2/dir.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/ocfs2/dir.c b/fs/ocfs2/dir.c
index b048d4f..c121abb 100644
--- a/fs/ocfs2/dir.c
+++ b/fs/ocfs2/dir.c
@@ -1897,8 +1897,7 @@ static int ocfs2_dir_foreach_blk_el(struct inode *inode,
/* On error, skip the f_pos to the
next block. */
ctx->pos = (ctx->pos | (sb->s_blocksize - 1)) + 1;
- brelse(bh);
- continue;
+ break;
}
if (le64_to_cpu(de->inode)) {
unsigned char d_type = DT_UNKNOWN;
--
2.7.4