Hagen reported broken strings in python3 tracepoint scripts:
make PYTHON=python3
./perf record -e sched:sched_switch -a -- sleep 5
./perf script --gen-script py
./perf script -s ./perf-script.py
[..]
sched__sched_switch 7 563231.759525792 0 swapper \
prev_comm=bytearray(b'swapper/7\x00\x00\x00\x00\x00\x00\x00'), \
prev_pid=0, prev_prio=120, prev_state=, next_comm=bytearray(b'mutex-thread-co\x00'),
The problem is in is_printable_array function that does not take
zero byte into account and claim such string as not printable,
so the code will create byte array instead of string.
Cc: stable(a)vger.kernel.org
Fixes: 249de6e07458 ("perf script python: Fix string vs byte array resolving")
Tested-by: Hagen Paul Pfeifer <hagen(a)jauu.net>
Signed-off-by: Jiri Olsa <jolsa(a)kernel.org>
---
tools/perf/util/print_binary.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/print_binary.c b/tools/perf/util/print_binary.c
index 599a1543871d..13fdc51c61d9 100644
--- a/tools/perf/util/print_binary.c
+++ b/tools/perf/util/print_binary.c
@@ -50,7 +50,7 @@ int is_printable_array(char *p, unsigned int len)
len--;
- for (i = 0; i < len; i++) {
+ for (i = 0; i < len && p[i]; i++) {
if (!isprint(p[i]) && !isspace(p[i]))
return 0;
}
--
2.26.2
The epoll_wait() syscall has a special version for OABI compat
mode to convert the arguments to the EABI structure layout
of the kernel. However, the later epoll_pwait() syscall was
added in arch/arm in linux-2.6.32 without this conversion.
Use the same kind of handler for both.
Fixes: 369842658a36 ("ARM: 5677/1: ARM support for TIF_RESTORE_SIGMASK/pselect6/ppoll/epoll_pwait")
Cc: stable(a)vger.kernel.org
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
arch/arm/kernel/sys_oabi-compat.c | 37 ++++++++++++++++++++++++++++---
arch/arm/tools/syscall.tbl | 2 +-
2 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c
index 0203e545bbc8..a2b1ae01e5bf 100644
--- a/arch/arm/kernel/sys_oabi-compat.c
+++ b/arch/arm/kernel/sys_oabi-compat.c
@@ -264,9 +264,8 @@ asmlinkage long sys_oabi_epoll_ctl(int epfd, int op, int fd,
return do_epoll_ctl(epfd, op, fd, &kernel, false);
}
-asmlinkage long sys_oabi_epoll_wait(int epfd,
- struct oabi_epoll_event __user *events,
- int maxevents, int timeout)
+static long do_oabi_epoll_wait(int epfd, struct oabi_epoll_event __user *events,
+ int maxevents, int timeout)
{
struct epoll_event *kbuf;
struct oabi_epoll_event e;
@@ -299,6 +298,38 @@ asmlinkage long sys_oabi_epoll_wait(int epfd,
return err ? -EFAULT : ret;
}
+SYSCALL_DEFINE4(oabi_epoll_wait, int, epfd,
+ struct oabi_epoll_event __user *, events,
+ int, maxevents, int, timeout)
+{
+ return do_oabi_epoll_wait(epfd, events, maxevents, timeout);
+}
+
+/*
+ * Implement the event wait interface for the eventpoll file. It is the kernel
+ * part of the user space epoll_pwait(2).
+ */
+SYSCALL_DEFINE6(oabi_epoll_pwait, int, epfd,
+ struct oabi_epoll_event __user *, events, int, maxevents,
+ int, timeout, const sigset_t __user *, sigmask,
+ size_t, sigsetsize)
+{
+ int error;
+
+ /*
+ * If the caller wants a certain signal mask to be set during the wait,
+ * we apply it here.
+ */
+ error = set_user_sigmask(sigmask, sigsetsize);
+ if (error)
+ return error;
+
+ error = do_oabi_epoll_wait(epfd, events, maxevents, timeout);
+ restore_saved_sigmask_unless(error == -EINTR);
+
+ return error;
+}
+
struct oabi_sembuf {
unsigned short sem_num;
short sem_op;
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 171077cbf419..39a24bee7df8 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -360,7 +360,7 @@
343 common vmsplice sys_vmsplice
344 common move_pages sys_move_pages
345 common getcpu sys_getcpu
-346 common epoll_pwait sys_epoll_pwait
+346 common epoll_pwait sys_epoll_pwait sys_oabi_epoll_pwait
347 common kexec_load sys_kexec_load
348 common utimensat sys_utimensat_time32
349 common signalfd sys_signalfd
--
2.27.0
Hello,
My Name is Mrs.Maria Abdullah from Wales United Kingdom,
I would like to work with you to carry out a promise i made to
God by willing all that i have to the charity because i am very
sick now (Covid-19 Virus Infection)
Get back to me for more details via my private email.
Email: mariadullah101(a)allhelas.com
Thanks and God bless you.
Maria
This is a note to let you know that I've just added the patch titled
serial: pl011: Fix lockdep splat when handling magic-sysrq interrupt
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 534cf755d9df99e214ddbe26b91cd4d81d2603e2 Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Wed, 30 Sep 2020 13:04:32 +0100
Subject: serial: pl011: Fix lockdep splat when handling magic-sysrq interrupt
Issuing a magic-sysrq via the PL011 causes the following lockdep splat,
which is easily reproducible under QEMU:
| sysrq: Changing Loglevel
| sysrq: Loglevel set to 9
|
| ======================================================
| WARNING: possible circular locking dependency detected
| 5.9.0-rc7 #1 Not tainted
| ------------------------------------------------------
| systemd-journal/138 is trying to acquire lock:
| ffffab133ad950c0 (console_owner){-.-.}-{0:0}, at: console_lock_spinning_enable+0x34/0x70
|
| but task is already holding lock:
| ffff0001fd47b098 (&port_lock_key){-.-.}-{2:2}, at: pl011_int+0x40/0x488
|
| which lock already depends on the new lock.
[...]
| Possible unsafe locking scenario:
|
| CPU0 CPU1
| ---- ----
| lock(&port_lock_key);
| lock(console_owner);
| lock(&port_lock_key);
| lock(console_owner);
|
| *** DEADLOCK ***
The issue being that CPU0 takes 'port_lock' on the irq path in pl011_int()
before taking 'console_owner' on the printk() path, whereas CPU1 takes
the two locks in the opposite order on the printk() path due to setting
the "console_owner" prior to calling into into the actual console driver.
Fix this in the same way as the msm-serial driver by dropping 'port_lock'
before handling the sysrq.
Cc: <stable(a)vger.kernel.org> # 4.19+
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Jiri Slaby <jirislaby(a)kernel.org>
Link: https://lore.kernel.org/r/20200811101313.GA6970@willie-the-truck
Signed-off-by: Peter Zijlstra <peterz(a)infradead.org>
Tested-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
Link: https://lore.kernel.org/r/20200930120432.16551-1-will@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/amba-pl011.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index 67498594d7d7..87dc3fc15694 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -308,8 +308,9 @@ static void pl011_write(unsigned int val, const struct uart_amba_port *uap,
*/
static int pl011_fifo_to_tty(struct uart_amba_port *uap)
{
- u16 status;
unsigned int ch, flag, fifotaken;
+ int sysrq;
+ u16 status;
for (fifotaken = 0; fifotaken != 256; fifotaken++) {
status = pl011_read(uap, REG_FR);
@@ -344,10 +345,12 @@ static int pl011_fifo_to_tty(struct uart_amba_port *uap)
flag = TTY_FRAME;
}
- if (uart_handle_sysrq_char(&uap->port, ch & 255))
- continue;
+ spin_unlock(&uap->port.lock);
+ sysrq = uart_handle_sysrq_char(&uap->port, ch & 255);
+ spin_lock(&uap->port.lock);
- uart_insert_char(&uap->port, ch, UART011_DR_OE, ch, flag);
+ if (!sysrq)
+ uart_insert_char(&uap->port, ch, UART011_DR_OE, ch, flag);
}
return fifotaken;
--
2.28.0
This is a note to let you know that I've just added the patch titled
tty: serial: fsl_lpuart: fix lpuart32_poll_get_char
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 29788ab1d2bf26c130de8f44f9553ee78a27e8d5 Mon Sep 17 00:00:00 2001
From: Peng Fan <peng.fan(a)nxp.com>
Date: Tue, 29 Sep 2020 17:55:09 +0800
Subject: tty: serial: fsl_lpuart: fix lpuart32_poll_get_char
The watermark is set to 1, so we need to input two chars to trigger RDRF
using the original logic. With the new logic, we could always get the
char when there is data in FIFO.
Suggested-by: Fugang Duan <fugang.duan(a)nxp.com>
Signed-off-by: Peng Fan <peng.fan(a)nxp.com>
Link: https://lore.kernel.org/r/20200929095509.21680-1-peng.fan@nxp.com
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/fsl_lpuart.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
index 645bbb24b433..590c901a1fc5 100644
--- a/drivers/tty/serial/fsl_lpuart.c
+++ b/drivers/tty/serial/fsl_lpuart.c
@@ -680,7 +680,7 @@ static void lpuart32_poll_put_char(struct uart_port *port, unsigned char c)
static int lpuart32_poll_get_char(struct uart_port *port)
{
- if (!(lpuart32_read(port, UARTSTAT) & UARTSTAT_RDRF))
+ if (!(lpuart32_read(port, UARTWATER) >> UARTWATER_RXCNT_OFF))
return NO_POLL_CHAR;
return lpuart32_read(port, UARTDATA);
--
2.28.0
This is a note to let you know that I've just added the patch titled
tty: serial: lpuart: fix lpuart32_write usage
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 9ea40db477c024dcb87321b7f32bd6ea12130ac2 Mon Sep 17 00:00:00 2001
From: Peng Fan <peng.fan(a)nxp.com>
Date: Tue, 29 Sep 2020 17:19:20 +0800
Subject: tty: serial: lpuart: fix lpuart32_write usage
The 2nd and 3rd parameter were wrongly used, and cause kernel abort when
doing kgdb debug.
Fixes: 1da17d7cf8e2 ("tty: serial: fsl_lpuart: Use appropriate lpuart32_* I/O funcs")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Peng Fan <peng.fan(a)nxp.com>
Link: https://lore.kernel.org/r/20200929091920.22612-1-peng.fan@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/fsl_lpuart.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
index 5a5a22d77841..645bbb24b433 100644
--- a/drivers/tty/serial/fsl_lpuart.c
+++ b/drivers/tty/serial/fsl_lpuart.c
@@ -649,26 +649,24 @@ static int lpuart32_poll_init(struct uart_port *port)
spin_lock_irqsave(&sport->port.lock, flags);
/* Disable Rx & Tx */
- lpuart32_write(&sport->port, UARTCTRL, 0);
+ lpuart32_write(&sport->port, 0, UARTCTRL);
temp = lpuart32_read(&sport->port, UARTFIFO);
/* Enable Rx and Tx FIFO */
- lpuart32_write(&sport->port, UARTFIFO,
- temp | UARTFIFO_RXFE | UARTFIFO_TXFE);
+ lpuart32_write(&sport->port, temp | UARTFIFO_RXFE | UARTFIFO_TXFE, UARTFIFO);
/* flush Tx and Rx FIFO */
- lpuart32_write(&sport->port, UARTFIFO,
- UARTFIFO_TXFLUSH | UARTFIFO_RXFLUSH);
+ lpuart32_write(&sport->port, UARTFIFO_TXFLUSH | UARTFIFO_RXFLUSH, UARTFIFO);
/* explicitly clear RDRF */
if (lpuart32_read(&sport->port, UARTSTAT) & UARTSTAT_RDRF) {
lpuart32_read(&sport->port, UARTDATA);
- lpuart32_write(&sport->port, UARTFIFO, UARTFIFO_RXUF);
+ lpuart32_write(&sport->port, UARTFIFO_RXUF, UARTFIFO);
}
/* Enable Rx and Tx */
- lpuart32_write(&sport->port, UARTCTRL, UARTCTRL_RE | UARTCTRL_TE);
+ lpuart32_write(&sport->port, UARTCTRL_RE | UARTCTRL_TE, UARTCTRL);
spin_unlock_irqrestore(&sport->port.lock, flags);
return 0;
@@ -677,7 +675,7 @@ static int lpuart32_poll_init(struct uart_port *port)
static void lpuart32_poll_put_char(struct uart_port *port, unsigned char c)
{
lpuart32_wait_bit_set(port, UARTSTAT, UARTSTAT_TDRE);
- lpuart32_write(port, UARTDATA, c);
+ lpuart32_write(port, c, UARTDATA);
}
static int lpuart32_poll_get_char(struct uart_port *port)
--
2.28.0
This is a note to let you know that I've just added the patch titled
serial: qcom_geni_serial: To correct QUP Version detection logic
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From c9ca43d42ed8d5fd635d327a664ed1d8579eb2af Mon Sep 17 00:00:00 2001
From: Paras Sharma <parashar(a)codeaurora.org>
Date: Wed, 30 Sep 2020 11:35:26 +0530
Subject: serial: qcom_geni_serial: To correct QUP Version detection logic
For QUP IP versions 2.5 and above the oversampling rate is
halved from 32 to 16.
Commit ce734600545f ("tty: serial: qcom_geni_serial: Update
the oversampling rate") is pushed to handle this scenario.
But the existing logic is failing to classify QUP Version 3.0
into the correct group ( 2.5 and above).
As result Serial Engine clocks are not configured properly for
baud rate and garbage data is sampled to FIFOs from the line.
So, fix the logic to detect QUP with versions 2.5 and above.
Fixes: ce734600545f ("tty: serial: qcom_geni_serial: Update the oversampling rate")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Paras Sharma <parashar(a)codeaurora.org>
Reviewed-by: Akash Asthana <akashast(a)codeaurora.org>
Link: https://lore.kernel.org/r/1601445926-23673-1-git-send-email-parashar@codeau…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/qcom_geni_serial.c | 2 +-
include/linux/qcom-geni-se.h | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 8da2eb2d1090..291649f02821 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -1000,7 +1000,7 @@ static void qcom_geni_serial_set_termios(struct uart_port *uport,
sampling_rate = UART_OVERSAMPLING;
/* Sampling rate is halved for IP versions >= 2.5 */
ver = geni_se_get_qup_hw_version(&port->se);
- if (GENI_SE_VERSION_MAJOR(ver) >= 2 && GENI_SE_VERSION_MINOR(ver) >= 5)
+ if (ver >= QUP_SE_VERSION_2_5)
sampling_rate /= 2;
clk_rate = get_clk_div_rate(baud, sampling_rate, &clk_div);
diff --git a/include/linux/qcom-geni-se.h b/include/linux/qcom-geni-se.h
index 8f385fbe5a0e..1c31f26ccc7a 100644
--- a/include/linux/qcom-geni-se.h
+++ b/include/linux/qcom-geni-se.h
@@ -248,6 +248,9 @@ struct geni_se {
#define GENI_SE_VERSION_MINOR(ver) ((ver & HW_VER_MINOR_MASK) >> HW_VER_MINOR_SHFT)
#define GENI_SE_VERSION_STEP(ver) (ver & HW_VER_STEP_MASK)
+/* QUP SE VERSION value for major number 2 and minor number 5 */
+#define QUP_SE_VERSION_2_5 0x20050000
+
/*
* Define bandwidth thresholds that cause the underlying Core 2X interconnect
* clock to run at the named frequency. These baseline values are recommended
--
2.28.0
commit a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab
objects") adds the checks for Slab pages, but the pages don't have
page_count are still missing from the check.
Network layer's sendpage method is not designed to send page_count 0
pages neither, therefore both PageSlab() and page_count() should be
both checked for the sending page. This is exactly what sendpage_ok()
does.
This patch uses sendpage_ok() in do_tcp_sendpages() to detect misused
.sendpage, to make the code more robust.
Fixes: a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
Suggested-by: Eric Dumazet <eric.dumazet(a)gmail.com>
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Vasily Averin <vvs(a)virtuozzo.com>
Cc: David S. Miller <davem(a)davemloft.net>
Cc: stable(a)vger.kernel.org
---
net/ipv4/tcp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 31f3b858db81..2135ee7c806d 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -970,7 +970,8 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
if (IS_ENABLED(CONFIG_DEBUG_VM) &&
- WARN_ONCE(PageSlab(page), "page must not be a Slab one"))
+ WARN_ONCE(!sendpage_ok(page),
+ "page must not be a Slab one and have page_count > 0"))
return -EINVAL;
/* Wait for a connection to finish. One exception is TCP Fast Open
--
2.26.2
Currently nvme_tcp_try_send_data() doesn't use kernel_sendpage() to
send slab pages. But for pages allocated by __get_free_pages() without
__GFP_COMP, which also have refcount as 0, they are still sent by
kernel_sendpage() to remote end, this is problematic.
The new introduced helper sendpage_ok() checks both PageSlab tag and
page_count counter, and returns true if the checking page is OK to be
sent by kernel_sendpage().
This patch fixes the page checking issue of nvme_tcp_try_send_data()
with sendpage_ok(). If sendpage_ok() returns true, send this page by
kernel_sendpage(), otherwise use sock_no_sendpage to handle this page.
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Jan Kara <jack(a)suse.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy(a)solarflare.com>
Cc: Philipp Reisner <philipp.reisner(a)linbit.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Vlastimil Babka <vbabka(a)suse.com>
Cc: stable(a)vger.kernel.org
---
drivers/nvme/host/tcp.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 8f4f29f18b8c..d6a3e1487354 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -913,12 +913,11 @@ static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
else
flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST;
- /* can't zcopy slab pages */
- if (unlikely(PageSlab(page))) {
- ret = sock_no_sendpage(queue->sock, page, offset, len,
+ if (sendpage_ok(page)) {
+ ret = kernel_sendpage(queue->sock, page, offset, len,
flags);
} else {
- ret = kernel_sendpage(queue->sock, page, offset, len,
+ ret = sock_no_sendpage(queue->sock, page, offset, len,
flags);
}
if (ret <= 0)
--
2.26.2
The original problem was from nvme-over-tcp code, who mistakenly uses
kernel_sendpage() to send pages allocated by __get_free_pages() without
__GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
tail pages, sending them by kernel_sendpage() may trigger a kernel panic
from a corrupted kernel heap, because these pages are incorrectly freed
in network stack as page_count 0 pages.
This patch introduces a helper sendpage_ok(), it returns true if the
checking page,
- is not slab page: PageSlab(page) is false.
- has page refcount: page_count(page) is not zero
All drivers who want to send page to remote end by kernel_sendpage()
may use this helper to check whether the page is OK. If the helper does
not return true, the driver should try other non sendpage method (e.g.
sock_no_sendpage()) to handle the page.
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Jan Kara <jack(a)suse.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy(a)solarflare.com>
Cc: Philipp Reisner <philipp.reisner(a)linbit.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Vlastimil Babka <vbabka(a)suse.com>
Cc: stable(a)vger.kernel.org
---
include/linux/net.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/include/linux/net.h b/include/linux/net.h
index d48ff1180879..ae713c851342 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -21,6 +21,7 @@
#include <linux/rcupdate.h>
#include <linux/once.h>
#include <linux/fs.h>
+#include <linux/mm.h>
#include <linux/sockptr.h>
#include <uapi/linux/net.h>
@@ -286,6 +287,21 @@ do { \
#define net_get_random_once_wait(buf, nbytes) \
get_random_once_wait((buf), (nbytes))
+/*
+ * E.g. XFS meta- & log-data is in slab pages, or bcache meta
+ * data pages, or other high order pages allocated by
+ * __get_free_pages() without __GFP_COMP, which have a page_count
+ * of 0 and/or have PageSlab() set. We cannot use send_page for
+ * those, as that does get_page(); put_page(); and would cause
+ * either a VM_BUG directly, or __page_cache_release a page that
+ * would actually still be referenced by someone, leading to some
+ * obscure delayed Oops somewhere else.
+ */
+static inline bool sendpage_ok(struct page *page)
+{
+ return !PageSlab(page) && page_count(page) >= 1;
+}
+
int kernel_sendmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec,
size_t num, size_t len);
int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
--
2.26.2
Although btrfs_calc_avail_data_space() is trying to do an estimation
on how many data chunks it can allocate, the estimation is far from
perfect:
- Metadata over-commit is not considered at all
- Chunk allocation doesn't take RAID5/6 into consideration
This patch will change btrfs_calc_avail_data_space() to use
pre-calculated per-profile available space.
This provides the following benefits:
- Accurate unallocated data space estimation
It's as accurate as chunk allocator, and can handle RAID5/6 and newly
introduced RAID1C3/C4.
For the metadata over-commit part, we don't take that into consideration
yet. As metadata over-commit only happens when we have enough
unallocated space, and under most case we won't use that much metadata
space at all.
And we still have the existing 0-available space check, to prevent us
from reporting too optimistic f_bavail result.
Since we're keeping the old lock-free design, statfs should not experience
any extra delay.
CC: stable(a)vger.kernel.org # 5.4+
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/super.c | 131 +++--------------------------------------------
1 file changed, 7 insertions(+), 124 deletions(-)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 25967ecaaf0a..355e4f6a2fd4 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2016,124 +2016,6 @@ static inline void btrfs_descending_sort_devices(
btrfs_cmp_device_free_bytes, NULL);
}
-/*
- * The helper to calc the free space on the devices that can be used to store
- * file data.
- */
-static inline int btrfs_calc_avail_data_space(struct btrfs_fs_info *fs_info,
- u64 *free_bytes)
-{
- struct btrfs_device_info *devices_info;
- struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
- struct btrfs_device *device;
- u64 type;
- u64 avail_space;
- u64 min_stripe_size;
- int num_stripes = 1;
- int i = 0, nr_devices;
- const struct btrfs_raid_attr *rattr;
-
- /*
- * We aren't under the device list lock, so this is racy-ish, but good
- * enough for our purposes.
- */
- nr_devices = fs_info->fs_devices->open_devices;
- if (!nr_devices) {
- smp_mb();
- nr_devices = fs_info->fs_devices->open_devices;
- ASSERT(nr_devices);
- if (!nr_devices) {
- *free_bytes = 0;
- return 0;
- }
- }
-
- devices_info = kmalloc_array(nr_devices, sizeof(*devices_info),
- GFP_KERNEL);
- if (!devices_info)
- return -ENOMEM;
-
- /* calc min stripe number for data space allocation */
- type = btrfs_data_alloc_profile(fs_info);
- rattr = &btrfs_raid_array[btrfs_bg_flags_to_raid_index(type)];
-
- if (type & BTRFS_BLOCK_GROUP_RAID0)
- num_stripes = nr_devices;
- else if (type & BTRFS_BLOCK_GROUP_RAID1)
- num_stripes = 2;
- else if (type & BTRFS_BLOCK_GROUP_RAID1C3)
- num_stripes = 3;
- else if (type & BTRFS_BLOCK_GROUP_RAID1C4)
- num_stripes = 4;
- else if (type & BTRFS_BLOCK_GROUP_RAID10)
- num_stripes = 4;
-
- /* Adjust for more than 1 stripe per device */
- min_stripe_size = rattr->dev_stripes * BTRFS_STRIPE_LEN;
-
- rcu_read_lock();
- list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) {
- if (!test_bit(BTRFS_DEV_STATE_IN_FS_METADATA,
- &device->dev_state) ||
- !device->bdev ||
- test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state))
- continue;
-
- if (i >= nr_devices)
- break;
-
- avail_space = device->total_bytes - device->bytes_used;
-
- /* align with stripe_len */
- avail_space = rounddown(avail_space, BTRFS_STRIPE_LEN);
-
- /*
- * In order to avoid overwriting the superblock on the drive,
- * btrfs starts at an offset of at least 1MB when doing chunk
- * allocation.
- *
- * This ensures we have at least min_stripe_size free space
- * after excluding 1MB.
- */
- if (avail_space <= SZ_1M + min_stripe_size)
- continue;
-
- avail_space -= SZ_1M;
-
- devices_info[i].dev = device;
- devices_info[i].max_avail = avail_space;
-
- i++;
- }
- rcu_read_unlock();
-
- nr_devices = i;
-
- btrfs_descending_sort_devices(devices_info, nr_devices);
-
- i = nr_devices - 1;
- avail_space = 0;
- while (nr_devices >= rattr->devs_min) {
- num_stripes = min(num_stripes, nr_devices);
-
- if (devices_info[i].max_avail >= min_stripe_size) {
- int j;
- u64 alloc_size;
-
- avail_space += devices_info[i].max_avail * num_stripes;
- alloc_size = devices_info[i].max_avail;
- for (j = i + 1 - num_stripes; j <= i; j++)
- devices_info[j].max_avail -= alloc_size;
- }
- i--;
- nr_devices--;
- }
-
- kfree(devices_info);
- *free_bytes = avail_space;
- return 0;
-}
-
/*
* Calculate numbers for 'df', pessimistic in case of mixed raid profiles.
*
@@ -2150,6 +2032,7 @@ static inline int btrfs_calc_avail_data_space(struct btrfs_fs_info *fs_info,
static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
{
struct btrfs_fs_info *fs_info = btrfs_sb(dentry->d_sb);
+ struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
struct btrfs_super_block *disk_super = fs_info->super_copy;
struct btrfs_space_info *found;
u64 total_used = 0;
@@ -2159,7 +2042,7 @@ static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
__be32 *fsid = (__be32 *)fs_info->fs_devices->fsid;
unsigned factor = 1;
struct btrfs_block_rsv *block_rsv = &fs_info->global_block_rsv;
- int ret;
+ enum btrfs_raid_types data_type;
u64 thresh = 0;
int mixed = 0;
@@ -2208,11 +2091,11 @@ static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_bfree = 0;
spin_unlock(&block_rsv->lock);
- buf->f_bavail = div_u64(total_free_data, factor);
- ret = btrfs_calc_avail_data_space(fs_info, &total_free_data);
- if (ret)
- return ret;
- buf->f_bavail += div_u64(total_free_data, factor);
+ data_type = btrfs_bg_flags_to_raid_index(
+ btrfs_data_alloc_profile(fs_info));
+
+ buf->f_bavail = total_free_data +
+ atomic64_read(&fs_devices->per_profile_avail[data_type]);
buf->f_bavail = buf->f_bavail >> bits;
/*
--
2.28.0
For the following disk layout, can_overcommit() can cause false
confidence in available space:
devid 1 unallocated: 1T
devid 2 unallocated: 10T
metadata type: RAID1
As can_overcommit() simply uses unallocated space with factor to
calculate the allocatable metadata chunk size.
can_overcommit() believes we still have 5.5T for metadata chunks, while
the truth is, we only have 1T available for metadata chunks.
This can lead to ENOSPC at run_delalloc_range() and cause transaction
abort.
Since factor based calculation can't distinguish RAID1/RAID10 and DUP at
all, we need proper chunk-allocator level awareness to do such estimation.
Thankfully, we have per-profile available space already calculated, just
use that facility to avoid such false confidence.
CC: stable(a)vger.kernel.org # 5.4+
Reported-by: Marc Lehmann <schmorp(a)schmorp.de>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
---
fs/btrfs/space-info.c | 14 +++++---------
1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 64b6e1d44f47..4bb4e3c3531f 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -336,25 +336,21 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info,
struct btrfs_space_info *space_info,
enum btrfs_reserve_flush_enum flush)
{
+ enum btrfs_raid_types index;
u64 profile;
u64 avail;
- int factor;
if (space_info->flags & BTRFS_BLOCK_GROUP_SYSTEM)
profile = btrfs_system_alloc_profile(fs_info);
else
profile = btrfs_metadata_alloc_profile(fs_info);
- avail = atomic64_read(&fs_info->free_chunk_space);
-
/*
- * If we have dup, raid1 or raid10 then only half of the free
- * space is actually usable. For raid56, the space info used
- * doesn't include the parity drive, so we don't have to
- * change the math
+ * Grab avail space from per-profile array which should be as accurate
+ * as chunk allocator.
*/
- factor = btrfs_bg_type_to_factor(profile);
- avail = div_u64(avail, factor);
+ index = btrfs_bg_flags_to_raid_index(profile);
+ avail = atomic64_read(&fs_info->fs_devices->per_profile_avail[index]);
/*
* If we aren't flushing all things, let us overcommit up to
--
2.28.0
Dear stable
Please i was referred to your company for this services and urgently.
Please see attached our company quotation request.
Kindly quote your best prices for this items including shipping cost,
FOB and payment terms.
Expecting your response asap.
Regards,
Shui biun
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
The HDMI vs. not-HDMI check got inverted whem the bogus encoder->type
checks were eliminated. So now we're using 0 as the link rate on DP
and potentially non-zero on HDMI, which is exactly the opposite of
what we want. The original bogus check actually worked more correctly
by accident since if would always evaluate to true. Due to this we
now always use the RBR/HBR1 vswing table and never ever the HBR2+
vswing table. That is probably not a good way to get a high quality
signal at HBR2+ rates. Fix the check so we pick the right table.
Cc: stable(a)vger.kernel.org
Cc: Vandita Kulkarni <vandita.kulkarni(a)intel.com>
Cc: Uma Shankar <uma.shankar(a)intel.com>
Fixes: 94641eb6c696 ("drm/i915/display: Fix the encoder type check")
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_ddi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c
index 4d06178cd76c..cdcb7b1034ae 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2742,7 +2742,7 @@ tgl_dkl_phy_ddi_vswing_sequence(struct intel_encoder *encoder, int link_clock,
u32 n_entries, val, ln, dpcnt_mask, dpcnt_val;
int rate = 0;
- if (type == INTEL_OUTPUT_HDMI) {
+ if (type != INTEL_OUTPUT_HDMI) {
struct intel_dp *intel_dp = enc_to_intel_dp(encoder);
rate = intel_dp->link_rate;
--
2.26.2
Dear stable
Please i was referred to your company for this services and urgently.
Please see attached our company quotation request.
Kindly quote your best prices for this items including shipping cost,
FOB and payment terms.
Expecting your response asap.
Regards,
Shui biun
Hello,
I ran into some packet reordering using a plain vanilla 5.4.49 kernel
and the Amazon AWS ena driver. The external symptom was that every now
and again, one or more larger packets would be delivered to the UDP
socket after some smaller packets, even though the smaller packets were
sent after the larger packets. They were also delivered late to a packet
socket sniffer, which initially made me suspect an RSS bug in the ena
hardware. Eventually, I modified the ena driver to stamp each packet (by
overwriting its ethernet source mac field) with the id of the RSS queue
that delivered it, and with the value of a per-RSS-queue counter that
incremented for each packet received in that queue. That hack showed RSS
functioning properly, and also showed packets received earlier (with a
smaller counter value) being delivered after packets with a larger
counter value. It established that the reordering was in fact happening
inside the kernel network core.
The breakthrough came from realizing that the ena driver defaults its
rx_copybreak to 256, which matched perfectly the boundary between the
delayed large packets and the smaller packets being delivered first.
After changing ena's rx_copybreak to zero, the reordering issue disappeared.
After a lot of hair pulling, I think I figured out what the issue is --
and it's confined to the 5.4 stable tree. Commit 323ebb61e32b4 (present
in 5.4) introduced queueing for GRO_NORMAL packets received via
napi_gro_frags() -> napi_frags_finish(). Commit 6570bc79c0df (NOT
present in 5.4) extended the same GRO_NORMAL queueing to packets
received via napi_gro_receive() -> napi_skb_finish(). Without
6570bc79c0df, packets received via napi_gro_receive() can get delivered
ahead of the earlier, queued up packets received via napi_gro_frags().
And this is precisely what happens in the ena driver with packets
smaller than rx_copybreak vs packets larger than rx_copybreak.
Interestingly, the 5.4 stable tree does contain a backport of the
upstream c80794323e commit, which purports to fix issues introduced by
323ebb61e32b4 and 6570bc79c0df. But 6570bc79c0df itself is missing...
I don't yet have a 5.4-stable patch, since I wanted to first raise the
issue publicly and confirm I'm not missing something obvious. The patch
would probably involve massaging 6570bc79c0df quite a bit, to avoid
colliding with the already-present c80794323e fix.
Thanks,
-Ion
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Subject: mm: replace memmap_context by meminit_context
Patch series "mm: fix memory to node bad links in sysfs", v3.
Sometimes, firmware may expose interleaved memory layout like this:
Early memory node ranges
node 1: [mem 0x0000000000000000-0x000000011fffffff]
node 2: [mem 0x0000000120000000-0x000000014fffffff]
node 1: [mem 0x0000000150000000-0x00000001ffffffff]
node 0: [mem 0x0000000200000000-0x000000048fffffff]
node 2: [mem 0x0000000490000000-0x00000007ffffffff]
In that case, we can see memory blocks assigned to multiple nodes in sysfs:
$ ls -l /sys/devices/system/memory/memory21
total 0
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2
-rw-r--r-- 1 root root 65536 Aug 24 05:27 online
-r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device
-r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index
drwxr-xr-x 2 root root 0 Aug 24 05:27 power
-r--r--r-- 1 root root 65536 Aug 24 05:27 removable
-rw-r--r-- 1 root root 65536 Aug 24 05:27 state
lrwxrwxrwx 1 root root 0 Aug 24 05:25 subsystem -> ../../../../bus/memory
-rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent
-r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones
The same applies in the node's directory with a memory21 link in both the
node1 and node2's directory.
This is wrong but doesn't prevent the system to run. However when later,
one of these memory blocks is hot-unplugged and then hot-plugged, the
system is detecting an inconsistency in the sysfs layout and a BUG_ON() is
raised:
------------[ cut here ]------------
kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
NIP: c000000000403f34 LR: c000000000403f2c CTR: 0000000000000000
REGS: c0000004876e3660 TRAP: 0700 Not tainted (5.9.0-rc1+)
MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24000448 XER: 20040000
CFAR: c000000000846d20 IRQMASK: 0
GPR00: c000000000403f2c c0000004876e38f0 c0000000012f6f00 ffffffffffffffef
GPR04: 0000000000000227 c0000004805ae680 0000000000000000 00000004886f0000
GPR08: 0000000000000226 0000000000000003 0000000000000002 fffffffffffffffd
GPR12: 0000000088000484 c00000001ec96280 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000004 0000000000000003
GPR20: c00000047814ffe0 c0000007ffff7c08 0000000000000010 c0000000013332c8
GPR24: 0000000000000000 c0000000011f6cc0 0000000000000000 0000000000000000
GPR28: ffffffffffffffef 0000000000000001 0000000150000000 0000000010000000
NIP [c000000000403f34] add_memory_resource+0x244/0x340
LR [c000000000403f2c] add_memory_resource+0x23c/0x340
Call Trace:
[c0000004876e38f0] [c000000000403f2c] add_memory_resource+0x23c/0x340 (unreliable)
[c0000004876e39c0] [c00000000040408c] __add_memory+0x5c/0xf0
[c0000004876e39f0] [c0000000000e2b94] dlpar_add_lmb+0x1b4/0x500
[c0000004876e3ad0] [c0000000000e3888] dlpar_memory+0x1f8/0xb80
[c0000004876e3b60] [c0000000000dc0d0] handle_dlpar_errorlog+0xc0/0x190
[c0000004876e3bd0] [c0000000000dc398] dlpar_store+0x198/0x4a0
[c0000004876e3c90] [c00000000072e630] kobj_attr_store+0x30/0x50
[c0000004876e3cb0] [c00000000051f954] sysfs_kf_write+0x64/0x90
[c0000004876e3cd0] [c00000000051ee40] kernfs_fop_write+0x1b0/0x290
[c0000004876e3d20] [c000000000438dd8] vfs_write+0xe8/0x290
[c0000004876e3d70] [c0000000004391ac] ksys_write+0xdc/0x130
[c0000004876e3dc0] [c000000000034e40] system_call_exception+0x160/0x270
[c0000004876e3e20] [c00000000000d740] system_call_common+0xf0/0x27c
Instruction dump:
48442e35 60000000 0b030000 3cbe0001 7fa3eb78 7bc48402 38a5fffe 7ca5fa14
78a58402 48442db1 60000000 7c7c1b78 <0b030000> 7f23cb78 4bda371d 60000000
---[ end trace 562fd6c109cd0fb2 ]---
This has been seen on PowerPC LPAR.
The root cause of this issue is that when node's memory is registered, the
range used can overlap another node's range, thus the memory block is
registered to multiple nodes in sysfs.
There are 2 issues here:
a. The sysfs memory and node's layouts are broken due to these multiple
links
b. The link errors in link_mem_sections() should not lead to a system
panic.
To address a. register_mem_sect_under_node should not rely on the system
state to detect whether the link operation is triggered by a hot plug
operation or not. This is addressed by the patches 1 and 2 of this series.
The patch 3 is addressing the point b.
This patch (of 2):
The memmap_context enum is used to detect whether a memory operation is due
to a hot-add operation or happening at boot time.
Make it general to the hotplug operation and rename it as meminit_context.
There is no functional change introduced by this patch
Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com
Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J . Wysocki" <rafael(a)kernel.org>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/ia64/mm/init.c | 6 +++---
include/linux/mm.h | 2 +-
include/linux/mmzone.h | 11 ++++++++---
mm/memory_hotplug.c | 2 +-
mm/page_alloc.c | 10 +++++-----
5 files changed, 18 insertions(+), 13 deletions(-)
--- a/arch/ia64/mm/init.c~mm-replace-memmap_context-by-meminit_context
+++ a/arch/ia64/mm/init.c
@@ -538,7 +538,7 @@ virtual_memmap_init(u64 start, u64 end,
if (map_start < map_end)
memmap_init_zone((unsigned long)(map_end - map_start),
args->nid, args->zone, page_to_pfn(map_start),
- MEMMAP_EARLY, NULL);
+ MEMINIT_EARLY, NULL);
return 0;
}
@@ -547,8 +547,8 @@ memmap_init (unsigned long size, int nid
unsigned long start_pfn)
{
if (!vmem_map) {
- memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY,
- NULL);
+ memmap_init_zone(size, nid, zone, start_pfn,
+ MEMINIT_EARLY, NULL);
} else {
struct page *start;
struct memmap_init_callback_data args;
--- a/include/linux/mm.h~mm-replace-memmap_context-by-meminit_context
+++ a/include/linux/mm.h
@@ -2416,7 +2416,7 @@ extern int __meminit __early_pfn_to_nid(
extern void set_dma_reserve(unsigned long new_dma_reserve);
extern void memmap_init_zone(unsigned long, int, unsigned long, unsigned long,
- enum memmap_context, struct vmem_altmap *);
+ enum meminit_context, struct vmem_altmap *);
extern void setup_per_zone_wmarks(void);
extern int __meminit init_per_zone_wmark_min(void);
extern void mem_init(void);
--- a/include/linux/mmzone.h~mm-replace-memmap_context-by-meminit_context
+++ a/include/linux/mmzone.h
@@ -824,10 +824,15 @@ bool zone_watermark_ok(struct zone *z, u
unsigned int alloc_flags);
bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
unsigned long mark, int highest_zoneidx);
-enum memmap_context {
- MEMMAP_EARLY,
- MEMMAP_HOTPLUG,
+/*
+ * Memory initialization context, use to differentiate memory added by
+ * the platform statically or via memory hotplug interface.
+ */
+enum meminit_context {
+ MEMINIT_EARLY,
+ MEMINIT_HOTPLUG,
};
+
extern void init_currently_empty_zone(struct zone *zone, unsigned long start_pfn,
unsigned long size);
--- a/mm/memory_hotplug.c~mm-replace-memmap_context-by-meminit_context
+++ a/mm/memory_hotplug.c
@@ -729,7 +729,7 @@ void __ref move_pfn_range_to_zone(struct
* are reserved so nobody should be touching them so we should be safe
*/
memmap_init_zone(nr_pages, nid, zone_idx(zone), start_pfn,
- MEMMAP_HOTPLUG, altmap);
+ MEMINIT_HOTPLUG, altmap);
set_zone_contiguous(zone);
}
--- a/mm/page_alloc.c~mm-replace-memmap_context-by-meminit_context
+++ a/mm/page_alloc.c
@@ -5975,7 +5975,7 @@ overlap_memmap_init(unsigned long zone,
* done. Non-atomic initialization, single-pass.
*/
void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
- unsigned long start_pfn, enum memmap_context context,
+ unsigned long start_pfn, enum meminit_context context,
struct vmem_altmap *altmap)
{
unsigned long pfn, end_pfn = start_pfn + size;
@@ -6007,7 +6007,7 @@ void __meminit memmap_init_zone(unsigned
* There can be holes in boot-time mem_map[]s handed to this
* function. They do not exist on hotplugged memory.
*/
- if (context == MEMMAP_EARLY) {
+ if (context == MEMINIT_EARLY) {
if (overlap_memmap_init(zone, &pfn))
continue;
if (defer_init(nid, pfn, end_pfn))
@@ -6016,7 +6016,7 @@ void __meminit memmap_init_zone(unsigned
page = pfn_to_page(pfn);
__init_single_page(page, pfn, zone, nid);
- if (context == MEMMAP_HOTPLUG)
+ if (context == MEMINIT_HOTPLUG)
__SetPageReserved(page);
/*
@@ -6099,7 +6099,7 @@ void __ref memmap_init_zone_device(struc
* check here not to call set_pageblock_migratetype() against
* pfn out of zone.
*
- * Please note that MEMMAP_HOTPLUG path doesn't clear memmap
+ * Please note that MEMINIT_HOTPLUG path doesn't clear memmap
* because this is done early in section_activate()
*/
if (!(pfn & (pageblock_nr_pages - 1))) {
@@ -6137,7 +6137,7 @@ void __meminit __weak memmap_init(unsign
if (end_pfn > start_pfn) {
size = end_pfn - start_pfn;
memmap_init_zone(size, nid, zone, start_pfn,
- MEMMAP_EARLY, NULL);
+ MEMINIT_EARLY, NULL);
}
}
}
_
Changes since v8 [1]:
- Rebase on v5.9-rc6
- Fix a performance regression in the x86 copy_mc_to_user()
implementation that was duplicating copies in the "fragile" case.
- Refreshed the cover letter.
[1]: http://lore.kernel.org/r/159630255616.3143511.18110575960499749012.stgit@dw…
---
The motivations to go rework memcpy_mcsafe() are that the benefit of
doing slow and careful copies is obviated on newer CPUs, and that the
current opt-in list of cpus to instrument recovery is broken relative to
those cpus. There is no need to keep an opt-in list up to date on an
ongoing basis if pmem/dax operations are instrumented for recovery by
default. With recovery enabled by default the old "mcsafe_key" opt-in to
careful copying can be made a "fragile" opt-out. Where the "fragile"
list takes steps to not consume poison across cachelines.
The discussion with Linus made clear that the current "_mcsafe" suffix
was imprecise to a fault. The operations that are needed by pmem/dax are
to copy from a source address that might throw #MC to a destination that
may write-fault, if it is a user page. So copy_to_user_mcsafe() becomes
copy_mc_to_user() to indicate the separate precautions taken on source
and destination. copy_mc_to_kernel() is introduced as a version that
does not expect write-faults on the destination, but is still prepared
to abort with an error code upon taking #MC.
These patches have received a kbuild-robot build success notification
across 114 configs, the rebase to v5.9-rc6 did not encounter any
conflicts, and the merge with tip/master is conflict-free.
---
Dan Williams (2):
x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user,kernel}()
x86/copy_mc: Introduce copy_mc_generic()
arch/powerpc/Kconfig | 2
arch/powerpc/include/asm/string.h | 2
arch/powerpc/include/asm/uaccess.h | 40 +++--
arch/powerpc/lib/Makefile | 2
arch/powerpc/lib/copy_mc_64.S | 4
arch/x86/Kconfig | 2
arch/x86/Kconfig.debug | 2
arch/x86/include/asm/copy_mc_test.h | 75 +++++++++
arch/x86/include/asm/mcsafe_test.h | 75 ---------
arch/x86/include/asm/string_64.h | 32 ----
arch/x86/include/asm/uaccess.h | 21 +++
arch/x86/include/asm/uaccess_64.h | 20 --
arch/x86/kernel/cpu/mce/core.c | 8 -
arch/x86/kernel/quirks.c | 9 -
arch/x86/lib/Makefile | 1
arch/x86/lib/copy_mc.c | 65 ++++++++
arch/x86/lib/copy_mc_64.S | 165 ++++++++++++++++++++
arch/x86/lib/memcpy_64.S | 115 --------------
arch/x86/lib/usercopy_64.c | 21 ---
drivers/md/dm-writecache.c | 15 +-
drivers/nvdimm/claim.c | 2
drivers/nvdimm/pmem.c | 6 -
include/linux/string.h | 9 -
include/linux/uaccess.h | 9 +
include/linux/uio.h | 10 +
lib/Kconfig | 7 +
lib/iov_iter.c | 43 +++--
tools/arch/x86/include/asm/mcsafe_test.h | 13 --
tools/arch/x86/lib/memcpy_64.S | 115 --------------
tools/objtool/check.c | 5 -
tools/perf/bench/Build | 1
tools/perf/bench/mem-memcpy-x86-64-lib.c | 24 ---
tools/testing/nvdimm/test/nfit.c | 48 +++---
.../testing/selftests/powerpc/copyloops/.gitignore | 2
tools/testing/selftests/powerpc/copyloops/Makefile | 6 -
.../selftests/powerpc/copyloops/copy_mc_64.S | 1
.../selftests/powerpc/copyloops/memcpy_mcsafe_64.S | 1
37 files changed, 452 insertions(+), 526 deletions(-)
rename arch/powerpc/lib/{memcpy_mcsafe_64.S => copy_mc_64.S} (98%)
create mode 100644 arch/x86/include/asm/copy_mc_test.h
delete mode 100644 arch/x86/include/asm/mcsafe_test.h
create mode 100644 arch/x86/lib/copy_mc.c
create mode 100644 arch/x86/lib/copy_mc_64.S
delete mode 100644 tools/arch/x86/include/asm/mcsafe_test.h
delete mode 100644 tools/perf/bench/mem-memcpy-x86-64-lib.c
create mode 120000 tools/testing/selftests/powerpc/copyloops/copy_mc_64.S
delete mode 120000 tools/testing/selftests/powerpc/copyloops/memcpy_mcsafe_64.S
base-commit: ba4f184e126b751d1bffad5897f263108befc780
With suitably crafted reiserfs image and mount command reiserfs will
crash when trying to verify that XATTR_ROOT directory can be looked up
in / as that recurses back to xattr code like:
xattr_lookup+0x24/0x280 fs/reiserfs/xattr.c:395
reiserfs_xattr_get+0x89/0x540 fs/reiserfs/xattr.c:677
reiserfs_get_acl+0x63/0x690 fs/reiserfs/xattr_acl.c:209
get_acl+0x152/0x2e0 fs/posix_acl.c:141
check_acl fs/namei.c:277 [inline]
acl_permission_check fs/namei.c:309 [inline]
generic_permission+0x2ba/0x550 fs/namei.c:353
do_inode_permission fs/namei.c:398 [inline]
inode_permission+0x234/0x4a0 fs/namei.c:463
lookup_one_len+0xa6/0x200 fs/namei.c:2557
reiserfs_lookup_privroot+0x85/0x1e0 fs/reiserfs/xattr.c:972
reiserfs_fill_super+0x2b51/0x3240 fs/reiserfs/super.c:2176
mount_bdev+0x24f/0x360 fs/super.c:1417
Fix the problem by bailing from reiserfs_xattr_get() when xattrs are not
yet initialized.
CC: stable(a)vger.kernel.org
Reported-by: syzbot+9b33c9b118d77ff59b6f(a)syzkaller.appspotmail.com
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/reiserfs/xattr.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c
index 28b241cd6987..fe63a7c3e0da 100644
--- a/fs/reiserfs/xattr.c
+++ b/fs/reiserfs/xattr.c
@@ -674,6 +674,13 @@ reiserfs_xattr_get(struct inode *inode, const char *name, void *buffer,
if (get_inode_sd_version(inode) == STAT_DATA_V1)
return -EOPNOTSUPP;
+ /*
+ * priv_root needn't be initialized during mount so allow initial
+ * lookups to succeed.
+ */
+ if (!REISERFS_SB(inode->i_sb)->priv_root)
+ return 0;
+
dentry = xattr_lookup(inode, name, XATTR_REPLACE);
if (IS_ERR(dentry)) {
err = PTR_ERR(dentry);
--
2.16.4
When running as Xen dom0 the kernel isn't responsible for selecting the
error handling mode, this should be handled by the hypervisor.
So disable setting FF mode when running as Xen pv guest. Not doing so
might result in boot splats like:
[ 7.509696] HEST: Enabling Firmware First mode for corrected errors.
[ 7.510382] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 2.
[ 7.510383] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 3.
[ 7.510384] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 4.
[ 7.510384] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 5.
[ 7.510385] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 6.
[ 7.510386] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 7.
[ 7.510386] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 8.
Reason is that the HEST ACPI table contains the real number of MCA
banks, while the hypervisor is emulating only 2 banks for guests.
Cc: stable(a)vger.kernel.org
Signed-off-by: Juergen Gross <jgross(a)suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
---
arch/x86/xen/enlighten_pv.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 22e741e0b10c..351ac1a9a119 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1376,6 +1376,15 @@ asmlinkage __visible void __init xen_start_kernel(void)
x86_init.mpparse.get_smp_config = x86_init_uint_noop;
xen_boot_params_init_edd();
+
+#ifdef CONFIG_ACPI
+ /*
+ * Disable selecting "Firmware First mode" for correctable
+ * memory errors, as this is the duty of the hypervisor to
+ * decide.
+ */
+ acpi_disable_cmcff = 1;
+#endif
}
if (!boot_params.screen_info.orig_video_isVGA)
--
2.26.2
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
The first thing that the ftrace function callback helper functions should do
is to check for recursion. Peter Zijlstra found that when
"rcu_is_watching()" had its notrace removed, it caused perf function tracing
to crash. This is because the call of rcu_is_watching() is tested before
function recursion is checked and and if it is traced, it will cause an
infinite recursion loop.
rcu_is_watching() should still stay notrace, but to prevent this should
never had crashed in the first place. The recursion prevention must be the
first thing done in callback functions.
Link: https://lore.kernel.org/r/20200929112541.GM2628@hirez.programming.kicks-ass…
Cc: stable(a)vger.kernel.org
Cc: Paul McKenney <paulmck(a)kernel.org>
Fixes: c68c0fa293417 ("ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too")
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reported-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/trace/ftrace.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 603255f5f085..541453927c82 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -6993,16 +6993,14 @@ static void ftrace_ops_assist_func(unsigned long ip, unsigned long parent_ip,
{
int bit;
- if ((op->flags & FTRACE_OPS_FL_RCU) && !rcu_is_watching())
- return;
-
bit = trace_test_and_set_recursion(TRACE_LIST_START, TRACE_LIST_MAX);
if (bit < 0)
return;
preempt_disable_notrace();
- op->func(ip, parent_ip, op, regs);
+ if (!(op->flags & FTRACE_OPS_FL_RCU) || rcu_is_watching())
+ op->func(ip, parent_ip, op, regs);
preempt_enable_notrace();
trace_clear_recursion(bit);
--
2.28.0
This is a note to let you know that I've just added the patch titled
serial: pl011: Fix lockdep splat when handling magic-sysrq interrupt
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 534cf755d9df99e214ddbe26b91cd4d81d2603e2 Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Wed, 30 Sep 2020 13:04:32 +0100
Subject: serial: pl011: Fix lockdep splat when handling magic-sysrq interrupt
Issuing a magic-sysrq via the PL011 causes the following lockdep splat,
which is easily reproducible under QEMU:
| sysrq: Changing Loglevel
| sysrq: Loglevel set to 9
|
| ======================================================
| WARNING: possible circular locking dependency detected
| 5.9.0-rc7 #1 Not tainted
| ------------------------------------------------------
| systemd-journal/138 is trying to acquire lock:
| ffffab133ad950c0 (console_owner){-.-.}-{0:0}, at: console_lock_spinning_enable+0x34/0x70
|
| but task is already holding lock:
| ffff0001fd47b098 (&port_lock_key){-.-.}-{2:2}, at: pl011_int+0x40/0x488
|
| which lock already depends on the new lock.
[...]
| Possible unsafe locking scenario:
|
| CPU0 CPU1
| ---- ----
| lock(&port_lock_key);
| lock(console_owner);
| lock(&port_lock_key);
| lock(console_owner);
|
| *** DEADLOCK ***
The issue being that CPU0 takes 'port_lock' on the irq path in pl011_int()
before taking 'console_owner' on the printk() path, whereas CPU1 takes
the two locks in the opposite order on the printk() path due to setting
the "console_owner" prior to calling into into the actual console driver.
Fix this in the same way as the msm-serial driver by dropping 'port_lock'
before handling the sysrq.
Cc: <stable(a)vger.kernel.org> # 4.19+
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Jiri Slaby <jirislaby(a)kernel.org>
Link: https://lore.kernel.org/r/20200811101313.GA6970@willie-the-truck
Signed-off-by: Peter Zijlstra <peterz(a)infradead.org>
Tested-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
Link: https://lore.kernel.org/r/20200930120432.16551-1-will@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/amba-pl011.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index 67498594d7d7..87dc3fc15694 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -308,8 +308,9 @@ static void pl011_write(unsigned int val, const struct uart_amba_port *uap,
*/
static int pl011_fifo_to_tty(struct uart_amba_port *uap)
{
- u16 status;
unsigned int ch, flag, fifotaken;
+ int sysrq;
+ u16 status;
for (fifotaken = 0; fifotaken != 256; fifotaken++) {
status = pl011_read(uap, REG_FR);
@@ -344,10 +345,12 @@ static int pl011_fifo_to_tty(struct uart_amba_port *uap)
flag = TTY_FRAME;
}
- if (uart_handle_sysrq_char(&uap->port, ch & 255))
- continue;
+ spin_unlock(&uap->port.lock);
+ sysrq = uart_handle_sysrq_char(&uap->port, ch & 255);
+ spin_lock(&uap->port.lock);
- uart_insert_char(&uap->port, ch, UART011_DR_OE, ch, flag);
+ if (!sysrq)
+ uart_insert_char(&uap->port, ch, UART011_DR_OE, ch, flag);
}
return fifotaken;
--
2.28.0
This is a note to let you know that I've just added the patch titled
tty: serial: fsl_lpuart: fix lpuart32_poll_get_char
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 29788ab1d2bf26c130de8f44f9553ee78a27e8d5 Mon Sep 17 00:00:00 2001
From: Peng Fan <peng.fan(a)nxp.com>
Date: Tue, 29 Sep 2020 17:55:09 +0800
Subject: tty: serial: fsl_lpuart: fix lpuart32_poll_get_char
The watermark is set to 1, so we need to input two chars to trigger RDRF
using the original logic. With the new logic, we could always get the
char when there is data in FIFO.
Suggested-by: Fugang Duan <fugang.duan(a)nxp.com>
Signed-off-by: Peng Fan <peng.fan(a)nxp.com>
Link: https://lore.kernel.org/r/20200929095509.21680-1-peng.fan@nxp.com
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/fsl_lpuart.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
index 645bbb24b433..590c901a1fc5 100644
--- a/drivers/tty/serial/fsl_lpuart.c
+++ b/drivers/tty/serial/fsl_lpuart.c
@@ -680,7 +680,7 @@ static void lpuart32_poll_put_char(struct uart_port *port, unsigned char c)
static int lpuart32_poll_get_char(struct uart_port *port)
{
- if (!(lpuart32_read(port, UARTSTAT) & UARTSTAT_RDRF))
+ if (!(lpuart32_read(port, UARTWATER) >> UARTWATER_RXCNT_OFF))
return NO_POLL_CHAR;
return lpuart32_read(port, UARTDATA);
--
2.28.0
This is a note to let you know that I've just added the patch titled
tty: serial: lpuart: fix lpuart32_write usage
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 9ea40db477c024dcb87321b7f32bd6ea12130ac2 Mon Sep 17 00:00:00 2001
From: Peng Fan <peng.fan(a)nxp.com>
Date: Tue, 29 Sep 2020 17:19:20 +0800
Subject: tty: serial: lpuart: fix lpuart32_write usage
The 2nd and 3rd parameter were wrongly used, and cause kernel abort when
doing kgdb debug.
Fixes: 1da17d7cf8e2 ("tty: serial: fsl_lpuart: Use appropriate lpuart32_* I/O funcs")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Peng Fan <peng.fan(a)nxp.com>
Link: https://lore.kernel.org/r/20200929091920.22612-1-peng.fan@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/fsl_lpuart.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
index 5a5a22d77841..645bbb24b433 100644
--- a/drivers/tty/serial/fsl_lpuart.c
+++ b/drivers/tty/serial/fsl_lpuart.c
@@ -649,26 +649,24 @@ static int lpuart32_poll_init(struct uart_port *port)
spin_lock_irqsave(&sport->port.lock, flags);
/* Disable Rx & Tx */
- lpuart32_write(&sport->port, UARTCTRL, 0);
+ lpuart32_write(&sport->port, 0, UARTCTRL);
temp = lpuart32_read(&sport->port, UARTFIFO);
/* Enable Rx and Tx FIFO */
- lpuart32_write(&sport->port, UARTFIFO,
- temp | UARTFIFO_RXFE | UARTFIFO_TXFE);
+ lpuart32_write(&sport->port, temp | UARTFIFO_RXFE | UARTFIFO_TXFE, UARTFIFO);
/* flush Tx and Rx FIFO */
- lpuart32_write(&sport->port, UARTFIFO,
- UARTFIFO_TXFLUSH | UARTFIFO_RXFLUSH);
+ lpuart32_write(&sport->port, UARTFIFO_TXFLUSH | UARTFIFO_RXFLUSH, UARTFIFO);
/* explicitly clear RDRF */
if (lpuart32_read(&sport->port, UARTSTAT) & UARTSTAT_RDRF) {
lpuart32_read(&sport->port, UARTDATA);
- lpuart32_write(&sport->port, UARTFIFO, UARTFIFO_RXUF);
+ lpuart32_write(&sport->port, UARTFIFO_RXUF, UARTFIFO);
}
/* Enable Rx and Tx */
- lpuart32_write(&sport->port, UARTCTRL, UARTCTRL_RE | UARTCTRL_TE);
+ lpuart32_write(&sport->port, UARTCTRL_RE | UARTCTRL_TE, UARTCTRL);
spin_unlock_irqrestore(&sport->port.lock, flags);
return 0;
@@ -677,7 +675,7 @@ static int lpuart32_poll_init(struct uart_port *port)
static void lpuart32_poll_put_char(struct uart_port *port, unsigned char c)
{
lpuart32_wait_bit_set(port, UARTSTAT, UARTSTAT_TDRE);
- lpuart32_write(port, UARTDATA, c);
+ lpuart32_write(port, c, UARTDATA);
}
static int lpuart32_poll_get_char(struct uart_port *port)
--
2.28.0
From: Peter Zijlstra <peterz(a)infradead.org>
Issuing a magic-sysrq via the PL011 causes the following lockdep splat,
which is easily reproducible under QEMU:
| sysrq: Changing Loglevel
| sysrq: Loglevel set to 9
|
| ======================================================
| WARNING: possible circular locking dependency detected
| 5.9.0-rc7 #1 Not tainted
| ------------------------------------------------------
| systemd-journal/138 is trying to acquire lock:
| ffffab133ad950c0 (console_owner){-.-.}-{0:0}, at: console_lock_spinning_enable+0x34/0x70
|
| but task is already holding lock:
| ffff0001fd47b098 (&port_lock_key){-.-.}-{2:2}, at: pl011_int+0x40/0x488
|
| which lock already depends on the new lock.
[...]
| Possible unsafe locking scenario:
|
| CPU0 CPU1
| ---- ----
| lock(&port_lock_key);
| lock(console_owner);
| lock(&port_lock_key);
| lock(console_owner);
|
| *** DEADLOCK ***
The issue being that CPU0 takes 'port_lock' on the irq path in pl011_int()
before taking 'console_owner' on the printk() path, whereas CPU1 takes
the two locks in the opposite order on the printk() path due to setting
the "console_owner" prior to calling into into the actual console driver.
Fix this in the same way as the msm-serial driver by dropping 'port_lock'
before handling the sysrq.
Cc: <stable(a)vger.kernel.org> # 4.19+
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Jiri Slaby <jirislaby(a)kernel.org>
Link: https://lore.kernel.org/r/20200811101313.GA6970@willie-the-truck
Signed-off-by: Peter Zijlstra <peterz(a)infradead.org>
Tested-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
---
drivers/tty/serial/amba-pl011.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index 67498594d7d7..87dc3fc15694 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -308,8 +308,9 @@ static void pl011_write(unsigned int val, const struct uart_amba_port *uap,
*/
static int pl011_fifo_to_tty(struct uart_amba_port *uap)
{
- u16 status;
unsigned int ch, flag, fifotaken;
+ int sysrq;
+ u16 status;
for (fifotaken = 0; fifotaken != 256; fifotaken++) {
status = pl011_read(uap, REG_FR);
@@ -344,10 +345,12 @@ static int pl011_fifo_to_tty(struct uart_amba_port *uap)
flag = TTY_FRAME;
}
- if (uart_handle_sysrq_char(&uap->port, ch & 255))
- continue;
+ spin_unlock(&uap->port.lock);
+ sysrq = uart_handle_sysrq_char(&uap->port, ch & 255);
+ spin_lock(&uap->port.lock);
- uart_insert_char(&uap->port, ch, UART011_DR_OE, ch, flag);
+ if (!sysrq)
+ uart_insert_char(&uap->port, ch, UART011_DR_OE, ch, flag);
}
return fifotaken;
--
2.28.0.709.gb0816b6eb0-goog
This is a note to let you know that I've just added the patch titled
serial: qcom_geni_serial: To correct QUP Version detection logic
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From dd82044dd138460fbc0c684fca3b2f4cf23d2958 Mon Sep 17 00:00:00 2001
From: Paras Sharma <parashar(a)codeaurora.org>
Date: Wed, 30 Sep 2020 11:35:26 +0530
Subject: serial: qcom_geni_serial: To correct QUP Version detection logic
For QUP IP versions 2.5 and above the oversampling rate is
halved from 32 to 16.
Commit ce734600545f ("tty: serial: qcom_geni_serial: Update
the oversampling rate") is pushed to handle this scenario.
But the existing logic is failing to classify QUP Version 3.0
into the correct group ( 2.5 and above).
As result Serial Engine clocks are not configured properly for
baud rate and garbage data is sampled to FIFOs from the line.
So, fix the logic to detect QUP with versions 2.5 and above.
Fixes: ce734600545f ("tty: serial: qcom_geni_serial: Update the oversampling rate")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Paras Sharma <parashar(a)codeaurora.org>
Link: https://lore.kernel.org/r/1601445926-23673-1-git-send-email-parashar@codeau…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/qcom_geni_serial.c | 2 +-
include/linux/qcom-geni-se.h | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 8da2eb2d1090..291649f02821 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -1000,7 +1000,7 @@ static void qcom_geni_serial_set_termios(struct uart_port *uport,
sampling_rate = UART_OVERSAMPLING;
/* Sampling rate is halved for IP versions >= 2.5 */
ver = geni_se_get_qup_hw_version(&port->se);
- if (GENI_SE_VERSION_MAJOR(ver) >= 2 && GENI_SE_VERSION_MINOR(ver) >= 5)
+ if (ver >= QUP_SE_VERSION_2_5)
sampling_rate /= 2;
clk_rate = get_clk_div_rate(baud, sampling_rate, &clk_div);
diff --git a/include/linux/qcom-geni-se.h b/include/linux/qcom-geni-se.h
index 8f385fbe5a0e..1c31f26ccc7a 100644
--- a/include/linux/qcom-geni-se.h
+++ b/include/linux/qcom-geni-se.h
@@ -248,6 +248,9 @@ struct geni_se {
#define GENI_SE_VERSION_MINOR(ver) ((ver & HW_VER_MINOR_MASK) >> HW_VER_MINOR_SHFT)
#define GENI_SE_VERSION_STEP(ver) (ver & HW_VER_STEP_MASK)
+/* QUP SE VERSION value for major number 2 and minor number 5 */
+#define QUP_SE_VERSION_2_5 0x20050000
+
/*
* Define bandwidth thresholds that cause the underlying Core 2X interconnect
* clock to run at the named frequency. These baseline values are recommended
--
2.28.0
This is a note to let you know that I've just added the patch titled
iio: adc: gyroadc: fix leak of device node iterator
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From da4410d4078ba4ead9d6f1027d6db77c5a74ecee Mon Sep 17 00:00:00 2001
From: Tobias Jordan <kernel(a)cdqe.de>
Date: Sat, 26 Sep 2020 18:19:46 +0200
Subject: iio: adc: gyroadc: fix leak of device node iterator
Add missing of_node_put calls when exiting the for_each_child_of_node
loop in rcar_gyroadc_parse_subdevs early.
Also add goto-exception handling for the error paths in that loop.
Fixes: 059c53b32329 ("iio: adc: Add Renesas GyroADC driver")
Signed-off-by: Tobias Jordan <kernel(a)cdqe.de>
Link: https://lore.kernel.org/r/20200926161946.GA10240@agrajag.zerfleddert.de
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/rcar-gyroadc.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/drivers/iio/adc/rcar-gyroadc.c b/drivers/iio/adc/rcar-gyroadc.c
index dcaefc108ff6..9f38cf3c7dc2 100644
--- a/drivers/iio/adc/rcar-gyroadc.c
+++ b/drivers/iio/adc/rcar-gyroadc.c
@@ -357,7 +357,7 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
num_channels = ARRAY_SIZE(rcar_gyroadc_iio_channels_3);
break;
default:
- return -EINVAL;
+ goto err_e_inval;
}
/*
@@ -374,7 +374,7 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
dev_err(dev,
"Failed to get child reg property of ADC \"%pOFn\".\n",
child);
- return ret;
+ goto err_of_node_put;
}
/* Channel number is too high. */
@@ -382,7 +382,7 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
dev_err(dev,
"Only %i channels supported with %pOFn, but reg = <%i>.\n",
num_channels, child, reg);
- return -EINVAL;
+ goto err_e_inval;
}
}
@@ -391,7 +391,7 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
dev_err(dev,
"Channel %i uses different ADC mode than the rest.\n",
reg);
- return -EINVAL;
+ goto err_e_inval;
}
/* Channel is valid, grab the regulator. */
@@ -401,7 +401,8 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
if (IS_ERR(vref)) {
dev_dbg(dev, "Channel %i 'vref' supply not connected.\n",
reg);
- return PTR_ERR(vref);
+ ret = PTR_ERR(vref);
+ goto err_of_node_put;
}
priv->vref[reg] = vref;
@@ -425,8 +426,10 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
* attached to the GyroADC at a time, so if we found it,
* we can stop parsing here.
*/
- if (childmode == RCAR_GYROADC_MODE_SELECT_1_MB88101A)
+ if (childmode == RCAR_GYROADC_MODE_SELECT_1_MB88101A) {
+ of_node_put(child);
break;
+ }
}
if (first) {
@@ -435,6 +438,12 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
}
return 0;
+
+err_e_inval:
+ ret = -EINVAL;
+err_of_node_put:
+ of_node_put(child);
+ return ret;
}
static void rcar_gyroadc_deinit_supplies(struct iio_dev *indio_dev)
--
2.28.0
This is a note to let you know that I've just added the patch titled
iio: adc: at91-sama5d2_adc: fix DMA conversion crash
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 1a198794451449113fa86994ed491d6986802c23 Mon Sep 17 00:00:00 2001
From: Eugen Hristev <eugen.hristev(a)microchip.com>
Date: Wed, 23 Sep 2020 15:17:48 +0300
Subject: iio: adc: at91-sama5d2_adc: fix DMA conversion crash
After the move of the postenable code to preenable, the DMA start was
done before the DMA init, which is not correct.
The DMA is initialized in set_watermark. Because of this, we need to call
the DMA start functions in set_watermark, after the DMA init, instead of
preenable hook, when the DMA is not properly setup yet.
Fixes: f3c034f61775 ("iio: at91-sama5d2_adc: adjust iio_triggered_buffer_{predisable,postenable} positions")
Signed-off-by: Eugen Hristev <eugen.hristev(a)microchip.com>
Link: https://lore.kernel.org/r/20200923121748.49384-1-eugen.hristev@microchip.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/at91-sama5d2_adc.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/iio/adc/at91-sama5d2_adc.c b/drivers/iio/adc/at91-sama5d2_adc.c
index ad7d9819f83c..b917a4714a9c 100644
--- a/drivers/iio/adc/at91-sama5d2_adc.c
+++ b/drivers/iio/adc/at91-sama5d2_adc.c
@@ -884,7 +884,7 @@ static bool at91_adc_current_chan_is_touch(struct iio_dev *indio_dev)
AT91_SAMA5D2_MAX_CHAN_IDX + 1);
}
-static int at91_adc_buffer_preenable(struct iio_dev *indio_dev)
+static int at91_adc_buffer_prepare(struct iio_dev *indio_dev)
{
int ret;
u8 bit;
@@ -901,7 +901,7 @@ static int at91_adc_buffer_preenable(struct iio_dev *indio_dev)
/* we continue with the triggered buffer */
ret = at91_adc_dma_start(indio_dev);
if (ret) {
- dev_err(&indio_dev->dev, "buffer postenable failed\n");
+ dev_err(&indio_dev->dev, "buffer prepare failed\n");
return ret;
}
@@ -989,7 +989,6 @@ static int at91_adc_buffer_postdisable(struct iio_dev *indio_dev)
}
static const struct iio_buffer_setup_ops at91_buffer_setup_ops = {
- .preenable = &at91_adc_buffer_preenable,
.postdisable = &at91_adc_buffer_postdisable,
};
@@ -1563,6 +1562,7 @@ static void at91_adc_dma_disable(struct platform_device *pdev)
static int at91_adc_set_watermark(struct iio_dev *indio_dev, unsigned int val)
{
struct at91_adc_state *st = iio_priv(indio_dev);
+ int ret;
if (val > AT91_HWFIFO_MAX_SIZE)
return -EINVAL;
@@ -1586,7 +1586,15 @@ static int at91_adc_set_watermark(struct iio_dev *indio_dev, unsigned int val)
else if (val > 1)
at91_adc_dma_init(to_platform_device(&indio_dev->dev));
- return 0;
+ /*
+ * We can start the DMA only after setting the watermark and
+ * having the DMA initialization completed
+ */
+ ret = at91_adc_buffer_prepare(indio_dev);
+ if (ret)
+ at91_adc_dma_disable(to_platform_device(&indio_dev->dev));
+
+ return ret;
}
static int at91_adc_update_scan_mode(struct iio_dev *indio_dev,
--
2.28.0
Hi,
Can we have this backport applied to 5.4 stable, its more of a required
fix than an enhancement.
commit df12eb6d6cd920ab2f0e0a43cd6e1c23a05cea91 upstream
The call has a minor api change in 5.4 vs higher, only the pkt arg is
required.
diff --git a/net/vmw_vsock/virtio_transport_common.c
b/net/vmw_vsock/virtio_transport_common.c
index d9f0c9c5425a..2f696124bab6 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -1153,6 +1153,7 @@ void virtio_transport_recv_pkt(struct
virtio_transport *t,
virtio_transport_free_pkt(pkt);
break;
default:
+ (void)virtio_transport_reset_no_sock(pkt);
virtio_transport_free_pkt(pkt);
break;
}
--
The patch below does not apply to the 5.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 6c5dee18756b4721ac8518c69b22ee8ac0c9c442 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)wdc.com>
Date: Tue, 15 Sep 2020 16:33:47 +0900
Subject: [PATCH] scsi: sd: sd_zbc: Fix ZBC disk initialization
Make sure to call sd_zbc_init_disk() when the sdkp->zoned field is known,
that is, once sd_read_block_characteristics() is executed in
sd_revalidate_disk(), so that host-aware disks also get initialized. To do
so, move sd_zbc_init_disk() call in sd_zbc_revalidate_zones() and make sure
to execute it for all zoned disks, including for host-aware disks used as
regular disks as these disk zoned model may be changed back to BLK_ZONED_HA
when partitions are deleted.
Link: https://lore.kernel.org/r/20200915073347.832424-3-damien.lemoal@wdc.com
Fixes: 5795eb443060 ("scsi: sd_zbc: emulate ZONE_APPEND commands")
Cc: <stable(a)vger.kernel.org> # v5.8+
Reported-by: Borislav Petkov <bp(a)alien8.de>
Tested-by: Borislav Petkov <bp(a)suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Damien Le Moal <damien.lemoal(a)wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 06286b6aeaec..16503e22691e 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3410,10 +3410,6 @@ static int sd_probe(struct device *dev)
sdkp->first_scan = 1;
sdkp->max_medium_access_timeouts = SD_MAX_MEDIUM_TIMEOUTS;
- error = sd_zbc_init_disk(sdkp);
- if (error)
- goto out_free_index;
-
sd_revalidate_disk(gd);
gd->flags = GENHD_FL_EXT_DEVT;
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 7251434100e6..a3aad608bc38 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -215,7 +215,6 @@ static inline int sd_is_zoned(struct scsi_disk *sdkp)
#ifdef CONFIG_BLK_DEV_ZONED
-int sd_zbc_init_disk(struct scsi_disk *sdkp);
void sd_zbc_release_disk(struct scsi_disk *sdkp);
int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buffer);
int sd_zbc_revalidate_zones(struct scsi_disk *sdkp);
@@ -231,11 +230,6 @@ blk_status_t sd_zbc_prepare_zone_append(struct scsi_cmnd *cmd, sector_t *lba,
#else /* CONFIG_BLK_DEV_ZONED */
-static inline int sd_zbc_init_disk(struct scsi_disk *sdkp)
-{
- return 0;
-}
-
static inline void sd_zbc_release_disk(struct scsi_disk *sdkp) {}
static inline int sd_zbc_read_zones(struct scsi_disk *sdkp,
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index a739456dea02..cf07b7f93579 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -651,6 +651,28 @@ static void sd_zbc_print_zones(struct scsi_disk *sdkp)
sdkp->zone_blocks);
}
+static int sd_zbc_init_disk(struct scsi_disk *sdkp)
+{
+ sdkp->zones_wp_offset = NULL;
+ spin_lock_init(&sdkp->zones_wp_offset_lock);
+ sdkp->rev_wp_offset = NULL;
+ mutex_init(&sdkp->rev_mutex);
+ INIT_WORK(&sdkp->zone_wp_offset_work, sd_zbc_update_wp_offset_workfn);
+ sdkp->zone_wp_update_buf = kzalloc(SD_BUF_SIZE, GFP_KERNEL);
+ if (!sdkp->zone_wp_update_buf)
+ return -ENOMEM;
+
+ return 0;
+}
+
+void sd_zbc_release_disk(struct scsi_disk *sdkp)
+{
+ kvfree(sdkp->zones_wp_offset);
+ sdkp->zones_wp_offset = NULL;
+ kfree(sdkp->zone_wp_update_buf);
+ sdkp->zone_wp_update_buf = NULL;
+}
+
static void sd_zbc_revalidate_zones_cb(struct gendisk *disk)
{
struct scsi_disk *sdkp = scsi_disk(disk);
@@ -667,6 +689,19 @@ int sd_zbc_revalidate_zones(struct scsi_disk *sdkp)
u32 max_append;
int ret = 0;
+ /*
+ * For all zoned disks, initialize zone append emulation data if not
+ * already done. This is necessary also for host-aware disks used as
+ * regular disks due to the presence of partitions as these partitions
+ * may be deleted and the disk zoned model changed back from
+ * BLK_ZONED_NONE to BLK_ZONED_HA.
+ */
+ if (sd_is_zoned(sdkp) && !sdkp->zone_wp_update_buf) {
+ ret = sd_zbc_init_disk(sdkp);
+ if (ret)
+ return ret;
+ }
+
/*
* There is nothing to do for regular disks, including host-aware disks
* that have partitions.
@@ -768,28 +803,3 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
return ret;
}
-
-int sd_zbc_init_disk(struct scsi_disk *sdkp)
-{
- if (!sd_is_zoned(sdkp))
- return 0;
-
- sdkp->zones_wp_offset = NULL;
- spin_lock_init(&sdkp->zones_wp_offset_lock);
- sdkp->rev_wp_offset = NULL;
- mutex_init(&sdkp->rev_mutex);
- INIT_WORK(&sdkp->zone_wp_offset_work, sd_zbc_update_wp_offset_workfn);
- sdkp->zone_wp_update_buf = kzalloc(SD_BUF_SIZE, GFP_KERNEL);
- if (!sdkp->zone_wp_update_buf)
- return -ENOMEM;
-
- return 0;
-}
-
-void sd_zbc_release_disk(struct scsi_disk *sdkp)
-{
- kvfree(sdkp->zones_wp_offset);
- sdkp->zones_wp_offset = NULL;
- kfree(sdkp->zone_wp_update_buf);
- sdkp->zone_wp_update_buf = NULL;
-}
The following commit has been merged into the efi/urgent branch of tip:
Commit-ID: d32de9130f6c79533508e2c7879f18997bfbe2a0
Gitweb: https://git.kernel.org/tip/d32de9130f6c79533508e2c7879f18997bfbe2a0
Author: Ard Biesheuvel <ardb(a)kernel.org>
AuthorDate: Sat, 26 Sep 2020 10:52:42 +02:00
Committer: Ard Biesheuvel <ardb(a)kernel.org>
CommitterDate: Tue, 29 Sep 2020 15:41:52 +02:00
efi/arm64: libstub: Deal gracefully with EFI_RNG_PROTOCOL failure
Currently, on arm64, we abort on any failure from efi_get_random_bytes()
other than EFI_NOT_FOUND when it comes to setting the physical seed for
KASLR, but ignore such failures when obtaining the seed for virtual
KASLR or for early seeding of the kernel's entropy pool via the config
table. This is inconsistent, and may lead to unexpected boot failures.
So let's permit any failure for the physical seed, and simply report
the error code if it does not equal EFI_NOT_FOUND.
Cc: <stable(a)vger.kernel.org> # v5.8+
Reported-by: Heinrich Schuchardt <xypron.glpk(a)gmx.de>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
---
drivers/firmware/efi/libstub/arm64-stub.c | 8 +++++---
drivers/firmware/efi/libstub/fdt.c | 4 +---
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index e5bfac7..04f5d79 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -62,10 +62,12 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
status = efi_get_random_bytes(sizeof(phys_seed),
(u8 *)&phys_seed);
if (status == EFI_NOT_FOUND) {
- efi_info("EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+ efi_info("EFI_RNG_PROTOCOL unavailable, KASLR will be disabled\n");
+ efi_nokaslr = true;
} else if (status != EFI_SUCCESS) {
- efi_err("efi_get_random_bytes() failed\n");
- return status;
+ efi_err("efi_get_random_bytes() failed (0x%lx), KASLR will be disabled\n",
+ status);
+ efi_nokaslr = true;
}
} else {
efi_info("KASLR disabled on kernel command line\n");
diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index 11ecf3c..368cd60 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -136,7 +136,7 @@ static efi_status_t update_fdt(void *orig_fdt, unsigned long orig_fdt_size,
if (status)
goto fdt_set_fail;
- if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && !efi_nokaslr) {
efi_status_t efi_status;
efi_status = efi_get_random_bytes(sizeof(fdt_val64),
@@ -145,8 +145,6 @@ static efi_status_t update_fdt(void *orig_fdt, unsigned long orig_fdt_size,
status = fdt_setprop_var(fdt, node, "kaslr-seed", fdt_val64);
if (status)
goto fdt_set_fail;
- } else if (efi_status != EFI_NOT_FOUND) {
- return efi_status;
}
}
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: caa13284791a - clocksource/drivers/timer-ti-dm: Do reset before enable
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Ethernet drivers sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
Host 4:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
x86_64:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 3:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
When tpm_get_random() was introduced, it defined the following API for the
return value:
1. A positive value tells how many bytes of random data was generated.
2. A negative value on error.
However, in the call sites the API was used incorrectly, i.e. as it would
only return negative values and otherwise zero. Returning he positive read
counts to the user space does not make any possible sense.
Fix this by returning -EIO when tpm_get_random() returns a positive value.
Fixes: 41ab999c80f1 ("tpm: Move tpm_get_random api into the TPM device driver")
Cc: Kent Yoder <key(a)linux.vnet.ibm.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley(a)HansenPartnership.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com>
---
security/keys/trusted-keys/trusted_tpm1.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/security/keys/trusted-keys/trusted_tpm1.c b/security/keys/trusted-keys/trusted_tpm1.c
index b9fe02e5f84f..c7b1701cdac5 100644
--- a/security/keys/trusted-keys/trusted_tpm1.c
+++ b/security/keys/trusted-keys/trusted_tpm1.c
@@ -403,9 +403,12 @@ static int osap(struct tpm_buf *tb, struct osapsess *s,
int ret;
ret = tpm_get_random(chip, ononce, TPM_NONCE_SIZE);
- if (ret != TPM_NONCE_SIZE)
+ if (ret < 0)
return ret;
+ if (ret != TPM_NONCE_SIZE)
+ return -EIO;
+
tpm_buf_reset(tb, TPM_TAG_RQU_COMMAND, TPM_ORD_OSAP);
tpm_buf_append_u16(tb, type);
tpm_buf_append_u32(tb, handle);
@@ -496,8 +499,12 @@ static int tpm_seal(struct tpm_buf *tb, uint16_t keytype,
goto out;
ret = tpm_get_random(chip, td->nonceodd, TPM_NONCE_SIZE);
+ if (ret < 0)
+ return ret;
+
if (ret != TPM_NONCE_SIZE)
- goto out;
+ return -EIO;
+
ordinal = htonl(TPM_ORD_SEAL);
datsize = htonl(datalen);
pcrsize = htonl(pcrinfosize);
@@ -601,9 +608,12 @@ static int tpm_unseal(struct tpm_buf *tb,
ordinal = htonl(TPM_ORD_UNSEAL);
ret = tpm_get_random(chip, nonceodd, TPM_NONCE_SIZE);
+ if (ret < 0)
+ return ret;
+
if (ret != TPM_NONCE_SIZE) {
pr_info("trusted_key: tpm_get_random failed (%d)\n", ret);
- return ret;
+ return -EIO;
}
ret = TSS_authhmac(authdata1, keyauth, TPM_NONCE_SIZE,
enonce1, nonceodd, cont, sizeof(uint32_t),
@@ -1013,8 +1023,12 @@ static int trusted_instantiate(struct key *key,
case Opt_new:
key_len = payload->key_len;
ret = tpm_get_random(chip, payload->key, key_len);
+ if (ret < 0)
+ goto out;
+
if (ret != key_len) {
pr_info("trusted_key: key_create failed (%d)\n", ret);
+ ret = -EIO;
goto out;
}
if (tpm2)
--
2.25.1
The patch titled
Subject: mm-khugepaged-recalculate-min_free_kbytes-after-memory-hotplug-as-expected-by-khugepaged-v5
has been added to the -mm tree. Its filename is
mm-khugepaged-recalculate-min_free_kbytes-after-memory-hotplug-as-expected-by-khugepaged-v5.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-khugepaged-recalculate-min_fre…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-khugepaged-recalculate-min_fre…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Vijay Balakrishna <vijayb(a)linux.microsoft.com>
Subject: mm-khugepaged-recalculate-min_free_kbytes-after-memory-hotplug-as-expected-by-khugepaged-v5
made changes to move khugepaged_min_free_kbytes_update into
init_per_zone_wmark_min and rested changes
Link: https://lkml.kernel.org/r/1601398153-5517-1-git-send-email-vijayb@linux.mic…
Fixes: f000565adb77 ("thp: set recommended min free kbytes")
Signed-off-by: Vijay Balakrishna <vijayb(a)linux.microsoft.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Allen Pais <apais(a)microsoft.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory_hotplug.c | 3 ---
mm/page_alloc.c | 3 +++
2 files changed, 3 insertions(+), 3 deletions(-)
--- a/mm/memory_hotplug.c~mm-khugepaged-recalculate-min_free_kbytes-after-memory-hotplug-as-expected-by-khugepaged-v5
+++ a/mm/memory_hotplug.c
@@ -36,7 +36,6 @@
#include <linux/memblock.h>
#include <linux/compaction.h>
#include <linux/rmap.h>
-#include <linux/khugepaged.h>
#include <asm/tlbflush.h>
@@ -858,7 +857,6 @@ int __ref online_pages(unsigned long pfn
zone_pcp_update(zone);
init_per_zone_wmark_min();
- khugepaged_min_free_kbytes_update();
kswapd_run(nid);
kcompactd_run(nid);
@@ -1617,7 +1615,6 @@ static int __ref __offline_pages(unsigne
pgdat_resize_unlock(zone->zone_pgdat, &flags);
init_per_zone_wmark_min();
- khugepaged_min_free_kbytes_update();
if (!populated_zone(zone)) {
zone_pcp_reset(zone);
--- a/mm/page_alloc.c~mm-khugepaged-recalculate-min_free_kbytes-after-memory-hotplug-as-expected-by-khugepaged-v5
+++ a/mm/page_alloc.c
@@ -69,6 +69,7 @@
#include <linux/nmi.h>
#include <linux/psi.h>
#include <linux/padata.h>
+#include <linux/khugepaged.h>
#include <asm/sections.h>
#include <asm/tlbflush.h>
@@ -7891,6 +7892,8 @@ int __meminit init_per_zone_wmark_min(vo
setup_min_slab_ratio();
#endif
+ khugepaged_min_free_kbytes_update();
+
return 0;
}
postcore_initcall(init_per_zone_wmark_min)
_
Patches currently in -mm which might be from vijayb(a)linux.microsoft.com are
mm-khugepaged-recalculate-min_free_kbytes-after-memory-hotplug-as-expected-by-khugepaged.patch
mm-khugepaged-recalculate-min_free_kbytes-after-memory-hotplug-as-expected-by-khugepaged-v5.patch
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 9e8a2521d6e5 - clocksource/drivers/timer-ti-dm: Do reset before enable
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
ppc64le:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 2bea8b771966 - Linux 5.8.13-rc1
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
x86_64:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 3:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
This is a note to let you know that I've just added the patch titled
iio: adc: gyroadc: fix leak of device node iterator
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the staging-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From da4410d4078ba4ead9d6f1027d6db77c5a74ecee Mon Sep 17 00:00:00 2001
From: Tobias Jordan <kernel(a)cdqe.de>
Date: Sat, 26 Sep 2020 18:19:46 +0200
Subject: iio: adc: gyroadc: fix leak of device node iterator
Add missing of_node_put calls when exiting the for_each_child_of_node
loop in rcar_gyroadc_parse_subdevs early.
Also add goto-exception handling for the error paths in that loop.
Fixes: 059c53b32329 ("iio: adc: Add Renesas GyroADC driver")
Signed-off-by: Tobias Jordan <kernel(a)cdqe.de>
Link: https://lore.kernel.org/r/20200926161946.GA10240@agrajag.zerfleddert.de
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/rcar-gyroadc.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/drivers/iio/adc/rcar-gyroadc.c b/drivers/iio/adc/rcar-gyroadc.c
index dcaefc108ff6..9f38cf3c7dc2 100644
--- a/drivers/iio/adc/rcar-gyroadc.c
+++ b/drivers/iio/adc/rcar-gyroadc.c
@@ -357,7 +357,7 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
num_channels = ARRAY_SIZE(rcar_gyroadc_iio_channels_3);
break;
default:
- return -EINVAL;
+ goto err_e_inval;
}
/*
@@ -374,7 +374,7 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
dev_err(dev,
"Failed to get child reg property of ADC \"%pOFn\".\n",
child);
- return ret;
+ goto err_of_node_put;
}
/* Channel number is too high. */
@@ -382,7 +382,7 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
dev_err(dev,
"Only %i channels supported with %pOFn, but reg = <%i>.\n",
num_channels, child, reg);
- return -EINVAL;
+ goto err_e_inval;
}
}
@@ -391,7 +391,7 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
dev_err(dev,
"Channel %i uses different ADC mode than the rest.\n",
reg);
- return -EINVAL;
+ goto err_e_inval;
}
/* Channel is valid, grab the regulator. */
@@ -401,7 +401,8 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
if (IS_ERR(vref)) {
dev_dbg(dev, "Channel %i 'vref' supply not connected.\n",
reg);
- return PTR_ERR(vref);
+ ret = PTR_ERR(vref);
+ goto err_of_node_put;
}
priv->vref[reg] = vref;
@@ -425,8 +426,10 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
* attached to the GyroADC at a time, so if we found it,
* we can stop parsing here.
*/
- if (childmode == RCAR_GYROADC_MODE_SELECT_1_MB88101A)
+ if (childmode == RCAR_GYROADC_MODE_SELECT_1_MB88101A) {
+ of_node_put(child);
break;
+ }
}
if (first) {
@@ -435,6 +438,12 @@ static int rcar_gyroadc_parse_subdevs(struct iio_dev *indio_dev)
}
return 0;
+
+err_e_inval:
+ ret = -EINVAL;
+err_of_node_put:
+ of_node_put(child);
+ return ret;
}
static void rcar_gyroadc_deinit_supplies(struct iio_dev *indio_dev)
--
2.28.0
This is a note to let you know that I've just added the patch titled
iio: adc: at91-sama5d2_adc: fix DMA conversion crash
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the staging-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 1a198794451449113fa86994ed491d6986802c23 Mon Sep 17 00:00:00 2001
From: Eugen Hristev <eugen.hristev(a)microchip.com>
Date: Wed, 23 Sep 2020 15:17:48 +0300
Subject: iio: adc: at91-sama5d2_adc: fix DMA conversion crash
After the move of the postenable code to preenable, the DMA start was
done before the DMA init, which is not correct.
The DMA is initialized in set_watermark. Because of this, we need to call
the DMA start functions in set_watermark, after the DMA init, instead of
preenable hook, when the DMA is not properly setup yet.
Fixes: f3c034f61775 ("iio: at91-sama5d2_adc: adjust iio_triggered_buffer_{predisable,postenable} positions")
Signed-off-by: Eugen Hristev <eugen.hristev(a)microchip.com>
Link: https://lore.kernel.org/r/20200923121748.49384-1-eugen.hristev@microchip.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/at91-sama5d2_adc.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/iio/adc/at91-sama5d2_adc.c b/drivers/iio/adc/at91-sama5d2_adc.c
index ad7d9819f83c..b917a4714a9c 100644
--- a/drivers/iio/adc/at91-sama5d2_adc.c
+++ b/drivers/iio/adc/at91-sama5d2_adc.c
@@ -884,7 +884,7 @@ static bool at91_adc_current_chan_is_touch(struct iio_dev *indio_dev)
AT91_SAMA5D2_MAX_CHAN_IDX + 1);
}
-static int at91_adc_buffer_preenable(struct iio_dev *indio_dev)
+static int at91_adc_buffer_prepare(struct iio_dev *indio_dev)
{
int ret;
u8 bit;
@@ -901,7 +901,7 @@ static int at91_adc_buffer_preenable(struct iio_dev *indio_dev)
/* we continue with the triggered buffer */
ret = at91_adc_dma_start(indio_dev);
if (ret) {
- dev_err(&indio_dev->dev, "buffer postenable failed\n");
+ dev_err(&indio_dev->dev, "buffer prepare failed\n");
return ret;
}
@@ -989,7 +989,6 @@ static int at91_adc_buffer_postdisable(struct iio_dev *indio_dev)
}
static const struct iio_buffer_setup_ops at91_buffer_setup_ops = {
- .preenable = &at91_adc_buffer_preenable,
.postdisable = &at91_adc_buffer_postdisable,
};
@@ -1563,6 +1562,7 @@ static void at91_adc_dma_disable(struct platform_device *pdev)
static int at91_adc_set_watermark(struct iio_dev *indio_dev, unsigned int val)
{
struct at91_adc_state *st = iio_priv(indio_dev);
+ int ret;
if (val > AT91_HWFIFO_MAX_SIZE)
return -EINVAL;
@@ -1586,7 +1586,15 @@ static int at91_adc_set_watermark(struct iio_dev *indio_dev, unsigned int val)
else if (val > 1)
at91_adc_dma_init(to_platform_device(&indio_dev->dev));
- return 0;
+ /*
+ * We can start the DMA only after setting the watermark and
+ * having the DMA initialization completed
+ */
+ ret = at91_adc_buffer_prepare(indio_dev);
+ if (ret)
+ at91_adc_dma_disable(to_platform_device(&indio_dev->dev));
+
+ return ret;
}
static int at91_adc_update_scan_mode(struct iio_dev *indio_dev,
--
2.28.0
Backport version to the 5.4-stable tree of:
[ Upstream commit c1d0da83358a2316d9be7f229f26126dbaa07468 ]
Patch series "mm: fix memory to node bad links in sysfs", v3.
Sometimes, firmware may expose interleaved memory layout like this:
Early memory node ranges
node 1: [mem 0x0000000000000000-0x000000011fffffff]
node 2: [mem 0x0000000120000000-0x000000014fffffff]
node 1: [mem 0x0000000150000000-0x00000001ffffffff]
node 0: [mem 0x0000000200000000-0x000000048fffffff]
node 2: [mem 0x0000000490000000-0x00000007ffffffff]
In that case, we can see memory blocks assigned to multiple nodes in
sysfs:
$ ls -l /sys/devices/system/memory/memory21
total 0
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2
-rw-r--r-- 1 root root 65536 Aug 24 05:27 online
-r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device
-r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index
drwxr-xr-x 2 root root 0 Aug 24 05:27 power
-r--r--r-- 1 root root 65536 Aug 24 05:27 removable
-rw-r--r-- 1 root root 65536 Aug 24 05:27 state
lrwxrwxrwx 1 root root 0 Aug 24 05:25 subsystem -> ../../../../bus/memory
-rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent
-r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones
The same applies in the node's directory with a memory21 link in both
the node1 and node2's directory.
This is wrong but doesn't prevent the system to run. However when
later, one of these memory blocks is hot-unplugged and then hot-plugged,
the system is detecting an inconsistency in the sysfs layout and a
BUG_ON() is raised:
kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
Call Trace:
add_memory_resource+0x23c/0x340 (unreliable)
__add_memory+0x5c/0xf0
dlpar_add_lmb+0x1b4/0x500
dlpar_memory+0x1f8/0xb80
handle_dlpar_errorlog+0xc0/0x190
dlpar_store+0x198/0x4a0
kobj_attr_store+0x30/0x50
sysfs_kf_write+0x64/0x90
kernfs_fop_write+0x1b0/0x290
vfs_write+0xe8/0x290
ksys_write+0xdc/0x130
system_call_exception+0x160/0x270
system_call_common+0xf0/0x27c
This has been seen on PowerPC LPAR.
The root cause of this issue is that when node's memory is registered,
the range used can overlap another node's range, thus the memory block
is registered to multiple nodes in sysfs.
There are two issues here:
(a) The sysfs memory and node's layouts are broken due to these
multiple links
(b) The link errors in link_mem_sections() should not lead to a system
panic.
To address (a) register_mem_sect_under_node should not rely on the
system state to detect whether the link operation is triggered by a hot
plug operation or not. This is addressed by the patches 1 and 2 of this
series.
Issue (b) will be addressed separately.
This patch (of 2):
The memmap_context enum is used to detect whether a memory operation is
due to a hot-add operation or happening at boot time.
Make it general to the hotplug operation and rename it as
meminit_context.
There is no functional change introduced by this patch
Suggested-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J . Wysocki" <rafael(a)kernel.org>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: <stable(a)vger.kernel.org> # 5.4.y
Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com
Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
arch/ia64/mm/init.c | 6 +++---
include/linux/mm.h | 2 +-
include/linux/mmzone.h | 11 ++++++++---
mm/memory_hotplug.c | 2 +-
mm/page_alloc.c | 10 +++++-----
5 files changed, 18 insertions(+), 13 deletions(-)
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index a6dd80a2c939..ee50506d86f4 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -518,7 +518,7 @@ virtual_memmap_init(u64 start, u64 end, void *arg)
if (map_start < map_end)
memmap_init_zone((unsigned long)(map_end - map_start),
args->nid, args->zone, page_to_pfn(map_start),
- MEMMAP_EARLY, NULL);
+ MEMINIT_EARLY, NULL);
return 0;
}
@@ -527,8 +527,8 @@ memmap_init (unsigned long size, int nid, unsigned long zone,
unsigned long start_pfn)
{
if (!vmem_map) {
- memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY,
- NULL);
+ memmap_init_zone(size, nid, zone, start_pfn,
+ MEMINIT_EARLY, NULL);
} else {
struct page *start;
struct memmap_init_callback_data args;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 3285dae06c03..34119f393a80 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2208,7 +2208,7 @@ static inline void zero_resv_unavail(void) {}
extern void set_dma_reserve(unsigned long new_dma_reserve);
extern void memmap_init_zone(unsigned long, int, unsigned long, unsigned long,
- enum memmap_context, struct vmem_altmap *);
+ enum meminit_context, struct vmem_altmap *);
extern void setup_per_zone_wmarks(void);
extern int __meminit init_per_zone_wmark_min(void);
extern void mem_init(void);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 85804ba62215..a90aba3d6afb 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -822,10 +822,15 @@ bool zone_watermark_ok(struct zone *z, unsigned int order,
unsigned int alloc_flags);
bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
unsigned long mark, int classzone_idx);
-enum memmap_context {
- MEMMAP_EARLY,
- MEMMAP_HOTPLUG,
+/*
+ * Memory initialization context, use to differentiate memory added by
+ * the platform statically or via memory hotplug interface.
+ */
+enum meminit_context {
+ MEMINIT_EARLY,
+ MEMINIT_HOTPLUG,
};
+
extern void init_currently_empty_zone(struct zone *zone, unsigned long start_pfn,
unsigned long size);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 3eb0b311b4a1..6a4b3a01e1b6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -725,7 +725,7 @@ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
* are reserved so nobody should be touching them so we should be safe
*/
memmap_init_zone(nr_pages, nid, zone_idx(zone), start_pfn,
- MEMMAP_HOTPLUG, altmap);
+ MEMINIT_HOTPLUG, altmap);
set_zone_contiguous(zone);
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 67a9943aa595..373ca5780758 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5875,7 +5875,7 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn)
* done. Non-atomic initialization, single-pass.
*/
void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
- unsigned long start_pfn, enum memmap_context context,
+ unsigned long start_pfn, enum meminit_context context,
struct vmem_altmap *altmap)
{
unsigned long pfn, end_pfn = start_pfn + size;
@@ -5907,7 +5907,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
* There can be holes in boot-time mem_map[]s handed to this
* function. They do not exist on hotplugged memory.
*/
- if (context == MEMMAP_EARLY) {
+ if (context == MEMINIT_EARLY) {
if (!early_pfn_valid(pfn))
continue;
if (!early_pfn_in_nid(pfn, nid))
@@ -5920,7 +5920,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
page = pfn_to_page(pfn);
__init_single_page(page, pfn, zone, nid);
- if (context == MEMMAP_HOTPLUG)
+ if (context == MEMINIT_HOTPLUG)
__SetPageReserved(page);
/*
@@ -6002,7 +6002,7 @@ void __ref memmap_init_zone_device(struct zone *zone,
* check here not to call set_pageblock_migratetype() against
* pfn out of zone.
*
- * Please note that MEMMAP_HOTPLUG path doesn't clear memmap
+ * Please note that MEMINIT_HOTPLUG path doesn't clear memmap
* because this is done early in section_activate()
*/
if (!(pfn & (pageblock_nr_pages - 1))) {
@@ -6028,7 +6028,7 @@ static void __meminit zone_init_free_lists(struct zone *zone)
void __meminit __weak memmap_init(unsigned long size, int nid,
unsigned long zone, unsigned long start_pfn)
{
- memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY, NULL);
+ memmap_init_zone(size, nid, zone, start_pfn, MEMINIT_EARLY, NULL);
}
static int zone_batchsize(struct zone *zone)
--
2.28.0
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: bff7fa0ebc47 - clocksource/drivers/timer-ti-dm: Do reset before enable
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ❌ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
ppc64le:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 4:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
x86_64:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 3:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 85bb317be4e5 - dm: fix bio splitting and its bio completion order for regular IO
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ❌ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
x86_64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
This is the start of the stable review cycle for the 4.19.148 release.
There are 37 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 27 Sep 2020 12:47:02 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.148-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.148-rc1
Lukas Wunner <lukas(a)wunner.de>
serial: 8250: Avoid error message on reprobe
Priyaranjan Jha <priyarjha(a)google.com>
tcp_bbr: adapt cwnd based on ack aggregation estimation
Priyaranjan Jha <priyarjha(a)google.com>
tcp_bbr: refactor bbr_target_cwnd() for general inflight provisioning
Xunlei Pang <xlpang(a)linux.alibaba.com>
mm: memcg: fix memcg reclaim soft lockup
Masahiro Yamada <masahiroy(a)kernel.org>
kbuild: support LLVM=1 to switch the default tools to Clang/LLVM
Masahiro Yamada <masahiroy(a)kernel.org>
kbuild: replace AS=clang with LLVM_IAS=1
Masahiro Yamada <masahiroy(a)kernel.org>
kbuild: remove AS variable
Dmitry Golovin <dima(a)golovin.in>
x86/boot: kbuild: allow readelf executable to be specified
Masahiro Yamada <masahiroy(a)kernel.org>
net: wan: wanxl: use $(M68KCC) instead of $(M68KAS) for rebuilding firmware
Masahiro Yamada <masahiroy(a)kernel.org>
net: wan: wanxl: use allow to pass CROSS_COMPILE_M68k for rebuilding firmware
Fangrui Song <maskray(a)google.com>
Documentation/llvm: fix the name of llvm-size
Nick Desaulniers <ndesaulniers(a)google.com>
Documentation/llvm: add documentation on building w/ Clang/LLVM
Vasily Gorbik <gor(a)linux.ibm.com>
kbuild: add OBJSIZE variable for the size tool
Nick Desaulniers <ndesaulniers(a)google.com>
MAINTAINERS: add CLANG/LLVM BUILD SUPPORT info
David Ahern <dsahern(a)kernel.org>
ipv4: Update exception handling for multipath routes via same device
Eric Dumazet <edumazet(a)google.com>
net: add __must_check to skb_put_padto()
Eric Dumazet <edumazet(a)google.com>
net: qrtr: check skb_put_padto() return value
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: Avoid NPD upon phy_detach() when driver is unbound
Michael Chan <michael.chan(a)broadcom.com>
bnxt_en: Protect bnxt_set_eee() and bnxt_set_pauseparam() with mutex.
Edwin Peer <edwin.peer(a)broadcom.com>
bnxt_en: return proper error codes in bnxt_show_temp
Xin Long <lucien.xin(a)gmail.com>
tipc: use skb_unshare() instead in tipc_buf_append()
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
tipc: fix shutdown() of connection oriented socket
Peilin Ye <yepeilin.cs(a)gmail.com>
tipc: Fix memory leak in tipc_group_create_member()
Jakub Kicinski <kuba(a)kernel.org>
nfp: use correct define to return NONE fec
Yunsheng Lin <linyunsheng(a)huawei.com>
net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
Necip Fazil Yildiran <fazilyildiran(a)gmail.com>
net: ipv6: fix kconfig dependency warning for IPV6_SEG6_HMAC
Linus Walleij <linus.walleij(a)linaro.org>
net: dsa: rtl8366: Properly clear member config
Petr Machata <petrm(a)nvidia.com>
net: DCB: Validate DCB_ATTR_DCB_BUFFER argument
Eric Dumazet <edumazet(a)google.com>
ipv6: avoid lockdep issue in fib6_del()
Wei Wang <weiwan(a)google.com>
ip: fix tos reflection in ack and reset packets
Dan Carpenter <dan.carpenter(a)oracle.com>
hdlc_ppp: add range checks in ppp_cp_parse_cr()
Mark Gray <mark.d.gray(a)redhat.com>
geneve: add transport ports in route lookup for geneve
Ganji Aravind <ganji.aravind(a)chelsio.com>
cxgb4: Fix offset when clearing filter byte counters
Ralph Campbell <rcampbell(a)nvidia.com>
mm/thp: fix __split_huge_pmd_locked() for migration PMD
Muchun Song <songmuchun(a)bytedance.com>
kprobes: fix kill kprobe which has been marked as gone
Rustam Kovhaev <rkovhaev(a)gmail.com>
KVM: fix memory leak in kvm_io_bus_unregister_dev()
Mark Salyzyn <salyzyn(a)android.com>
af_key: pfkey_dump needs parameter validation
-------------
Diffstat:
Documentation/kbuild/llvm.rst | 87 ++++++++++
MAINTAINERS | 9 ++
Makefile | 40 +++--
arch/x86/boot/compressed/Makefile | 2 +-
drivers/net/dsa/rtl8366.c | 20 ++-
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 19 ++-
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 31 ++--
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 9 +-
.../net/ethernet/netronome/nfp/nfp_net_ethtool.c | 4 +-
drivers/net/geneve.c | 37 +++--
drivers/net/phy/phy_device.c | 3 +-
drivers/net/wan/Kconfig | 2 +-
drivers/net/wan/Makefile | 12 +-
drivers/net/wan/hdlc_ppp.c | 16 +-
drivers/tty/serial/8250/8250_core.c | 11 +-
include/linux/skbuff.h | 7 +-
include/net/inet_connection_sock.h | 4 +-
kernel/kprobes.c | 9 +-
mm/huge_memory.c | 40 +++--
mm/vmscan.c | 8 +
net/dcb/dcbnl.c | 8 +
net/ipv4/ip_output.c | 3 +-
net/ipv4/route.c | 11 +-
net/ipv4/tcp_bbr.c | 180 ++++++++++++++++++---
net/ipv6/Kconfig | 1 +
net/ipv6/ip6_fib.c | 13 +-
net/key/af_key.c | 7 +
net/qrtr/qrtr.c | 20 +--
net/sched/sch_generic.c | 49 ++++--
net/tipc/group.c | 14 +-
net/tipc/msg.c | 3 +-
net/tipc/socket.c | 5 +-
tools/objtool/Makefile | 6 +
virt/kvm/kvm_main.c | 21 +--
34 files changed, 550 insertions(+), 161 deletions(-)
Be consistent and use unsigned long throughout the chunk copies to
avoid the inherent clumsiness of mixing integer types of different
widths and signs. Failing to take acount of a wider unsigned type when
using min_t can lead to treating it as a negative, only for it flip back
to a large unsigned value after passing a boundary check.
Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap")
Testcase: igt/gen9_exec_parse/bb-large
Reported-by: "Candelaria, Jared" <jared.candelaria(a)intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: "Candelaria, Jared" <jared.candelaria(a)intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v4.9+
---
The alternative would be to use u32 throughout, but that would also mean
keeping the min_t(u32, ...). unsigned long decouples the mechanism from
the API limits, so long as we remember to enforce that the mechanism
copes with the entire range of the API.
---
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 7 +++++--
drivers/gpu/drm/i915/i915_cmd_parser.c | 10 +++++-----
drivers/gpu/drm/i915/i915_drv.h | 4 ++--
3 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 5509946f1a1d..4b09bcd70cf4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2267,8 +2267,8 @@ struct eb_parse_work {
struct i915_vma *batch;
struct i915_vma *shadow;
struct i915_vma *trampoline;
- unsigned int batch_offset;
- unsigned int batch_length;
+ unsigned long batch_offset;
+ unsigned long batch_length;
};
static int __eb_parse(struct dma_fence_work *work)
@@ -2338,6 +2338,9 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
struct eb_parse_work *pw;
int err;
+ GEM_BUG_ON(overflows_type(eb->batch_start_offset, pw->batch_offset));
+ GEM_BUG_ON(overflows_type(eb->batch_len, pw->batch_length));
+
pw = kzalloc(sizeof(*pw), GFP_KERNEL);
if (!pw)
return -ENOMEM;
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 5ac4a999f05a..e88970256e8e 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1136,7 +1136,7 @@ find_reg(const struct intel_engine_cs *engine, u32 addr)
/* Returns a vmap'd pointer to dst_obj, which the caller must unmap */
static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
struct drm_i915_gem_object *src_obj,
- u32 offset, u32 length)
+ unsigned long offset, unsigned long length)
{
bool needs_clflush;
void *dst, *src;
@@ -1166,8 +1166,8 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
}
}
if (IS_ERR(src)) {
+ unsigned long x, n;
void *ptr;
- int x, n;
/*
* We can avoid clflushing partial cachelines before the write
@@ -1184,7 +1184,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
ptr = dst;
x = offset_in_page(offset);
for (n = offset >> PAGE_SHIFT; length; n++) {
- int len = min_t(int, length, PAGE_SIZE - x);
+ int len = min(length, PAGE_SIZE - x);
src = kmap_atomic(i915_gem_object_get_page(src_obj, n));
if (needs_clflush)
@@ -1414,8 +1414,8 @@ static bool shadow_needs_clflush(struct drm_i915_gem_object *obj)
*/
int intel_engine_cmd_parser(struct intel_engine_cs *engine,
struct i915_vma *batch,
- u32 batch_offset,
- u32 batch_length,
+ unsigned long batch_offset,
+ unsigned long batch_length,
struct i915_vma *shadow,
bool trampoline)
{
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 72a9449b674e..eef9a821c49c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1949,8 +1949,8 @@ void intel_engine_init_cmd_parser(struct intel_engine_cs *engine);
void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine);
int intel_engine_cmd_parser(struct intel_engine_cs *engine,
struct i915_vma *batch,
- u32 batch_offset,
- u32 batch_length,
+ unsigned long batch_offset,
+ unsigned long batch_length,
struct i915_vma *shadow,
bool trampoline);
#define I915_CMD_PARSER_TRAMPOLINE_SIZE 8
--
2.20.1
Fix a newly introduced build breakge in drm-misc-next
Patch 1/2 should be applied to drm-misc-next
Also fix a warning in the same driver - the warning is present in v5.8
Patch 2/2 is a drm-misc-fixes candidate
Sam
Sam Ravnborg (2):
drm/rockchip: fix build due to undefined drm_gem_cma_vm_ops
drm/rockchip: fix warning from cdn_dp_resume
drivers/gpu/drm/rockchip/cdn-dp-core.c | 2 +-
drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
Commit c4ad98e4b72cb5be30ea282fce935248f2300e62 upstream.
KVM currently assumes that an instruction abort can never be a write.
This is in general true, except when the abort is triggered by
a S1PTW on instruction fetch that tries to update the S1 page tables
(to set AF, for example).
This can happen if the page tables have been paged out and brought
back in without seeing a direct write to them (they are thus marked
read only), and the fault handling code will make the PT executable(!)
instead of writable. The guest gets stuck forever.
In these conditions, the permission fault must be considered as
a write so that the Stage-1 update can take place. This is essentially
the I-side equivalent of the problem fixed by 60e21a0ef54c ("arm64: KVM:
Take S1 walks into account when determining S2 write faults").
Update kvm_is_write_fault() to return true on IABT+S1PTW, and introduce
kvm_vcpu_trap_is_exec_fault() that only return true when no faulting
on a S1 fault. Additionally, kvm_vcpu_dabt_iss1tw() is renamed to
kvm_vcpu_abt_iss1tw(), as the above makes it plain that it isn't
specific to data abort.
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Reviewed-by: Will Deacon <will(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20200915104218.1284701-2-maz@kernel.org
---
arch/arm64/include/asm/kvm_emulate.h | 12 ++++++++++--
arch/arm64/kvm/hyp/switch.c | 2 +-
arch/arm64/kvm/mmio.c | 2 +-
arch/arm64/kvm/mmu.c | 2 +-
4 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 4d0f8ea600ba..e1254e55835b 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -319,7 +319,7 @@ static __always_inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu)
return (kvm_vcpu_get_hsr(vcpu) & ESR_ELx_SRT_MASK) >> ESR_ELx_SRT_SHIFT;
}
-static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_vcpu_abt_iss1tw(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_S1PTW);
}
@@ -327,7 +327,7 @@ static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
static __always_inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_WNR) ||
- kvm_vcpu_dabt_iss1tw(vcpu); /* AF/DBM update */
+ kvm_vcpu_abt_iss1tw(vcpu); /* AF/DBM update */
}
static inline bool kvm_vcpu_dabt_is_cm(const struct kvm_vcpu *vcpu)
@@ -356,6 +356,11 @@ static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu)
return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW;
}
+static inline bool kvm_vcpu_trap_is_exec_fault(const struct kvm_vcpu *vcpu)
+{
+ return kvm_vcpu_trap_is_iabt(vcpu) && !kvm_vcpu_abt_iss1tw(vcpu);
+}
+
static __always_inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu)
{
return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_FSC;
@@ -393,6 +398,9 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
{
+ if (kvm_vcpu_abt_iss1tw(vcpu))
+ return true;
+
if (kvm_vcpu_trap_is_iabt(vcpu))
return false;
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index ba225e09aaf1..8564742948d3 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -599,7 +599,7 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
kvm_vcpu_dabt_isvalid(vcpu) &&
!kvm_vcpu_dabt_isextabt(vcpu) &&
- !kvm_vcpu_dabt_iss1tw(vcpu);
+ !kvm_vcpu_abt_iss1tw(vcpu);
if (valid) {
int ret = __vgic_v2_perform_cpuif_access(vcpu);
diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index 4e0366759726..07e9b6eab59e 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -146,7 +146,7 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run,
}
/* Page table accesses IO mem: tell guest to fix its TTBR */
- if (kvm_vcpu_dabt_iss1tw(vcpu)) {
+ if (kvm_vcpu_abt_iss1tw(vcpu)) {
kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
return 1;
}
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index d906350d543d..1677107b74de 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1845,7 +1845,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
unsigned long vma_pagesize, flags = 0;
write_fault = kvm_is_write_fault(vcpu);
- exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
+ exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
VM_BUG_ON(write_fault && exec_fault);
if (fault_status == FSC_PERM && !write_fault && !exec_fault) {
--
2.28.0
Dear stable kernel maintainers,
Please consider applying the following patches to 4.14, 4.9, and 4.4
implementing stpcpy, which is required when building w/ clang-12 and
newer. These are manual backports of patches already accepted into
stable earlier today, but for branches which they do not cherry pick
back cleanly, due to us not backporting commit 458a3bf82df4
("lib/string: Add strscpy_pad() function") which landed in v5.2-rc1.
Thanks to Nathan Chancellor for taking the time to backport these.
The 4.9 and 4.4 patches look the same to me (other than the base).
(We've been running them under CI to ensure our builds stay green. I
would like to drop them.)
--
Thanks,
~Nick Desaulniers
This is the start of the stable review cycle for the 4.19.141 release.
There are 92 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 22 Aug 2020 09:15:09 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.141-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.141-rc1
Sandeep Raghuraman <sandy.8925(a)gmail.com>
drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume
Marius Iacob <themariusus(a)gmail.com>
drm: Added orientation quirk for ASUS tablet model T103HAF
Denis Efremov <efremov(a)linux.com>
drm/radeon: fix fb_div check in ni_init_smc_spll_table()
Tomasz Maciej Nowak <tmn505(a)gmail.com>
arm64: dts: marvell: espressobin: add ethernet alias
Hugh Dickins <hughd(a)google.com>
khugepaged: retract_page_tables() remember to test exit
Geert Uytterhoeven <geert+renesas(a)glider.be>
sh: landisk: Add missing initialization of sh_io_port_base
Daniel Díaz <daniel.diaz(a)linaro.org>
tools build feature: Quote CC and CXX for their arguments
Vincent Whitchurch <vincent.whitchurch(a)axis.com>
perf bench mem: Always memset source before memcpy
Dinghao Liu <dinghao.liu(a)zju.edu.cn>
ALSA: echoaudio: Fix potential Oops in snd_echo_resume()
Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
mfd: dln2: Run event handler loop under spinlock
Tiezhu Yang <yangtiezhu(a)loongson.cn>
test_kmod: avoid potential double free in trigger_config_run_type()
Colin Ian King <colin.king(a)canonical.com>
fs/ufs: avoid potential u32 multiplication overflow
Eric Biggers <ebiggers(a)google.com>
fs/minix: remove expected error message in block_to_path()
Eric Biggers <ebiggers(a)google.com>
fs/minix: fix block limit check for V1 filesystems
Eric Biggers <ebiggers(a)google.com>
fs/minix: set s_maxbytes correctly
Jeffrey Mitchell <jeffrey.mitchell(a)starlab.io>
nfs: Fix getxattr kernel panic and memory overflow
Wang Hai <wanghai38(a)huawei.com>
net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init
Dan Carpenter <dan.carpenter(a)oracle.com>
drm/vmwgfx: Fix two list_for_each loop exit tests
Dan Carpenter <dan.carpenter(a)oracle.com>
drm/vmwgfx: Use correct vmw_legacy_display_unit pointer
Colin Ian King <colin.king(a)canonical.com>
Input: sentelic - fix error return when fsp_reg_write fails
Krzysztof Sobota <krzysztof.sobota(a)nokia.com>
watchdog: initialize device before misc_register
Ewan D. Milne <emilne(a)redhat.com>
scsi: lpfc: nvmet: Avoid hang / use-after-free again when destroying targetport
Stafford Horne <shorne(a)gmail.com>
openrisc: Fix oops caused when dumping stack
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
i2c: rcar: avoid race when unregistering slave
Thomas Hebb <tommyhebb(a)gmail.com>
tools build feature: Use CC and CXX from parent
Rayagonda Kokatanur <rayagonda.kokatanur(a)broadcom.com>
pwm: bcm-iproc: handle clk_get_rate() return
Xu Wang <vulab(a)iscas.ac.cn>
clk: clk-atlas6: fix return value check in atlas6_clk_init()
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
i2c: rcar: slave: only send STOP event when we have been addressed
Liu Yi L <yi.l.liu(a)intel.com>
iommu/vt-d: Enforce PASID devTLB field mask
Colin Ian King <colin.king(a)canonical.com>
iommu/omap: Check for failure of a call to omap_iommu_dump_ctx
Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
selftests/powerpc: ptrace-pkey: Don't update expected UAMOR value
Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
selftests/powerpc: ptrace-pkey: Update the test to mark an invalid pkey correctly
Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
selftests/powerpc: ptrace-pkey: Rename variables to make it easier to follow code
Ming Lei <ming.lei(a)redhat.com>
dm rq: don't call blk_mq_queue_stopped() in dm_stop_queue()
Steve Longerbeam <slongerbeam(a)gmail.com>
gpu: ipu-v3: image-convert: Combine rotate/no-rotate irq handlers
Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
mmc: renesas_sdhi_internal_dmac: clean up the code for dma complete
Johan Hovold <johan(a)kernel.org>
USB: serial: ftdi_sio: fix break and sysrq handling
Johan Hovold <johan(a)kernel.org>
USB: serial: ftdi_sio: clean up receive processing
Johan Hovold <johan(a)kernel.org>
USB: serial: ftdi_sio: make process-packet buffer unsigned
Paul Kocialkowski <paul.kocialkowski(a)bootlin.com>
media: rockchip: rga: Only set output CSC mode for RGB input
Paul Kocialkowski <paul.kocialkowski(a)bootlin.com>
media: rockchip: rga: Introduce color fmt macros and refactor CSC mode logic
Jason Gunthorpe <jgg(a)nvidia.com>
RDMA/ipoib: Fix ABBA deadlock with ipoib_reap_ah()
Kamal Heib <kamalheib1(a)gmail.com>
RDMA/ipoib: Return void from ipoib_ib_dev_stop()
Charles Keepax <ckeepax(a)opensource.cirrus.com>
mfd: arizona: Ensure 32k clock is put on driver unbind and error
Liu Ying <victor.liu(a)nxp.com>
drm/imx: imx-ldb: Disable both channels for split mode in enc->disable()
Sibi Sankar <sibis(a)codeaurora.org>
remoteproc: qcom: q6v5: Update running state before requesting stop
Adrian Hunter <adrian.hunter(a)intel.com>
perf intel-pt: Fix FUP packet state
Kees Cook <keescook(a)chromium.org>
module: Correctly truncate sysfs sections output
Anton Blanchard <anton(a)ozlabs.org>
pseries: Fix 64 bit logical memory block panic
Ahmad Fatoum <a.fatoum(a)pengutronix.de>
watchdog: f71808e_wdt: clear watchdog timeout occurred flag
Ahmad Fatoum <a.fatoum(a)pengutronix.de>
watchdog: f71808e_wdt: remove use of wrong watchdog_info option
Ahmad Fatoum <a.fatoum(a)pengutronix.de>
watchdog: f71808e_wdt: indicate WDIOF_CARDRESET support in watchdog_info.options
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
tracing: Use trace_sched_process_free() instead of exit() for pid tracing
Kevin Hao <haokexin(a)gmail.com>
tracing/hwlat: Honor the tracing_cpumask
Muchun Song <songmuchun(a)bytedance.com>
kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler
Chengming Zhou <zhouchengming(a)bytedance.com>
ftrace: Setup correct FTRACE_FL_REGS flags for module
Michal Koutný <mkoutny(a)suse.com>
mm/page_counter.c: fix protection usage propagation
Junxiao Bi <junxiao.bi(a)oracle.com>
ocfs2: change slot number type s16 to u16
Mikulas Patocka <mpatocka(a)redhat.com>
ext2: fix missing percpu_counter_inc
Huacai Chen <chenhc(a)lemote.com>
MIPS: CPU#0 is not hotpluggable
Lukas Wunner <lukas(a)wunner.de>
driver core: Avoid binding drivers to dead devices
Johannes Berg <johannes.berg(a)intel.com>
mac80211: fix misplaced while instead of if
Coly Li <colyli(a)suse.de>
bcache: fix overflow in offset_to_stripe()
Coly Li <colyli(a)suse.de>
bcache: allocate meta data pages as compound pages
ChangSyun Peng <allenpeng(a)synology.com>
md/raid5: Fix Force reconstruct-write io stuck in degraded raid5
Kees Cook <keescook(a)chromium.org>
net/compat: Add missing sock updates for SCM_RIGHTS
Jonathan McDowell <noodles(a)earth.li>
net: stmmac: dwmac1000: provide multicast filter fallback
Jonathan McDowell <noodles(a)earth.li>
net: ethernet: stmmac: Disable hardware multicast filter
Eugeniu Rosca <erosca(a)de.adit-jv.com>
media: vsp1: dl: Fix NULL pointer dereference on unbind
Michael Ellerman <mpe(a)ellerman.id.au>
powerpc: Fix circular dependency between percpu.h and mmu.h
Michael Ellerman <mpe(a)ellerman.id.au>
powerpc: Allow 4224 bytes of stack expansion for the signal frame
Paul Aurich <paul(a)darkrain42.org>
cifs: Fix leak when handling lease break for cached root fid
Max Filippov <jcmvbkbc(a)gmail.com>
xtensa: fix xtensa_pmu_setup prototype
Alexandru Ardelean <alexandru.ardelean(a)analog.com>
iio: dac: ad5592r: fix unbalanced mutex unlocks in ad5592r_read_raw()
Christian Eggers <ceggers(a)arri.de>
dt-bindings: iio: io-channel-mux: Fix compatible string in example code
Pavel Machek <pavel(a)denx.de>
btrfs: fix return value mixup in btrfs_get_extent
Filipe Manana <fdmanana(a)suse.com>
btrfs: fix memory leaks after failure to lookup checksums during inode logging
Josef Bacik <josef(a)toxicpanda.com>
btrfs: only search for left_info if there is no right_info in try_merge_free_space
David Sterba <dsterba(a)suse.com>
btrfs: fix messages after changing compression level by remount
Josef Bacik <josef(a)toxicpanda.com>
btrfs: open device without device_list_mutex
Anand Jain <anand.jain(a)oracle.com>
btrfs: don't traverse into the seed devices in show_devname
Tom Rix <trix(a)redhat.com>
btrfs: ref-verify: fix memory leak in add_block_entry
Qu Wenruo <wqu(a)suse.com>
btrfs: don't allocate anonymous block device for user invisible roots
Qu Wenruo <wqu(a)suse.com>
btrfs: free anon block device right after subvolume deletion
Bjorn Helgaas <bhelgaas(a)google.com>
PCI: Probe bridge window attributes once at enumeration-time
Ansuel Smith <ansuelsmth(a)gmail.com>
PCI: qcom: Add support for tx term offset for rev 2.1.0
Ansuel Smith <ansuelsmth(a)gmail.com>
PCI: qcom: Define some PARF params needed for ipq8064 SoC
Rajat Jain <rajatja(a)google.com>
PCI: Add device even if driver attach failed
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
PCI: hotplug: ACPI: Fix context refcounting in acpiphp_grab_context()
Thomas Gleixner <tglx(a)linutronix.de>
genirq/affinity: Make affinity setting if activated opt-in
Steve French <stfrench(a)microsoft.com>
smb3: warn on confusing error scenario with sec=krb5
-------------
Diffstat:
.../bindings/iio/multiplexer/io-channel-mux.txt | 2 +-
Makefile | 4 +-
.../boot/dts/marvell/armada-3720-espressobin.dts | 6 ++
arch/mips/kernel/topology.c | 2 +-
arch/openrisc/kernel/stacktrace.c | 18 +++++-
arch/powerpc/include/asm/percpu.h | 4 +-
arch/powerpc/mm/fault.c | 7 ++-
arch/powerpc/platforms/pseries/hotplug-memory.c | 2 +-
arch/sh/boards/mach-landisk/setup.c | 3 +
arch/x86/kernel/apic/vector.c | 4 ++
arch/xtensa/kernel/perf_event.c | 2 +-
drivers/base/dd.c | 4 +-
drivers/clk/sirf/clk-atlas6.c | 2 +-
drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c | 5 +-
drivers/gpu/drm/drm_panel_orientation_quirks.c | 6 ++
drivers/gpu/drm/imx/imx-ldb.c | 7 ++-
drivers/gpu/drm/radeon/ni_dpm.c | 2 +-
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 8 +--
drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c | 5 +-
drivers/gpu/ipu-v3/ipu-image-convert.c | 58 ++++++-----------
drivers/i2c/busses/i2c-rcar.c | 15 +++--
drivers/iio/dac/ad5592r-base.c | 4 +-
drivers/infiniband/ulp/ipoib/ipoib.h | 2 +-
drivers/infiniband/ulp/ipoib/ipoib_ib.c | 67 +++++++++-----------
drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +
drivers/input/mouse/sentelic.c | 2 +-
drivers/iommu/omap-iommu-debug.c | 3 +
drivers/irqchip/irq-gic-v3-its.c | 5 +-
drivers/md/bcache/bcache.h | 2 +-
drivers/md/bcache/bset.c | 2 +-
drivers/md/bcache/btree.c | 2 +-
drivers/md/bcache/journal.c | 4 +-
drivers/md/bcache/super.c | 2 +-
drivers/md/bcache/writeback.c | 14 +++--
drivers/md/bcache/writeback.h | 19 +++++-
drivers/md/dm-rq.c | 3 -
drivers/md/raid5.c | 3 +-
drivers/media/platform/rockchip/rga/rga-hw.c | 29 +++++----
drivers/media/platform/rockchip/rga/rga-hw.h | 5 ++
drivers/media/platform/vsp1/vsp1_dl.c | 2 +
drivers/mfd/arizona-core.c | 18 ++++++
drivers/mfd/dln2.c | 4 ++
drivers/mmc/host/renesas_sdhi_internal_dmac.c | 18 ++++--
drivers/net/ethernet/qualcomm/emac/emac.c | 17 ++++-
.../net/ethernet/stmicro/stmmac/dwmac-ipq806x.c | 1 +
.../net/ethernet/stmicro/stmmac/dwmac1000_core.c | 3 +
drivers/pci/bus.c | 6 +-
drivers/pci/controller/dwc/pcie-qcom.c | 41 +++++++++++-
drivers/pci/hotplug/acpiphp_glue.c | 14 ++++-
drivers/pci/probe.c | 52 +++++++++++++++
drivers/pci/quirks.c | 5 +-
drivers/pci/setup-bus.c | 45 ++-----------
drivers/pwm/pwm-bcm-iproc.c | 9 ++-
drivers/remoteproc/qcom_q6v5.c | 2 +
drivers/scsi/lpfc/lpfc_nvmet.c | 2 +-
drivers/usb/serial/ftdi_sio.c | 57 ++++++++++-------
drivers/watchdog/f71808e_wdt.c | 13 +++-
drivers/watchdog/watchdog_dev.c | 18 +++---
fs/btrfs/disk-io.c | 13 +++-
fs/btrfs/free-space-cache.c | 4 +-
fs/btrfs/inode.c | 4 +-
fs/btrfs/ref-verify.c | 2 +
fs/btrfs/super.c | 35 +++++------
fs/btrfs/tree-log.c | 8 +--
fs/btrfs/volumes.c | 21 ++++++-
fs/cifs/smb2misc.c | 73 +++++++++++++++-------
fs/cifs/smb2pdu.c | 2 +
fs/ext2/ialloc.c | 3 +-
fs/minix/inode.c | 12 ++--
fs/minix/itree_v1.c | 12 ++--
fs/minix/itree_v2.c | 13 ++--
fs/minix/minix.h | 1 -
fs/nfs/nfs4proc.c | 2 -
fs/nfs/nfs4xdr.c | 6 +-
fs/ocfs2/ocfs2.h | 4 +-
fs/ocfs2/suballoc.c | 4 +-
fs/ocfs2/super.c | 4 +-
fs/ufs/super.c | 2 +-
include/linux/intel-iommu.h | 4 +-
include/linux/irq.h | 13 ++++
include/linux/pci.h | 3 +
include/net/sock.h | 4 ++
kernel/irq/manage.c | 6 +-
kernel/kprobes.c | 7 +++
kernel/module.c | 22 ++++++-
kernel/trace/ftrace.c | 15 +++--
kernel/trace/trace_events.c | 4 +-
kernel/trace/trace_hwlat.c | 5 +-
lib/test_kmod.c | 2 +-
mm/khugepaged.c | 22 ++++---
mm/page_counter.c | 6 +-
net/compat.c | 1 +
net/core/sock.c | 21 +++++++
net/mac80211/sta_info.c | 2 +-
sound/pci/echoaudio/echoaudio.c | 2 -
tools/build/Makefile.feature | 2 +-
tools/build/feature/Makefile | 2 -
tools/perf/bench/mem-functions.c | 21 ++++---
.../perf/util/intel-pt-decoder/intel-pt-decoder.c | 21 +++----
.../testing/selftests/powerpc/ptrace/ptrace-pkey.c | 55 ++++++++--------
100 files changed, 715 insertions(+), 413 deletions(-)
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 3f98abc01b28 - dm: fix bio splitting and its bio completion order for regular IO
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ❌ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 3:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
x86_64:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 3:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
From: Xie He <xie.he.0141(a)gmail.com>
[ Upstream commit 44a049c42681de71c783d75cd6e56b4e339488b0 ]
PVC devices are virtual devices in this driver stacked on top of the
actual HDLC device. They are the devices normal users would use.
PVC devices have two types: normal PVC devices and Ethernet-emulating
PVC devices.
When transmitting data with PVC devices, the ndo_start_xmit function
will prepend a header of 4 or 10 bytes. Currently this driver requests
this headroom to be reserved for normal PVC devices by setting their
hard_header_len to 10. However, this does not work when these devices
are used with AF_PACKET/RAW sockets. Also, this driver does not request
this headroom for Ethernet-emulating PVC devices (but deals with this
problem by reallocating the skb when needed, which is not optimal).
This patch replaces hard_header_len with needed_headroom, and set
needed_headroom for Ethernet-emulating PVC devices, too. This makes
the driver to request headroom for all PVC devices in all cases.
Cc: Krzysztof Halasa <khc(a)pm.waw.pl>
Cc: Martin Schiller <ms(a)dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141(a)gmail.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/wan/hdlc_fr.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c
index 78596e42a3f3f..a09af49cd08b9 100644
--- a/drivers/net/wan/hdlc_fr.c
+++ b/drivers/net/wan/hdlc_fr.c
@@ -1045,7 +1045,7 @@ static void pvc_setup(struct net_device *dev)
{
dev->type = ARPHRD_DLCI;
dev->flags = IFF_POINTOPOINT;
- dev->hard_header_len = 10;
+ dev->hard_header_len = 0;
dev->addr_len = 2;
netif_keep_dst(dev);
}
@@ -1097,6 +1097,7 @@ static int fr_add_pvc(struct net_device *frad, unsigned int dlci, int type)
dev->mtu = HDLC_MAX_MTU;
dev->min_mtu = 68;
dev->max_mtu = HDLC_MAX_MTU;
+ dev->needed_headroom = 10;
dev->priv_flags |= IFF_NO_QUEUE;
dev->ml_priv = pvc;
--
2.25.1
From: Xie He <xie.he.0141(a)gmail.com>
[ Upstream commit 44a049c42681de71c783d75cd6e56b4e339488b0 ]
PVC devices are virtual devices in this driver stacked on top of the
actual HDLC device. They are the devices normal users would use.
PVC devices have two types: normal PVC devices and Ethernet-emulating
PVC devices.
When transmitting data with PVC devices, the ndo_start_xmit function
will prepend a header of 4 or 10 bytes. Currently this driver requests
this headroom to be reserved for normal PVC devices by setting their
hard_header_len to 10. However, this does not work when these devices
are used with AF_PACKET/RAW sockets. Also, this driver does not request
this headroom for Ethernet-emulating PVC devices (but deals with this
problem by reallocating the skb when needed, which is not optimal).
This patch replaces hard_header_len with needed_headroom, and set
needed_headroom for Ethernet-emulating PVC devices, too. This makes
the driver to request headroom for all PVC devices in all cases.
Cc: Krzysztof Halasa <khc(a)pm.waw.pl>
Cc: Martin Schiller <ms(a)dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141(a)gmail.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/net/wan/hdlc_fr.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c
index 038236a9c60ee..67f89917277ce 100644
--- a/drivers/net/wan/hdlc_fr.c
+++ b/drivers/net/wan/hdlc_fr.c
@@ -1044,7 +1044,7 @@ static void pvc_setup(struct net_device *dev)
{
dev->type = ARPHRD_DLCI;
dev->flags = IFF_POINTOPOINT;
- dev->hard_header_len = 10;
+ dev->hard_header_len = 0;
dev->addr_len = 2;
netif_keep_dst(dev);
}
@@ -1096,6 +1096,7 @@ static int fr_add_pvc(struct net_device *frad, unsigned int dlci, int type)
dev->mtu = HDLC_MAX_MTU;
dev->min_mtu = 68;
dev->max_mtu = HDLC_MAX_MTU;
+ dev->needed_headroom = 10;
dev->priv_flags |= IFF_NO_QUEUE;
dev->ml_priv = pvc;
--
2.25.1
From: Guo Ren <guoren(a)linux.alibaba.com>
[ Upstream commit bc6717d55d07110d8f3c6d31ec2af50c11b07091 ]
When the timer counts to the upper limit, an overflow interrupt is
generated, and the count is reset with the value in the TIME_INI
register. But the software expects to start counting from 0 when
the count overflows, so it forces TIME_INI to 0 to solve the
potential interrupt storm problem.
Signed-off-by: Guo Ren <guoren(a)linux.alibaba.com>
Tested-by: Xu Kai <xukai(a)nationalchip.com>
Cc: Daniel Lezcano <daniel.lezcano(a)linaro.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
Link: https://lore.kernel.org/r/1597735877-71115-1-git-send-email-guoren@kernel.o…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/clocksource/timer-gx6605s.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/clocksource/timer-gx6605s.c b/drivers/clocksource/timer-gx6605s.c
index 80d0939d040b5..8d386adbe8009 100644
--- a/drivers/clocksource/timer-gx6605s.c
+++ b/drivers/clocksource/timer-gx6605s.c
@@ -28,6 +28,7 @@ static irqreturn_t gx6605s_timer_interrupt(int irq, void *dev)
void __iomem *base = timer_of_base(to_timer_of(ce));
writel_relaxed(GX6605S_STATUS_CLR, base + TIMER_STATUS);
+ writel_relaxed(0, base + TIMER_INI);
ce->event_handler(ce);
--
2.25.1
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 05399b344982 - s390/zcrypt: Fix ZCRYPT_PERDEV_REQCNT ioctl
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
ppc64le:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 3:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 4:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
x86_64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ stress: stress-ng
🚧 ⚡⚡⚡ CPU: Frequency Driver Test
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ stress: stress-ng
🚧 ⚡⚡⚡ CPU: Frequency Driver Test
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 5:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ stress: stress-ng
🚧 ⚡⚡⚡ CPU: Frequency Driver Test
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
Hi Sasha,
I noticed build error on latest stable-rc/queue/5.4 branch:
util/parse-events.c: In function 'parse_events_add_pmu':
util/parse-events.c:1373:11: error: 'struct perf_evsel_config_term'
has no member named 'free_str'
if (pos->free_str)
^~
In file included from util/parse-events.c:4:
util/parse-events.c:1374:20: error: 'union <anonymous>' has no member
named 'str'
zfree(&pos->val.str);
^
/<<PKGBUILDDIR>>/tools/include/linux/zalloc.h:10:38: note: in
definition of macro 'zfree'
#define zfree(ptr) __zfree((void **)(ptr))
revert both
d6ac1bd91161 ("perf parse-events: Fix incorrect conversion of 'if ()
free()' to 'zfree()'")
b6f48cb0c18b ("perf parse-events: Fix memory leaks found on parse_events")
Fixed the problem.
Thanks!
--
Jinpu Wang
Linux Kernel Developer
Application Support (IONOS Cloud)
1&1 IONOS SE | Greifswalder Str. 207 | 10405 Berlin | Germany
Phone:
E-mail: jinpu.wang(a)cloud.ionos.com | Web: www.ionos.de
Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 24498
Vorstand: Dr. Christian Böing, Hüseyin Dogan, Dr. Martin Endreß,
Hans-Henning Kettler, Arthur Mai, Matthias Steinberg, Achim Weiß
Aufsichtsratsvorsitzender: Markus Kadelke
Member of United Internet
Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu
speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch
immer zu verwenden.
This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this
e-mail in any way is prohibited. If you have received this e-mail in
error, please notify the sender and delete the e-mail.
From: Jia He <justin.he(a)arm.com>
[ Upstream commit 83d116c53058d505ddef051e90ab27f57015b025 ]
When we tested pmdk unit test [1] vmmalloc_fork TEST3 on arm64 guest, there
will be a double page fault in __copy_from_user_inatomic of cow_user_page.
To reproduce the bug, the cmd is as follows after you deployed everything:
make -C src/test/vmmalloc_fork/ TEST_TIME=60m check
Below call trace is from arm64 do_page_fault for debugging purpose:
[ 110.016195] Call trace:
[ 110.016826] do_page_fault+0x5a4/0x690
[ 110.017812] do_mem_abort+0x50/0xb0
[ 110.018726] el1_da+0x20/0xc4
[ 110.019492] __arch_copy_from_user+0x180/0x280
[ 110.020646] do_wp_page+0xb0/0x860
[ 110.021517] __handle_mm_fault+0x994/0x1338
[ 110.022606] handle_mm_fault+0xe8/0x180
[ 110.023584] do_page_fault+0x240/0x690
[ 110.024535] do_mem_abort+0x50/0xb0
[ 110.025423] el0_da+0x20/0x24
The pte info before __copy_from_user_inatomic is (PTE_AF is cleared):
[ffff9b007000] pgd=000000023d4f8003, pud=000000023da9b003,
pmd=000000023d4b3003, pte=360000298607bd3
As told by Catalin: "On arm64 without hardware Access Flag, copying from
user will fail because the pte is old and cannot be marked young. So we
always end up with zeroed page after fork() + CoW for pfn mappings. we
don't always have a hardware-managed access flag on arm64."
This patch fixes it by calling pte_mkyoung. Also, the parameter is
changed because vmf should be passed to cow_user_page()
Add a WARN_ON_ONCE when __copy_from_user_inatomic() returns error
in case there can be some obscure use-case (by Kirill).
[1] https://github.com/pmem/pmdk/tree/master/src/test/vmmalloc_fork
Signed-off-by: Jia He <justin.he(a)arm.com>
Reported-by: Yibo Cai <Yibo.Cai(a)arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
mm/memory.c | 104 ++++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 89 insertions(+), 15 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index e9bce27bc18c3..07188929a30a1 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -117,6 +117,18 @@ int randomize_va_space __read_mostly =
2;
#endif
+#ifndef arch_faults_on_old_pte
+static inline bool arch_faults_on_old_pte(void)
+{
+ /*
+ * Those arches which don't have hw access flag feature need to
+ * implement their own helper. By default, "true" means pagefault
+ * will be hit on old pte.
+ */
+ return true;
+}
+#endif
+
static int __init disable_randmaps(char *s)
{
randomize_va_space = 0;
@@ -2324,32 +2336,82 @@ static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd,
return same;
}
-static inline void cow_user_page(struct page *dst, struct page *src, unsigned long va, struct vm_area_struct *vma)
+static inline bool cow_user_page(struct page *dst, struct page *src,
+ struct vm_fault *vmf)
{
+ bool ret;
+ void *kaddr;
+ void __user *uaddr;
+ bool force_mkyoung;
+ struct vm_area_struct *vma = vmf->vma;
+ struct mm_struct *mm = vma->vm_mm;
+ unsigned long addr = vmf->address;
+
debug_dma_assert_idle(src);
+ if (likely(src)) {
+ copy_user_highpage(dst, src, addr, vma);
+ return true;
+ }
+
/*
* If the source page was a PFN mapping, we don't have
* a "struct page" for it. We do a best-effort copy by
* just copying from the original user address. If that
* fails, we just zero-fill it. Live with it.
*/
- if (unlikely(!src)) {
- void *kaddr = kmap_atomic(dst);
- void __user *uaddr = (void __user *)(va & PAGE_MASK);
+ kaddr = kmap_atomic(dst);
+ uaddr = (void __user *)(addr & PAGE_MASK);
+
+ /*
+ * On architectures with software "accessed" bits, we would
+ * take a double page fault, so mark it accessed here.
+ */
+ force_mkyoung = arch_faults_on_old_pte() && !pte_young(vmf->orig_pte);
+ if (force_mkyoung) {
+ pte_t entry;
+
+ vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
+ if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
+ /*
+ * Other thread has already handled the fault
+ * and we don't need to do anything. If it's
+ * not the case, the fault will be triggered
+ * again on the same address.
+ */
+ ret = false;
+ goto pte_unlock;
+ }
+ entry = pte_mkyoung(vmf->orig_pte);
+ if (ptep_set_access_flags(vma, addr, vmf->pte, entry, 0))
+ update_mmu_cache(vma, addr, vmf->pte);
+ }
+
+ /*
+ * This really shouldn't fail, because the page is there
+ * in the page tables. But it might just be unreadable,
+ * in which case we just give up and fill the result with
+ * zeroes.
+ */
+ if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) {
/*
- * This really shouldn't fail, because the page is there
- * in the page tables. But it might just be unreadable,
- * in which case we just give up and fill the result with
- * zeroes.
+ * Give a warn in case there can be some obscure
+ * use-case
*/
- if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE))
- clear_page(kaddr);
- kunmap_atomic(kaddr);
- flush_dcache_page(dst);
- } else
- copy_user_highpage(dst, src, va, vma);
+ WARN_ON_ONCE(1);
+ clear_page(kaddr);
+ }
+
+ ret = true;
+
+pte_unlock:
+ if (force_mkyoung)
+ pte_unmap_unlock(vmf->pte, vmf->ptl);
+ kunmap_atomic(kaddr);
+ flush_dcache_page(dst);
+
+ return ret;
}
static gfp_t __get_fault_gfp_mask(struct vm_area_struct *vma)
@@ -2503,7 +2565,19 @@ static int wp_page_copy(struct vm_fault *vmf)
vmf->address);
if (!new_page)
goto oom;
- cow_user_page(new_page, old_page, vmf->address, vma);
+
+ if (!cow_user_page(new_page, old_page, vmf)) {
+ /*
+ * COW failed, if the fault was solved by other,
+ * it's fine. If not, userspace would re-fault on
+ * the same address and we will handle the fault
+ * from the second attempt.
+ */
+ put_page(new_page);
+ if (old_page)
+ put_page(old_page);
+ return 0;
+ }
}
if (mem_cgroup_try_charge(new_page, mm, GFP_KERNEL, &memcg, false))
--
2.25.1
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 3df2b9c0d66b - lib/string.c: implement stpcpy
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 4:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
The patch titled
Subject: mm: don't rely on system state to detect hot-plug operations
has been removed from the -mm tree. Its filename was
mm-dont-rely-on-system-state-to-detect-hot-plug-operations.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Subject: mm: don't rely on system state to detect hot-plug operations
In register_mem_sect_under_node() the system_state's value is checked to
detect whether the call is made during boot time or during an hot-plug
operation. Unfortunately, that check against SYSTEM_BOOTING is wrong
because regular memory is registered at SYSTEM_SCHEDULING state. In
addition, memory hot-plug operation can be triggered at this system state
by the ACPI [1]. So checking against the system state is not enough.
The consequence is that on system with interleaved node's ranges like this:
Early memory node ranges
node 1: [mem 0x0000000000000000-0x000000011fffffff]
node 2: [mem 0x0000000120000000-0x000000014fffffff]
node 1: [mem 0x0000000150000000-0x00000001ffffffff]
node 0: [mem 0x0000000200000000-0x000000048fffffff]
node 2: [mem 0x0000000490000000-0x00000007ffffffff]
This can be seen on PowerPC LPAR after multiple memory hot-plug and
hot-unplug operations are done. At the next reboot the node's memory ranges
can be interleaved and since the call to link_mem_sections() is made in
topology_init() while the system is in the SYSTEM_SCHEDULING state, the
node's id is not checked, and the sections registered to multiple nodes:
$ ls -l /sys/devices/system/memory/memory21/node*
total 0
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2
In that case, the system is able to boot but if later one of theses memory
blocks is hot-unplugged and then hot-plugged, the sysfs inconsistency is
detected and this is triggering a BUG_ON():
------------[ cut here ]------------
kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
NIP: c000000000403f34 LR: c000000000403f2c CTR: 0000000000000000
REGS: c0000004876e3660 TRAP: 0700 Not tainted (5.9.0-rc1+)
MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24000448 XER: 20040000
CFAR: c000000000846d20 IRQMASK: 0
GPR00: c000000000403f2c c0000004876e38f0 c0000000012f6f00 ffffffffffffffef
GPR04: 0000000000000227 c0000004805ae680 0000000000000000 00000004886f0000
GPR08: 0000000000000226 0000000000000003 0000000000000002 fffffffffffffffd
GPR12: 0000000088000484 c00000001ec96280 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000004 0000000000000003
GPR20: c00000047814ffe0 c0000007ffff7c08 0000000000000010 c0000000013332c8
GPR24: 0000000000000000 c0000000011f6cc0 0000000000000000 0000000000000000
GPR28: ffffffffffffffef 0000000000000001 0000000150000000 0000000010000000
NIP [c000000000403f34] add_memory_resource+0x244/0x340
LR [c000000000403f2c] add_memory_resource+0x23c/0x340
Call Trace:
[c0000004876e38f0] [c000000000403f2c] add_memory_resource+0x23c/0x340 (unreliable)
[c0000004876e39c0] [c00000000040408c] __add_memory+0x5c/0xf0
[c0000004876e39f0] [c0000000000e2b94] dlpar_add_lmb+0x1b4/0x500
[c0000004876e3ad0] [c0000000000e3888] dlpar_memory+0x1f8/0xb80
[c0000004876e3b60] [c0000000000dc0d0] handle_dlpar_errorlog+0xc0/0x190
[c0000004876e3bd0] [c0000000000dc398] dlpar_store+0x198/0x4a0
[c0000004876e3c90] [c00000000072e630] kobj_attr_store+0x30/0x50
[c0000004876e3cb0] [c00000000051f954] sysfs_kf_write+0x64/0x90
[c0000004876e3cd0] [c00000000051ee40] kernfs_fop_write+0x1b0/0x290
[c0000004876e3d20] [c000000000438dd8] vfs_write+0xe8/0x290
[c0000004876e3d70] [c0000000004391ac] ksys_write+0xdc/0x130
[c0000004876e3dc0] [c000000000034e40] system_call_exception+0x160/0x270
[c0000004876e3e20] [c00000000000d740] system_call_common+0xf0/0x27c
Instruction dump:
48442e35 60000000 0b030000 3cbe0001 7fa3eb78 7bc48402 38a5fffe 7ca5fa14
78a58402 48442db1 60000000 7c7c1b78 <0b030000> 7f23cb78 4bda371d 60000000
---[ end trace 562fd6c109cd0fb2 ]---
This patch addresses the root cause by not relying on the system_state
value to detect whether the call is due to a hot-plug operation. An extra
parameter is added to link_mem_sections() detailing whether the operation
is due to a hot-plug operation.
[1] According to Oscar Salvador, using this qemu command line, ACPI memory
hotplug operations are raised at SYSTEM_SCHEDULING state:
$QEMU -enable-kvm -machine pc -smp 4,sockets=4,cores=1,threads=1 -cpu host -monitor pty \
-m size=$MEM,slots=255,maxmem=4294967296k \
-numa node,nodeid=0,cpus=0-3,mem=512 -numa node,nodeid=1,mem=512 \
-object memory-backend-ram,id=memdimm0,size=134217728 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0,slot=0 \
-object memory-backend-ram,id=memdimm1,size=134217728 -device pc-dimm,node=0,memdev=memdimm1,id=dimm1,slot=1 \
-object memory-backend-ram,id=memdimm2,size=134217728 -device pc-dimm,node=0,memdev=memdimm2,id=dimm2,slot=2 \
-object memory-backend-ram,id=memdimm3,size=134217728 -device pc-dimm,node=0,memdev=memdimm3,id=dimm3,slot=3 \
-object memory-backend-ram,id=memdimm4,size=134217728 -device pc-dimm,node=1,memdev=memdimm4,id=dimm4,slot=4 \
-object memory-backend-ram,id=memdimm5,size=134217728 -device pc-dimm,node=1,memdev=memdimm5,id=dimm5,slot=5 \
-object memory-backend-ram,id=memdimm6,size=134217728 -device pc-dimm,node=1,memdev=memdimm6,id=dimm6,slot=6 \
Link: https://lkml.kernel.org/r/20200915094143.79181-3-ldufour@linux.ibm.com
Fixes: 4fbce633910e ("mm/memory_hotplug.c: make register_mem_sect_under_node() a callback of walk_memory_range()")
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael(a)kernel.org>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/base/node.c | 85 ++++++++++++++++++++++++++---------------
include/linux/node.h | 11 +++--
mm/memory_hotplug.c | 3 -
3 files changed, 64 insertions(+), 35 deletions(-)
--- a/drivers/base/node.c~mm-dont-rely-on-system-state-to-detect-hot-plug-operations
+++ a/drivers/base/node.c
@@ -761,14 +761,36 @@ static int __ref get_nid_for_pfn(unsigne
return pfn_to_nid(pfn);
}
+static int do_register_memory_block_under_node(int nid,
+ struct memory_block *mem_blk)
+{
+ int ret;
+
+ /*
+ * If this memory block spans multiple nodes, we only indicate
+ * the last processed node.
+ */
+ mem_blk->nid = nid;
+
+ ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj,
+ &mem_blk->dev.kobj,
+ kobject_name(&mem_blk->dev.kobj));
+ if (ret)
+ return ret;
+
+ return sysfs_create_link_nowarn(&mem_blk->dev.kobj,
+ &node_devices[nid]->dev.kobj,
+ kobject_name(&node_devices[nid]->dev.kobj));
+}
+
/* register memory section under specified node if it spans that node */
-static int register_mem_sect_under_node(struct memory_block *mem_blk,
- void *arg)
+static int register_mem_block_under_node_early(struct memory_block *mem_blk,
+ void *arg)
{
unsigned long memory_block_pfns = memory_block_size_bytes() / PAGE_SIZE;
unsigned long start_pfn = section_nr_to_pfn(mem_blk->start_section_nr);
unsigned long end_pfn = start_pfn + memory_block_pfns - 1;
- int ret, nid = *(int *)arg;
+ int nid = *(int *)arg;
unsigned long pfn;
for (pfn = start_pfn; pfn <= end_pfn; pfn++) {
@@ -785,39 +807,34 @@ static int register_mem_sect_under_node(
}
/*
- * We need to check if page belongs to nid only for the boot
- * case, during hotplug we know that all pages in the memory
- * block belong to the same node.
- */
- if (system_state == SYSTEM_BOOTING) {
- page_nid = get_nid_for_pfn(pfn);
- if (page_nid < 0)
- continue;
- if (page_nid != nid)
- continue;
- }
-
- /*
- * If this memory block spans multiple nodes, we only indicate
- * the last processed node.
+ * We need to check if page belongs to nid only at the boot
+ * case because node's ranges can be interleaved.
*/
- mem_blk->nid = nid;
-
- ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj,
- &mem_blk->dev.kobj,
- kobject_name(&mem_blk->dev.kobj));
- if (ret)
- return ret;
+ page_nid = get_nid_for_pfn(pfn);
+ if (page_nid < 0)
+ continue;
+ if (page_nid != nid)
+ continue;
- return sysfs_create_link_nowarn(&mem_blk->dev.kobj,
- &node_devices[nid]->dev.kobj,
- kobject_name(&node_devices[nid]->dev.kobj));
+ return do_register_memory_block_under_node(nid, mem_blk);
}
/* mem section does not span the specified node */
return 0;
}
/*
+ * During hotplug we know that all pages in the memory block belong to the same
+ * node.
+ */
+static int register_mem_block_under_node_hotplug(struct memory_block *mem_blk,
+ void *arg)
+{
+ int nid = *(int *)arg;
+
+ return do_register_memory_block_under_node(nid, mem_blk);
+}
+
+/*
* Unregister a memory block device under the node it spans. Memory blocks
* with multiple nodes cannot be offlined and therefore also never be removed.
*/
@@ -832,11 +849,19 @@ void unregister_memory_block_under_nodes
kobject_name(&node_devices[mem_blk->nid]->dev.kobj));
}
-int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn)
+int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn,
+ enum meminit_context context)
{
+ walk_memory_blocks_func_t func;
+
+ if (context == MEMINIT_HOTPLUG)
+ func = register_mem_block_under_node_hotplug;
+ else
+ func = register_mem_block_under_node_early;
+
return walk_memory_blocks(PFN_PHYS(start_pfn),
PFN_PHYS(end_pfn - start_pfn), (void *)&nid,
- register_mem_sect_under_node);
+ func);
}
#ifdef CONFIG_HUGETLBFS
--- a/include/linux/node.h~mm-dont-rely-on-system-state-to-detect-hot-plug-operations
+++ a/include/linux/node.h
@@ -99,11 +99,13 @@ extern struct node *node_devices[];
typedef void (*node_registration_func_t)(struct node *);
#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_NUMA)
-extern int link_mem_sections(int nid, unsigned long start_pfn,
- unsigned long end_pfn);
+int link_mem_sections(int nid, unsigned long start_pfn,
+ unsigned long end_pfn,
+ enum meminit_context context);
#else
static inline int link_mem_sections(int nid, unsigned long start_pfn,
- unsigned long end_pfn)
+ unsigned long end_pfn,
+ enum meminit_context context)
{
return 0;
}
@@ -128,7 +130,8 @@ static inline int register_one_node(int
if (error)
return error;
/* link memory sections under this node */
- error = link_mem_sections(nid, start_pfn, end_pfn);
+ error = link_mem_sections(nid, start_pfn, end_pfn,
+ MEMINIT_EARLY);
}
return error;
--- a/mm/memory_hotplug.c~mm-dont-rely-on-system-state-to-detect-hot-plug-operations
+++ a/mm/memory_hotplug.c
@@ -1080,7 +1080,8 @@ int __ref add_memory_resource(int nid, s
}
/* link memory sections under this node.*/
- ret = link_mem_sections(nid, PFN_DOWN(start), PFN_UP(start + size - 1));
+ ret = link_mem_sections(nid, PFN_DOWN(start), PFN_UP(start + size - 1),
+ MEMINIT_HOTPLUG);
BUG_ON(ret);
/* create new memmap entry */
_
Patches currently in -mm which might be from ldufour(a)linux.ibm.com are
mm-dont-panic-when-links-cant-be-created-in-sysfs.patch
The patch titled
Subject: mm: replace memmap_context by meminit_context
has been removed from the -mm tree. Its filename was
mm-replace-memmap_context-by-meminit_context.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Subject: mm: replace memmap_context by meminit_context
Patch series "mm: fix memory to node bad links in sysfs", v3.
Sometimes, firmware may expose interleaved memory layout like this:
Early memory node ranges
node 1: [mem 0x0000000000000000-0x000000011fffffff]
node 2: [mem 0x0000000120000000-0x000000014fffffff]
node 1: [mem 0x0000000150000000-0x00000001ffffffff]
node 0: [mem 0x0000000200000000-0x000000048fffffff]
node 2: [mem 0x0000000490000000-0x00000007ffffffff]
In that case, we can see memory blocks assigned to multiple nodes in sysfs:
$ ls -l /sys/devices/system/memory/memory21
total 0
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2
-rw-r--r-- 1 root root 65536 Aug 24 05:27 online
-r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device
-r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index
drwxr-xr-x 2 root root 0 Aug 24 05:27 power
-r--r--r-- 1 root root 65536 Aug 24 05:27 removable
-rw-r--r-- 1 root root 65536 Aug 24 05:27 state
lrwxrwxrwx 1 root root 0 Aug 24 05:25 subsystem -> ../../../../bus/memory
-rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent
-r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones
The same applies in the node's directory with a memory21 link in both the
node1 and node2's directory.
This is wrong but doesn't prevent the system to run. However when later,
one of these memory blocks is hot-unplugged and then hot-plugged, the
system is detecting an inconsistency in the sysfs layout and a BUG_ON() is
raised:
------------[ cut here ]------------
kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
NIP: c000000000403f34 LR: c000000000403f2c CTR: 0000000000000000
REGS: c0000004876e3660 TRAP: 0700 Not tainted (5.9.0-rc1+)
MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24000448 XER: 20040000
CFAR: c000000000846d20 IRQMASK: 0
GPR00: c000000000403f2c c0000004876e38f0 c0000000012f6f00 ffffffffffffffef
GPR04: 0000000000000227 c0000004805ae680 0000000000000000 00000004886f0000
GPR08: 0000000000000226 0000000000000003 0000000000000002 fffffffffffffffd
GPR12: 0000000088000484 c00000001ec96280 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000004 0000000000000003
GPR20: c00000047814ffe0 c0000007ffff7c08 0000000000000010 c0000000013332c8
GPR24: 0000000000000000 c0000000011f6cc0 0000000000000000 0000000000000000
GPR28: ffffffffffffffef 0000000000000001 0000000150000000 0000000010000000
NIP [c000000000403f34] add_memory_resource+0x244/0x340
LR [c000000000403f2c] add_memory_resource+0x23c/0x340
Call Trace:
[c0000004876e38f0] [c000000000403f2c] add_memory_resource+0x23c/0x340 (unreliable)
[c0000004876e39c0] [c00000000040408c] __add_memory+0x5c/0xf0
[c0000004876e39f0] [c0000000000e2b94] dlpar_add_lmb+0x1b4/0x500
[c0000004876e3ad0] [c0000000000e3888] dlpar_memory+0x1f8/0xb80
[c0000004876e3b60] [c0000000000dc0d0] handle_dlpar_errorlog+0xc0/0x190
[c0000004876e3bd0] [c0000000000dc398] dlpar_store+0x198/0x4a0
[c0000004876e3c90] [c00000000072e630] kobj_attr_store+0x30/0x50
[c0000004876e3cb0] [c00000000051f954] sysfs_kf_write+0x64/0x90
[c0000004876e3cd0] [c00000000051ee40] kernfs_fop_write+0x1b0/0x290
[c0000004876e3d20] [c000000000438dd8] vfs_write+0xe8/0x290
[c0000004876e3d70] [c0000000004391ac] ksys_write+0xdc/0x130
[c0000004876e3dc0] [c000000000034e40] system_call_exception+0x160/0x270
[c0000004876e3e20] [c00000000000d740] system_call_common+0xf0/0x27c
Instruction dump:
48442e35 60000000 0b030000 3cbe0001 7fa3eb78 7bc48402 38a5fffe 7ca5fa14
78a58402 48442db1 60000000 7c7c1b78 <0b030000> 7f23cb78 4bda371d 60000000
---[ end trace 562fd6c109cd0fb2 ]---
This has been seen on PowerPC LPAR.
The root cause of this issue is that when node's memory is registered, the
range used can overlap another node's range, thus the memory block is
registered to multiple nodes in sysfs.
There are 2 issues here:
a. The sysfs memory and node's layouts are broken due to these multiple
links
b. The link errors in link_mem_sections() should not lead to a system
panic.
To address a. register_mem_sect_under_node should not rely on the system
state to detect whether the link operation is triggered by a hot plug
operation or not. This is addressed by the patches 1 and 2 of this series.
The patch 3 is addressing the point b.
This patch (of 2):
The memmap_context enum is used to detect whether a memory operation is due
to a hot-add operation or happening at boot time.
Make it general to the hotplug operation and rename it as meminit_context.
There is no functional change introduced by this patch
Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com
Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J . Wysocki" <rafael(a)kernel.org>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/ia64/mm/init.c | 6 +++---
include/linux/mm.h | 2 +-
include/linux/mmzone.h | 11 ++++++++---
mm/memory_hotplug.c | 2 +-
mm/page_alloc.c | 10 +++++-----
5 files changed, 18 insertions(+), 13 deletions(-)
--- a/arch/ia64/mm/init.c~mm-replace-memmap_context-by-meminit_context
+++ a/arch/ia64/mm/init.c
@@ -538,7 +538,7 @@ virtual_memmap_init(u64 start, u64 end,
if (map_start < map_end)
memmap_init_zone((unsigned long)(map_end - map_start),
args->nid, args->zone, page_to_pfn(map_start),
- MEMMAP_EARLY, NULL);
+ MEMINIT_EARLY, NULL);
return 0;
}
@@ -547,8 +547,8 @@ memmap_init (unsigned long size, int nid
unsigned long start_pfn)
{
if (!vmem_map) {
- memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY,
- NULL);
+ memmap_init_zone(size, nid, zone, start_pfn,
+ MEMINIT_EARLY, NULL);
} else {
struct page *start;
struct memmap_init_callback_data args;
--- a/include/linux/mm.h~mm-replace-memmap_context-by-meminit_context
+++ a/include/linux/mm.h
@@ -2416,7 +2416,7 @@ extern int __meminit __early_pfn_to_nid(
extern void set_dma_reserve(unsigned long new_dma_reserve);
extern void memmap_init_zone(unsigned long, int, unsigned long, unsigned long,
- enum memmap_context, struct vmem_altmap *);
+ enum meminit_context, struct vmem_altmap *);
extern void setup_per_zone_wmarks(void);
extern int __meminit init_per_zone_wmark_min(void);
extern void mem_init(void);
--- a/include/linux/mmzone.h~mm-replace-memmap_context-by-meminit_context
+++ a/include/linux/mmzone.h
@@ -824,10 +824,15 @@ bool zone_watermark_ok(struct zone *z, u
unsigned int alloc_flags);
bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
unsigned long mark, int highest_zoneidx);
-enum memmap_context {
- MEMMAP_EARLY,
- MEMMAP_HOTPLUG,
+/*
+ * Memory initialization context, use to differentiate memory added by
+ * the platform statically or via memory hotplug interface.
+ */
+enum meminit_context {
+ MEMINIT_EARLY,
+ MEMINIT_HOTPLUG,
};
+
extern void init_currently_empty_zone(struct zone *zone, unsigned long start_pfn,
unsigned long size);
--- a/mm/memory_hotplug.c~mm-replace-memmap_context-by-meminit_context
+++ a/mm/memory_hotplug.c
@@ -729,7 +729,7 @@ void __ref move_pfn_range_to_zone(struct
* are reserved so nobody should be touching them so we should be safe
*/
memmap_init_zone(nr_pages, nid, zone_idx(zone), start_pfn,
- MEMMAP_HOTPLUG, altmap);
+ MEMINIT_HOTPLUG, altmap);
set_zone_contiguous(zone);
}
--- a/mm/page_alloc.c~mm-replace-memmap_context-by-meminit_context
+++ a/mm/page_alloc.c
@@ -5975,7 +5975,7 @@ overlap_memmap_init(unsigned long zone,
* done. Non-atomic initialization, single-pass.
*/
void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
- unsigned long start_pfn, enum memmap_context context,
+ unsigned long start_pfn, enum meminit_context context,
struct vmem_altmap *altmap)
{
unsigned long pfn, end_pfn = start_pfn + size;
@@ -6007,7 +6007,7 @@ void __meminit memmap_init_zone(unsigned
* There can be holes in boot-time mem_map[]s handed to this
* function. They do not exist on hotplugged memory.
*/
- if (context == MEMMAP_EARLY) {
+ if (context == MEMINIT_EARLY) {
if (overlap_memmap_init(zone, &pfn))
continue;
if (defer_init(nid, pfn, end_pfn))
@@ -6016,7 +6016,7 @@ void __meminit memmap_init_zone(unsigned
page = pfn_to_page(pfn);
__init_single_page(page, pfn, zone, nid);
- if (context == MEMMAP_HOTPLUG)
+ if (context == MEMINIT_HOTPLUG)
__SetPageReserved(page);
/*
@@ -6099,7 +6099,7 @@ void __ref memmap_init_zone_device(struc
* check here not to call set_pageblock_migratetype() against
* pfn out of zone.
*
- * Please note that MEMMAP_HOTPLUG path doesn't clear memmap
+ * Please note that MEMINIT_HOTPLUG path doesn't clear memmap
* because this is done early in section_activate()
*/
if (!(pfn & (pageblock_nr_pages - 1))) {
@@ -6137,7 +6137,7 @@ void __meminit __weak memmap_init(unsign
if (end_pfn > start_pfn) {
size = end_pfn - start_pfn;
memmap_init_zone(size, nid, zone, start_pfn,
- MEMMAP_EARLY, NULL);
+ MEMINIT_EARLY, NULL);
}
}
}
_
Patches currently in -mm which might be from ldufour(a)linux.ibm.com are
mm-dont-panic-when-links-cant-be-created-in-sysfs.patch
The patch titled
Subject: arch/x86/lib/usercopy_64.c: fix __copy_user_flushcache() cache writeback
has been removed from the -mm tree. Its filename was
arch-x86-lib-usercopy_64c-fix-__copy_user_flushcache-cache-writeback.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Mikulas Patocka <mpatocka(a)redhat.com>
Subject: arch/x86/lib/usercopy_64.c: fix __copy_user_flushcache() cache writeback
If we copy less than 8 bytes and if the destination crosses a cache line,
__copy_user_flushcache would invalidate only the first cache line. This
patch makes it invalidate the second cache line as well.
Link: https://lkml.kernel.org/r/alpine.LRH.2.02.2009161451140.21915@file01.intran…
Fixes: 0aed55af88345b ("x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations")
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Reviewed-by: Dan Williams <dan.j.wiilliams(a)intel.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Jeff Moyer <jmoyer(a)redhat.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Toshi Kani <toshi.kani(a)hpe.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Matthew Wilcox <mawilcox(a)microsoft.com>
Cc: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Cc: Ingo Molnar <mingo(a)elte.hu>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/x86/lib/usercopy_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/lib/usercopy_64.c~arch-x86-lib-usercopy_64c-fix-__copy_user_flushcache-cache-writeback
+++ a/arch/x86/lib/usercopy_64.c
@@ -120,7 +120,7 @@ long __copy_user_flushcache(void *dst, c
*/
if (size < 8) {
if (!IS_ALIGNED(dest, 4) || size != 4)
- clean_cache_range(dst, 1);
+ clean_cache_range(dst, size);
} else {
if (!IS_ALIGNED(dest, 8)) {
dest = ALIGN(dest, boot_cpu_data.x86_clflush_size);
_
Patches currently in -mm which might be from mpatocka(a)redhat.com are
The patch titled
Subject: lib/string.c: implement stpcpy
has been removed from the -mm tree. Its filename was
lib-stringc-implement-stpcpy.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Nick Desaulniers <ndesaulniers(a)google.com>
Subject: lib/string.c: implement stpcpy
LLVM implemented a recent "libcall optimization" that lowers calls to
`sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest,
str) - dest`. This generally avoids the machinery involved in parsing
format strings. `stpcpy` is just like `strcpy` except it returns the
pointer to the new tail of `dest`. This optimization was introduced into
clang-12.
Implement this so that we don't observe linkage failures due to missing
symbol definitions for `stpcpy`.
Similar to last year's fire drill with:
commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
The kernel is somewhere between a "freestanding" environment (no full
libc) and "hosted" environment (many symbols from libc exist with the same
type, function signature, and semantics).
As H. Peter Anvin notes, there's not really a great way to inform the
compiler that you're targeting a freestanding environment but would like
to opt-in to some libcall optimizations (see pr/47280 below), rather than
opt-out.
Arvind notes, -fno-builtin-* behaves slightly differently between GCC
and Clang, and Clang is missing many __builtin_* definitions, which I
consider a bug in Clang and am working on fixing.
Masahiro summarizes the subtle distinction between compilers justly:
To prevent transformation from foo() into bar(), there are two ways in
Clang to do that; -fno-builtin-foo, and -fno-builtin-bar. There is
only one in GCC; -fno-buitin-foo.
(Any difference in that behavior in Clang is likely a bug from a missing
__builtin_* definition.)
Masahiro also notes:
We want to disable optimization from foo() to bar(),
but we may still benefit from the optimization from
foo() into something else. If GCC implements the same transform, we
would run into a problem because it is not -fno-builtin-bar, but
-fno-builtin-foo that disables that optimization.
In this regard, -fno-builtin-foo would be more future-proof than
-fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
may want to prevent calls from foo() being optimized into calls to
bar(), but we still may want other optimization on calls to foo().
It seems that compilers today don't quite provide the fine grain control
over which libcall optimizations pseudo-freestanding environments would
prefer.
Finally, Kees notes that this interface is unsafe, so we should not
encourage its use. As such, I've removed the declaration from any
header, but it still needs to be exported to avoid linkage errors in
modules.
Link: https://lkml.kernel.org/r/20200914161643.938408-1-ndesaulniers@google.com
Link: https://bugs.llvm.org/show_bug.cgi?id=47162
Link: https://bugs.llvm.org/show_bug.cgi?id=47280
Link: https://github.com/ClangBuiltLinux/linux/issues/1126
Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
Link: https://reviews.llvm.org/D85963
Reported-by: Sami Tolvanen <samitolvanen(a)google.com>
Suggested-by: Andy Lavr <andy.lavr(a)gmail.com>
Suggested-by: Arvind Sankar <nivedita(a)alum.mit.edu>
Suggested-by: Joe Perches <joe(a)perches.com>
Suggested-by: Kees Cook <keescook(a)chromium.org>
Suggested-by: Masahiro Yamada <masahiroy(a)kernel.org>
Suggested-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/string.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
--- a/lib/string.c~lib-stringc-implement-stpcpy
+++ a/lib/string.c
@@ -272,6 +272,30 @@ ssize_t strscpy_pad(char *dest, const ch
}
EXPORT_SYMBOL(strscpy_pad);
+/**
+ * stpcpy - copy a string from src to dest returning a pointer to the new end
+ * of dest, including src's %NUL-terminator. May overrun dest.
+ * @dest: pointer to end of string being copied into. Must be large enough
+ * to receive copy.
+ * @src: pointer to the beginning of string being copied from. Must not overlap
+ * dest.
+ *
+ * stpcpy differs from strcpy in a key way: the return value is a pointer
+ * to the new %NUL-terminating character in @dest. (For strcpy, the return
+ * value is a pointer to the start of @dest). This interface is considered
+ * unsafe as it doesn't perform bounds checking of the inputs. As such it's
+ * not recommended for usage. Instead, its definition is provided in case
+ * the compiler lowers other libcalls to stpcpy.
+ */
+char *stpcpy(char *__restrict__ dest, const char *__restrict__ src);
+char *stpcpy(char *__restrict__ dest, const char *__restrict__ src)
+{
+ while ((*dest++ = *src++) != '\0')
+ /* nothing */;
+ return --dest;
+}
+EXPORT_SYMBOL(stpcpy);
+
#ifndef __HAVE_ARCH_STRCAT
/**
* strcat - Append one %NUL-terminated string to another
_
Patches currently in -mm which might be from ndesaulniers(a)google.com are
compiler-clang-add-build-check-for-clang-1001.patch
revert-kbuild-disable-clangs-default-use-of-fmerge-all-constants.patch
revert-arm64-bti-require-clang-=-1001-for-in-kernel-bti-support.patch
revert-arm64-vdso-fix-compilation-with-clang-older-than-8.patch
partially-revert-arm-8905-1-emit-__gnu_mcount_nc-when-using-clang-1000-or-newer.patch
compiler-gcc-improve-version-error.patch
The patch titled
Subject: mm/gup: fix gup_fast with dynamic page table folding
has been removed from the -mm tree. Its filename was
mm-gup-fix-gup_fast-with-dynamic-page-table-folding.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Vasily Gorbik <gor(a)linux.ibm.com>
Subject: mm/gup: fix gup_fast with dynamic page table folding
Currently to make sure that every page table entry is read just once
gup_fast walks perform READ_ONCE and pass pXd value down to the next
gup_pXd_range function by value e.g.:
static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
...
pudp = pud_offset(&p4d, addr);
This function passes a reference on that local value copy to pXd_offset,
and might get the very same pointer in return. This happens when the
level is folded (on most arches), and that pointer should not be iterated.
On s390 due to the fact that each task might have different 5,4 or 3-level
address translation and hence different levels folded the logic is more
complex and non-iteratable pointer to a local copy leads to severe
problems.
Here is an example of what happens with gup_fast on s390, for a task with
3-levels paging, crossing a 2 GB pud boundary:
// addr = 0x1007ffff000, end = 0x10080001000
static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
pud_t *pudp;
// pud_offset returns &p4d itself (a pointer to a value on stack)
pudp = pud_offset(&p4d, addr);
do {
// on second iteratation reading "random" stack value
pud_t pud = READ_ONCE(*pudp);
// next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390
next = pud_addr_end(addr, end);
...
} while (pudp++, addr = next, addr != end); // pudp++ iterating over stack
return 1;
}
This happens since s390 moved to common gup code with commit d1874a0c2805
("s390/mm: make the pxd_offset functions more robust") and commit
1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code").
s390 tried to mimic static level folding by changing pXd_offset
primitives to always calculate top level page table offset in pgd_offset
and just return the value passed when pXd_offset has to act as folded.
What is crucial for gup_fast and what has been overlooked is that
PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly.
And the latter is not possible with dynamic folding.
To fix the issue in addition to pXd values pass original pXdp pointers
down to gup_pXd_range functions. And introduce pXd_offset_lockless
helpers, which take an additional pXd entry value parameter. This has
already been discussed in
https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1
Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856…
Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code")
Signed-off-by: Vasily Gorbik <gor(a)linux.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev(a)linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: Paul Mackerras <paulus(a)samba.org>
Cc: Jeff Dike <jdike(a)addtoit.com>
Cc: Richard Weinberger <richard(a)nod.at>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Christian Borntraeger <borntraeger(a)de.ibm.com>
Cc: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org> [5.2+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/s390/include/asm/pgtable.h | 42 +++++++++++++++++++++---------
include/linux/pgtable.h | 10 +++++++
mm/gup.c | 18 ++++++------
3 files changed, 49 insertions(+), 21 deletions(-)
--- a/arch/s390/include/asm/pgtable.h~mm-gup-fix-gup_fast-with-dynamic-page-table-folding
+++ a/arch/s390/include/asm/pgtable.h
@@ -1260,26 +1260,44 @@ static inline pgd_t *pgd_offset_raw(pgd_
#define pgd_offset(mm, address) pgd_offset_raw(READ_ONCE((mm)->pgd), address)
-static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
+static inline p4d_t *p4d_offset_lockless(pgd_t *pgdp, pgd_t pgd, unsigned long address)
{
- if ((pgd_val(*pgd) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R1)
- return (p4d_t *) pgd_deref(*pgd) + p4d_index(address);
- return (p4d_t *) pgd;
+ if ((pgd_val(pgd) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R1)
+ return (p4d_t *) pgd_deref(pgd) + p4d_index(address);
+ return (p4d_t *) pgdp;
}
+#define p4d_offset_lockless p4d_offset_lockless
-static inline pud_t *pud_offset(p4d_t *p4d, unsigned long address)
+static inline p4d_t *p4d_offset(pgd_t *pgdp, unsigned long address)
{
- if ((p4d_val(*p4d) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R2)
- return (pud_t *) p4d_deref(*p4d) + pud_index(address);
- return (pud_t *) p4d;
+ return p4d_offset_lockless(pgdp, *pgdp, address);
+}
+
+static inline pud_t *pud_offset_lockless(p4d_t *p4dp, p4d_t p4d, unsigned long address)
+{
+ if ((p4d_val(p4d) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R2)
+ return (pud_t *) p4d_deref(p4d) + pud_index(address);
+ return (pud_t *) p4dp;
+}
+#define pud_offset_lockless pud_offset_lockless
+
+static inline pud_t *pud_offset(p4d_t *p4dp, unsigned long address)
+{
+ return pud_offset_lockless(p4dp, *p4dp, address);
}
#define pud_offset pud_offset
-static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
+static inline pmd_t *pmd_offset_lockless(pud_t *pudp, pud_t pud, unsigned long address)
+{
+ if ((pud_val(pud) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R3)
+ return (pmd_t *) pud_deref(pud) + pmd_index(address);
+ return (pmd_t *) pudp;
+}
+#define pmd_offset_lockless pmd_offset_lockless
+
+static inline pmd_t *pmd_offset(pud_t *pudp, unsigned long address)
{
- if ((pud_val(*pud) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R3)
- return (pmd_t *) pud_deref(*pud) + pmd_index(address);
- return (pmd_t *) pud;
+ return pmd_offset_lockless(pudp, *pudp, address);
}
#define pmd_offset pmd_offset
--- a/include/linux/pgtable.h~mm-gup-fix-gup_fast-with-dynamic-page-table-folding
+++ a/include/linux/pgtable.h
@@ -1427,6 +1427,16 @@ typedef unsigned int pgtbl_mod_mask;
#define mm_pmd_folded(mm) __is_defined(__PAGETABLE_PMD_FOLDED)
#endif
+#ifndef p4d_offset_lockless
+#define p4d_offset_lockless(pgdp, pgd, address) p4d_offset(&(pgd), address)
+#endif
+#ifndef pud_offset_lockless
+#define pud_offset_lockless(p4dp, p4d, address) pud_offset(&(p4d), address)
+#endif
+#ifndef pmd_offset_lockless
+#define pmd_offset_lockless(pudp, pud, address) pmd_offset(&(pud), address)
+#endif
+
/*
* p?d_leaf() - true if this entry is a final mapping to a physical address.
* This differs from p?d_huge() by the fact that they are always available (if
--- a/mm/gup.c~mm-gup-fix-gup_fast-with-dynamic-page-table-folding
+++ a/mm/gup.c
@@ -2485,13 +2485,13 @@ static int gup_huge_pgd(pgd_t orig, pgd_
return 1;
}
-static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
+static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
pmd_t *pmdp;
- pmdp = pmd_offset(&pud, addr);
+ pmdp = pmd_offset_lockless(pudp, pud, addr);
do {
pmd_t pmd = READ_ONCE(*pmdp);
@@ -2528,13 +2528,13 @@ static int gup_pmd_range(pud_t pud, unsi
return 1;
}
-static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
+static int gup_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
pud_t *pudp;
- pudp = pud_offset(&p4d, addr);
+ pudp = pud_offset_lockless(p4dp, p4d, addr);
do {
pud_t pud = READ_ONCE(*pudp);
@@ -2549,20 +2549,20 @@ static int gup_pud_range(p4d_t p4d, unsi
if (!gup_huge_pd(__hugepd(pud_val(pud)), addr,
PUD_SHIFT, next, flags, pages, nr))
return 0;
- } else if (!gup_pmd_range(pud, addr, next, flags, pages, nr))
+ } else if (!gup_pmd_range(pudp, pud, addr, next, flags, pages, nr))
return 0;
} while (pudp++, addr = next, addr != end);
return 1;
}
-static int gup_p4d_range(pgd_t pgd, unsigned long addr, unsigned long end,
+static int gup_p4d_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
p4d_t *p4dp;
- p4dp = p4d_offset(&pgd, addr);
+ p4dp = p4d_offset_lockless(pgdp, pgd, addr);
do {
p4d_t p4d = READ_ONCE(*p4dp);
@@ -2574,7 +2574,7 @@ static int gup_p4d_range(pgd_t pgd, unsi
if (!gup_huge_pd(__hugepd(p4d_val(p4d)), addr,
P4D_SHIFT, next, flags, pages, nr))
return 0;
- } else if (!gup_pud_range(p4d, addr, next, flags, pages, nr))
+ } else if (!gup_pud_range(p4dp, p4d, addr, next, flags, pages, nr))
return 0;
} while (p4dp++, addr = next, addr != end);
@@ -2602,7 +2602,7 @@ static void gup_pgd_range(unsigned long
if (!gup_huge_pd(__hugepd(pgd_val(pgd)), addr,
PGDIR_SHIFT, next, flags, pages, nr))
return;
- } else if (!gup_p4d_range(pgd, addr, next, flags, pages, nr))
+ } else if (!gup_p4d_range(pgdp, pgd, addr, next, flags, pages, nr))
return;
} while (pgdp++, addr = next, addr != end);
}
_
Patches currently in -mm which might be from gor(a)linux.ibm.com are
The patch titled
Subject: mm, THP, swap: fix allocating cluster for swapfile by mistake
has been removed from the -mm tree. Its filename was
mm-thp-swap-fix-allocating-cluster-for-swapfile-by-mistake.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Gao Xiang <hsiangkao(a)redhat.com>
Subject: mm, THP, swap: fix allocating cluster for swapfile by mistake
SWP_FS is used to make swap_{read,write}page() go through the filesystem,
and it's only used for swap files over NFS. So, !SWP_FS means non NFS for
now, it could be either file backed or device backed. Something similar
goes with legacy SWP_FILE.
So in order to achieve the goal of the original patch, SWP_BLKDEV should
be used instead.
FS corruption can be observed with SSD device + XFS + fragmented swapfile
due to CONFIG_THP_SWAP=y.
I reproduced the issue with the following details:
Environment:
QEMU + upstream kernel + buildroot + NVMe (2 GB)
Kernel config:
CONFIG_BLK_DEV_NVME=y
CONFIG_THP_SWAP=y
Some reproducable steps:
mkfs.xfs -f /dev/nvme0n1
mkdir /tmp/mnt
mount /dev/nvme0n1 /tmp/mnt
bs="32k"
sz="1024m" # doesn't matter too much, I also tried 16m
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw
mkswap /tmp/mnt/sw
swapon /tmp/mnt/sw
stress --vm 2 --vm-bytes 600M # doesn't matter too much as well
Symptoms:
- FS corruption (e.g. checksum failure)
- memory corruption at: 0xd2808010
- segfault
Link: https://lkml.kernel.org/r/20200820045323.7809-1-hsiangkao@redhat.com
Fixes: f0eea189e8e9 ("mm, THP, swap: Don't allocate huge cluster for file backed swap device")
Fixes: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out")
Signed-off-by: Gao Xiang <hsiangkao(a)redhat.com>
Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Acked-by: Rafael Aquini <aquini(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Carlos Maiolino <cmaiolino(a)redhat.com>
Cc: Eric Sandeen <esandeen(a)redhat.com>
Cc: Dave Chinner <david(a)fromorbit.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swapfile.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/swapfile.c~mm-thp-swap-fix-allocating-cluster-for-swapfile-by-mistake
+++ a/mm/swapfile.c
@@ -1078,7 +1078,7 @@ start_over:
goto nextsi;
}
if (size == SWAPFILE_CLUSTER) {
- if (!(si->flags & SWP_FS))
+ if (si->flags & SWP_BLKDEV)
n_ret = swap_alloc_cluster(si, swp_entries);
} else
n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
_
Patches currently in -mm which might be from hsiangkao(a)redhat.com are
swap-rename-swp_fs-to-swap_fs_ops-to-avoid-ambiguity.patch
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 596ff5a5ba49 - mm: validate pmd after splitting
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ❌ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
ppc64le:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 4:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ee1dfad5325ff1cfb2239e564cd411b3bfe8667a Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer(a)redhat.com>
Date: Mon, 14 Sep 2020 13:04:19 -0400
Subject: [PATCH] dm: fix bio splitting and its bio completion order for
regular IO
dm_queue_split() is removed because __split_and_process_bio() _must_
handle splitting bios to ensure proper bio submission and completion
ordering as a bio is split.
Otherwise, multiple recursive calls to ->submit_bio will cause multiple
split bios to be allocated from the same ->bio_split mempool at the same
time. This would result in deadlock in low memory conditions because no
progress could be made (only one bio is available in ->bio_split
mempool).
This fix has been verified to still fix the loss of performance, due
to excess splitting, that commit 120c9257f5f1 provided.
Fixes: 120c9257f5f1 ("Revert "dm: always call blk_queue_split() in dm_process_bio()"")
Cc: stable(a)vger.kernel.org # 5.0+, requires custom backport due to 5.9 changes
Reported-by: Ming Lei <ming.lei(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 4a40df8af7d3..d948cd522431 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1724,23 +1724,6 @@ static blk_qc_t __process_bio(struct mapped_device *md, struct dm_table *map,
return ret;
}
-static void dm_queue_split(struct mapped_device *md, struct dm_target *ti, struct bio **bio)
-{
- unsigned len, sector_count;
-
- sector_count = bio_sectors(*bio);
- len = min_t(sector_t, max_io_len((*bio)->bi_iter.bi_sector, ti), sector_count);
-
- if (sector_count > len) {
- struct bio *split = bio_split(*bio, len, GFP_NOIO, &md->queue->bio_split);
-
- bio_chain(split, *bio);
- trace_block_split(md->queue, split, (*bio)->bi_iter.bi_sector);
- submit_bio_noacct(*bio);
- *bio = split;
- }
-}
-
static blk_qc_t dm_process_bio(struct mapped_device *md,
struct dm_table *map, struct bio *bio)
{
@@ -1768,14 +1751,12 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
if (current->bio_list) {
if (is_abnormal_io(bio))
blk_queue_split(&bio);
- else
- dm_queue_split(md, ti, &bio);
+ /* regular IO is split by __split_and_process_bio */
}
if (dm_get_md_type(md) == DM_TYPE_NVME_BIO_BASED)
return __process_bio(md, map, bio, ti);
- else
- return __split_and_process_bio(md, map, bio);
+ return __split_and_process_bio(md, map, bio);
}
static blk_qc_t dm_submit_bio(struct bio *bio)
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: d18c028cbd8f - mm: validate pmd after splitting
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
ppc64le:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 3:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
x86_64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
When tpm_get_random() was introduced, it defined the following API for the
return value:
1. A positive value tells how many bytes of random data was generated.
2. A negative value on error.
However, in the call sites the API was used incorrectly, i.e. as it would
only return negative values and otherwise zero. Returning he positive read
counts to the user space does not make any possible sense.
Fix this by returning -EIO when tpm_get_random() returns a positive value.
Fixes: 41ab999c80f1 ("tpm: Move tpm_get_random api into the TPM device driver")
Cc: Kent Yoder <key(a)linux.vnet.ibm.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley(a)HansenPartnership.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com>
---
v2: Fixed ret check in two locations.
security/keys/trusted-keys/trusted_tpm1.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/security/keys/trusted-keys/trusted_tpm1.c b/security/keys/trusted-keys/trusted_tpm1.c
index b9fe02e5f84f..c7b1701cdac5 100644
--- a/security/keys/trusted-keys/trusted_tpm1.c
+++ b/security/keys/trusted-keys/trusted_tpm1.c
@@ -403,9 +403,12 @@ static int osap(struct tpm_buf *tb, struct osapsess *s,
int ret;
ret = tpm_get_random(chip, ononce, TPM_NONCE_SIZE);
- if (ret != TPM_NONCE_SIZE)
+ if (ret < 0)
return ret;
+ if (ret != TPM_NONCE_SIZE)
+ return -EIO;
+
tpm_buf_reset(tb, TPM_TAG_RQU_COMMAND, TPM_ORD_OSAP);
tpm_buf_append_u16(tb, type);
tpm_buf_append_u32(tb, handle);
@@ -496,8 +499,12 @@ static int tpm_seal(struct tpm_buf *tb, uint16_t keytype,
goto out;
ret = tpm_get_random(chip, td->nonceodd, TPM_NONCE_SIZE);
+ if (ret < 0)
+ return ret;
+
if (ret != TPM_NONCE_SIZE)
- goto out;
+ return -EIO;
+
ordinal = htonl(TPM_ORD_SEAL);
datsize = htonl(datalen);
pcrsize = htonl(pcrinfosize);
@@ -601,9 +608,12 @@ static int tpm_unseal(struct tpm_buf *tb,
ordinal = htonl(TPM_ORD_UNSEAL);
ret = tpm_get_random(chip, nonceodd, TPM_NONCE_SIZE);
+ if (ret < 0)
+ return ret;
+
if (ret != TPM_NONCE_SIZE) {
pr_info("trusted_key: tpm_get_random failed (%d)\n", ret);
- return ret;
+ return -EIO;
}
ret = TSS_authhmac(authdata1, keyauth, TPM_NONCE_SIZE,
enonce1, nonceodd, cont, sizeof(uint32_t),
@@ -1013,8 +1023,12 @@ static int trusted_instantiate(struct key *key,
case Opt_new:
key_len = payload->key_len;
ret = tpm_get_random(chip, payload->key, key_len);
+ if (ret < 0)
+ goto out;
+
if (ret != key_len) {
pr_info("trusted_key: key_create failed (%d)\n", ret);
+ ret = -EIO;
goto out;
}
if (tpm2)
--
2.25.1
When tpm_get_random() was introduced, it defined the following API for the
return value:
1. A positive value tells how many bytes of random data was generated.
2. A negative value on error.
However, in the call sites the API was used incorrectly, i.e. as it would
only return negative values and otherwise zero. Returning he positive read
counts to the user space does not make any possible sense.
Fix this by returning -EIO when tpm_get_random() returns a positive value.
Fixes: 41ab999c80f1 ("tpm: Move tpm_get_random api into the TPM device driver")
Cc: Kent Yoder <key(a)linux.vnet.ibm.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley(a)HansenPartnership.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com>
---
security/keys/trusted-keys/trusted_tpm1.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/security/keys/trusted-keys/trusted_tpm1.c b/security/keys/trusted-keys/trusted_tpm1.c
index b9fe02e5f84f..0f2e893c6b5f 100644
--- a/security/keys/trusted-keys/trusted_tpm1.c
+++ b/security/keys/trusted-keys/trusted_tpm1.c
@@ -403,9 +403,12 @@ static int osap(struct tpm_buf *tb, struct osapsess *s,
int ret;
ret = tpm_get_random(chip, ononce, TPM_NONCE_SIZE);
- if (ret != TPM_NONCE_SIZE)
+ if (ret < 0)
return ret;
+ if (ret != TPM_NONCE_SIZE)
+ return -EIO;
+
tpm_buf_reset(tb, TPM_TAG_RQU_COMMAND, TPM_ORD_OSAP);
tpm_buf_append_u16(tb, type);
tpm_buf_append_u32(tb, handle);
@@ -496,8 +499,12 @@ static int tpm_seal(struct tpm_buf *tb, uint16_t keytype,
goto out;
ret = tpm_get_random(chip, td->nonceodd, TPM_NONCE_SIZE);
+ if (ret < 0)
+ return ret;
+
if (ret != TPM_NONCE_SIZE)
- goto out;
+ return -EIO;
+
ordinal = htonl(TPM_ORD_SEAL);
datsize = htonl(datalen);
pcrsize = htonl(pcrinfosize);
@@ -601,6 +608,9 @@ static int tpm_unseal(struct tpm_buf *tb,
ordinal = htonl(TPM_ORD_UNSEAL);
ret = tpm_get_random(chip, nonceodd, TPM_NONCE_SIZE);
+ if (ret < 0)
+ return -EIO;
+
if (ret != TPM_NONCE_SIZE) {
pr_info("trusted_key: tpm_get_random failed (%d)\n", ret);
return ret;
@@ -1013,6 +1023,11 @@ static int trusted_instantiate(struct key *key,
case Opt_new:
key_len = payload->key_len;
ret = tpm_get_random(chip, payload->key, key_len);
+ if (ret < 0) {
+ ret = -EIO;
+ goto out;
+ }
+
if (ret != key_len) {
pr_info("trusted_key: key_create failed (%d)\n", ret);
goto out;
--
2.25.1
The patch below does not apply to the 5.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f3cd4850504ff612d0ea77a0aaf29b66c98fcefe Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Thu, 24 Sep 2020 14:55:54 -0600
Subject: [PATCH] io_uring: ensure open/openat2 name is cleaned on cancelation
If we cancel these requests, we'll leak the memory associated with the
filename. Add them to the table of ops that need cleaning, if
REQ_F_NEED_CLEANUP is set.
Cc: stable(a)vger.kernel.org
Fixes: e62753e4e292 ("io_uring: call statx directly")
Reviewed-by: Stefano Garzarella <sgarzare(a)redhat.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index e6004b92e553..0ab16df31288 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5671,6 +5671,11 @@ static void __io_clean_op(struct io_kiocb *req)
io_put_file(req, req->splice.file_in,
(req->splice.flags & SPLICE_F_FD_IN_FIXED));
break;
+ case IORING_OP_OPENAT:
+ case IORING_OP_OPENAT2:
+ if (req->open.filename)
+ putname(req->open.filename);
+ break;
}
req->flags &= ~REQ_F_NEED_CLEANUP;
}
We only allow persistent requests to remain on the GPU past the closure
of their containing context (and process) so long as they are continuously
checked for hangs or allow other requests to preempt them, as we need to
ensure forward progress of the system. If we allow persistent contexts
to remain on the system after the the hangcheck mechanism is disabled,
the system may grind to a halt. On disabling the mechanism, we sent a
pulse along the engine to remove all executing contexts from the engine
which would check for hung contexts -- but we did not prevent those
contexts from being resubmitted if they survived the final hangcheck.
Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs")
Testcase: igt/gem_ctx_persistence/heartbeat-stop
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
---
drivers/gpu/drm/i915/gt/intel_engine.h | 9 +++++++++
drivers/gpu/drm/i915/i915_request.c | 5 +++++
2 files changed, 14 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 08e2c000dcc3..7c3a1012e702 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -337,4 +337,13 @@ intel_engine_has_preempt_reset(const struct intel_engine_cs *engine)
return intel_engine_has_preemption(engine);
}
+static inline bool
+intel_engine_has_heartbeat(const struct intel_engine_cs *engine)
+{
+ if (!IS_ACTIVE(CONFIG_DRM_I915_HEARTBEAT_INTERVAL))
+ return false;
+
+ return READ_ONCE(engine->props.heartbeat_interval_ms);
+}
+
#endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 436ce368ddaa..0e813819b041 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -542,8 +542,13 @@ bool __i915_request_submit(struct i915_request *request)
if (i915_request_completed(request))
goto xfer;
+ if (unlikely(intel_context_is_closed(request->context) &&
+ !intel_engine_has_heartbeat(engine)))
+ intel_context_set_banned(request->context);
+
if (unlikely(intel_context_is_banned(request->context)))
i915_request_set_error_once(request, -EIO);
+
if (unlikely(fatal_error(request->fence.error)))
__i915_request_skip(request);
--
2.20.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f85086f95fa36194eb0db5cd5c12e56801b98523 Mon Sep 17 00:00:00 2001
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Date: Fri, 25 Sep 2020 21:19:31 -0700
Subject: [PATCH] mm: don't rely on system state to detect hot-plug operations
In register_mem_sect_under_node() the system_state's value is checked to
detect whether the call is made during boot time or during an hot-plug
operation. Unfortunately, that check against SYSTEM_BOOTING is wrong
because regular memory is registered at SYSTEM_SCHEDULING state. In
addition, memory hot-plug operation can be triggered at this system
state by the ACPI [1]. So checking against the system state is not
enough.
The consequence is that on system with interleaved node's ranges like this:
Early memory node ranges
node 1: [mem 0x0000000000000000-0x000000011fffffff]
node 2: [mem 0x0000000120000000-0x000000014fffffff]
node 1: [mem 0x0000000150000000-0x00000001ffffffff]
node 0: [mem 0x0000000200000000-0x000000048fffffff]
node 2: [mem 0x0000000490000000-0x00000007ffffffff]
This can be seen on PowerPC LPAR after multiple memory hot-plug and
hot-unplug operations are done. At the next reboot the node's memory
ranges can be interleaved and since the call to link_mem_sections() is
made in topology_init() while the system is in the SYSTEM_SCHEDULING
state, the node's id is not checked, and the sections registered to
multiple nodes:
$ ls -l /sys/devices/system/memory/memory21/node*
total 0
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2
In that case, the system is able to boot but if later one of theses
memory blocks is hot-unplugged and then hot-plugged, the sysfs
inconsistency is detected and this is triggering a BUG_ON():
kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
Call Trace:
add_memory_resource+0x23c/0x340 (unreliable)
__add_memory+0x5c/0xf0
dlpar_add_lmb+0x1b4/0x500
dlpar_memory+0x1f8/0xb80
handle_dlpar_errorlog+0xc0/0x190
dlpar_store+0x198/0x4a0
kobj_attr_store+0x30/0x50
sysfs_kf_write+0x64/0x90
kernfs_fop_write+0x1b0/0x290
vfs_write+0xe8/0x290
ksys_write+0xdc/0x130
system_call_exception+0x160/0x270
system_call_common+0xf0/0x27c
This patch addresses the root cause by not relying on the system_state
value to detect whether the call is due to a hot-plug operation. An
extra parameter is added to link_mem_sections() detailing whether the
operation is due to a hot-plug operation.
[1] According to Oscar Salvador, using this qemu command line, ACPI
memory hotplug operations are raised at SYSTEM_SCHEDULING state:
$QEMU -enable-kvm -machine pc -smp 4,sockets=4,cores=1,threads=1 -cpu host -monitor pty \
-m size=$MEM,slots=255,maxmem=4294967296k \
-numa node,nodeid=0,cpus=0-3,mem=512 -numa node,nodeid=1,mem=512 \
-object memory-backend-ram,id=memdimm0,size=134217728 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0,slot=0 \
-object memory-backend-ram,id=memdimm1,size=134217728 -device pc-dimm,node=0,memdev=memdimm1,id=dimm1,slot=1 \
-object memory-backend-ram,id=memdimm2,size=134217728 -device pc-dimm,node=0,memdev=memdimm2,id=dimm2,slot=2 \
-object memory-backend-ram,id=memdimm3,size=134217728 -device pc-dimm,node=0,memdev=memdimm3,id=dimm3,slot=3 \
-object memory-backend-ram,id=memdimm4,size=134217728 -device pc-dimm,node=1,memdev=memdimm4,id=dimm4,slot=4 \
-object memory-backend-ram,id=memdimm5,size=134217728 -device pc-dimm,node=1,memdev=memdimm5,id=dimm5,slot=5 \
-object memory-backend-ram,id=memdimm6,size=134217728 -device pc-dimm,node=1,memdev=memdimm6,id=dimm6,slot=6 \
Fixes: 4fbce633910e ("mm/memory_hotplug.c: make register_mem_sect_under_node() a callback of walk_memory_range()")
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael(a)kernel.org>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20200915094143.79181-3-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 508b80f6329b..50af16e68d98 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -761,14 +761,36 @@ static int __ref get_nid_for_pfn(unsigned long pfn)
return pfn_to_nid(pfn);
}
+static int do_register_memory_block_under_node(int nid,
+ struct memory_block *mem_blk)
+{
+ int ret;
+
+ /*
+ * If this memory block spans multiple nodes, we only indicate
+ * the last processed node.
+ */
+ mem_blk->nid = nid;
+
+ ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj,
+ &mem_blk->dev.kobj,
+ kobject_name(&mem_blk->dev.kobj));
+ if (ret)
+ return ret;
+
+ return sysfs_create_link_nowarn(&mem_blk->dev.kobj,
+ &node_devices[nid]->dev.kobj,
+ kobject_name(&node_devices[nid]->dev.kobj));
+}
+
/* register memory section under specified node if it spans that node */
-static int register_mem_sect_under_node(struct memory_block *mem_blk,
- void *arg)
+static int register_mem_block_under_node_early(struct memory_block *mem_blk,
+ void *arg)
{
unsigned long memory_block_pfns = memory_block_size_bytes() / PAGE_SIZE;
unsigned long start_pfn = section_nr_to_pfn(mem_blk->start_section_nr);
unsigned long end_pfn = start_pfn + memory_block_pfns - 1;
- int ret, nid = *(int *)arg;
+ int nid = *(int *)arg;
unsigned long pfn;
for (pfn = start_pfn; pfn <= end_pfn; pfn++) {
@@ -785,38 +807,33 @@ static int register_mem_sect_under_node(struct memory_block *mem_blk,
}
/*
- * We need to check if page belongs to nid only for the boot
- * case, during hotplug we know that all pages in the memory
- * block belong to the same node.
- */
- if (system_state == SYSTEM_BOOTING) {
- page_nid = get_nid_for_pfn(pfn);
- if (page_nid < 0)
- continue;
- if (page_nid != nid)
- continue;
- }
-
- /*
- * If this memory block spans multiple nodes, we only indicate
- * the last processed node.
+ * We need to check if page belongs to nid only at the boot
+ * case because node's ranges can be interleaved.
*/
- mem_blk->nid = nid;
-
- ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj,
- &mem_blk->dev.kobj,
- kobject_name(&mem_blk->dev.kobj));
- if (ret)
- return ret;
+ page_nid = get_nid_for_pfn(pfn);
+ if (page_nid < 0)
+ continue;
+ if (page_nid != nid)
+ continue;
- return sysfs_create_link_nowarn(&mem_blk->dev.kobj,
- &node_devices[nid]->dev.kobj,
- kobject_name(&node_devices[nid]->dev.kobj));
+ return do_register_memory_block_under_node(nid, mem_blk);
}
/* mem section does not span the specified node */
return 0;
}
+/*
+ * During hotplug we know that all pages in the memory block belong to the same
+ * node.
+ */
+static int register_mem_block_under_node_hotplug(struct memory_block *mem_blk,
+ void *arg)
+{
+ int nid = *(int *)arg;
+
+ return do_register_memory_block_under_node(nid, mem_blk);
+}
+
/*
* Unregister a memory block device under the node it spans. Memory blocks
* with multiple nodes cannot be offlined and therefore also never be removed.
@@ -832,11 +849,19 @@ void unregister_memory_block_under_nodes(struct memory_block *mem_blk)
kobject_name(&node_devices[mem_blk->nid]->dev.kobj));
}
-int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn)
+int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn,
+ enum meminit_context context)
{
+ walk_memory_blocks_func_t func;
+
+ if (context == MEMINIT_HOTPLUG)
+ func = register_mem_block_under_node_hotplug;
+ else
+ func = register_mem_block_under_node_early;
+
return walk_memory_blocks(PFN_PHYS(start_pfn),
PFN_PHYS(end_pfn - start_pfn), (void *)&nid,
- register_mem_sect_under_node);
+ func);
}
#ifdef CONFIG_HUGETLBFS
diff --git a/include/linux/node.h b/include/linux/node.h
index 4866f32a02d8..014ba3ab2efd 100644
--- a/include/linux/node.h
+++ b/include/linux/node.h
@@ -99,11 +99,13 @@ extern struct node *node_devices[];
typedef void (*node_registration_func_t)(struct node *);
#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_NUMA)
-extern int link_mem_sections(int nid, unsigned long start_pfn,
- unsigned long end_pfn);
+int link_mem_sections(int nid, unsigned long start_pfn,
+ unsigned long end_pfn,
+ enum meminit_context context);
#else
static inline int link_mem_sections(int nid, unsigned long start_pfn,
- unsigned long end_pfn)
+ unsigned long end_pfn,
+ enum meminit_context context)
{
return 0;
}
@@ -128,7 +130,8 @@ static inline int register_one_node(int nid)
if (error)
return error;
/* link memory sections under this node */
- error = link_mem_sections(nid, start_pfn, end_pfn);
+ error = link_mem_sections(nid, start_pfn, end_pfn,
+ MEMINIT_EARLY);
}
return error;
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 345308a8bd15..ce3e73e3a5c1 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1080,7 +1080,8 @@ int __ref add_memory_resource(int nid, struct resource *res)
}
/* link memory sections under this node.*/
- ret = link_mem_sections(nid, PFN_DOWN(start), PFN_UP(start + size - 1));
+ ret = link_mem_sections(nid, PFN_DOWN(start), PFN_UP(start + size - 1),
+ MEMINIT_HOTPLUG);
BUG_ON(ret);
/* create new memmap entry */
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f85086f95fa36194eb0db5cd5c12e56801b98523 Mon Sep 17 00:00:00 2001
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Date: Fri, 25 Sep 2020 21:19:31 -0700
Subject: [PATCH] mm: don't rely on system state to detect hot-plug operations
In register_mem_sect_under_node() the system_state's value is checked to
detect whether the call is made during boot time or during an hot-plug
operation. Unfortunately, that check against SYSTEM_BOOTING is wrong
because regular memory is registered at SYSTEM_SCHEDULING state. In
addition, memory hot-plug operation can be triggered at this system
state by the ACPI [1]. So checking against the system state is not
enough.
The consequence is that on system with interleaved node's ranges like this:
Early memory node ranges
node 1: [mem 0x0000000000000000-0x000000011fffffff]
node 2: [mem 0x0000000120000000-0x000000014fffffff]
node 1: [mem 0x0000000150000000-0x00000001ffffffff]
node 0: [mem 0x0000000200000000-0x000000048fffffff]
node 2: [mem 0x0000000490000000-0x00000007ffffffff]
This can be seen on PowerPC LPAR after multiple memory hot-plug and
hot-unplug operations are done. At the next reboot the node's memory
ranges can be interleaved and since the call to link_mem_sections() is
made in topology_init() while the system is in the SYSTEM_SCHEDULING
state, the node's id is not checked, and the sections registered to
multiple nodes:
$ ls -l /sys/devices/system/memory/memory21/node*
total 0
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2
In that case, the system is able to boot but if later one of theses
memory blocks is hot-unplugged and then hot-plugged, the sysfs
inconsistency is detected and this is triggering a BUG_ON():
kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
Call Trace:
add_memory_resource+0x23c/0x340 (unreliable)
__add_memory+0x5c/0xf0
dlpar_add_lmb+0x1b4/0x500
dlpar_memory+0x1f8/0xb80
handle_dlpar_errorlog+0xc0/0x190
dlpar_store+0x198/0x4a0
kobj_attr_store+0x30/0x50
sysfs_kf_write+0x64/0x90
kernfs_fop_write+0x1b0/0x290
vfs_write+0xe8/0x290
ksys_write+0xdc/0x130
system_call_exception+0x160/0x270
system_call_common+0xf0/0x27c
This patch addresses the root cause by not relying on the system_state
value to detect whether the call is due to a hot-plug operation. An
extra parameter is added to link_mem_sections() detailing whether the
operation is due to a hot-plug operation.
[1] According to Oscar Salvador, using this qemu command line, ACPI
memory hotplug operations are raised at SYSTEM_SCHEDULING state:
$QEMU -enable-kvm -machine pc -smp 4,sockets=4,cores=1,threads=1 -cpu host -monitor pty \
-m size=$MEM,slots=255,maxmem=4294967296k \
-numa node,nodeid=0,cpus=0-3,mem=512 -numa node,nodeid=1,mem=512 \
-object memory-backend-ram,id=memdimm0,size=134217728 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0,slot=0 \
-object memory-backend-ram,id=memdimm1,size=134217728 -device pc-dimm,node=0,memdev=memdimm1,id=dimm1,slot=1 \
-object memory-backend-ram,id=memdimm2,size=134217728 -device pc-dimm,node=0,memdev=memdimm2,id=dimm2,slot=2 \
-object memory-backend-ram,id=memdimm3,size=134217728 -device pc-dimm,node=0,memdev=memdimm3,id=dimm3,slot=3 \
-object memory-backend-ram,id=memdimm4,size=134217728 -device pc-dimm,node=1,memdev=memdimm4,id=dimm4,slot=4 \
-object memory-backend-ram,id=memdimm5,size=134217728 -device pc-dimm,node=1,memdev=memdimm5,id=dimm5,slot=5 \
-object memory-backend-ram,id=memdimm6,size=134217728 -device pc-dimm,node=1,memdev=memdimm6,id=dimm6,slot=6 \
Fixes: 4fbce633910e ("mm/memory_hotplug.c: make register_mem_sect_under_node() a callback of walk_memory_range()")
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael(a)kernel.org>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20200915094143.79181-3-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 508b80f6329b..50af16e68d98 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -761,14 +761,36 @@ static int __ref get_nid_for_pfn(unsigned long pfn)
return pfn_to_nid(pfn);
}
+static int do_register_memory_block_under_node(int nid,
+ struct memory_block *mem_blk)
+{
+ int ret;
+
+ /*
+ * If this memory block spans multiple nodes, we only indicate
+ * the last processed node.
+ */
+ mem_blk->nid = nid;
+
+ ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj,
+ &mem_blk->dev.kobj,
+ kobject_name(&mem_blk->dev.kobj));
+ if (ret)
+ return ret;
+
+ return sysfs_create_link_nowarn(&mem_blk->dev.kobj,
+ &node_devices[nid]->dev.kobj,
+ kobject_name(&node_devices[nid]->dev.kobj));
+}
+
/* register memory section under specified node if it spans that node */
-static int register_mem_sect_under_node(struct memory_block *mem_blk,
- void *arg)
+static int register_mem_block_under_node_early(struct memory_block *mem_blk,
+ void *arg)
{
unsigned long memory_block_pfns = memory_block_size_bytes() / PAGE_SIZE;
unsigned long start_pfn = section_nr_to_pfn(mem_blk->start_section_nr);
unsigned long end_pfn = start_pfn + memory_block_pfns - 1;
- int ret, nid = *(int *)arg;
+ int nid = *(int *)arg;
unsigned long pfn;
for (pfn = start_pfn; pfn <= end_pfn; pfn++) {
@@ -785,38 +807,33 @@ static int register_mem_sect_under_node(struct memory_block *mem_blk,
}
/*
- * We need to check if page belongs to nid only for the boot
- * case, during hotplug we know that all pages in the memory
- * block belong to the same node.
- */
- if (system_state == SYSTEM_BOOTING) {
- page_nid = get_nid_for_pfn(pfn);
- if (page_nid < 0)
- continue;
- if (page_nid != nid)
- continue;
- }
-
- /*
- * If this memory block spans multiple nodes, we only indicate
- * the last processed node.
+ * We need to check if page belongs to nid only at the boot
+ * case because node's ranges can be interleaved.
*/
- mem_blk->nid = nid;
-
- ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj,
- &mem_blk->dev.kobj,
- kobject_name(&mem_blk->dev.kobj));
- if (ret)
- return ret;
+ page_nid = get_nid_for_pfn(pfn);
+ if (page_nid < 0)
+ continue;
+ if (page_nid != nid)
+ continue;
- return sysfs_create_link_nowarn(&mem_blk->dev.kobj,
- &node_devices[nid]->dev.kobj,
- kobject_name(&node_devices[nid]->dev.kobj));
+ return do_register_memory_block_under_node(nid, mem_blk);
}
/* mem section does not span the specified node */
return 0;
}
+/*
+ * During hotplug we know that all pages in the memory block belong to the same
+ * node.
+ */
+static int register_mem_block_under_node_hotplug(struct memory_block *mem_blk,
+ void *arg)
+{
+ int nid = *(int *)arg;
+
+ return do_register_memory_block_under_node(nid, mem_blk);
+}
+
/*
* Unregister a memory block device under the node it spans. Memory blocks
* with multiple nodes cannot be offlined and therefore also never be removed.
@@ -832,11 +849,19 @@ void unregister_memory_block_under_nodes(struct memory_block *mem_blk)
kobject_name(&node_devices[mem_blk->nid]->dev.kobj));
}
-int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn)
+int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn,
+ enum meminit_context context)
{
+ walk_memory_blocks_func_t func;
+
+ if (context == MEMINIT_HOTPLUG)
+ func = register_mem_block_under_node_hotplug;
+ else
+ func = register_mem_block_under_node_early;
+
return walk_memory_blocks(PFN_PHYS(start_pfn),
PFN_PHYS(end_pfn - start_pfn), (void *)&nid,
- register_mem_sect_under_node);
+ func);
}
#ifdef CONFIG_HUGETLBFS
diff --git a/include/linux/node.h b/include/linux/node.h
index 4866f32a02d8..014ba3ab2efd 100644
--- a/include/linux/node.h
+++ b/include/linux/node.h
@@ -99,11 +99,13 @@ extern struct node *node_devices[];
typedef void (*node_registration_func_t)(struct node *);
#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_NUMA)
-extern int link_mem_sections(int nid, unsigned long start_pfn,
- unsigned long end_pfn);
+int link_mem_sections(int nid, unsigned long start_pfn,
+ unsigned long end_pfn,
+ enum meminit_context context);
#else
static inline int link_mem_sections(int nid, unsigned long start_pfn,
- unsigned long end_pfn)
+ unsigned long end_pfn,
+ enum meminit_context context)
{
return 0;
}
@@ -128,7 +130,8 @@ static inline int register_one_node(int nid)
if (error)
return error;
/* link memory sections under this node */
- error = link_mem_sections(nid, start_pfn, end_pfn);
+ error = link_mem_sections(nid, start_pfn, end_pfn,
+ MEMINIT_EARLY);
}
return error;
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 345308a8bd15..ce3e73e3a5c1 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1080,7 +1080,8 @@ int __ref add_memory_resource(int nid, struct resource *res)
}
/* link memory sections under this node.*/
- ret = link_mem_sections(nid, PFN_DOWN(start), PFN_UP(start + size - 1));
+ ret = link_mem_sections(nid, PFN_DOWN(start), PFN_UP(start + size - 1),
+ MEMINIT_HOTPLUG);
BUG_ON(ret);
/* create new memmap entry */
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c1d0da83358a2316d9be7f229f26126dbaa07468 Mon Sep 17 00:00:00 2001
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Date: Fri, 25 Sep 2020 21:19:28 -0700
Subject: [PATCH] mm: replace memmap_context by meminit_context
Patch series "mm: fix memory to node bad links in sysfs", v3.
Sometimes, firmware may expose interleaved memory layout like this:
Early memory node ranges
node 1: [mem 0x0000000000000000-0x000000011fffffff]
node 2: [mem 0x0000000120000000-0x000000014fffffff]
node 1: [mem 0x0000000150000000-0x00000001ffffffff]
node 0: [mem 0x0000000200000000-0x000000048fffffff]
node 2: [mem 0x0000000490000000-0x00000007ffffffff]
In that case, we can see memory blocks assigned to multiple nodes in
sysfs:
$ ls -l /sys/devices/system/memory/memory21
total 0
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2
-rw-r--r-- 1 root root 65536 Aug 24 05:27 online
-r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device
-r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index
drwxr-xr-x 2 root root 0 Aug 24 05:27 power
-r--r--r-- 1 root root 65536 Aug 24 05:27 removable
-rw-r--r-- 1 root root 65536 Aug 24 05:27 state
lrwxrwxrwx 1 root root 0 Aug 24 05:25 subsystem -> ../../../../bus/memory
-rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent
-r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones
The same applies in the node's directory with a memory21 link in both
the node1 and node2's directory.
This is wrong but doesn't prevent the system to run. However when
later, one of these memory blocks is hot-unplugged and then hot-plugged,
the system is detecting an inconsistency in the sysfs layout and a
BUG_ON() is raised:
kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
Call Trace:
add_memory_resource+0x23c/0x340 (unreliable)
__add_memory+0x5c/0xf0
dlpar_add_lmb+0x1b4/0x500
dlpar_memory+0x1f8/0xb80
handle_dlpar_errorlog+0xc0/0x190
dlpar_store+0x198/0x4a0
kobj_attr_store+0x30/0x50
sysfs_kf_write+0x64/0x90
kernfs_fop_write+0x1b0/0x290
vfs_write+0xe8/0x290
ksys_write+0xdc/0x130
system_call_exception+0x160/0x270
system_call_common+0xf0/0x27c
This has been seen on PowerPC LPAR.
The root cause of this issue is that when node's memory is registered,
the range used can overlap another node's range, thus the memory block
is registered to multiple nodes in sysfs.
There are two issues here:
(a) The sysfs memory and node's layouts are broken due to these
multiple links
(b) The link errors in link_mem_sections() should not lead to a system
panic.
To address (a) register_mem_sect_under_node should not rely on the
system state to detect whether the link operation is triggered by a hot
plug operation or not. This is addressed by the patches 1 and 2 of this
series.
Issue (b) will be addressed separately.
This patch (of 2):
The memmap_context enum is used to detect whether a memory operation is
due to a hot-add operation or happening at boot time.
Make it general to the hotplug operation and rename it as
meminit_context.
There is no functional change introduced by this patch
Suggested-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J . Wysocki" <rafael(a)kernel.org>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com
Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 0b3fb4c7af29..8e7b8c6c576e 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -538,7 +538,7 @@ virtual_memmap_init(u64 start, u64 end, void *arg)
if (map_start < map_end)
memmap_init_zone((unsigned long)(map_end - map_start),
args->nid, args->zone, page_to_pfn(map_start),
- MEMMAP_EARLY, NULL);
+ MEMINIT_EARLY, NULL);
return 0;
}
@@ -547,8 +547,8 @@ memmap_init (unsigned long size, int nid, unsigned long zone,
unsigned long start_pfn)
{
if (!vmem_map) {
- memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY,
- NULL);
+ memmap_init_zone(size, nid, zone, start_pfn,
+ MEMINIT_EARLY, NULL);
} else {
struct page *start;
struct memmap_init_callback_data args;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b2f370f0b420..48614513eb66 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2416,7 +2416,7 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn,
extern void set_dma_reserve(unsigned long new_dma_reserve);
extern void memmap_init_zone(unsigned long, int, unsigned long, unsigned long,
- enum memmap_context, struct vmem_altmap *);
+ enum meminit_context, struct vmem_altmap *);
extern void setup_per_zone_wmarks(void);
extern int __meminit init_per_zone_wmark_min(void);
extern void mem_init(void);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 8379432f4f2f..0f7a4ff4b059 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -824,10 +824,15 @@ bool zone_watermark_ok(struct zone *z, unsigned int order,
unsigned int alloc_flags);
bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
unsigned long mark, int highest_zoneidx);
-enum memmap_context {
- MEMMAP_EARLY,
- MEMMAP_HOTPLUG,
+/*
+ * Memory initialization context, use to differentiate memory added by
+ * the platform statically or via memory hotplug interface.
+ */
+enum meminit_context {
+ MEMINIT_EARLY,
+ MEMINIT_HOTPLUG,
};
+
extern void init_currently_empty_zone(struct zone *zone, unsigned long start_pfn,
unsigned long size);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b11a269e2356..345308a8bd15 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -729,7 +729,7 @@ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
* are reserved so nobody should be touching them so we should be safe
*/
memmap_init_zone(nr_pages, nid, zone_idx(zone), start_pfn,
- MEMMAP_HOTPLUG, altmap);
+ MEMINIT_HOTPLUG, altmap);
set_zone_contiguous(zone);
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index fab5e97dc9ca..5661fa164f13 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5975,7 +5975,7 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn)
* done. Non-atomic initialization, single-pass.
*/
void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
- unsigned long start_pfn, enum memmap_context context,
+ unsigned long start_pfn, enum meminit_context context,
struct vmem_altmap *altmap)
{
unsigned long pfn, end_pfn = start_pfn + size;
@@ -6007,7 +6007,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
* There can be holes in boot-time mem_map[]s handed to this
* function. They do not exist on hotplugged memory.
*/
- if (context == MEMMAP_EARLY) {
+ if (context == MEMINIT_EARLY) {
if (overlap_memmap_init(zone, &pfn))
continue;
if (defer_init(nid, pfn, end_pfn))
@@ -6016,7 +6016,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
page = pfn_to_page(pfn);
__init_single_page(page, pfn, zone, nid);
- if (context == MEMMAP_HOTPLUG)
+ if (context == MEMINIT_HOTPLUG)
__SetPageReserved(page);
/*
@@ -6099,7 +6099,7 @@ void __ref memmap_init_zone_device(struct zone *zone,
* check here not to call set_pageblock_migratetype() against
* pfn out of zone.
*
- * Please note that MEMMAP_HOTPLUG path doesn't clear memmap
+ * Please note that MEMINIT_HOTPLUG path doesn't clear memmap
* because this is done early in section_activate()
*/
if (!(pfn & (pageblock_nr_pages - 1))) {
@@ -6137,7 +6137,7 @@ void __meminit __weak memmap_init(unsigned long size, int nid,
if (end_pfn > start_pfn) {
size = end_pfn - start_pfn;
memmap_init_zone(size, nid, zone, start_pfn,
- MEMMAP_EARLY, NULL);
+ MEMINIT_EARLY, NULL);
}
}
}
The patch below does not apply to the 5.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ee1dfad5325ff1cfb2239e564cd411b3bfe8667a Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer(a)redhat.com>
Date: Mon, 14 Sep 2020 13:04:19 -0400
Subject: [PATCH] dm: fix bio splitting and its bio completion order for
regular IO
dm_queue_split() is removed because __split_and_process_bio() _must_
handle splitting bios to ensure proper bio submission and completion
ordering as a bio is split.
Otherwise, multiple recursive calls to ->submit_bio will cause multiple
split bios to be allocated from the same ->bio_split mempool at the same
time. This would result in deadlock in low memory conditions because no
progress could be made (only one bio is available in ->bio_split
mempool).
This fix has been verified to still fix the loss of performance, due
to excess splitting, that commit 120c9257f5f1 provided.
Fixes: 120c9257f5f1 ("Revert "dm: always call blk_queue_split() in dm_process_bio()"")
Cc: stable(a)vger.kernel.org # 5.0+, requires custom backport due to 5.9 changes
Reported-by: Ming Lei <ming.lei(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 4a40df8af7d3..d948cd522431 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1724,23 +1724,6 @@ static blk_qc_t __process_bio(struct mapped_device *md, struct dm_table *map,
return ret;
}
-static void dm_queue_split(struct mapped_device *md, struct dm_target *ti, struct bio **bio)
-{
- unsigned len, sector_count;
-
- sector_count = bio_sectors(*bio);
- len = min_t(sector_t, max_io_len((*bio)->bi_iter.bi_sector, ti), sector_count);
-
- if (sector_count > len) {
- struct bio *split = bio_split(*bio, len, GFP_NOIO, &md->queue->bio_split);
-
- bio_chain(split, *bio);
- trace_block_split(md->queue, split, (*bio)->bi_iter.bi_sector);
- submit_bio_noacct(*bio);
- *bio = split;
- }
-}
-
static blk_qc_t dm_process_bio(struct mapped_device *md,
struct dm_table *map, struct bio *bio)
{
@@ -1768,14 +1751,12 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
if (current->bio_list) {
if (is_abnormal_io(bio))
blk_queue_split(&bio);
- else
- dm_queue_split(md, ti, &bio);
+ /* regular IO is split by __split_and_process_bio */
}
if (dm_get_md_type(md) == DM_TYPE_NVME_BIO_BASED)
return __process_bio(md, map, bio, ti);
- else
- return __split_and_process_bio(md, map, bio);
+ return __split_and_process_bio(md, map, bio);
}
static blk_qc_t dm_submit_bio(struct bio *bio)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c4ad98e4b72cb5be30ea282fce935248f2300e62 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Tue, 15 Sep 2020 11:42:17 +0100
Subject: [PATCH] KVM: arm64: Assume write fault on S1PTW permission fault on
instruction fetch
KVM currently assumes that an instruction abort can never be a write.
This is in general true, except when the abort is triggered by
a S1PTW on instruction fetch that tries to update the S1 page tables
(to set AF, for example).
This can happen if the page tables have been paged out and brought
back in without seeing a direct write to them (they are thus marked
read only), and the fault handling code will make the PT executable(!)
instead of writable. The guest gets stuck forever.
In these conditions, the permission fault must be considered as
a write so that the Stage-1 update can take place. This is essentially
the I-side equivalent of the problem fixed by 60e21a0ef54c ("arm64: KVM:
Take S1 walks into account when determining S2 write faults").
Update kvm_is_write_fault() to return true on IABT+S1PTW, and introduce
kvm_vcpu_trap_is_exec_fault() that only return true when no faulting
on a S1 fault. Additionally, kvm_vcpu_dabt_iss1tw() is renamed to
kvm_vcpu_abt_iss1tw(), as the above makes it plain that it isn't
specific to data abort.
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Reviewed-by: Will Deacon <will(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20200915104218.1284701-2-maz@kernel.org
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 49a55be2b9a2..4f618af660ba 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -298,7 +298,7 @@ static __always_inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu)
return (kvm_vcpu_get_esr(vcpu) & ESR_ELx_SRT_MASK) >> ESR_ELx_SRT_SHIFT;
}
-static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_vcpu_abt_iss1tw(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_S1PTW);
}
@@ -306,7 +306,7 @@ static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
static __always_inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_WNR) ||
- kvm_vcpu_dabt_iss1tw(vcpu); /* AF/DBM update */
+ kvm_vcpu_abt_iss1tw(vcpu); /* AF/DBM update */
}
static inline bool kvm_vcpu_dabt_is_cm(const struct kvm_vcpu *vcpu)
@@ -335,6 +335,11 @@ static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu)
return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW;
}
+static inline bool kvm_vcpu_trap_is_exec_fault(const struct kvm_vcpu *vcpu)
+{
+ return kvm_vcpu_trap_is_iabt(vcpu) && !kvm_vcpu_abt_iss1tw(vcpu);
+}
+
static __always_inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu)
{
return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC;
@@ -372,6 +377,9 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
{
+ if (kvm_vcpu_abt_iss1tw(vcpu))
+ return true;
+
if (kvm_vcpu_trap_is_iabt(vcpu))
return false;
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 426ef65601dd..d64c5d56c860 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -445,7 +445,7 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
kvm_vcpu_dabt_isvalid(vcpu) &&
!kvm_vcpu_abt_issea(vcpu) &&
- !kvm_vcpu_dabt_iss1tw(vcpu);
+ !kvm_vcpu_abt_iss1tw(vcpu);
if (valid) {
int ret = __vgic_v2_perform_cpuif_access(vcpu);
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f58d657a898d..9aec1ce491d2 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1843,7 +1843,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
struct kvm_s2_mmu *mmu = vcpu->arch.hw_mmu;
write_fault = kvm_is_write_fault(vcpu);
- exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
+ exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
VM_BUG_ON(write_fault && exec_fault);
if (fault_status == FSC_PERM && !write_fault && !exec_fault) {
@@ -2125,7 +2125,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
goto out;
}
- if (kvm_vcpu_dabt_iss1tw(vcpu)) {
+ if (kvm_vcpu_abt_iss1tw(vcpu)) {
kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
ret = 1;
goto out_unlock;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c4ad98e4b72cb5be30ea282fce935248f2300e62 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Tue, 15 Sep 2020 11:42:17 +0100
Subject: [PATCH] KVM: arm64: Assume write fault on S1PTW permission fault on
instruction fetch
KVM currently assumes that an instruction abort can never be a write.
This is in general true, except when the abort is triggered by
a S1PTW on instruction fetch that tries to update the S1 page tables
(to set AF, for example).
This can happen if the page tables have been paged out and brought
back in without seeing a direct write to them (they are thus marked
read only), and the fault handling code will make the PT executable(!)
instead of writable. The guest gets stuck forever.
In these conditions, the permission fault must be considered as
a write so that the Stage-1 update can take place. This is essentially
the I-side equivalent of the problem fixed by 60e21a0ef54c ("arm64: KVM:
Take S1 walks into account when determining S2 write faults").
Update kvm_is_write_fault() to return true on IABT+S1PTW, and introduce
kvm_vcpu_trap_is_exec_fault() that only return true when no faulting
on a S1 fault. Additionally, kvm_vcpu_dabt_iss1tw() is renamed to
kvm_vcpu_abt_iss1tw(), as the above makes it plain that it isn't
specific to data abort.
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Reviewed-by: Will Deacon <will(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20200915104218.1284701-2-maz@kernel.org
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 49a55be2b9a2..4f618af660ba 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -298,7 +298,7 @@ static __always_inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu)
return (kvm_vcpu_get_esr(vcpu) & ESR_ELx_SRT_MASK) >> ESR_ELx_SRT_SHIFT;
}
-static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_vcpu_abt_iss1tw(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_S1PTW);
}
@@ -306,7 +306,7 @@ static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
static __always_inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_WNR) ||
- kvm_vcpu_dabt_iss1tw(vcpu); /* AF/DBM update */
+ kvm_vcpu_abt_iss1tw(vcpu); /* AF/DBM update */
}
static inline bool kvm_vcpu_dabt_is_cm(const struct kvm_vcpu *vcpu)
@@ -335,6 +335,11 @@ static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu)
return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW;
}
+static inline bool kvm_vcpu_trap_is_exec_fault(const struct kvm_vcpu *vcpu)
+{
+ return kvm_vcpu_trap_is_iabt(vcpu) && !kvm_vcpu_abt_iss1tw(vcpu);
+}
+
static __always_inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu)
{
return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC;
@@ -372,6 +377,9 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
{
+ if (kvm_vcpu_abt_iss1tw(vcpu))
+ return true;
+
if (kvm_vcpu_trap_is_iabt(vcpu))
return false;
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 426ef65601dd..d64c5d56c860 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -445,7 +445,7 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
kvm_vcpu_dabt_isvalid(vcpu) &&
!kvm_vcpu_abt_issea(vcpu) &&
- !kvm_vcpu_dabt_iss1tw(vcpu);
+ !kvm_vcpu_abt_iss1tw(vcpu);
if (valid) {
int ret = __vgic_v2_perform_cpuif_access(vcpu);
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f58d657a898d..9aec1ce491d2 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1843,7 +1843,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
struct kvm_s2_mmu *mmu = vcpu->arch.hw_mmu;
write_fault = kvm_is_write_fault(vcpu);
- exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
+ exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
VM_BUG_ON(write_fault && exec_fault);
if (fault_status == FSC_PERM && !write_fault && !exec_fault) {
@@ -2125,7 +2125,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
goto out;
}
- if (kvm_vcpu_dabt_iss1tw(vcpu)) {
+ if (kvm_vcpu_abt_iss1tw(vcpu)) {
kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
ret = 1;
goto out_unlock;
The patch below does not apply to the 5.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c4ad98e4b72cb5be30ea282fce935248f2300e62 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Tue, 15 Sep 2020 11:42:17 +0100
Subject: [PATCH] KVM: arm64: Assume write fault on S1PTW permission fault on
instruction fetch
KVM currently assumes that an instruction abort can never be a write.
This is in general true, except when the abort is triggered by
a S1PTW on instruction fetch that tries to update the S1 page tables
(to set AF, for example).
This can happen if the page tables have been paged out and brought
back in without seeing a direct write to them (they are thus marked
read only), and the fault handling code will make the PT executable(!)
instead of writable. The guest gets stuck forever.
In these conditions, the permission fault must be considered as
a write so that the Stage-1 update can take place. This is essentially
the I-side equivalent of the problem fixed by 60e21a0ef54c ("arm64: KVM:
Take S1 walks into account when determining S2 write faults").
Update kvm_is_write_fault() to return true on IABT+S1PTW, and introduce
kvm_vcpu_trap_is_exec_fault() that only return true when no faulting
on a S1 fault. Additionally, kvm_vcpu_dabt_iss1tw() is renamed to
kvm_vcpu_abt_iss1tw(), as the above makes it plain that it isn't
specific to data abort.
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Reviewed-by: Will Deacon <will(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20200915104218.1284701-2-maz@kernel.org
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 49a55be2b9a2..4f618af660ba 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -298,7 +298,7 @@ static __always_inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu)
return (kvm_vcpu_get_esr(vcpu) & ESR_ELx_SRT_MASK) >> ESR_ELx_SRT_SHIFT;
}
-static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_vcpu_abt_iss1tw(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_S1PTW);
}
@@ -306,7 +306,7 @@ static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
static __always_inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_WNR) ||
- kvm_vcpu_dabt_iss1tw(vcpu); /* AF/DBM update */
+ kvm_vcpu_abt_iss1tw(vcpu); /* AF/DBM update */
}
static inline bool kvm_vcpu_dabt_is_cm(const struct kvm_vcpu *vcpu)
@@ -335,6 +335,11 @@ static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu)
return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW;
}
+static inline bool kvm_vcpu_trap_is_exec_fault(const struct kvm_vcpu *vcpu)
+{
+ return kvm_vcpu_trap_is_iabt(vcpu) && !kvm_vcpu_abt_iss1tw(vcpu);
+}
+
static __always_inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu)
{
return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC;
@@ -372,6 +377,9 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
{
+ if (kvm_vcpu_abt_iss1tw(vcpu))
+ return true;
+
if (kvm_vcpu_trap_is_iabt(vcpu))
return false;
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 426ef65601dd..d64c5d56c860 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -445,7 +445,7 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
kvm_vcpu_dabt_isvalid(vcpu) &&
!kvm_vcpu_abt_issea(vcpu) &&
- !kvm_vcpu_dabt_iss1tw(vcpu);
+ !kvm_vcpu_abt_iss1tw(vcpu);
if (valid) {
int ret = __vgic_v2_perform_cpuif_access(vcpu);
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f58d657a898d..9aec1ce491d2 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1843,7 +1843,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
struct kvm_s2_mmu *mmu = vcpu->arch.hw_mmu;
write_fault = kvm_is_write_fault(vcpu);
- exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
+ exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
VM_BUG_ON(write_fault && exec_fault);
if (fault_status == FSC_PERM && !write_fault && !exec_fault) {
@@ -2125,7 +2125,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
goto out;
}
- if (kvm_vcpu_dabt_iss1tw(vcpu)) {
+ if (kvm_vcpu_abt_iss1tw(vcpu)) {
kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
ret = 1;
goto out_unlock;
As Eric pointed out, this patch should not be applied. There were 2
patches addressing the very same leak and it seems one of them already
landed and this one is being applied on top of it, so in practice, this
patch is producing a double free.
Iago
On Sun, 2020-09-27 at 13:52 -0400, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> drm/v3d: don't leak bin job if v3d_job_init fails.
>
> to the 5.4-stable tree which can be found at:
>
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> drm-v3d-don-t-leak-bin-job-if-v3d_job_init-fails.patch
> and it can be found in the queue-5.4 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable
> tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit cd0324f318e4447471e0bb01d4bea4f5de4aab1e
> Author: Iago Toral Quiroga <itoral(a)igalia.com>
> Date: Mon Sep 16 09:11:25 2019 +0200
>
> drm/v3d: don't leak bin job if v3d_job_init fails.
>
> [ Upstream commit 0d352a3a8a1f26168d09f7073e61bb4b328e3bb9 ]
>
> If the initialization of the job fails we need to kfree() it
> before returning.
>
> Signed-off-by: Iago Toral Quiroga <itoral(a)igalia.com>
> Signed-off-by: Eric Anholt <eric(a)anholt.net>
> Link:
> https://patchwork.freedesktop.org/patch/msgid/20190916071125.5255-1-itoral@…
> Fixes: a783a09ee76d ("drm/v3d: Refactor job management.")
> Reviewed-by: Eric Anholt <eric(a)anholt.net>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c
> b/drivers/gpu/drm/v3d/v3d_gem.c
> index 19c092d75266b..6316bf3646af5 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -565,6 +565,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void
> *data,
> ret = v3d_job_init(v3d, file_priv, &bin->base,
> v3d_job_free, args->in_sync_bcl);
> if (ret) {
> + kfree(bin);
> v3d_job_put(&render->base);
> kfree(bin);
> return ret;
>
This is a note to let you know that I've just added the patch titled
vt_ioctl: make VT_RESIZEX behave like VT_RESIZE
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 988d0763361bb65690d60e2bc53a6b72777040c3 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp>
Date: Sun, 27 Sep 2020 20:46:30 +0900
Subject: vt_ioctl: make VT_RESIZEX behave like VT_RESIZE
syzbot is reporting UAF/OOB read at bit_putcs()/soft_cursor() [1][2], for
vt_resizex() from ioctl(VT_RESIZEX) allows setting font height larger than
actual font height calculated by con_font_set() from ioctl(PIO_FONT).
Since fbcon_set_font() from con_font_set() allocates minimal amount of
memory based on actual font height calculated by con_font_set(),
use of vt_resizex() can cause UAF/OOB read for font data.
VT_RESIZEX was introduced in Linux 1.3.3, but it is unclear that what
comes to the "+ more" part, and I couldn't find a user of VT_RESIZEX.
#define VT_RESIZE 0x5609 /* set kernel's idea of screensize */
#define VT_RESIZEX 0x560A /* set kernel's idea of screensize + more */
So far we are not aware of syzbot reports caused by setting non-zero value
to v_vlin parameter. But given that it is possible that nobody is using
VT_RESIZEX, we can try removing support for v_clin and v_vlin parameters.
Therefore, this patch effectively makes VT_RESIZEX behave like VT_RESIZE,
with emitting a message if somebody is still using v_clin and/or v_vlin
parameters.
[1] https://syzkaller.appspot.com/bug?id=32577e96d88447ded2d3b76d71254fb8552458…
[2] https://syzkaller.appspot.com/bug?id=6b8355d27b2b94fb5cedf4655e3a59162d9e48…
Reported-by: syzbot <syzbot+b308f5fd049fbbc6e74f(a)syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+16469b5e8e5a72e9131e(a)syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/4933b81b-9b1a-355b-df0e-9b31e8280ab9@i-love.sakur…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/vt/vt_ioctl.c | 57 +++++++--------------------------------
1 file changed, 10 insertions(+), 47 deletions(-)
diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c
index 2ea76a09e07f..0a33b8ababe3 100644
--- a/drivers/tty/vt/vt_ioctl.c
+++ b/drivers/tty/vt/vt_ioctl.c
@@ -772,58 +772,21 @@ static int vt_resizex(struct vc_data *vc, struct vt_consize __user *cs)
if (copy_from_user(&v, cs, sizeof(struct vt_consize)))
return -EFAULT;
- /* FIXME: Should check the copies properly */
- if (!v.v_vlin)
- v.v_vlin = vc->vc_scan_lines;
-
- if (v.v_clin) {
- int rows = v.v_vlin / v.v_clin;
- if (v.v_rows != rows) {
- if (v.v_rows) /* Parameters don't add up */
- return -EINVAL;
- v.v_rows = rows;
- }
- }
-
- if (v.v_vcol && v.v_ccol) {
- int cols = v.v_vcol / v.v_ccol;
- if (v.v_cols != cols) {
- if (v.v_cols)
- return -EINVAL;
- v.v_cols = cols;
- }
- }
-
- if (v.v_clin > 32)
- return -EINVAL;
+ if (v.v_vlin)
+ pr_info_once("\"struct vt_consize\"->v_vlin is ignored. Please report if you need this.\n");
+ if (v.v_clin)
+ pr_info_once("\"struct vt_consize\"->v_clin is ignored. Please report if you need this.\n");
+ console_lock();
for (i = 0; i < MAX_NR_CONSOLES; i++) {
- struct vc_data *vcp;
+ vc = vc_cons[i].d;
- if (!vc_cons[i].d)
- continue;
- console_lock();
- vcp = vc_cons[i].d;
- if (vcp) {
- int ret;
- int save_scan_lines = vcp->vc_scan_lines;
- int save_font_height = vcp->vc_font.height;
-
- if (v.v_vlin)
- vcp->vc_scan_lines = v.v_vlin;
- if (v.v_clin)
- vcp->vc_font.height = v.v_clin;
- vcp->vc_resize_user = 1;
- ret = vc_resize(vcp, v.v_cols, v.v_rows);
- if (ret) {
- vcp->vc_scan_lines = save_scan_lines;
- vcp->vc_font.height = save_font_height;
- console_unlock();
- return ret;
- }
+ if (vc) {
+ vc->vc_resize_user = 1;
+ vc_resize(vc, v.v_cols, v.v_rows);
}
- console_unlock();
}
+ console_unlock();
return 0;
}
--
2.28.0
musb_queue_resume_work() would call the provided callback if the runtime
PM status was 'active'. Otherwise, it would enqueue the request if the
hardware was still suspended (musb->is_runtime_suspended is true).
This causes a race with the runtime PM handlers, as it is possible to be
in the case where the runtime PM status is not yet 'active', but the
hardware has been awaken (PM resume function has been called).
When hitting the race, the resume work was not enqueued, which probably
triggered other bugs further down the stack. For instance, a telnet
connection on Ingenic SoCs would result in a 50/50 chance of a
segmentation fault somewhere in the musb code.
Rework the code so that either we call the callback directly if
(musb->is_runtime_suspended == 0), or enqueue the query otherwise.
Fixes: ea2f35c01d5e ("usb: musb: Fix sleeping function called from invalid context for hdrc glue")
Cc: stable(a)vger.kernel.org # v4.9
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
---
drivers/usb/musb/musb_core.c | 31 +++++++++++++++++--------------
1 file changed, 17 insertions(+), 14 deletions(-)
diff --git a/drivers/usb/musb/musb_core.c b/drivers/usb/musb/musb_core.c
index 384a8039a7fd..462c10d7455a 100644
--- a/drivers/usb/musb/musb_core.c
+++ b/drivers/usb/musb/musb_core.c
@@ -2241,32 +2241,35 @@ int musb_queue_resume_work(struct musb *musb,
{
struct musb_pending_work *w;
unsigned long flags;
+ bool is_suspended;
int error;
if (WARN_ON(!callback))
return -EINVAL;
- if (pm_runtime_active(musb->controller))
- return callback(musb, data);
+ spin_lock_irqsave(&musb->list_lock, flags);
+ is_suspended = musb->is_runtime_suspended;
+
+ if (is_suspended) {
+ w = devm_kzalloc(musb->controller, sizeof(*w), GFP_ATOMIC);
+ if (!w) {
+ error = -ENOMEM;
+ goto out_unlock;
+ }
- w = devm_kzalloc(musb->controller, sizeof(*w), GFP_ATOMIC);
- if (!w)
- return -ENOMEM;
+ w->callback = callback;
+ w->data = data;
- w->callback = callback;
- w->data = data;
- spin_lock_irqsave(&musb->list_lock, flags);
- if (musb->is_runtime_suspended) {
list_add_tail(&w->node, &musb->pending_list);
error = 0;
- } else {
- dev_err(musb->controller, "could not add resume work %p\n",
- callback);
- devm_kfree(musb->controller, w);
- error = -EINPROGRESS;
}
+
+out_unlock:
spin_unlock_irqrestore(&musb->list_lock, flags);
+ if (!is_suspended)
+ error = callback(musb, data);
+
return error;
}
EXPORT_SYMBOL_GPL(musb_queue_resume_work);
--
2.28.0
Hello Dear Friendhow are you doing? , I do hope that you are swimming in the ocean of health.My name is AYAZ IBRAHIM I own a Mining company and have been into Gold mining with 25 years Experience.I have 99.99% 23 carat Gold been deposited in safekeeping company, the total kilogram is 300 Kilogram, I am looking for a potential buy that will buy the Gold at good price .Gold global price for 23 carat 1 kilogram is approximately $43,000.00 USD, I have decided to sale the 300 kilograms of Gold to a potential buyer at the selling rate of USD$29,000.00 for one KG, this is to enable me raise the quick finance that I need to put into my plant project in other to facilitate the project .
I Hope to hear from you ASAP so we can proceed further.With RegardsMR. AYAZ IBRAHIM
The use of PHY_REFCLK_USE_PAD introduced a regression for apq8064
devices. It was tested that while apq doesn't require the padding, ipq
SoC must use it or the kernel hangs on boot.
Fixes: de3c4bf6489 ("PCI: qcom: Add support for tx term offset for rev 2.1.0")
Reported-by: Ilia Mirkin <imirkin(a)alum.mit.edu>
Signed-off-by: Ilia Mirkin <imirkin(a)alum.mit.edu>
Signed-off-by: Ansuel Smith <ansuelsmth(a)gmail.com>
Cc: stable(a)vger.kernel.org # v4.19+
---
drivers/pci/controller/dwc/pcie-qcom.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 3aac77a295ba..dad6e9ce66ba 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -387,7 +387,9 @@ static int qcom_pcie_init_2_1_0(struct qcom_pcie *pcie)
/* enable external reference clock */
val = readl(pcie->parf + PCIE20_PARF_PHY_REFCLK);
- val &= ~PHY_REFCLK_USE_PAD;
+ /* USE_PAD is required only for ipq806x */
+ if (!of_device_is_compatible(node, "qcom,pcie-apq8064"))
+ val &= ~PHY_REFCLK_USE_PAD;
val |= PHY_REFCLK_SSP_EN;
writel(val, pcie->parf + PCIE20_PARF_PHY_REFCLK);
--
2.27.0
If interrupt comes late, during probe error path or device remove (could
be triggered with CONFIG_DEBUG_SHIRQ), the interrupt handler
i2c_imx_isr() will access registers with the clock being disabled. This
leads to external abort on non-linefetch on Toradex Colibri VF50 module
(with Vybrid VF5xx):
Unhandled fault: external abort on non-linefetch (0x1008) at 0x8882d003
Internal error: : 1008 [#1] ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 5.7.0 #607
Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
(i2c_imx_isr) from [<8017009c>] (free_irq+0x25c/0x3b0)
(free_irq) from [<805844ec>] (release_nodes+0x178/0x284)
(release_nodes) from [<80580030>] (really_probe+0x10c/0x348)
(really_probe) from [<80580380>] (driver_probe_device+0x60/0x170)
(driver_probe_device) from [<80580630>] (device_driver_attach+0x58/0x60)
(device_driver_attach) from [<805806bc>] (__driver_attach+0x84/0xc0)
(__driver_attach) from [<8057e228>] (bus_for_each_dev+0x68/0xb4)
(bus_for_each_dev) from [<8057f3ec>] (bus_add_driver+0x144/0x1ec)
(bus_add_driver) from [<80581320>] (driver_register+0x78/0x110)
(driver_register) from [<8010213c>] (do_one_initcall+0xa8/0x2f4)
(do_one_initcall) from [<80c0100c>] (kernel_init_freeable+0x178/0x1dc)
(kernel_init_freeable) from [<80807048>] (kernel_init+0x8/0x110)
(kernel_init) from [<80100114>] (ret_from_fork+0x14/0x20)
Additionally, the i2c_imx_isr() could wake up the wait queue
(imx_i2c_struct->queue) before its initialization happens.
The resource-managed framework should not be used for interrupt handling,
because the resource will be released too late - after disabling clocks.
The interrupt handler is not prepared for such case.
Fixes: 1c4b6c3bcf30 ("i2c: imx: implement bus recovery")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzk(a)kernel.org>
---
Changes since v2:
1. Rebase
Changes since v1:
1. Remove the devm- and use regular methods.
---
drivers/i2c/busses/i2c-imx.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c
index 63f4367c312b..c98529c76348 100644
--- a/drivers/i2c/busses/i2c-imx.c
+++ b/drivers/i2c/busses/i2c-imx.c
@@ -1169,14 +1169,6 @@ static int i2c_imx_probe(struct platform_device *pdev)
return ret;
}
- /* Request IRQ */
- ret = devm_request_irq(&pdev->dev, irq, i2c_imx_isr, IRQF_SHARED,
- pdev->name, i2c_imx);
- if (ret) {
- dev_err(&pdev->dev, "can't claim irq %d\n", irq);
- goto clk_disable;
- }
-
/* Init queue */
init_waitqueue_head(&i2c_imx->queue);
@@ -1195,6 +1187,14 @@ static int i2c_imx_probe(struct platform_device *pdev)
if (ret < 0)
goto rpm_disable;
+ /* Request IRQ */
+ ret = request_threaded_irq(irq, i2c_imx_isr, NULL, IRQF_SHARED,
+ pdev->name, i2c_imx);
+ if (ret) {
+ dev_err(&pdev->dev, "can't claim irq %d\n", irq);
+ goto rpm_disable;
+ }
+
/* Set up clock divider */
i2c_imx->bitrate = I2C_MAX_STANDARD_MODE_FREQ;
ret = of_property_read_u32(pdev->dev.of_node,
@@ -1237,13 +1237,12 @@ static int i2c_imx_probe(struct platform_device *pdev)
clk_notifier_unregister:
clk_notifier_unregister(i2c_imx->clk, &i2c_imx->clk_change_nb);
+ free_irq(irq, i2c_imx);
rpm_disable:
pm_runtime_put_noidle(&pdev->dev);
pm_runtime_disable(&pdev->dev);
pm_runtime_set_suspended(&pdev->dev);
pm_runtime_dont_use_autosuspend(&pdev->dev);
-
-clk_disable:
clk_disable_unprepare(i2c_imx->clk);
return ret;
}
@@ -1251,7 +1250,7 @@ static int i2c_imx_probe(struct platform_device *pdev)
static int i2c_imx_remove(struct platform_device *pdev)
{
struct imx_i2c_struct *i2c_imx = platform_get_drvdata(pdev);
- int ret;
+ int irq, ret;
ret = pm_runtime_get_sync(&pdev->dev);
if (ret < 0)
@@ -1271,6 +1270,9 @@ static int i2c_imx_remove(struct platform_device *pdev)
imx_i2c_write_reg(0, i2c_imx, IMX_I2C_I2SR);
clk_notifier_unregister(i2c_imx->clk, &i2c_imx->clk_change_nb);
+ irq = platform_get_irq(pdev, 0);
+ if (irq >= 0)
+ free_irq(irq, i2c_imx);
clk_disable_unprepare(i2c_imx->clk);
pm_runtime_put_noidle(&pdev->dev);
--
2.17.1
The original problem was from nvme-over-tcp code, who mistakenly uses
kernel_sendpage() to send pages allocated by __get_free_pages() without
__GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
tail pages, sending them by kernel_sendpage() may trigger a kernel panic
from a corrupted kernel heap, because these pages are incorrectly freed
in network stack as page_count 0 pages.
This patch introduces a helper sendpage_ok(), it returns true if the
checking page,
- is not slab page: PageSlab(page) is false.
- has page refcount: page_count(page) is not zero
All drivers who want to send page to remote end by kernel_sendpage()
may use this helper to check whether the page is OK. If the helper does
not return true, the driver should try other non sendpage method (e.g.
sock_no_sendpage()) to handle the page.
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Jan Kara <jack(a)suse.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy(a)solarflare.com>
Cc: Philipp Reisner <philipp.reisner(a)linbit.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Vlastimil Babka <vbabka(a)suse.com>
Cc: stable(a)vger.kernel.org
---
include/linux/net.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/include/linux/net.h b/include/linux/net.h
index d48ff1180879..05db8690f67e 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -21,6 +21,7 @@
#include <linux/rcupdate.h>
#include <linux/once.h>
#include <linux/fs.h>
+#include <linux/mm.h>
#include <linux/sockptr.h>
#include <uapi/linux/net.h>
@@ -286,6 +287,21 @@ do { \
#define net_get_random_once_wait(buf, nbytes) \
get_random_once_wait((buf), (nbytes))
+/*
+ * E.g. XFS meta- & log-data is in slab pages, or bcache meta
+ * data pages, or other high order pages allocated by
+ * __get_free_pages() without __GFP_COMP, which have a page_count
+ * of 0 and/or have PageSlab() set. We cannot use send_page for
+ * those, as that does get_page(); put_page(); and would cause
+ * either a VM_BUG directly, or __page_cache_release a page that
+ * would actually still be referenced by someone, leading to some
+ * obscure delayed Oops somewhere else.
+ */
+static inline bool sendpage_ok(struct page *page)
+{
+ return !PageSlab(page) && page_count(page) >= 1;
+}
+
int kernel_sendmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec,
size_t num, size_t len);
int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
--
2.26.2
This is a note to let you know that I've just added the patch titled
vt_ioctl: make VT_RESIZEX behave like VT_RESIZE
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 988d0763361bb65690d60e2bc53a6b72777040c3 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp>
Date: Sun, 27 Sep 2020 20:46:30 +0900
Subject: vt_ioctl: make VT_RESIZEX behave like VT_RESIZE
syzbot is reporting UAF/OOB read at bit_putcs()/soft_cursor() [1][2], for
vt_resizex() from ioctl(VT_RESIZEX) allows setting font height larger than
actual font height calculated by con_font_set() from ioctl(PIO_FONT).
Since fbcon_set_font() from con_font_set() allocates minimal amount of
memory based on actual font height calculated by con_font_set(),
use of vt_resizex() can cause UAF/OOB read for font data.
VT_RESIZEX was introduced in Linux 1.3.3, but it is unclear that what
comes to the "+ more" part, and I couldn't find a user of VT_RESIZEX.
#define VT_RESIZE 0x5609 /* set kernel's idea of screensize */
#define VT_RESIZEX 0x560A /* set kernel's idea of screensize + more */
So far we are not aware of syzbot reports caused by setting non-zero value
to v_vlin parameter. But given that it is possible that nobody is using
VT_RESIZEX, we can try removing support for v_clin and v_vlin parameters.
Therefore, this patch effectively makes VT_RESIZEX behave like VT_RESIZE,
with emitting a message if somebody is still using v_clin and/or v_vlin
parameters.
[1] https://syzkaller.appspot.com/bug?id=32577e96d88447ded2d3b76d71254fb8552458…
[2] https://syzkaller.appspot.com/bug?id=6b8355d27b2b94fb5cedf4655e3a59162d9e48…
Reported-by: syzbot <syzbot+b308f5fd049fbbc6e74f(a)syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+16469b5e8e5a72e9131e(a)syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/4933b81b-9b1a-355b-df0e-9b31e8280ab9@i-love.sakur…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/vt/vt_ioctl.c | 57 +++++++--------------------------------
1 file changed, 10 insertions(+), 47 deletions(-)
diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c
index 2ea76a09e07f..0a33b8ababe3 100644
--- a/drivers/tty/vt/vt_ioctl.c
+++ b/drivers/tty/vt/vt_ioctl.c
@@ -772,58 +772,21 @@ static int vt_resizex(struct vc_data *vc, struct vt_consize __user *cs)
if (copy_from_user(&v, cs, sizeof(struct vt_consize)))
return -EFAULT;
- /* FIXME: Should check the copies properly */
- if (!v.v_vlin)
- v.v_vlin = vc->vc_scan_lines;
-
- if (v.v_clin) {
- int rows = v.v_vlin / v.v_clin;
- if (v.v_rows != rows) {
- if (v.v_rows) /* Parameters don't add up */
- return -EINVAL;
- v.v_rows = rows;
- }
- }
-
- if (v.v_vcol && v.v_ccol) {
- int cols = v.v_vcol / v.v_ccol;
- if (v.v_cols != cols) {
- if (v.v_cols)
- return -EINVAL;
- v.v_cols = cols;
- }
- }
-
- if (v.v_clin > 32)
- return -EINVAL;
+ if (v.v_vlin)
+ pr_info_once("\"struct vt_consize\"->v_vlin is ignored. Please report if you need this.\n");
+ if (v.v_clin)
+ pr_info_once("\"struct vt_consize\"->v_clin is ignored. Please report if you need this.\n");
+ console_lock();
for (i = 0; i < MAX_NR_CONSOLES; i++) {
- struct vc_data *vcp;
+ vc = vc_cons[i].d;
- if (!vc_cons[i].d)
- continue;
- console_lock();
- vcp = vc_cons[i].d;
- if (vcp) {
- int ret;
- int save_scan_lines = vcp->vc_scan_lines;
- int save_font_height = vcp->vc_font.height;
-
- if (v.v_vlin)
- vcp->vc_scan_lines = v.v_vlin;
- if (v.v_clin)
- vcp->vc_font.height = v.v_clin;
- vcp->vc_resize_user = 1;
- ret = vc_resize(vcp, v.v_cols, v.v_rows);
- if (ret) {
- vcp->vc_scan_lines = save_scan_lines;
- vcp->vc_font.height = save_font_height;
- console_unlock();
- return ret;
- }
+ if (vc) {
+ vc->vc_resize_user = 1;
+ vc_resize(vc, v.v_cols, v.v_rows);
}
- console_unlock();
}
+ console_unlock();
return 0;
}
--
2.28.0
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 7ed24f1e8cf1 - net/mlx5e: Fix endianness when calculating pedit mask first bit
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ❌ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ❌ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 0c52d2a4d9e9 - Linux 5.8.12
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ❌ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
This is the start of the stable review cycle for the 5.8.12 release.
There are 56 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 27 Sep 2020 12:47:02 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.12-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.8.12-rc1
Maor Dickman <maord(a)nvidia.com>
net/mlx5e: Fix endianness when calculating pedit mask first bit
Maxim Mikityanskiy <maximmi(a)mellanox.com>
net/mlx5e: Use synchronize_rcu to sync with NAPI
Maxim Mikityanskiy <maximmi(a)mellanox.com>
net/mlx5e: Use RCU to protect rq->xdp_prog
Taehee Yoo <ap420073(a)gmail.com>
Revert "netns: don't disable BHs when locking "nsid_lock""
Parshuram Thombare <pthombar(a)cadence.com>
net: macb: fix for pause frame receive enable bit
Matthias Schiffer <matthias.schiffer(a)ew.tq-group.com>
net: dsa: microchip: ksz8795: really set the correct number of ports
Vladimir Oltean <olteanv(a)gmail.com>
net: dsa: link interfaces with the DSA master to get rid of lockdep warnings
Dexuan Cui <decui(a)microsoft.com>
hv_netvsc: Fix hibernation for mlx5 VF driver
Luo bin <luobin9(a)huawei.com>
hinic: fix rewaking txq after netif_tx_disable
Jianbo Liu <jianbol(a)mellanox.com>
net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready
Vadym Kochan <vadym.kochan(a)plvision.eu>
net: ipa: fix u32_replace_bits by u32p_xxx version
Jason A. Donenfeld <Jason(a)zx2c4.com>
wireguard: peerlookup: take lock before checking hash in replace operation
Jason A. Donenfeld <Jason(a)zx2c4.com>
wireguard: noise: take lock when removing handshake entry from table
Grygorii Strashko <grygorii.strashko(a)ti.com>
net: ethernet: ti: cpsw_new: fix suspend/resume
Eric Dumazet <edumazet(a)google.com>
net: add __must_check to skb_put_padto()
Eric Dumazet <edumazet(a)google.com>
net: qrtr: check skb_put_padto() return value
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: Do not warn in phy_stop() on PHY_DOWN
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: Avoid NPD upon phy_detach() when driver is unbound
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: Disable IRQs only if NAPI gets scheduled
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: Use napi_complete_done()
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: use netif_tx_napi_add() for TX NAPI
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: Wake TX queue again
Michael Chan <michael.chan(a)broadcom.com>
bnxt_en: Protect bnxt_set_eee() and bnxt_set_pauseparam() with mutex.
Edwin Peer <edwin.peer(a)broadcom.com>
bnxt_en: return proper error codes in bnxt_show_temp
Vasundhara Volam <vasundhara-v.volam(a)broadcom.com>
bnxt_en: Use memcpy to copy VPD field info.
Tariq Toukan <tariqt(a)mellanox.com>
net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
Maor Dickman <maord(a)mellanox.com>
net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
Xin Long <lucien.xin(a)gmail.com>
tipc: use skb_unshare() instead in tipc_buf_append()
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
tipc: fix shutdown() of connection oriented socket
Peilin Ye <yepeilin.cs(a)gmail.com>
tipc: Fix memory leak in tipc_group_create_member()
Vinicius Costa Gomes <vinicius.gomes(a)intel.com>
taprio: Fix allowing too small intervals
Jakub Kicinski <kuba(a)kernel.org>
nfp: use correct define to return NONE fec
Henry Ptasinski <hptasinski(a)google.com>
net: sctp: Fix IPv6 ancestor_size calc in sctp_copy_descendant
Yunsheng Lin <linyunsheng(a)huawei.com>
net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
Xin Long <lucien.xin(a)gmail.com>
net: sched: initialize with 0 before setting erspan md->u
Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
net: phy: call phy_disable_interrupts() in phy_attach_direct() instead
Maor Gottlieb <maorg(a)nvidia.com>
net/mlx5: Fix FTE cleanup
Necip Fazil Yildiran <fazilyildiran(a)gmail.com>
net: ipv6: fix kconfig dependency warning for IPV6_SEG6_HMAC
Ido Schimmel <idosch(a)nvidia.com>
net: Fix bridge enslavement failure
Linus Walleij <linus.walleij(a)linaro.org>
net: dsa: rtl8366: Properly clear member config
Petr Machata <petrm(a)nvidia.com>
net: DCB: Validate DCB_ATTR_DCB_BUFFER argument
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
Eric Dumazet <edumazet(a)google.com>
ipv6: avoid lockdep issue in fib6_del()
David Ahern <dsahern(a)kernel.org>
ipv4: Update exception handling for multipath routes via same device
David Ahern <dsahern(a)gmail.com>
ipv4: Initialize flowi4_multipath_hash in data path
Wei Wang <weiwan(a)google.com>
ip: fix tos reflection in ack and reset packets
Luo bin <luobin9(a)huawei.com>
hinic: bump up the timeout of SET_FUNC_STATE cmd
Dan Carpenter <dan.carpenter(a)oracle.com>
hdlc_ppp: add range checks in ppp_cp_parse_cr()
Mark Gray <mark.d.gray(a)redhat.com>
geneve: add transport ports in route lookup for geneve
Ganji Aravind <ganji.aravind(a)chelsio.com>
cxgb4: Fix offset when clearing filter byte counters
Raju Rangoju <rajur(a)chelsio.com>
cxgb4: fix memory leak during module unload
Vasundhara Volam <vasundhara-v.volam(a)broadcom.com>
bnxt_en: Fix NULL ptr dereference crash in bnxt_fw_reset_task()
Vasundhara Volam <vasundhara-v.volam(a)broadcom.com>
bnxt_en: Avoid sending firmware messages when AER error is detected.
Cong Wang <xiyou.wangcong(a)gmail.com>
act_ife: load meta modules before tcf_idr_check_alloc()
Jakub Kicinski <kuba(a)kernel.org>
ibmvnic: add missing parenthesis in do_reset()
Mingming Cao <mmc(a)linux.vnet.ibm.com>
ibmvnic fix NULL tx_pools and rx_tools issue at do_reset
-------------
Diffstat:
Makefile | 4 +-
drivers/net/dsa/microchip/ksz8795.c | 2 +-
drivers/net/dsa/rtl8366.c | 20 ++++---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 40 ++++++++-----
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 ++
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 31 +++++++----
drivers/net/ethernet/cadence/macb_main.c | 3 +-
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 9 ++-
drivers/net/ethernet/chelsio/cxgb4/cxgb4_mps.c | 2 +-
drivers/net/ethernet/huawei/hinic/hinic_hw_mgmt.c | 16 ++++--
drivers/net/ethernet/huawei/hinic/hinic_main.c | 24 ++++++++
drivers/net/ethernet/huawei/hinic/hinic_tx.c | 18 +-----
drivers/net/ethernet/ibm/ibmvnic.c | 21 ++++++-
drivers/net/ethernet/lantiq_xrx200.c | 21 ++++---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 14 +----
.../net/ethernet/mellanox/mlx5/core/en/xsk/setup.c | 3 +-
.../mellanox/mlx5/core/en_accel/tls_stats.c | 12 ++--
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 65 ++++++++++------------
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 12 +---
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 39 +++++++------
drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 17 ++++--
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 52 +++++++++--------
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 8 +--
.../net/ethernet/netronome/nfp/nfp_net_ethtool.c | 4 +-
drivers/net/ethernet/ti/cpsw_new.c | 53 ++++++++++++++++++
drivers/net/geneve.c | 37 ++++++++----
drivers/net/hyperv/netvsc_drv.c | 16 ++++--
drivers/net/ipa/ipa_table.c | 4 +-
drivers/net/phy/phy.c | 2 +-
drivers/net/phy/phy_device.c | 11 ++--
drivers/net/wan/hdlc_ppp.c | 16 ++++--
drivers/net/wireguard/noise.c | 5 +-
drivers/net/wireguard/peerlookup.c | 11 +++-
include/linux/skbuff.h | 7 ++-
include/net/flow.h | 1 +
include/net/sctp/structs.h | 8 ++-
net/bridge/br_vlan.c | 27 +++++----
net/core/dev.c | 2 +-
net/core/filter.c | 1 +
net/core/net_namespace.c | 22 ++++----
net/dcb/dcbnl.c | 8 +++
net/dsa/slave.c | 18 +++++-
net/ipv4/fib_frontend.c | 1 +
net/ipv4/ip_output.c | 3 +-
net/ipv4/route.c | 14 +++--
net/ipv6/Kconfig | 1 +
net/ipv6/ip6_fib.c | 13 +++--
net/qrtr/qrtr.c | 21 +++----
net/sched/act_ife.c | 44 +++++++++++----
net/sched/cls_flower.c | 1 +
net/sched/sch_generic.c | 48 +++++++++++-----
net/sched/sch_taprio.c | 28 ++++++----
net/sctp/socket.c | 9 +--
net/tipc/group.c | 14 +++--
net/tipc/msg.c | 3 +-
net/tipc/socket.c | 5 +-
58 files changed, 573 insertions(+), 326 deletions(-)
This is the start of the stable review cycle for the 5.4.68 release.
There are 43 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 27 Sep 2020 12:47:02 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.68-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.68-rc1
Suravee Suthikulpanit <suravee.suthikulpanit(a)amd.com>
iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE
Xunlei Pang <xlpang(a)linux.alibaba.com>
mm: memcg: fix memcg reclaim soft lockup
Eric Dumazet <edumazet(a)google.com>
net: add __must_check to skb_put_padto()
Eric Dumazet <edumazet(a)google.com>
net: qrtr: check skb_put_padto() return value
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: Do not warn in phy_stop() on PHY_DOWN
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: Avoid NPD upon phy_detach() when driver is unbound
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: Disable IRQs only if NAPI gets scheduled
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: Use napi_complete_done()
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: use netif_tx_napi_add() for TX NAPI
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: Wake TX queue again
Michael Chan <michael.chan(a)broadcom.com>
bnxt_en: Protect bnxt_set_eee() and bnxt_set_pauseparam() with mutex.
Edwin Peer <edwin.peer(a)broadcom.com>
bnxt_en: return proper error codes in bnxt_show_temp
Tariq Toukan <tariqt(a)mellanox.com>
net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
Maor Dickman <maord(a)mellanox.com>
net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
Xin Long <lucien.xin(a)gmail.com>
tipc: use skb_unshare() instead in tipc_buf_append()
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
tipc: fix shutdown() of connection oriented socket
Peilin Ye <yepeilin.cs(a)gmail.com>
tipc: Fix memory leak in tipc_group_create_member()
Vinicius Costa Gomes <vinicius.gomes(a)intel.com>
taprio: Fix allowing too small intervals
Jakub Kicinski <kuba(a)kernel.org>
nfp: use correct define to return NONE fec
Henry Ptasinski <hptasinski(a)google.com>
net: sctp: Fix IPv6 ancestor_size calc in sctp_copy_descendant
Yunsheng Lin <linyunsheng(a)huawei.com>
net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
Maor Gottlieb <maorg(a)nvidia.com>
net/mlx5: Fix FTE cleanup
Necip Fazil Yildiran <fazilyildiran(a)gmail.com>
net: ipv6: fix kconfig dependency warning for IPV6_SEG6_HMAC
Ido Schimmel <idosch(a)nvidia.com>
net: Fix bridge enslavement failure
Linus Walleij <linus.walleij(a)linaro.org>
net: dsa: rtl8366: Properly clear member config
Petr Machata <petrm(a)nvidia.com>
net: DCB: Validate DCB_ATTR_DCB_BUFFER argument
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
Eric Dumazet <edumazet(a)google.com>
ipv6: avoid lockdep issue in fib6_del()
David Ahern <dsahern(a)kernel.org>
ipv4: Update exception handling for multipath routes via same device
David Ahern <dsahern(a)gmail.com>
ipv4: Initialize flowi4_multipath_hash in data path
Wei Wang <weiwan(a)google.com>
ip: fix tos reflection in ack and reset packets
Dan Carpenter <dan.carpenter(a)oracle.com>
hdlc_ppp: add range checks in ppp_cp_parse_cr()
Mark Gray <mark.d.gray(a)redhat.com>
geneve: add transport ports in route lookup for geneve
Ganji Aravind <ganji.aravind(a)chelsio.com>
cxgb4: Fix offset when clearing filter byte counters
Raju Rangoju <rajur(a)chelsio.com>
cxgb4: fix memory leak during module unload
Vasundhara Volam <vasundhara-v.volam(a)broadcom.com>
bnxt_en: Fix NULL ptr dereference crash in bnxt_fw_reset_task()
Vasundhara Volam <vasundhara-v.volam(a)broadcom.com>
bnxt_en: Avoid sending firmware messages when AER error is detected.
Cong Wang <xiyou.wangcong(a)gmail.com>
act_ife: load meta modules before tcf_idr_check_alloc()
Ralph Campbell <rcampbell(a)nvidia.com>
mm/thp: fix __split_huge_pmd_locked() for migration PMD
Muchun Song <songmuchun(a)bytedance.com>
kprobes: fix kill kprobe which has been marked as gone
Jakub Kicinski <kuba(a)kernel.org>
ibmvnic: add missing parenthesis in do_reset()
Mingming Cao <mmc(a)linux.vnet.ibm.com>
ibmvnic fix NULL tx_pools and rx_tools issue at do_reset
Mark Salyzyn <salyzyn(a)android.com>
af_key: pfkey_dump needs parameter validation
-------------
Diffstat:
Makefile | 4 +-
drivers/iommu/Kconfig | 2 +-
drivers/iommu/amd_iommu.c | 17 +++++--
drivers/iommu/amd_iommu_init.c | 21 ++++++++-
drivers/net/dsa/rtl8366.c | 20 ++++++---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 32 ++++++++-----
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 ++
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 31 ++++++++-----
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 9 ++--
drivers/net/ethernet/chelsio/cxgb4/cxgb4_mps.c | 2 +-
drivers/net/ethernet/ibm/ibmvnic.c | 21 +++++++--
drivers/net/ethernet/lantiq_xrx200.c | 21 +++++----
.../mellanox/mlx5/core/en_accel/tls_stats.c | 12 +++--
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 52 ++++++++++++----------
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 8 ++--
.../net/ethernet/netronome/nfp/nfp_net_ethtool.c | 4 +-
drivers/net/geneve.c | 37 ++++++++++-----
drivers/net/phy/phy.c | 2 +-
drivers/net/phy/phy_device.c | 3 +-
drivers/net/wan/hdlc_ppp.c | 16 ++++---
include/linux/skbuff.h | 7 +--
include/net/flow.h | 1 +
include/net/sctp/structs.h | 8 ++--
kernel/kprobes.c | 9 +++-
mm/huge_memory.c | 40 ++++++++++-------
mm/vmscan.c | 8 ++++
net/bridge/br_vlan.c | 27 ++++++-----
net/core/dev.c | 2 +-
net/core/filter.c | 1 +
net/dcb/dcbnl.c | 8 ++++
net/ipv4/fib_frontend.c | 1 +
net/ipv4/ip_output.c | 3 +-
net/ipv4/route.c | 14 +++---
net/ipv6/Kconfig | 1 +
net/ipv6/ip6_fib.c | 13 ++++--
net/key/af_key.c | 7 +++
net/qrtr/qrtr.c | 20 +++++----
net/sched/act_ife.c | 44 +++++++++++++-----
net/sched/sch_generic.c | 49 +++++++++++++-------
net/sched/sch_taprio.c | 28 +++++++-----
net/sctp/socket.c | 9 ++--
net/tipc/group.c | 14 ++++--
net/tipc/msg.c | 3 +-
net/tipc/socket.c | 5 +--
44 files changed, 429 insertions(+), 211 deletions(-)
This commit resolves a minor bug in the selection/discovery of more
specific USB device drivers for devices that are currently bound to
generic USB device drivers.
The bug is related to the way a candidate USB device driver is
compared against the generic USB device driver. The code in
is_dev_usb_generic_driver() assumes that the device driver in question
is a USB device driver by calling to_usb_device_driver(dev->driver)
to downcast; however I have observed that this assumption is not always
true, through code instrumentation.
This commit avoids the incorrect downcast altogether by comparing
the USB device's driver (i.e., dev->driver) to the generic USB
device driver directly. This method was suggested by Alan Stern.
This bug was found while investigating Andrey Konovalov's report
indicating usbip device driver misbehaviour with the recently merged
generic USB device driver selection feature. The report is linked
below.
Fixes: d5643d2249 ("USB: Fix device driver race")
Link: https://lore.kernel.org/linux-usb/CAAeHK+zOrHnxjRFs=OE8T=O9208B9HP_oo8RZpyV…
Cc: <stable(a)vger.kernel.org> # 5.8
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Alan Stern <stern(a)rowland.harvard.edu>
Cc: Bastien Nocera <hadess(a)hadess.net>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Valentina Manea <valentina.manea.m(a)gmail.com>
Cc: <syzkaller(a)googlegroups.com>
Signed-off-by: M. Vefa Bicakci <m.v.b(a)runbox.com>
---
v3: Simplify the device driver pointer comparison to avoid incorrect
downcast, as suggested by Alan Stern. Minor edits to the commit
message.
v2: Following Bastien Nocera's code review comment, use is_usb_device()
to verify that a device is bound to a USB device driver (as opposed
to, e.g., a USB interface driver) to avoid incorrect downcast.
---
drivers/usb/core/driver.c | 11 ++---------
1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index 950044a6e77f..7d90cbe063ec 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -905,21 +905,14 @@ static int usb_uevent(struct device *dev, struct kobj_uevent_env *env)
return 0;
}
-static bool is_dev_usb_generic_driver(struct device *dev)
-{
- struct usb_device_driver *udd = dev->driver ?
- to_usb_device_driver(dev->driver) : NULL;
-
- return udd == &usb_generic_driver;
-}
-
static int __usb_bus_reprobe_drivers(struct device *dev, void *data)
{
struct usb_device_driver *new_udriver = data;
struct usb_device *udev;
int ret;
- if (!is_dev_usb_generic_driver(dev))
+ /* Don't reprobe if current driver isn't usb_generic_driver */
+ if (dev->driver != &usb_generic_driver.drvwrap.driver)
return 0;
udev = to_usb_device(dev);
--
2.26.2