1. With the patch "x86/vector/msi: Switch to global reservation mode"
(4900be8360), the recent v4.15 and newer kernels always hang for 1-vCPU
Hyper-V VM with SR-IOV. This is because when we reach hv_compose_msi_msg()
by request_irq() -> request_threaded_irq() -> __setup_irq()->irq_startup()
-> __irq_startup() -> irq_domain_activate_irq() -> ... ->
msi_domain_activate() -> ... -> hv_compose_msi_msg(), local irq is
disabled in __setup_irq().
Fix this by polling the channel.
2. If the host is ejecting the VF device before we reach
hv_compose_msi_msg(), in a UP VM, we can hang in hv_compose_msi_msg()
forever, because at this time the host doesn't respond to the
CREATE_INTERRUPT request. This issue also happens to old kernels like
v4.14, v4.13, etc.
Fix this by polling the channel for the PCI_EJECT message and
hpdev->state, and by checking the PCI vendor ID.
Note: actually the above issues also happen to a SMP VM, if
"hbus->hdev->channel->target_cpu == smp_processor_id()" is true.
Signed-off-by: Dexuan Cui <decui(a)microsoft.com>
Tested-by: Adrian Suhov <v-adsuho(a)microsoft.com>
Tested-by: Chris Valean <v-chvale(a)microsoft.com>
Cc: stable(a)vger.kernel.org
Cc: Stephen Hemminger <sthemmin(a)microsoft.com>
Cc: K. Y. Srinivasan <kys(a)microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Cc: Jack Morgenstein <jackm(a)mellanox.com>
---
drivers/pci/host/pci-hyperv.c | 58 ++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 57 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 265ba11e53e2..50cdefe3f6d3 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -521,6 +521,8 @@ struct hv_pci_compl {
s32 completion_status;
};
+static void hv_pci_onchannelcallback(void *context);
+
/**
* hv_pci_generic_compl() - Invoked for a completion packet
* @context: Set up by the sender of the packet.
@@ -665,6 +667,31 @@ static void _hv_pcifront_read_config(struct hv_pci_dev *hpdev, int where,
}
}
+static u16 hv_pcifront_get_vendor_id(struct hv_pci_dev *hpdev)
+{
+ u16 ret;
+ unsigned long flags;
+ void __iomem *addr = hpdev->hbus->cfg_addr + CFG_PAGE_OFFSET +
+ PCI_VENDOR_ID;
+
+ spin_lock_irqsave(&hpdev->hbus->config_lock, flags);
+
+ /* Choose the function to be read. (See comment above) */
+ writel(hpdev->desc.win_slot.slot, hpdev->hbus->cfg_addr);
+ /* Make sure the function was chosen before we start reading. */
+ mb();
+ /* Read from that function's config space. */
+ ret = readw(addr);
+ /*
+ * mb() is not required here, because the spin_unlock_irqrestore()
+ * is a barrier.
+ */
+
+ spin_unlock_irqrestore(&hpdev->hbus->config_lock, flags);
+
+ return ret;
+}
+
/**
* _hv_pcifront_write_config() - Internal PCI config write
* @hpdev: The PCI driver's representation of the device
@@ -1107,8 +1134,37 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
* Since this function is called with IRQ locks held, can't
* do normal wait for completion; instead poll.
*/
- while (!try_wait_for_completion(&comp.comp_pkt.host_event))
+ while (!try_wait_for_completion(&comp.comp_pkt.host_event)) {
+ /* 0xFFFF means an invalid PCI VENDOR ID. */
+ if (hv_pcifront_get_vendor_id(hpdev) == 0xFFFF) {
+ dev_err_once(&hbus->hdev->device,
+ "the device has gone\n");
+ goto free_int_desc;
+ }
+
+ /*
+ * When the higher level interrupt code calls us with
+ * interrupt disabled, we must poll the channel by calling
+ * the channel callback directly when channel->target_cpu is
+ * the current CPU. When the higher level interrupt code
+ * calls us with interrupt enabled, let's add the
+ * local_bh_disable()/enable() to avoid race.
+ */
+ local_bh_disable();
+
+ if (hbus->hdev->channel->target_cpu == smp_processor_id())
+ hv_pci_onchannelcallback(hbus);
+
+ local_bh_enable();
+
+ if (hpdev->state == hv_pcichild_ejecting) {
+ dev_err_once(&hbus->hdev->device,
+ "the device is being ejected\n");
+ goto free_int_desc;
+ }
+
udelay(100);
+ }
if (comp.comp_pkt.completion_status < 0) {
dev_err(&hbus->hdev->device,
--
2.7.4
This is a note to let you know that I've just added the patch titled
vt: change SGR 21 to follow the standards
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 65d9982d7e523a1a8e7c9af012da0d166f72fc56 Mon Sep 17 00:00:00 2001
From: Mike Frysinger <vapier(a)chromium.org>
Date: Mon, 29 Jan 2018 17:08:21 -0500
Subject: vt: change SGR 21 to follow the standards
ECMA-48 [1] (aka ISO 6429) has defined SGR 21 as "doubly underlined"
since at least March 1984. The Linux kernel has treated it as SGR 22
"normal intensity" since it was added in Linux-0.96b in June 1992.
Before that, it was simply ignored. Other terminal emulators have
either ignored it, or treat it as double underline now. xterm for
example added support in its 304 release (May 2014) [2] where it was
previously ignoring it.
Changing this behavior shouldn't be an issue:
- It isn't a named capability in ncurses's terminfo database, so no
script is using libtinfo/libcurses to look this up, or using tput
to query & output the right sequence.
- Any script assuming SGR 21 will reset intensity in all terminals
already do not work correctly on non-Linux VTs (including running
under screen/tmux/etc...).
- If someone has written a script that only runs in the Linux VT, and
they're using SGR 21 (instead of SGR 22), the output should still
be readable.
imo it's important to change this as the Linux VT's non-conformance
is sometimes used as an argument for other terminal emulators to not
implement SGR 21 at all, or do so incorrectly.
[1]: https://www.ecma-international.org/publications/standards/Ecma-048.htm
[2]: https://github.com/ThomasDickey/xterm-snapshots/commit/2fd29cb98d214cb536bc…
Signed-off-by: Mike Frysinger <vapier(a)chromium.org>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/vt/vt.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 88b902c525d7..16c0fcd6904a 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -1354,6 +1354,11 @@ static void csi_m(struct vc_data *vc)
case 3:
vc->vc_italic = 1;
break;
+ case 21:
+ /*
+ * No console drivers support double underline, so
+ * convert it to a single underline.
+ */
case 4:
vc->vc_underline = 1;
break;
@@ -1389,7 +1394,6 @@ static void csi_m(struct vc_data *vc)
vc->vc_disp_ctrl = 1;
vc->vc_toggle_meta = 1;
break;
- case 21:
case 22:
vc->vc_intensity = 1;
break;
--
2.16.2
This is a note to let you know that I've just added the patch titled
parport_pc: Add support for WCH CH382L PCI-E single parallel port
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 823f7923833c6cc2b16e601546d607dcfb368004 Mon Sep 17 00:00:00 2001
From: Alexander Gerasiov <gq(a)redlab-i.ru>
Date: Sun, 4 Feb 2018 02:50:22 +0300
Subject: parport_pc: Add support for WCH CH382L PCI-E single parallel port
card.
WCH CH382L is a PCI-E adapter with 1 parallel port. It is similair to CH382
but serial ports are not soldered on board. Detected as
Serial controller: Device 1c00:3050 (rev 10) (prog-if 05 [16850])
Signed-off-by: Alexander Gerasiov <gq(a)redlab-i.ru>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/parport/parport_pc.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/parport/parport_pc.c b/drivers/parport/parport_pc.c
index 489492b608cf..380916bff9e0 100644
--- a/drivers/parport/parport_pc.c
+++ b/drivers/parport/parport_pc.c
@@ -2646,6 +2646,7 @@ enum parport_pc_pci_cards {
netmos_9901,
netmos_9865,
quatech_sppxp100,
+ wch_ch382l,
};
@@ -2708,6 +2709,7 @@ static struct parport_pc_pci {
/* netmos_9901 */ { 1, { { 0, -1 }, } },
/* netmos_9865 */ { 1, { { 0, -1 }, } },
/* quatech_sppxp100 */ { 1, { { 0, 1 }, } },
+ /* wch_ch382l */ { 1, { { 2, -1 }, } },
};
static const struct pci_device_id parport_pc_pci_tbl[] = {
@@ -2797,6 +2799,8 @@ static const struct pci_device_id parport_pc_pci_tbl[] = {
/* Quatech SPPXP-100 Parallel port PCI ExpressCard */
{ PCI_VENDOR_ID_QUATECH, PCI_DEVICE_ID_QUATECH_SPPXP_100,
PCI_ANY_ID, PCI_ANY_ID, 0, 0, quatech_sppxp100 },
+ /* WCH CH382L PCI-E single parallel port card */
+ { 0x1c00, 0x3050, 0x1c00, 0x3050, 0, 0, wch_ch382l },
{ 0, } /* terminate list */
};
MODULE_DEVICE_TABLE(pci, parport_pc_pci_tbl);
--
2.16.2
From: Andri Yngvason <andri.yngvason(a)marel.com>
While waiting for the TX object to send an RTR, an external message with a
matching id can overwrite the TX data. In this case we must call the rx
routine and then try transmitting the message that was overwritten again.
The queue was being stalled because the RX event did not generate an
interrupt to wake up the queue again and the TX event did not happen
because the TXRQST flag is reset by the chip when new data is received.
According to the CC770 datasheet the id of a message object should not be
changed while the MSGVAL bit is set. This has been fixed by resetting the
MSGVAL bit before modifying the object in the transmit function and setting
it after. It is not enough to set & reset CPUUPD.
It is important to keep the MSGVAL bit reset while the message object is
being modified. Otherwise, during RTR transmission, a frame with matching
id could trigger an rx-interrupt, which would cause a race condition
between the interrupt routine and the transmit function.
Signed-off-by: Andri Yngvason <andri.yngvason(a)marel.com>
Tested-by: Richard Weinberger <richard(a)nod.at>
Cc: linux-stable <stable(a)vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/cc770/cc770.c | 94 ++++++++++++++++++++++++++++++-------------
drivers/net/can/cc770/cc770.h | 2 +
2 files changed, 68 insertions(+), 28 deletions(-)
diff --git a/drivers/net/can/cc770/cc770.c b/drivers/net/can/cc770/cc770.c
index 9fed163262e0..2743d82d4424 100644
--- a/drivers/net/can/cc770/cc770.c
+++ b/drivers/net/can/cc770/cc770.c
@@ -390,37 +390,23 @@ static int cc770_get_berr_counter(const struct net_device *dev,
return 0;
}
-static netdev_tx_t cc770_start_xmit(struct sk_buff *skb, struct net_device *dev)
+static void cc770_tx(struct net_device *dev, int mo)
{
struct cc770_priv *priv = netdev_priv(dev);
- struct net_device_stats *stats = &dev->stats;
- struct can_frame *cf = (struct can_frame *)skb->data;
- unsigned int mo = obj2msgobj(CC770_OBJ_TX);
+ struct can_frame *cf = (struct can_frame *)priv->tx_skb->data;
u8 dlc, rtr;
u32 id;
int i;
- if (can_dropped_invalid_skb(dev, skb))
- return NETDEV_TX_OK;
-
- if ((cc770_read_reg(priv,
- msgobj[mo].ctrl1) & TXRQST_UNC) == TXRQST_SET) {
- netdev_err(dev, "TX register is still occupied!\n");
- return NETDEV_TX_BUSY;
- }
-
- netif_stop_queue(dev);
-
dlc = cf->can_dlc;
id = cf->can_id;
- if (cf->can_id & CAN_RTR_FLAG)
- rtr = 0;
- else
- rtr = MSGCFG_DIR;
+ rtr = cf->can_id & CAN_RTR_FLAG ? 0 : MSGCFG_DIR;
+
+ cc770_write_reg(priv, msgobj[mo].ctrl0,
+ MSGVAL_RES | TXIE_RES | RXIE_RES | INTPND_RES);
cc770_write_reg(priv, msgobj[mo].ctrl1,
RMTPND_RES | TXRQST_RES | CPUUPD_SET | NEWDAT_RES);
- cc770_write_reg(priv, msgobj[mo].ctrl0,
- MSGVAL_SET | TXIE_SET | RXIE_RES | INTPND_RES);
+
if (id & CAN_EFF_FLAG) {
id &= CAN_EFF_MASK;
cc770_write_reg(priv, msgobj[mo].config,
@@ -439,13 +425,30 @@ static netdev_tx_t cc770_start_xmit(struct sk_buff *skb, struct net_device *dev)
for (i = 0; i < dlc; i++)
cc770_write_reg(priv, msgobj[mo].data[i], cf->data[i]);
- /* Store echo skb before starting the transfer */
- can_put_echo_skb(skb, dev, 0);
-
cc770_write_reg(priv, msgobj[mo].ctrl1,
- RMTPND_RES | TXRQST_SET | CPUUPD_RES | NEWDAT_UNC);
+ RMTPND_UNC | TXRQST_SET | CPUUPD_RES | NEWDAT_UNC);
+ cc770_write_reg(priv, msgobj[mo].ctrl0,
+ MSGVAL_SET | TXIE_SET | RXIE_SET | INTPND_UNC);
+}
+
+static netdev_tx_t cc770_start_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+ struct cc770_priv *priv = netdev_priv(dev);
+ unsigned int mo = obj2msgobj(CC770_OBJ_TX);
+
+ if (can_dropped_invalid_skb(dev, skb))
+ return NETDEV_TX_OK;
- stats->tx_bytes += dlc;
+ netif_stop_queue(dev);
+
+ if ((cc770_read_reg(priv,
+ msgobj[mo].ctrl1) & TXRQST_UNC) == TXRQST_SET) {
+ netdev_err(dev, "TX register is still occupied!\n");
+ return NETDEV_TX_BUSY;
+ }
+
+ priv->tx_skb = skb;
+ cc770_tx(dev, mo);
return NETDEV_TX_OK;
}
@@ -671,13 +674,47 @@ static void cc770_tx_interrupt(struct net_device *dev, unsigned int o)
struct cc770_priv *priv = netdev_priv(dev);
struct net_device_stats *stats = &dev->stats;
unsigned int mo = obj2msgobj(o);
+ struct can_frame *cf;
+ u8 ctrl1;
+
+ ctrl1 = cc770_read_reg(priv, msgobj[mo].ctrl1);
- /* Nothing more to send, switch off interrupts */
cc770_write_reg(priv, msgobj[mo].ctrl0,
MSGVAL_RES | TXIE_RES | RXIE_RES | INTPND_RES);
+ cc770_write_reg(priv, msgobj[mo].ctrl1,
+ RMTPND_RES | TXRQST_RES | MSGLST_RES | NEWDAT_RES);
- stats->tx_packets++;
+ if (unlikely(!priv->tx_skb)) {
+ netdev_err(dev, "missing tx skb in tx interrupt\n");
+ return;
+ }
+
+ if (unlikely(ctrl1 & MSGLST_SET)) {
+ stats->rx_over_errors++;
+ stats->rx_errors++;
+ }
+
+ /* When the CC770 is sending an RTR message and it receives a regular
+ * message that matches the id of the RTR message, it will overwrite the
+ * outgoing message in the TX register. When this happens we must
+ * process the received message and try to transmit the outgoing skb
+ * again.
+ */
+ if (unlikely(ctrl1 & NEWDAT_SET)) {
+ cc770_rx(dev, mo, ctrl1);
+ cc770_tx(dev, mo);
+ return;
+ }
+
+ can_put_echo_skb(priv->tx_skb, dev, 0);
can_get_echo_skb(dev, 0);
+
+ cf = (struct can_frame *)priv->tx_skb->data;
+ stats->tx_bytes += cf->can_dlc;
+ stats->tx_packets++;
+
+ priv->tx_skb = NULL;
+
netif_wake_queue(dev);
}
@@ -789,6 +826,7 @@ struct net_device *alloc_cc770dev(int sizeof_priv)
priv->can.do_set_bittiming = cc770_set_bittiming;
priv->can.do_set_mode = cc770_set_mode;
priv->can.ctrlmode_supported = CAN_CTRLMODE_3_SAMPLES;
+ priv->tx_skb = NULL;
memcpy(priv->obj_flags, cc770_obj_flags, sizeof(cc770_obj_flags));
diff --git a/drivers/net/can/cc770/cc770.h b/drivers/net/can/cc770/cc770.h
index a1739db98d91..95752e1d1283 100644
--- a/drivers/net/can/cc770/cc770.h
+++ b/drivers/net/can/cc770/cc770.h
@@ -193,6 +193,8 @@ struct cc770_priv {
u8 cpu_interface; /* CPU interface register */
u8 clkout; /* Clock out register */
u8 bus_config; /* Bus conffiguration register */
+
+ struct sk_buff *tx_skb;
};
struct net_device *alloc_cc770dev(int sizeof_priv);
--
2.16.1
Sehr geehrte Damen und Herren,
nach unserem Besuch Ihrer Homepage möchten wir Ihnen ein Angebot von Produkten vorstellen, das Ihnen ermöglichen wird, den Verkauf Ihrer Produkte sowie Dienstleistungen deutlich zu erhöhen.
Die Datenbanken der Firmen sind in für Sie interessante und relevante Zielgruppen unterglieder.
Ich biete Ihnen den ganz neuen Adressenkatalog der Schweizer Unternehmen an, in dem sich direkte Kontaktdaten der Firmeninhaber und Manager befinden.
Der neue Katalog enthält 187.764 schweizerische Firmen und stellt solche Daten zur Verfügung wie: Namen der Firma, Firmenanschrift, Kontaktdaten des Firmeninhabers oder des Managers, E-Mail-Adresse, Telefon, Faxnummer, Branche usw.
http://www.db-marketing-schweiz.net/?page=catalog
Die Verwendungsmöglichkeiten der Datenbanken sind praktisch unbegrenzt und Sie können durch Verwendung
der von uns entwickelten Programme des personalisierten Versendens von Angeboten u.ä. mittels
E-mailing bzw. Fax effektive und sichere Werbekampagnen damit durchführen.
Bitte informieren Sie sich über die weiteren Details einmal unverbindlich auf unseren Webseite:
http://www.db-marketing-schweiz.net/?page=catalog
MfG
GL-Marketing.
This is a note to let you know that I've just added the patch titled
scripts: recordmcount: break hardlinks
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
scripts-recordmcount-break-hardlinks.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From dd39a26538e37f6c6131e829a4a510787e43c783 Mon Sep 17 00:00:00 2001
From: Russell King <rmk+kernel(a)arm.linux.org.uk>
Date: Fri, 11 Dec 2015 12:09:03 +0000
Subject: scripts: recordmcount: break hardlinks
From: Russell King <rmk+kernel(a)arm.linux.org.uk>
commit dd39a26538e37f6c6131e829a4a510787e43c783 upstream.
recordmcount edits the file in-place, which can cause problems when
using ccache in hardlink mode. Arrange for recordmcount to break a
hardlinked object.
Link: http://lkml.kernel.org/r/E1a7MVT-0000et-62@rmk-PC.arm.linux.org.uk
Cc: stable(a)vger.kernel.org # 2.6.37+
Signed-off-by: Russell King <rmk+kernel(a)arm.linux.org.uk>
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
scripts/recordmcount.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
--- a/scripts/recordmcount.c
+++ b/scripts/recordmcount.c
@@ -201,6 +201,20 @@ static void *mmap_file(char const *fname
addr = umalloc(sb.st_size);
uread(fd_map, addr, sb.st_size);
}
+ if (sb.st_nlink != 1) {
+ /* file is hard-linked, break the hard link */
+ close(fd_map);
+ if (unlink(fname) < 0) {
+ perror(fname);
+ fail_file();
+ }
+ fd_map = open(fname, O_RDWR | O_CREAT, sb.st_mode);
+ if (fd_map < 0) {
+ perror(fname);
+ fail_file();
+ }
+ uwrite(fd_map, addr, sb.st_size);
+ }
return addr;
}
Patches currently in stable-queue which might be from rmk+kernel(a)arm.linux.org.uk are
queue-3.18/scripts-recordmcount-break-hardlinks.patch
On Mon, Mar 12, 2018 at 11:21 PM, Olof Johansson <olof(a)lixom.net> wrote:
> On Sun, Mar 11, 2018 at 8:59 AM, Olof's autobuilder <build(a)lixom.net> wrote:
>> Here are the build results from automated periodic testing.
>>
>> The tree being built was stable-rc, found at:
>>
>> URL: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
>>
>> Branch: linux-3.18.y
>>
>> Topmost commits:
>> 89dad4e Linux 3.18.99
>> 91e5f48 dm io: fix duplicate bio completion due to missing ref count
>> 1152361 fib_semantics: Don't match route with mismatching tclassid
>>
>> Build logs (stderr only) can be found at the following link (experimental):
>>
>> http://arm-soc.lixom.net/buildlogs/stable-rc/v3.18.99/
>>
>>
>> Runtime: 30m 20s
>>
>> Passed: 131
>> Failed: 0
>>
>> Warnings: 17658
>>
>> Section mismatches: 0
>
> I was unable to reproduce this with a regular build outside of the
> batch builder. Main difference is that I didn't use ccache, so I
> suspect the warnings are related to ccache (and presumably not a real
> problem with the kernel).
>
> Annoying and noisy though.
Found it!
Russell fixed it in
dd39a26538e3 ("scripts: recordmcount: break hardlinks")
Greg, can you add this to 3.18.y?
For some reason, it made it into 3.2 and 3.16 stable kernels, but
not 3.16 and 4.1. 4.4 contains the original upstream fix.
Arnd
From: Daniel Vacek <neelx(a)redhat.com>
Subject: mm/page_alloc: fix memmap_init_zone pageblock alignment
b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where
possible") introduced a bug where move_freepages() triggers a VM_BUG_ON()
on uninitialized page structure due to pageblock alignment. To fix this,
simply align the skipped pfns in memmap_init_zone() the same way as in
move_freepages_block().
Seen in one of the RHEL reports:
crash> log | grep -e BUG -e RIP -e Call.Trace -e move_freepages_block -e rmqueue -e freelist -A1
kernel BUG at mm/page_alloc.c:1389!
invalid opcode: 0000 [#1] SMP
--
RIP: 0010:[<ffffffff8118833e>] [<ffffffff8118833e>] move_freepages+0x15e/0x160
RSP: 0018:ffff88054d727688 EFLAGS: 00010087
--
Call Trace:
[<ffffffff811883b3>] move_freepages_block+0x73/0x80
[<ffffffff81189e63>] __rmqueue+0x263/0x460
[<ffffffff8118c781>] get_page_from_freelist+0x7e1/0x9e0
[<ffffffff8118caf6>] __alloc_pages_nodemask+0x176/0x420
--
RIP [<ffffffff8118833e>] move_freepages+0x15e/0x160
RSP <ffff88054d727688>
crash> page_init_bug -v | grep RAM
<struct resource 0xffff88067fffd2f8> 1000 - 9bfff System RAM (620.00 KiB)
<struct resource 0xffff88067fffd3a0> 100000 - 430bffff System RAM ( 1.05 GiB = 1071.75 MiB = 1097472.00 KiB)
<struct resource 0xffff88067fffd410> 4b0c8000 - 4bf9cfff System RAM ( 14.83 MiB = 15188.00 KiB)
<struct resource 0xffff88067fffd480> 4bfac000 - 646b1fff System RAM (391.02 MiB = 400408.00 KiB)
<struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB)
<struct resource 0xffff88067fffd640> 100000000 - 67fffffff System RAM ( 22.00 GiB)
crash> page_init_bug | head -6
<struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB)
<struct page 0xffffea0001ede200> 1fffff00000000 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575
<struct page 0xffffea0001ede200> 505736 505344 <struct page 0xffffea0001ed8000> 505855 <struct page 0xffffea0001edffc0>
<struct page 0xffffea0001ed8000> 0 0 <struct pglist_data 0xffff88047ffd9000> 0 <struct zone 0xffff88047ffd9000> DMA 1 4095
<struct page 0xffffea0001edffc0> 1fffff00000400 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575
BUG, zones differ!
Note that this range follows two not populated sections 68000000-77ffffff
in this zone. 7b788000-7b7fffff is the first one after a gap. This makes
memmap_init_zone() skip all the pfns up to the beginning of this range.
But this range is not pageblock (2M) aligned. In fact no range has to be.
crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b787000 7b788000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea0001e00000 78000000 0 0 0 0
ffffea0001ed7fc0 7b5ff000 0 0 0 0
ffffea0001ed8000 7b600000 0 0 0 0 <<<<
ffffea0001ede1c0 7b787000 0 0 0 0
ffffea0001ede200 7b788000 0 0 1 1fffff00000000
Top part of page flags should contain nodeid and zonenr, which is not
the case for page ffffea0001ed8000 here (<<<<).
crash> log | grep -o fffea0001ed[^\ ]* | sort -u
fffea0001ed8000
fffea0001eded20
fffea0001edffc0
crash> bt -r | grep -o fffea0001ed[^\ ]* | sort -u
fffea0001ed8000
fffea0001eded00
fffea0001eded20
fffea0001edffc0
Initialization of the whole beginning of the section is skipped up to the
start of the range due to the commit b92df1de5d28. Now any code calling
move_freepages_block() (like reusing the page from a freelist as in this
example) with a page from the beginning of the range will get the page
rounded down to start_page ffffea0001ed8000 and passed to move_freepages()
which crashes on assertion getting wrong zonenr.
> VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
Note, page_zone() derives the zone from page flags here.
>From similar machine before commit b92df1de5d28:
crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffff73941e00000 78000000 0 0 1 1fffff00000000
fffff73941ed7fc0 7b5ff000 0 0 1 1fffff00000000
fffff73941ed8000 7b600000 0 0 1 1fffff00000000
fffff73941edff80 7b7fe000 0 0 1 1fffff00000000
fffff73941edffc0 7b7ff000 ffff8e67e04d3ae0 ad84 1 1fffff00020068 uptodate,lru,active,mappedtodisk
All the pages since the beginning of the section are initialized.
move_freepages()' not gonna blow up.
The same machine with this fix applied:
crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea0001e00000 78000000 0 0 0 0
ffffea0001e00000 7b5ff000 0 0 0 0
ffffea0001ed8000 7b600000 0 0 1 1fffff00000000
ffffea0001edff80 7b7fe000 0 0 1 1fffff00000000
ffffea0001edffc0 7b7ff000 ffff88017fb13720 8 2 1fffff00020068 uptodate,lru,active,mappedtodisk
At least the bare minimum of pages is initialized preventing the crash
as well.
Customers started to report this as soon as 7.4 (where b92df1de5d28 was
merged in RHEL) was released. I remember reports from
September/October-ish times. It's not easily reproduced and happens on
a handful of machines only. I guess that's why. But that does not
make it less serious, I think.
Though there actually is a report here:
https://bugzilla.kernel.org/show_bug.cgi?id=196443
And there are reports for Fedora from July:
https://bugzilla.redhat.com/show_bug.cgi?id=1473242 and CentOS:
https://bugs.centos.org/view.php?id=13964 and we internally track
several dozens reports for RHEL bug
https://bugzilla.redhat.com/show_bug.cgi?id=1525121
Link: http://lkml.kernel.org/r/0485727b2e82da7efbce5f6ba42524b429d0391a.152001194…
Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx(a)redhat.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Paul Burton <paul.burton(a)imgtec.com>
Cc: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff -puN mm/page_alloc.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment mm/page_alloc.c
--- a/mm/page_alloc.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment
+++ a/mm/page_alloc.c
@@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned
/*
* Skip to the pfn preceding the next valid one (or
* end_pfn), such that we hit a valid pfn (or end_pfn)
- * on our next iteration of the loop.
+ * on our next iteration of the loop. Note that it needs
+ * to be pageblock aligned even when the region itself
+ * is not. move_freepages_block() can shift ahead of
+ * the valid region but still depends on correct page
+ * metadata.
*/
- pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+ pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
+ ~(pageblock_nr_pages-1)) - 1;
#endif
continue;
}
_
Fix a logic error that caused the test to exit with 0 even if test
cases failed.
Cc: stable(a)vger.kernel.org
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
---
tools/testing/selftests/x86/entry_from_vm86.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/x86/entry_from_vm86.c b/tools/testing/selftests/x86/entry_from_vm86.c
index 361466a2eaef..6e85f0d0498d 100644
--- a/tools/testing/selftests/x86/entry_from_vm86.c
+++ b/tools/testing/selftests/x86/entry_from_vm86.c
@@ -318,7 +318,7 @@ int main(void)
clearhandler(SIGSEGV);
/* Make sure nothing explodes if we fork. */
- if (fork() > 0)
+ if (fork() == 0)
return 0;
return (nerrs == 0 ? 0 : 1);
--
2.14.3
In move_freepages() a BUG_ON() can be triggered on uninitialized page structures
due to pageblock alignment. Aligning the skipped pfns in memmap_init_zone() the
same way as in move_freepages_block() simply fixes those crashes.
Fixes: b92df1de5d28 ("[mm] page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx(a)redhat.com>
Cc: stable(a)vger.kernel.org
---
mm/page_alloc.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cb416723538f..9edee36e6a74 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
/*
* Skip to the pfn preceding the next valid one (or
* end_pfn), such that we hit a valid pfn (or end_pfn)
- * on our next iteration of the loop.
+ * on our next iteration of the loop. Note that it needs
+ * to be pageblock aligned even when the region itself
+ * is not as move_freepages_block() can shift ahead of
+ * the valid region but still depends on correct page
+ * metadata.
*/
- pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+ pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
+ ~(pageblock_nr_pages-1)) - 1;
#endif
continue;
}
--
2.16.2
The patch titled
Subject: drivers/infiniband/core/verbs.c: fix build with gcc-4.4.4
has been added to the -mm tree. Its filename is
drivers-infiniband-core-verbsc-fix-build-with-gcc-444.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/drivers-infiniband-core-verbsc-fix…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/drivers-infiniband-core-verbsc-fix…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Andrew Morton <akpm(a)linux-foundation.org>
Subject: drivers/infiniband/core/verbs.c: fix build with gcc-4.4.4
gcc-4.4.4 has issues with initialization of anonymous unions.
drivers/infiniband/core/verbs.c: In function '__ib_drain_sq':
drivers/infiniband/core/verbs.c:2204: error: unknown field 'wr_cqe' specified in initializer
drivers/infiniband/core/verbs.c:2204: warning: initialization makes integer from pointer without a cast
Work around this.
Fixes: a1ae7d0345edd5 ("RDMA/core: Avoid that ib_drain_qp() triggers an out-of-bounds stack access")
Cc: Bart Van Assche <bart.vanassche(a)wdc.com>
Cc: Steve Wise <swise(a)opengridcomputing.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/infiniband/core/verbs.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff -puN drivers/infiniband/core/verbs.c~drivers-infiniband-core-verbsc-fix-build-with-gcc-444 drivers/infiniband/core/verbs.c
--- a/drivers/infiniband/core/verbs.c~drivers-infiniband-core-verbsc-fix-build-with-gcc-444
+++ a/drivers/infiniband/core/verbs.c
@@ -2200,8 +2200,9 @@ static void __ib_drain_sq(struct ib_qp *
struct ib_send_wr *bad_swr;
struct ib_rdma_wr swr = {
.wr = {
+ .next = NULL,
+ { .wr_cqe = &sdrain.cqe, },
.opcode = IB_WR_RDMA_WRITE,
- .wr_cqe = &sdrain.cqe,
},
};
int ret;
_
Patches currently in -mm which might be from akpm(a)linux-foundation.org are
i-need-old-gcc.patch
mm-mempolicy-avoid-use-uninitialized-preferred_node-fix.patch
hugetlbfs-check-for-pgoff-value-overflow-v3-fix.patch
mm-remove-unused-arg-from-memblock_next_valid_pfn.patch
arm-arch-arm-include-asm-pageh-needs-personalityh.patch
ocfs2-without-quota-support-try-to-avoid-calling-quota-recovery-checkpatch-fixes.patch
mm.patch
mm-initialize-pages-on-demand-during-boot-fix-4-fix.patch
mm-initialize-pages-on-demand-during-boot-v5-fix.patch
mm-initialize-pages-on-demand-during-boot-v6-checkpatch-fixes.patch
mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma-fix.patch
direct-io-minor-cleanups-in-do_blockdev_direct_io-fix.patch
mm-fix-races-between-swapoff-and-flush-dcache-fix.patch
list_lru-prefetch-neighboring-list-entries-before-acquiring-lock-fix.patch
mm-oom-cgroup-aware-oom-killer-fix.patch
mm-oom-docs-describe-the-cgroup-aware-oom-killer-fix-2-fix.patch
proc-add-seq_put_decimal_ull_width-to-speed-up-proc-pid-smaps-fix.patch
ipc-clamp-semmni-to-the-real-ipcmni-limit-fix.patch
linux-next-rejects.patch
linux-next-fixup.patch
drivers-infiniband-core-verbsc-fix-build-with-gcc-444.patch
fs-fsnotify-account-fsnotify-metadata-to-kmemcg-fix.patch
headers-untangle-kmemleakh-from-mmh-fix.patch
kernel-forkc-export-kernel_thread-to-modules.patch
slab-leaks3-default-y.patch
The patch titled
Subject: drivers/infiniband/ulp/srpt/ib_srpt.c: fix build with gcc-4.4.4
has been added to the -mm tree. Its filename is
drivers-infiniband-ulp-srpt-ib_srptc-fix-build-with-gcc-444.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/drivers-infiniband-ulp-srpt-ib_srp…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/drivers-infiniband-ulp-srpt-ib_srp…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Andrew Morton <akpm(a)linux-foundation.org>
Subject: drivers/infiniband/ulp/srpt/ib_srpt.c: fix build with gcc-4.4.4
gcc-4.4.4 has issues with initialization of anonymous unions:
drivers/infiniband/ulp/srpt/ib_srpt.c: In function 'srpt_zerolength_write':
drivers/infiniband/ulp/srpt/ib_srpt.c:854: error: unknown field 'wr_cqe' specified in initializer
drivers/infiniband/ulp/srpt/ib_srpt.c:854: warning: initialization makes integer from pointer without a cast
Work aound this.
Fixes: 2a78cb4db487 ("IB/srpt: Fix an out-of-bounds stack access in srpt_zerolength_write()")
Cc: Bart Van Assche <bart.vanassche(a)wdc.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/infiniband/ulp/srpt/ib_srpt.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff -puN drivers/infiniband/ulp/srpt/ib_srpt.c~drivers-infiniband-ulp-srpt-ib_srptc-fix-build-with-gcc-444 drivers/infiniband/ulp/srpt/ib_srpt.c
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c~drivers-infiniband-ulp-srpt-ib_srptc-fix-build-with-gcc-444
+++ a/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -850,8 +850,9 @@ static int srpt_zerolength_write(struct
struct ib_send_wr *bad_wr;
struct ib_rdma_wr wr = {
.wr = {
+ .next = NULL,
+ { .wr_cqe = &ch->zw_cqe, },
.opcode = IB_WR_RDMA_WRITE,
- .wr_cqe = &ch->zw_cqe,
.send_flags = IB_SEND_SIGNALED,
}
};
_
Patches currently in -mm which might be from akpm(a)linux-foundation.org are
i-need-old-gcc.patch
mm-mempolicy-avoid-use-uninitialized-preferred_node-fix.patch
hugetlbfs-check-for-pgoff-value-overflow-v3-fix.patch
mm-remove-unused-arg-from-memblock_next_valid_pfn.patch
arm-arch-arm-include-asm-pageh-needs-personalityh.patch
ocfs2-without-quota-support-try-to-avoid-calling-quota-recovery-checkpatch-fixes.patch
mm.patch
mm-initialize-pages-on-demand-during-boot-fix-4-fix.patch
mm-initialize-pages-on-demand-during-boot-v5-fix.patch
mm-initialize-pages-on-demand-during-boot-v6-checkpatch-fixes.patch
mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma-fix.patch
direct-io-minor-cleanups-in-do_blockdev_direct_io-fix.patch
mm-fix-races-between-swapoff-and-flush-dcache-fix.patch
list_lru-prefetch-neighboring-list-entries-before-acquiring-lock-fix.patch
mm-oom-cgroup-aware-oom-killer-fix.patch
mm-oom-docs-describe-the-cgroup-aware-oom-killer-fix-2-fix.patch
proc-add-seq_put_decimal_ull_width-to-speed-up-proc-pid-smaps-fix.patch
ipc-clamp-semmni-to-the-real-ipcmni-limit-fix.patch
linux-next-rejects.patch
linux-next-fixup.patch
drivers-infiniband-core-verbsc-fix-build-with-gcc-444.patch
fs-fsnotify-account-fsnotify-metadata-to-kmemcg-fix.patch
headers-untangle-kmemleakh-from-mmh-fix.patch
kernel-forkc-export-kernel_thread-to-modules.patch
slab-leaks3-default-y.patch
drivers-infiniband-ulp-srpt-ib_srptc-fix-build-with-gcc-444.patch
The patch titled
Subject: h8300: remove extranous __BIG_ENDIAN definition
has been added to the -mm tree. Its filename is
h8300-remove-extranous-__big_endian-definition.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/h8300-remove-extranous-__big_endia…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/h8300-remove-extranous-__big_endia…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Arnd Bergmann <arnd(a)arndb.de>
Subject: h8300: remove extranous __BIG_ENDIAN definition
A bugfix I did earlier caused a build regression on h8300, which defines
the __BIG_ENDIAN macro in a slightly different way than the generic code:
arch/h8300/include/asm/byteorder.h:5:0: warning: "__BIG_ENDIAN" redefined
We don't need to define it here, as the same macro is already provided by
the linux/byteorder/big_endian.h, and that version does not conflict.
While this is a v4.16 regression, my earlier patch also got backported to
the 4.14 and 4.15 stable kernels, so we need the fixup there as well.
Link: http://lkml.kernel.org/r/20180313120752.2645129-1-arnd@arndb.de
Fixes: 101110f6271c ("Kbuild: always define endianess in kconfig.h")
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Cc: Yoshinori Sato <ysato(a)users.sourceforge.jp>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/h8300/include/asm/byteorder.h | 1 -
1 file changed, 1 deletion(-)
diff -puN arch/h8300/include/asm/byteorder.h~h8300-remove-extranous-__big_endian-definition arch/h8300/include/asm/byteorder.h
--- a/arch/h8300/include/asm/byteorder.h~h8300-remove-extranous-__big_endian-definition
+++ a/arch/h8300/include/asm/byteorder.h
@@ -2,7 +2,6 @@
#ifndef __H8300_BYTEORDER_H__
#define __H8300_BYTEORDER_H__
-#define __BIG_ENDIAN __ORDER_BIG_ENDIAN__
#include <linux/byteorder/big_endian.h>
#endif
_
Patches currently in -mm which might be from arnd(a)arndb.de are
h8300-remove-extranous-__big_endian-definition.patch