This is the start of the stable review cycle for the 4.14.221 release. There are 30 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 10 Feb 2021 14:57:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.221-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.14.221-rc1
DENG Qingfang dqfext@gmail.com net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
Nadav Amit namit@vmware.com iommu/vt-d: Do not use flush-queue when caching-mode is on
Benjamin Valentin benpicco@googlemail.com Input: xpad - sync supported devices with fork on GitHub
Dave Hansen dave.hansen@linux.intel.com x86/apic: Add extra serialization for non-serializing MSRs
Josh Poimboeuf jpoimboe@redhat.com x86/build: Disable CET instrumentation in the kernel
Hugh Dickins hughd@google.com mm: thp: fix MADV_REMOVE deadlock on shmem THP
Muchun Song songmuchun@bytedance.com mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
Muchun Song songmuchun@bytedance.com mm: hugetlb: fix a race between isolating and freeing page
Muchun Song songmuchun@bytedance.com mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
Russell King rmk+kernel@armlinux.org.uk ARM: footbridge: fix dc21285 PCI configuration accessors
Thorsten Leemhuis linux@leemhuis.info nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
Fengnan Chang fengnanchang@gmail.com mmc: core: Limit retries when analyse of SDIO tuples fails
Gustavo A. R. Silva gustavoars@kernel.org smb3: Fix out-of-bounds bug in SMB2_negotiate()
Aurelien Aptel aaptel@suse.com cifs: report error instead of invalid when revalidating a dentry fails
Mathias Nyman mathias.nyman@linux.intel.com xhci: fix bounce buffer usage for non-sg list case
Wang ShaoBo bobo.shaobowang@huawei.com kretprobe: Avoid re-registration of the same kretprobe earlier
Felix Fietkau nbd@nbd.name mac80211: fix station rate table updates on assoc
Liangyan liangyan.peng@linux.alibaba.com ovl: fix dentry leak in ovl_get_redirect
Heiko Stuebner heiko.stuebner@theobroma-systems.com usb: dwc2: Fix endpoint direction check in ep_from_windex
Jeremy Figgins kernel@jeremyfiggins.com USB: usblp: don't call usb_set_interface if there's a single alt
Dan Carpenter dan.carpenter@oracle.com USB: gadget: legacy: fix an error code in eth_bind()
Wei Wang weiwan@google.com ipv4: fix race condition between route lookup and invalidation
Arnd Bergmann arnd@arndb.de elfcore: fix building with clang
Josh Poimboeuf jpoimboe@redhat.com objtool: Support Clang non-section symbols in ORC generation
Xie He xie.he.0141@gmail.com net: lapb: Copy the skb before sending a packet
Zyta Szpak zr@semihalf.com arm64: dts: ls1046a: fix dcfg address range
Alexey Dobriyan adobriyan@gmail.com Input: i8042 - unbreak Pegatron C15B
Christoph Schemmel christoph.schemmel@gmail.com USB: serial: option: Adding support for Cinterion MV31
Chenxin Jin bg4akv@hotmail.com USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
Pho Tran Pho.Tran@silabs.com USB: serial: cp210x: add pid/vid for WSDA-200-USB
-------------
Diffstat:
Makefile | 10 ++----- arch/arm/mach-footbridge/dc21285.c | 12 ++++---- arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2 +- arch/x86/Makefile | 3 ++ arch/x86/include/asm/apic.h | 10 ------- arch/x86/include/asm/barrier.h | 18 ++++++++++++ arch/x86/kernel/apic/apic.c | 4 +++ arch/x86/kernel/apic/x2apic_cluster.c | 6 ++-- arch/x86/kernel/apic/x2apic_phys.c | 6 ++-- drivers/input/joystick/xpad.c | 17 +++++++++++- drivers/input/serio/i8042-x86ia64io.h | 2 ++ drivers/iommu/intel-iommu.c | 6 ++++ drivers/mmc/core/sdio_cis.c | 6 ++++ drivers/net/dsa/mv88e6xxx/chip.c | 6 +++- drivers/nvme/host/pci.c | 2 ++ drivers/usb/class/usblp.c | 19 +++++++------ drivers/usb/dwc2/gadget.c | 8 +----- drivers/usb/gadget/legacy/ether.c | 4 ++- drivers/usb/host/xhci-ring.c | 31 +++++++++++++-------- drivers/usb/serial/cp210x.c | 2 ++ drivers/usb/serial/option.c | 6 ++++ fs/cifs/dir.c | 22 +++++++++++++-- fs/cifs/smb2pdu.h | 2 +- fs/hugetlbfs/inode.c | 3 +- fs/overlayfs/dir.c | 2 +- include/linux/elfcore.h | 22 +++++++++++++++ include/linux/hugetlb.h | 3 ++ kernel/Makefile | 1 - kernel/elfcore.c | 26 ------------------ kernel/kprobes.c | 4 +++ mm/huge_memory.c | 37 +++++++++++++++---------- mm/hugetlb.c | 9 +++--- net/ipv4/route.c | 38 +++++++++++++------------- net/lapb/lapb_out.c | 3 +- net/mac80211/driver-ops.c | 5 +++- net/mac80211/rate.c | 3 +- tools/objtool/orc_gen.c | 33 +++++++++++++++++----- 37 files changed, 255 insertions(+), 138 deletions(-)
From: Pho Tran Pho.Tran@silabs.com
commit 3c4f6ecd93442f4376a58b38bb40ee0b8c46e0e6 upstream.
Information pid/vid of WSDA-200-USB, Lord corporation company: vid: 199b pid: ba30
Signed-off-by: Pho Tran pho.tran@silabs.com [ johan: amend comment with product name ] Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/cp210x.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -204,6 +204,7 @@ static const struct usb_device_id id_tab { USB_DEVICE(0x1901, 0x0194) }, /* GE Healthcare Remote Alarm Box */ { USB_DEVICE(0x1901, 0x0195) }, /* GE B850/B650/B450 CP2104 DP UART interface */ { USB_DEVICE(0x1901, 0x0196) }, /* GE B850 CP2105 DP UART interface */ + { USB_DEVICE(0x199B, 0xBA30) }, /* LORD WSDA-200-USB */ { USB_DEVICE(0x19CF, 0x3000) }, /* Parrot NMEA GPS Flight Recorder */ { USB_DEVICE(0x1ADB, 0x0001) }, /* Schweitzer Engineering C662 Cable */ { USB_DEVICE(0x1B1C, 0x1C00) }, /* Corsair USB Dongle */
From: Chenxin Jin bg4akv@hotmail.com
commit 43377df70480f82919032eb09832e9646a8a5efb upstream.
Teraoka AD2000 uses the CP210x driver, but the chip VID/PID is customized with 0988/0578. We need the driver to support the new VID/PID.
Signed-off-by: Chenxin Jin bg4akv@hotmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/cp210x.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -64,6 +64,7 @@ static const struct usb_device_id id_tab { USB_DEVICE(0x08e6, 0x5501) }, /* Gemalto Prox-PU/CU contactless smartcard reader */ { USB_DEVICE(0x08FD, 0x000A) }, /* Digianswer A/S , ZigBee/802.15.4 MAC Device */ { USB_DEVICE(0x0908, 0x01FF) }, /* Siemens RUGGEDCOM USB Serial Console */ + { USB_DEVICE(0x0988, 0x0578) }, /* Teraoka AD2000 */ { USB_DEVICE(0x0B00, 0x3070) }, /* Ingenico 3070 */ { USB_DEVICE(0x0BED, 0x1100) }, /* MEI (TM) Cashflow-SC Bill/Voucher Acceptor */ { USB_DEVICE(0x0BED, 0x1101) }, /* MEI series 2000 Combo Acceptor */
From: Christoph Schemmel christoph.schemmel@gmail.com
commit e478d6029dca9d8462f426aee0d32896ef64f10f upstream.
Adding support for Cinterion device MV31 for enumeration with PID 0x00B3 and 0x00B7.
usb-devices output for 0x00B3 T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 6 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1 P: Vendor=1e2d ProdID=00b3 Rev=04.14 S: Manufacturer=Cinterion S: Product=Cinterion PID 0x00B3 USB Mobile Broadband S: SerialNumber=b3246eed C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#=0x0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim I: If#=0x1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=cdc_wdm I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
usb-devices output for 0x00B7 T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 5 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1 P: Vendor=1e2d ProdID=00b7 Rev=04.14 S: Manufacturer=Cinterion S: Product=Cinterion PID 0x00B3 USB Mobile Broadband S: SerialNumber=b3246eed C: #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
Signed-off-by: Christoph Schemmel christoph.schemmel@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -428,6 +428,8 @@ static void option_instat_callback(struc #define CINTERION_PRODUCT_AHXX_2RMNET 0x0084 #define CINTERION_PRODUCT_AHXX_AUDIO 0x0085 #define CINTERION_PRODUCT_CLS8 0x00b0 +#define CINTERION_PRODUCT_MV31_MBIM 0x00b3 +#define CINTERION_PRODUCT_MV31_RMNET 0x00b7
/* Olivetti products */ #define OLIVETTI_VENDOR_ID 0x0b3c @@ -1917,6 +1919,10 @@ static const struct usb_device_id option { USB_DEVICE(SIEMENS_VENDOR_ID, CINTERION_PRODUCT_HC25_MDMNET) }, { USB_DEVICE(SIEMENS_VENDOR_ID, CINTERION_PRODUCT_HC28_MDM) }, /* HC28 enumerates with Siemens or Cinterion VID depending on FW revision */ { USB_DEVICE(SIEMENS_VENDOR_ID, CINTERION_PRODUCT_HC28_MDMNET) }, + { USB_DEVICE_INTERFACE_CLASS(CINTERION_VENDOR_ID, CINTERION_PRODUCT_MV31_MBIM, 0xff), + .driver_info = RSVD(3)}, + { USB_DEVICE_INTERFACE_CLASS(CINTERION_VENDOR_ID, CINTERION_PRODUCT_MV31_RMNET, 0xff), + .driver_info = RSVD(0)}, { USB_DEVICE(OLIVETTI_VENDOR_ID, OLIVETTI_PRODUCT_OLICARD100), .driver_info = RSVD(4) }, { USB_DEVICE(OLIVETTI_VENDOR_ID, OLIVETTI_PRODUCT_OLICARD120),
From: Alexey Dobriyan adobriyan@gmail.com
[ Upstream commit a3a9060ecad030e2c7903b2b258383d2c716b56c ]
g++ reports
drivers/input/serio/i8042-x86ia64io.h:225:3: error: ‘.matches’ designator used multiple times in the same initializer list
C99 semantics is that last duplicated initialiser wins, so DMI entry gets overwritten.
Fixes: a48491c65b51 ("Input: i8042 - add ByteSpeed touchpad to noloop table") Signed-off-by: Alexey Dobriyan adobriyan@gmail.com Acked-by: Po-Hsu Lin po-hsu.lin@canonical.com Link: https://lore.kernel.org/r/20201228072335.GA27766@localhost.localdomain Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/serio/i8042-x86ia64io.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/input/serio/i8042-x86ia64io.h b/drivers/input/serio/i8042-x86ia64io.h index b256e3006a6fb..844875df8cad7 100644 --- a/drivers/input/serio/i8042-x86ia64io.h +++ b/drivers/input/serio/i8042-x86ia64io.h @@ -223,6 +223,8 @@ static const struct dmi_system_id __initconst i8042_dmi_noloop_table[] = { DMI_MATCH(DMI_SYS_VENDOR, "PEGATRON CORPORATION"), DMI_MATCH(DMI_PRODUCT_NAME, "C15B"), }, + }, + { .matches = { DMI_MATCH(DMI_SYS_VENDOR, "ByteSpeed LLC"), DMI_MATCH(DMI_PRODUCT_NAME, "ByteSpeed Laptop C15B"),
From: Zyta Szpak zr@semihalf.com
[ Upstream commit aa880c6f3ee6dbd0d5ab02026a514ff8ea0a3328 ]
Dcfg was overlapping with clockgen address space which resulted in failure in memory allocation for dcfg. According regs description dcfg size should not be bigger than 4KB.
Signed-off-by: Zyta Szpak zr@semihalf.com Fixes: 8126d88162a5 ("arm64: dts: add QorIQ LS1046A SoC support") Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi index c8ff0baddf1d0..cb49d21e317c0 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi @@ -304,7 +304,7 @@
dcfg: dcfg@1ee0000 { compatible = "fsl,ls1046a-dcfg", "syscon"; - reg = <0x0 0x1ee0000 0x0 0x10000>; + reg = <0x0 0x1ee0000 0x0 0x1000>; big-endian; };
From: Xie He xie.he.0141@gmail.com
[ Upstream commit 88c7a9fd9bdd3e453f04018920964c6f848a591a ]
When sending a packet, we will prepend it with an LAPB header. This modifies the shared parts of a cloned skb, so we should copy the skb rather than just clone it, before we prepend the header.
In "Documentation/networking/driver.rst" (the 2nd point), it states that drivers shouldn't modify the shared parts of a cloned skb when transmitting.
The "dev_queue_xmit_nit" function in "net/core/dev.c", which is called when an skb is being sent, clones the skb and sents the clone to AF_PACKET sockets. Because the LAPB drivers first remove a 1-byte pseudo-header before handing over the skb to us, if we don't copy the skb before prepending the LAPB header, the first byte of the packets received on AF_PACKET sockets can be corrupted.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Xie He xie.he.0141@gmail.com Acked-by: Martin Schiller ms@dev.tdt.de Link: https://lore.kernel.org/r/20210201055706.415842-1-xie.he.0141@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/lapb/lapb_out.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/lapb/lapb_out.c b/net/lapb/lapb_out.c index eda726e22f645..621c66f001177 100644 --- a/net/lapb/lapb_out.c +++ b/net/lapb/lapb_out.c @@ -87,7 +87,8 @@ void lapb_kick(struct lapb_cb *lapb) skb = skb_dequeue(&lapb->write_queue);
do { - if ((skbn = skb_clone(skb, GFP_ATOMIC)) == NULL) { + skbn = skb_copy(skb, GFP_ATOMIC); + if (!skbn) { skb_queue_head(&lapb->write_queue, skb); break; }
From: Josh Poimboeuf jpoimboe@redhat.com
commit e81e0724432542af8d8c702c31e9d82f57b1ff31 upstream.
When compiling the kernel with AS=clang, objtool produces a lot of warnings:
warning: objtool: missing symbol for section .text warning: objtool: missing symbol for section .init.text warning: objtool: missing symbol for section .ref.text
It then fails to generate the ORC table.
The problem is that objtool assumes text section symbols always exist. But the Clang assembler is aggressive about removing them.
When generating relocations for the ORC table, objtool always tries to reference instructions by their section symbol offset. If the section symbol doesn't exist, it bails.
Do a fallback: when a section symbol isn't available, reference a function symbol instead.
Reported-by: Dmitry Golovin dima@golovin.in Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Borislav Petkov bp@suse.de Tested-by: Nathan Chancellor natechancellor@gmail.com Reviewed-by: Miroslav Benes mbenes@suse.cz Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://github.com/ClangBuiltLinux/linux/issues/669 Link: https://lkml.kernel.org/r/9a9cae7fcf628843aabe5a086b1a3c5bf50f42e8.158576102... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/objtool/orc_gen.c | 33 ++++++++++++++++++++++++++------- 1 file changed, 26 insertions(+), 7 deletions(-)
--- a/tools/objtool/orc_gen.c +++ b/tools/objtool/orc_gen.c @@ -98,11 +98,6 @@ static int create_orc_entry(struct secti struct orc_entry *orc; struct rela *rela;
- if (!insn_sec->sym) { - WARN("missing symbol for section %s", insn_sec->name); - return -1; - } - /* populate ORC data */ orc = (struct orc_entry *)u_sec->data->d_buf + idx; memcpy(orc, o, sizeof(*orc)); @@ -115,8 +110,32 @@ static int create_orc_entry(struct secti } memset(rela, 0, sizeof(*rela));
- rela->sym = insn_sec->sym; - rela->addend = insn_off; + if (insn_sec->sym) { + rela->sym = insn_sec->sym; + rela->addend = insn_off; + } else { + /* + * The Clang assembler doesn't produce section symbols, so we + * have to reference the function symbol instead: + */ + rela->sym = find_symbol_containing(insn_sec, insn_off); + if (!rela->sym) { + /* + * Hack alert. This happens when we need to reference + * the NOP pad insn immediately after the function. + */ + rela->sym = find_symbol_containing(insn_sec, + insn_off - 1); + } + if (!rela->sym) { + WARN("missing symbol for insn at offset 0x%lx\n", + insn_off); + return -1; + } + + rela->addend = insn_off - rela->sym->offset; + } + rela->type = R_X86_64_PC32; rela->offset = idx * sizeof(int);
From: Arnd Bergmann arnd@arndb.de
commit 6e7b64b9dd6d96537d816ea07ec26b7dedd397b9 upstream.
kernel/elfcore.c only contains weak symbols, which triggers a bug with clang in combination with recordmcount:
Cannot find symbol for section 2: .text. kernel/elfcore.o: failed
Move the empty stubs into linux/elfcore.h as inline functions. As only two architectures use these, just use the architecture specific Kconfig symbols to key off the declaration.
Link: https://lkml.kernel.org/r/20201204165742.3815221-2-arnd@kernel.org Signed-off-by: Arnd Bergmann arnd@arndb.de Cc: Nathan Chancellor natechancellor@gmail.com Cc: Nick Desaulniers ndesaulniers@google.com Cc: Barret Rhoden brho@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/elfcore.h | 22 ++++++++++++++++++++++ kernel/Makefile | 1 - kernel/elfcore.c | 26 -------------------------- 3 files changed, 22 insertions(+), 27 deletions(-) delete mode 100644 kernel/elfcore.c
--- a/include/linux/elfcore.h +++ b/include/linux/elfcore.h @@ -58,6 +58,7 @@ static inline int elf_core_copy_task_xfp } #endif
+#if defined(CONFIG_UM) || defined(CONFIG_IA64) /* * These functions parameterize elf_core_dump in fs/binfmt_elf.c to write out * extra segments containing the gate DSO contents. Dumping its @@ -72,5 +73,26 @@ elf_core_write_extra_phdrs(struct coredu extern int elf_core_write_extra_data(struct coredump_params *cprm); extern size_t elf_core_extra_data_size(void); +#else +static inline Elf_Half elf_core_extra_phdrs(void) +{ + return 0; +} + +static inline int elf_core_write_extra_phdrs(struct coredump_params *cprm, loff_t offset) +{ + return 1; +} + +static inline int elf_core_write_extra_data(struct coredump_params *cprm) +{ + return 1; +} + +static inline size_t elf_core_extra_data_size(void) +{ + return 0; +} +#endif
#endif /* _LINUX_ELFCORE_H */ --- a/kernel/Makefile +++ b/kernel/Makefile @@ -90,7 +90,6 @@ obj-$(CONFIG_TASK_DELAY_ACCT) += delayac obj-$(CONFIG_TASKSTATS) += taskstats.o tsacct.o obj-$(CONFIG_TRACEPOINTS) += tracepoint.o obj-$(CONFIG_LATENCYTOP) += latencytop.o -obj-$(CONFIG_ELFCORE) += elfcore.o obj-$(CONFIG_FUNCTION_TRACER) += trace/ obj-$(CONFIG_TRACING) += trace/ obj-$(CONFIG_TRACE_CLOCK) += trace/ --- a/kernel/elfcore.c +++ /dev/null @@ -1,26 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <linux/elf.h> -#include <linux/fs.h> -#include <linux/mm.h> -#include <linux/binfmts.h> -#include <linux/elfcore.h> - -Elf_Half __weak elf_core_extra_phdrs(void) -{ - return 0; -} - -int __weak elf_core_write_extra_phdrs(struct coredump_params *cprm, loff_t offset) -{ - return 1; -} - -int __weak elf_core_write_extra_data(struct coredump_params *cprm) -{ - return 1; -} - -size_t __weak elf_core_extra_data_size(void) -{ - return 0; -}
From: Wei Wang weiwan@google.com
[ upstream commit 5018c59607a511cdee743b629c76206d9c9e6d7b ]
Jesse and Ido reported the following race condition: <CPU A, t0> - Received packet A is forwarded and cached dst entry is taken from the nexthop ('nhc->nhc_rth_input'). Calls skb_dst_set()
<t1> - Given Jesse has busy routers ("ingesting full BGP routing tables from multiple ISPs"), route is added / deleted and rt_cache_flush() is called
<CPU B, t2> - Received packet B tries to use the same cached dst entry from t0, but rt_cache_valid() is no longer true and it is replaced in rt_cache_route() by the newer one. This calls dst_dev_put() on the original dst entry which assigns the blackhole netdev to 'dst->dev'
<CPU A, t3> - dst_input(skb) is called on packet A and it is dropped due to 'dst->dev' being the blackhole netdev
There are 2 issues in the v4 routing code: 1. A per-netns counter is used to do the validation of the route. That means whenever a route is changed in the netns, users of all routes in the netns needs to redo lookup. v6 has an implementation of only updating fn_sernum for routes that are affected. 2. When rt_cache_valid() returns false, rt_cache_route() is called to throw away the current cache, and create a new one. This seems unnecessary because as long as this route does not change, the route cache does not need to be recreated.
To fully solve the above 2 issues, it probably needs quite some code changes and requires careful testing, and does not suite for net branch.
So this patch only tries to add the deleted cached rt into the uncached list, so user could still be able to use it to receive packets until it's done.
Fixes: 95c47f9cf5e0 ("ipv4: call dst_dev_put() properly") Signed-off-by: Wei Wang weiwan@google.com Reported-by: Ido Schimmel idosch@idosch.org Reported-by: Jesse Hathaway jesse@mbuki-mvuki.org Tested-by: Jesse Hathaway jesse@mbuki-mvuki.org Acked-by: Martin KaFai Lau kafai@fb.com Cc: David Ahern dsahern@gmail.com Reviewed-by: Ido Schimmel idosch@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Carsten Schmid carsten_schmid@mentor.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/route.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-)
--- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1437,6 +1437,24 @@ static bool rt_bind_exception(struct rta return ret; }
+struct uncached_list { + spinlock_t lock; + struct list_head head; +}; + +static DEFINE_PER_CPU_ALIGNED(struct uncached_list, rt_uncached_list); + +static void rt_add_uncached_list(struct rtable *rt) +{ + struct uncached_list *ul = raw_cpu_ptr(&rt_uncached_list); + + rt->rt_uncached_list = ul; + + spin_lock_bh(&ul->lock); + list_add_tail(&rt->rt_uncached, &ul->head); + spin_unlock_bh(&ul->lock); +} + static bool rt_cache_route(struct fib_nh *nh, struct rtable *rt) { struct rtable *orig, *prev, **p; @@ -1456,7 +1474,7 @@ static bool rt_cache_route(struct fib_nh prev = cmpxchg(p, orig, rt); if (prev == orig) { if (orig) { - dst_dev_put(&orig->dst); + rt_add_uncached_list(orig); dst_release(&orig->dst); } } else { @@ -1467,24 +1485,6 @@ static bool rt_cache_route(struct fib_nh return ret; }
-struct uncached_list { - spinlock_t lock; - struct list_head head; -}; - -static DEFINE_PER_CPU_ALIGNED(struct uncached_list, rt_uncached_list); - -static void rt_add_uncached_list(struct rtable *rt) -{ - struct uncached_list *ul = raw_cpu_ptr(&rt_uncached_list); - - rt->rt_uncached_list = ul; - - spin_lock_bh(&ul->lock); - list_add_tail(&rt->rt_uncached, &ul->head); - spin_unlock_bh(&ul->lock); -} - static void ipv4_dst_destroy(struct dst_entry *dst) { struct dst_metrics *p = (struct dst_metrics *)DST_METRICS_PTR(dst);
From: Dan Carpenter dan.carpenter@oracle.com
commit 3e1f4a2e1184ae6ad7f4caf682ced9554141a0f4 upstream.
This code should return -ENOMEM if the allocation fails but it currently returns success.
Fixes: 9b95236eebdb ("usb: gadget: ether: allocate and init otg descriptor by otg capabilities") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Link: https://lore.kernel.org/r/YBKE9rqVuJEOUWpW@mwanda Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/legacy/ether.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/gadget/legacy/ether.c +++ b/drivers/usb/gadget/legacy/ether.c @@ -407,8 +407,10 @@ static int eth_bind(struct usb_composite struct usb_descriptor_header *usb_desc;
usb_desc = usb_otg_descriptor_alloc(gadget); - if (!usb_desc) + if (!usb_desc) { + status = -ENOMEM; goto fail1; + } usb_otg_descriptor_init(gadget, usb_desc); otg_desc[0] = usb_desc; otg_desc[1] = NULL;
From: Jeremy Figgins kernel@jeremyfiggins.com
commit d8c6edfa3f4ee0d45d7ce5ef18d1245b78774b9d upstream.
Some devices, such as the Winbond Electronics Corp. Virtual Com Port (Vendor=0416, ProdId=5011), lockup when usb_set_interface() or usb_clear_halt() are called. This device has only a single altsetting, so it should not be necessary to call usb_set_interface().
Acked-by: Pete Zaitcev zaitcev@redhat.com Signed-off-by: Jeremy Figgins kernel@jeremyfiggins.com Link: https://lore.kernel.org/r/YAy9kJhM/rG8EQXC@watson Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/usblp.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)
--- a/drivers/usb/class/usblp.c +++ b/drivers/usb/class/usblp.c @@ -1340,14 +1340,17 @@ static int usblp_set_protocol(struct usb if (protocol < USBLP_FIRST_PROTOCOL || protocol > USBLP_LAST_PROTOCOL) return -EINVAL;
- alts = usblp->protocol[protocol].alt_setting; - if (alts < 0) - return -EINVAL; - r = usb_set_interface(usblp->dev, usblp->ifnum, alts); - if (r < 0) { - printk(KERN_ERR "usblp: can't set desired altsetting %d on interface %d\n", - alts, usblp->ifnum); - return r; + /* Don't unnecessarily set the interface if there's a single alt. */ + if (usblp->intf->num_altsetting > 1) { + alts = usblp->protocol[protocol].alt_setting; + if (alts < 0) + return -EINVAL; + r = usb_set_interface(usblp->dev, usblp->ifnum, alts); + if (r < 0) { + printk(KERN_ERR "usblp: can't set desired altsetting %d on interface %d\n", + alts, usblp->ifnum); + return r; + } }
usblp->bidir = (usblp->protocol[protocol].epread != NULL);
From: Heiko Stuebner heiko.stuebner@theobroma-systems.com
commit f670e9f9c8cac716c3506c6bac9e997b27ad441a upstream.
dwc2_hsotg_process_req_status uses ep_from_windex() to retrieve the endpoint for the index provided in the wIndex request param.
In a test-case with a rndis gadget running and sending a malformed packet to it like: dev.ctrl_transfer( 0x82, # bmRequestType 0x00, # bRequest 0x0000, # wValue 0x0001, # wIndex 0x00 # wLength ) it is possible to cause a crash:
[ 217.533022] dwc2 ff300000.usb: dwc2_hsotg_process_req_status: USB_REQ_GET_STATUS [ 217.559003] Unable to handle kernel read from unreadable memory at virtual address 0000000000000088 ... [ 218.313189] Call trace: [ 218.330217] ep_from_windex+0x3c/0x54 [ 218.348565] usb_gadget_giveback_request+0x10/0x20 [ 218.368056] dwc2_hsotg_complete_request+0x144/0x184
This happens because ep_from_windex wants to compare the endpoint direction even if index_to_ep() didn't return an endpoint due to the direction not matching.
The fix is easy insofar that the actual direction check is already happening when calling index_to_ep() which will return NULL if there is no endpoint for the targeted direction, so the offending check can go away completely.
Fixes: c6f5c050e2a7 ("usb: dwc2: gadget: add bi-directional endpoint support") Cc: stable@vger.kernel.org Reported-by: Gerhard Klostermeier gerhard.klostermeier@syss.de Signed-off-by: Heiko Stuebner heiko.stuebner@theobroma-systems.com Link: https://lore.kernel.org/r/20210127103919.58215-1-heiko@sntech.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc2/gadget.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
--- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -1470,7 +1470,6 @@ static void dwc2_hsotg_complete_oursetup static struct dwc2_hsotg_ep *ep_from_windex(struct dwc2_hsotg *hsotg, u32 windex) { - struct dwc2_hsotg_ep *ep; int dir = (windex & USB_DIR_IN) ? 1 : 0; int idx = windex & 0x7F;
@@ -1480,12 +1479,7 @@ static struct dwc2_hsotg_ep *ep_from_win if (idx > hsotg->num_of_eps) return NULL;
- ep = index_to_ep(hsotg, idx, dir); - - if (idx && ep->dir_in != dir) - return NULL; - - return ep; + return index_to_ep(hsotg, idx, dir); }
/**
From: Liangyan liangyan.peng@linux.alibaba.com
commit e04527fefba6e4e66492f122cf8cc6314f3cf3bf upstream.
We need to lock d_parent->d_lock before dget_dlock, or this may have d_lockref updated parallelly like calltrace below which will cause dentry->d_lockref leak and risk a crash.
CPU 0 CPU 1 ovl_set_redirect lookup_fast ovl_get_redirect __d_lookup dget_dlock //no lock protection here spin_lock(&dentry->d_lock) dentry->d_lockref.count++ dentry->d_lockref.count++
[ 49.799059] PGD 800000061fed7067 P4D 800000061fed7067 PUD 61fec5067 PMD 0 [ 49.799689] Oops: 0002 [#1] SMP PTI [ 49.800019] CPU: 2 PID: 2332 Comm: node Not tainted 4.19.24-7.20.al7.x86_64 #1 [ 49.800678] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8a46cfe 04/01/2014 [ 49.801380] RIP: 0010:_raw_spin_lock+0xc/0x20 [ 49.803470] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246 [ 49.803949] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000 [ 49.804600] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088 [ 49.805252] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040 [ 49.805898] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000 [ 49.806548] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0 [ 49.807200] FS: 00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000 [ 49.807935] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 49.808461] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0 [ 49.809113] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 49.809758] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 49.810410] Call Trace: [ 49.810653] d_delete+0x2c/0xb0 [ 49.810951] vfs_rmdir+0xfd/0x120 [ 49.811264] do_rmdir+0x14f/0x1a0 [ 49.811573] do_syscall_64+0x5b/0x190 [ 49.811917] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 49.812385] RIP: 0033:0x7ffbf505ffd7 [ 49.814404] RSP: 002b:00007ffbedffada8 EFLAGS: 00000297 ORIG_RAX: 0000000000000054 [ 49.815098] RAX: ffffffffffffffda RBX: 00007ffbedffb640 RCX: 00007ffbf505ffd7 [ 49.815744] RDX: 0000000004449700 RSI: 0000000000000000 RDI: 0000000006c8cd50 [ 49.816394] RBP: 00007ffbedffaea0 R08: 0000000000000000 R09: 0000000000017d0b [ 49.817038] R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000012 [ 49.817687] R13: 00000000072823d8 R14: 00007ffbedffb700 R15: 00000000072823d8 [ 49.818338] Modules linked in: pvpanic cirrusfb button qemu_fw_cfg atkbd libps2 i8042 [ 49.819052] CR2: 0000000000000088 [ 49.819368] ---[ end trace 4e652b8aa299aa2d ]--- [ 49.819796] RIP: 0010:_raw_spin_lock+0xc/0x20 [ 49.821880] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246 [ 49.822363] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000 [ 49.823008] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088 [ 49.823658] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040 [ 49.825404] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000 [ 49.827147] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0 [ 49.828890] FS: 00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000 [ 49.830725] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 49.832359] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0 [ 49.834085] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 49.835792] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Cc: stable@vger.kernel.org Fixes: a6c606551141 ("ovl: redirect on rename-dir") Signed-off-by: Liangyan liangyan.peng@linux.alibaba.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Suggested-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/overlayfs/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -850,8 +850,8 @@ static char *ovl_get_redirect(struct den
buflen -= thislen; memcpy(&buf[buflen], name, thislen); - tmp = dget_dlock(d->d_parent); spin_unlock(&d->d_lock); + tmp = dget_parent(d);
dput(d); d = tmp;
From: Felix Fietkau nbd@nbd.name
commit 18fe0fae61252b5ae6e26553e2676b5fac555951 upstream.
If the driver uses .sta_add, station entries are only uploaded after the sta is in assoc state. Fix early station rate table updates by deferring them until the sta has been uploaded.
Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau nbd@nbd.name Link: https://lore.kernel.org/r/20210201083324.3134-1-nbd@nbd.name [use rcu_access_pointer() instead since we won't dereference here] Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mac80211/driver-ops.c | 5 ++++- net/mac80211/rate.c | 3 ++- 2 files changed, 6 insertions(+), 2 deletions(-)
--- a/net/mac80211/driver-ops.c +++ b/net/mac80211/driver-ops.c @@ -128,8 +128,11 @@ int drv_sta_state(struct ieee80211_local } else if (old_state == IEEE80211_STA_AUTH && new_state == IEEE80211_STA_ASSOC) { ret = drv_sta_add(local, sdata, &sta->sta); - if (ret == 0) + if (ret == 0) { sta->uploaded = true; + if (rcu_access_pointer(sta->sta.rates)) + drv_sta_rate_tbl_update(local, sdata, &sta->sta); + } } else if (old_state == IEEE80211_STA_ASSOC && new_state == IEEE80211_STA_AUTH) { drv_sta_remove(local, sdata, &sta->sta); --- a/net/mac80211/rate.c +++ b/net/mac80211/rate.c @@ -941,7 +941,8 @@ int rate_control_set_rates(struct ieee80 if (old) kfree_rcu(old, rcu_head);
- drv_sta_rate_tbl_update(hw_to_local(hw), sta->sdata, pubsta); + if (sta->uploaded) + drv_sta_rate_tbl_update(hw_to_local(hw), sta->sdata, pubsta);
ieee80211_sta_set_expected_throughput(pubsta, sta_get_expected_throughput(sta));
From: Wang ShaoBo bobo.shaobowang@huawei.com
commit 0188b87899ffc4a1d36a0badbe77d56c92fd91dc upstream.
Our system encountered a re-init error when re-registering same kretprobe, where the kretprobe_instance in rp->free_instances is illegally accessed after re-init.
Implementation to avoid re-registration has been introduced for kprobe before, but lags for register_kretprobe(). We must check if kprobe has been re-registered before re-initializing kretprobe, otherwise it will destroy the data struct of kretprobe registered, which can lead to memory leak, system crash, also some unexpected behaviors.
We use check_kprobe_rereg() to check if kprobe has been re-registered before running register_kretprobe()'s body, for giving a warning message and terminate registration process.
Link: https://lkml.kernel.org/r/20210128124427.2031088-1-bobo.shaobowang@huawei.co...
Cc: stable@vger.kernel.org Fixes: 1f0ab40976460 ("kprobes: Prevent re-registration of the same kprobe") [ The above commit should have been done for kretprobes too ] Acked-by: Naveen N. Rao naveen.n.rao@linux.vnet.ibm.com Acked-by: Ananth N Mavinakayanahalli ananth@linux.ibm.com Acked-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Wang ShaoBo bobo.shaobowang@huawei.com Signed-off-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/kprobes.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1989,6 +1989,10 @@ int register_kretprobe(struct kretprobe if (!kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset)) return -EINVAL;
+ /* If only rp->kp.addr is specified, check reregistering kprobes */ + if (rp->kp.addr && check_kprobe_rereg(&rp->kp)) + return -EINVAL; + if (kretprobe_blacklist_size) { addr = kprobe_addr(&rp->kp); if (IS_ERR(addr))
From: Mathias Nyman mathias.nyman@linux.intel.com
commit d4a610635400ccc382792f6be69427078541c678 upstream.
xhci driver may in some special cases need to copy small amounts of payload data to a bounce buffer in order to meet the boundary and alignment restrictions set by the xHCI specification.
In the majority of these cases the data is in a sg list, and driver incorrectly assumed data is always in urb->sg when using the bounce buffer.
If data instead is contiguous, and in urb->transfer_buffer, we may still need to bounce buffer a small part if data starts very close (less than packet size) to a 64k boundary.
Check if sg list is used before copying data to/from it.
Fixes: f9c589e142d0 ("xhci: TD-fragment, align the unsplittable case with a bounce buffer") Cc: stable@vger.kernel.org Reported-by: Andreas Hartmann andihartmann@01019freenet.de Tested-by: Andreas Hartmann andihartmann@01019freenet.de Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20210203113702.436762-2-mathias.nyman@linux.intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-ring.c | 31 ++++++++++++++++++++----------- 1 file changed, 20 insertions(+), 11 deletions(-)
--- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -681,11 +681,16 @@ static void xhci_unmap_td_bounce_buffer( dma_unmap_single(dev, seg->bounce_dma, ring->bounce_buf_len, DMA_FROM_DEVICE); /* for in tranfers we need to copy the data from bounce to sg */ - len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs, seg->bounce_buf, - seg->bounce_len, seg->bounce_offs); - if (len != seg->bounce_len) - xhci_warn(xhci, "WARN Wrong bounce buffer read length: %zu != %d\n", - len, seg->bounce_len); + if (urb->num_sgs) { + len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs, seg->bounce_buf, + seg->bounce_len, seg->bounce_offs); + if (len != seg->bounce_len) + xhci_warn(xhci, "WARN Wrong bounce buffer read length: %zu != %d\n", + len, seg->bounce_len); + } else { + memcpy(urb->transfer_buffer + seg->bounce_offs, seg->bounce_buf, + seg->bounce_len); + } seg->bounce_len = 0; seg->bounce_offs = 0; } @@ -3252,12 +3257,16 @@ static int xhci_align_td(struct xhci_hcd
/* create a max max_pkt sized bounce buffer pointed to by last trb */ if (usb_urb_dir_out(urb)) { - len = sg_pcopy_to_buffer(urb->sg, urb->num_sgs, - seg->bounce_buf, new_buff_len, enqd_len); - if (len != new_buff_len) - xhci_warn(xhci, - "WARN Wrong bounce buffer write length: %zu != %d\n", - len, new_buff_len); + if (urb->num_sgs) { + len = sg_pcopy_to_buffer(urb->sg, urb->num_sgs, + seg->bounce_buf, new_buff_len, enqd_len); + if (len != new_buff_len) + xhci_warn(xhci, "WARN Wrong bounce buffer write length: %zu != %d\n", + len, new_buff_len); + } else { + memcpy(seg->bounce_buf, urb->transfer_buffer + enqd_len, new_buff_len); + } + seg->bounce_dma = dma_map_single(dev, seg->bounce_buf, max_pkt, DMA_TO_DEVICE); } else {
From: Aurelien Aptel aaptel@suse.com
commit 21b200d091826a83aafc95d847139b2b0582f6d1 upstream.
Assuming - //HOST/a is mounted on /mnt - //HOST/b is mounted on /mnt/b
On a slow connection, running 'df' and killing it while it's processing /mnt/b can make cifs_get_inode_info() returns -ERESTARTSYS.
This triggers the following chain of events: => the dentry revalidation fail => dentry is put and released => superblock associated with the dentry is put => /mnt/b is unmounted
This patch makes cifs_d_revalidate() return the error instead of 0 (invalid) when cifs_revalidate_dentry() fails, except for ENOENT (file deleted) and ESTALE (file recreated).
Signed-off-by: Aurelien Aptel aaptel@suse.com Suggested-by: Shyam Prasad N nspmangalore@gmail.com Reviewed-by: Shyam Prasad N nspmangalore@gmail.com CC: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/dir.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
--- a/fs/cifs/dir.c +++ b/fs/cifs/dir.c @@ -841,6 +841,7 @@ static int cifs_d_revalidate(struct dentry *direntry, unsigned int flags) { struct inode *inode; + int rc;
if (flags & LOOKUP_RCU) return -ECHILD; @@ -850,8 +851,25 @@ cifs_d_revalidate(struct dentry *direntr if ((flags & LOOKUP_REVAL) && !CIFS_CACHE_READ(CIFS_I(inode))) CIFS_I(inode)->time = 0; /* force reval */
- if (cifs_revalidate_dentry(direntry)) - return 0; + rc = cifs_revalidate_dentry(direntry); + if (rc) { + cifs_dbg(FYI, "cifs_revalidate_dentry failed with rc=%d", rc); + switch (rc) { + case -ENOENT: + case -ESTALE: + /* + * Those errors mean the dentry is invalid + * (file was deleted or recreated) + */ + return 0; + default: + /* + * Otherwise some unexpected error happened + * report it as-is to VFS layer + */ + return rc; + } + } else { /* * If the inode wasn't known to be a dfs entry when
From: Gustavo A. R. Silva gustavoars@kernel.org
commit 8d8d1dbefc423d42d626cf5b81aac214870ebaab upstream.
While addressing some warnings generated by -Warray-bounds, I found this bug that was introduced back in 2017:
CC [M] fs/cifs/smb2pdu.o fs/cifs/smb2pdu.c: In function ‘SMB2_negotiate’: fs/cifs/smb2pdu.c:822:16: warning: array subscript 1 is above array bounds of ‘__le16[1]’ {aka ‘short unsigned int[1]’} [-Warray-bounds] 822 | req->Dialects[1] = cpu_to_le16(SMB30_PROT_ID); | ~~~~~~~~~~~~~^~~ fs/cifs/smb2pdu.c:823:16: warning: array subscript 2 is above array bounds of ‘__le16[1]’ {aka ‘short unsigned int[1]’} [-Warray-bounds] 823 | req->Dialects[2] = cpu_to_le16(SMB302_PROT_ID); | ~~~~~~~~~~~~~^~~ fs/cifs/smb2pdu.c:824:16: warning: array subscript 3 is above array bounds of ‘__le16[1]’ {aka ‘short unsigned int[1]’} [-Warray-bounds] 824 | req->Dialects[3] = cpu_to_le16(SMB311_PROT_ID); | ~~~~~~~~~~~~~^~~ fs/cifs/smb2pdu.c:816:16: warning: array subscript 1 is above array bounds of ‘__le16[1]’ {aka ‘short unsigned int[1]’} [-Warray-bounds] 816 | req->Dialects[1] = cpu_to_le16(SMB302_PROT_ID); | ~~~~~~~~~~~~~^~~
At the time, the size of array _Dialects_ was changed from 1 to 3 in struct validate_negotiate_info_req, and then in 2019 it was changed from 3 to 4, but those changes were never made in struct smb2_negotiate_req, which has led to a 3 and a half years old out-of-bounds bug in function SMB2_negotiate() (fs/cifs/smb2pdu.c).
Fix this by increasing the size of array _Dialects_ in struct smb2_negotiate_req to 4.
Fixes: 9764c02fcbad ("SMB3: Add support for multidialect negotiate (SMB2.1 and later)") Fixes: d5c7076b772a ("smb3: add smb3.1.1 to default dialect list") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavoars@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/smb2pdu.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/cifs/smb2pdu.h +++ b/fs/cifs/smb2pdu.h @@ -206,7 +206,7 @@ struct smb2_negotiate_req { __le32 NegotiateContextOffset; /* SMB3.1.1 only. MBZ earlier */ __le16 NegotiateContextCount; /* SMB3.1.1 only. MBZ earlier */ __le16 Reserved2; - __le16 Dialects[1]; /* One dialect (vers=) at a time for now */ + __le16 Dialects[4]; /* BB expand this if autonegotiate > 4 dialects */ } __packed;
/* Dialects */
From: Fengnan Chang fengnanchang@gmail.com
commit f92e04f764b86e55e522988e6f4b6082d19a2721 upstream.
When analysing tuples fails we may loop indefinitely to retry. Let's avoid this by using a 10s timeout and bail if not completed earlier.
Signed-off-by: Fengnan Chang fengnanchang@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210123033230.36442-1-fengnanchang@gmail.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/sdio_cis.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/mmc/core/sdio_cis.c +++ b/drivers/mmc/core/sdio_cis.c @@ -24,6 +24,8 @@ #include "sdio_cis.h" #include "sdio_ops.h"
+#define SDIO_READ_CIS_TIMEOUT_MS (10 * 1000) /* 10s */ + static int cistpl_vers_1(struct mmc_card *card, struct sdio_func *func, const unsigned char *buf, unsigned size) { @@ -270,6 +272,8 @@ static int sdio_read_cis(struct mmc_card
do { unsigned char tpl_code, tpl_link; + unsigned long timeout = jiffies + + msecs_to_jiffies(SDIO_READ_CIS_TIMEOUT_MS);
ret = mmc_io_rw_direct(card, 0, 0, ptr++, 0, &tpl_code); if (ret) @@ -322,6 +326,8 @@ static int sdio_read_cis(struct mmc_card prev = &this->next;
if (ret == -ENOENT) { + if (time_after(jiffies, timeout)) + break; /* warn about unknown tuples */ pr_warn_ratelimited("%s: queuing unknown" " CIS tuple 0x%02x (%u bytes)\n",
From: Thorsten Leemhuis linux@leemhuis.info
commit 538e4a8c571efdf131834431e0c14808bcfb1004 upstream.
Some Kingston A2000 NVMe SSDs sooner or later get confused and stop working when they use the deepest APST sleep while running Linux. The system then crashes and one has to cold boot it to get the SSD working again.
Kingston seems to known about this since at least mid-September 2020: https://bbs.archlinux.org/viewtopic.php?pid=1926994#p1926994
Someone working for a German company representing Kingston to the German press confirmed to me Kingston engineering is aware of the issue and investigating; the person stated that to their current knowledge only the deepest APST sleep state causes trouble. Therefore, make Linux avoid it for now by applying the NVME_QUIRK_NO_DEEPEST_PS to this SSD.
I have two such SSDs, but it seems the problem doesn't occur with them. I hence couldn't verify if this patch really fixes the problem, but all the data in front of me suggests it should.
This patch can easily be reverted or improved upon if a better solution surfaces.
FWIW, there are many reports about the issue scattered around the web; most of the users disabled APST completely to make things work, some just made Linux avoid the deepest sleep state:
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c65 https://bugzilla.kernel.org/show_bug.cgi?id=195039#c73 https://bugzilla.kernel.org/show_bug.cgi?id=195039#c74 https://bugzilla.kernel.org/show_bug.cgi?id=195039#c78 https://bugzilla.kernel.org/show_bug.cgi?id=195039#c79 https://bugzilla.kernel.org/show_bug.cgi?id=195039#c80 https://askubuntu.com/questions/1222049/nvmekingston-a2000-sometimes-stops-g... https://community.acer.com/en/discussion/604326/m-2-nvme-ssd-aspire-517-51g-...
For the record, some data from 'nvme id-ctrl /dev/nvme0'
NVME Identify Controller: vid : 0x2646 ssvid : 0x2646 mn : KINGSTON SA2000M81000G fr : S5Z42105 [...] ps 0 : mp:9.00W operational enlat:0 exlat:0 rrt:0 rrl:0 rwt:0 rwl:0 idle_power:- active_power:- ps 1 : mp:4.60W operational enlat:0 exlat:0 rrt:1 rrl:1 rwt:1 rwl:1 idle_power:- active_power:- ps 2 : mp:3.80W operational enlat:0 exlat:0 rrt:2 rrl:2 rwt:2 rwl:2 idle_power:- active_power:- ps 3 : mp:0.0450W non-operational enlat:2000 exlat:2000 rrt:3 rrl:3 rwt:3 rwl:3 idle_power:- active_power:- ps 4 : mp:0.0040W non-operational enlat:15000 exlat:15000 rrt:4 rrl:4 rwt:4 rwl:4 idle_power:- active_power:-
Cc: stable@vger.kernel.org # 4.14+ Signed-off-by: Thorsten Leemhuis linux@leemhuis.info Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/nvme/host/pci.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2588,6 +2588,8 @@ static const struct pci_device_id nvme_i { PCI_DEVICE(0x1d1d, 0x2601), /* CNEX Granby */ .driver_data = NVME_QUIRK_LIGHTNVM, }, { PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) }, + { PCI_DEVICE(0x2646, 0x2263), /* KINGSTON A2000 NVMe SSD */ + .driver_data = NVME_QUIRK_NO_DEEPEST_PS, }, { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001) }, { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) }, { 0, }
From: Russell King rmk+kernel@armlinux.org.uk
commit 39d3454c3513840eb123b3913fda6903e45ce671 upstream.
Building with gcc 4.9.2 reveals a latent bug in the PCI accessors for Footbridge platforms, which causes a fatal alignment fault while accessing IO memory. Fix this by making the assembly volatile.
Cc: stable@vger.kernel.org Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mach-footbridge/dc21285.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/arch/arm/mach-footbridge/dc21285.c +++ b/arch/arm/mach-footbridge/dc21285.c @@ -69,15 +69,15 @@ dc21285_read_config(struct pci_bus *bus, if (addr) switch (size) { case 1: - asm("ldrb %0, [%1, %2]" + asm volatile("ldrb %0, [%1, %2]" : "=r" (v) : "r" (addr), "r" (where) : "cc"); break; case 2: - asm("ldrh %0, [%1, %2]" + asm volatile("ldrh %0, [%1, %2]" : "=r" (v) : "r" (addr), "r" (where) : "cc"); break; case 4: - asm("ldr %0, [%1, %2]" + asm volatile("ldr %0, [%1, %2]" : "=r" (v) : "r" (addr), "r" (where) : "cc"); break; } @@ -103,17 +103,17 @@ dc21285_write_config(struct pci_bus *bus if (addr) switch (size) { case 1: - asm("strb %0, [%1, %2]" + asm volatile("strb %0, [%1, %2]" : : "r" (value), "r" (addr), "r" (where) : "cc"); break; case 2: - asm("strh %0, [%1, %2]" + asm volatile("strh %0, [%1, %2]" : : "r" (value), "r" (addr), "r" (where) : "cc"); break; case 4: - asm("str %0, [%1, %2]" + asm volatile("str %0, [%1, %2]" : : "r" (value), "r" (addr), "r" (where) : "cc"); break;
From: Muchun Song songmuchun@bytedance.com
commit 585fc0d2871c9318c949fbf45b1f081edd489e96 upstream.
If a new hugetlb page is allocated during fallocate it will not be marked as active (set_page_huge_active) which will result in a later isolate_huge_page failure when the page migration code would like to move that page. Such a failure would be unexpected and wrong.
Only export set_page_huge_active, just leave clear_page_huge_active as static. Because there are no external users.
Link: https://lkml.kernel.org/r/20210115124942.46403-3-songmuchun@bytedance.com Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate()) Signed-off-by: Muchun Song songmuchun@bytedance.com Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: Mike Kravetz mike.kravetz@oracle.com Reviewed-by: Oscar Salvador osalvador@suse.de Cc: David Hildenbrand david@redhat.com Cc: Yang Shi shy828301@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/hugetlbfs/inode.c | 3 ++- include/linux/hugetlb.h | 3 +++ mm/hugetlb.c | 2 +- 3 files changed, 6 insertions(+), 2 deletions(-)
--- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -649,8 +649,9 @@ static long hugetlbfs_fallocate(struct f
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
+ set_page_huge_active(page); /* - * page_put due to reference from alloc_huge_page() + * put_page() due to reference from alloc_huge_page() * unlock_page because locked by add_to_page_cache() */ put_page(page); --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -531,6 +531,9 @@ static inline void set_huge_swap_pte_at( set_huge_pte_at(mm, addr, ptep, pte); } #endif + +void set_page_huge_active(struct page *page); + #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; #define alloc_huge_page(v, a, r) NULL --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1238,7 +1238,7 @@ bool page_huge_active(struct page *page) }
/* never called for tail page */ -static void set_page_huge_active(struct page *page) +void set_page_huge_active(struct page *page) { VM_BUG_ON_PAGE(!PageHeadHuge(page), page); SetPagePrivate(&page[1]);
From: Muchun Song songmuchun@bytedance.com
commit 0eb2df2b5629794020f75e94655e1994af63f0d4 upstream.
There is a race between isolate_huge_page() and __free_huge_page().
CPU0: CPU1:
if (PageHuge(page)) put_page(page) __free_huge_page(page) spin_lock(&hugetlb_lock) update_and_free_page(page) set_compound_page_dtor(page, NULL_COMPOUND_DTOR) spin_unlock(&hugetlb_lock) isolate_huge_page(page) // trigger BUG_ON VM_BUG_ON_PAGE(!PageHead(page), page) spin_lock(&hugetlb_lock) page_huge_active(page) // trigger BUG_ON VM_BUG_ON_PAGE(!PageHuge(page), page) spin_unlock(&hugetlb_lock)
When we isolate a HugeTLB page on CPU0. Meanwhile, we free it to the buddy allocator on CPU1. Then, we can trigger a BUG_ON on CPU0, because it is already freed to the buddy allocator.
Link: https://lkml.kernel.org/r/20210115124942.46403-5-songmuchun@bytedance.com Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Signed-off-by: Muchun Song songmuchun@bytedance.com Reviewed-by: Mike Kravetz mike.kravetz@oracle.com Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: Oscar Salvador osalvador@suse.de Cc: David Hildenbrand david@redhat.com Cc: Yang Shi shy828301@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/hugetlb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4865,9 +4865,9 @@ bool isolate_huge_page(struct page *page { bool ret = true;
- VM_BUG_ON_PAGE(!PageHead(page), page); spin_lock(&hugetlb_lock); - if (!page_huge_active(page) || !get_page_unless_zero(page)) { + if (!PageHeadHuge(page) || !page_huge_active(page) || + !get_page_unless_zero(page)) { ret = false; goto unlock; }
From: Muchun Song songmuchun@bytedance.com
commit ecbf4724e6061b4b01be20f6d797d64d462b2bc8 upstream.
The page_huge_active() can be called from scan_movable_pages() which do not hold a reference count to the HugeTLB page. So when we call page_huge_active() from scan_movable_pages(), the HugeTLB page can be freed parallel. Then we will trigger a BUG_ON which is in the page_huge_active() when CONFIG_DEBUG_VM is enabled. Just remove the VM_BUG_ON_PAGE.
Link: https://lkml.kernel.org/r/20210115124942.46403-6-songmuchun@bytedance.com Fixes: 7e1f049efb86 ("mm: hugetlb: cleanup using paeg_huge_active()") Signed-off-by: Muchun Song songmuchun@bytedance.com Reviewed-by: Mike Kravetz mike.kravetz@oracle.com Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: Oscar Salvador osalvador@suse.de Cc: David Hildenbrand david@redhat.com Cc: Yang Shi shy828301@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/hugetlb.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1233,8 +1233,7 @@ struct hstate *size_to_hstate(unsigned l */ bool page_huge_active(struct page *page) { - VM_BUG_ON_PAGE(!PageHuge(page), page); - return PageHead(page) && PagePrivate(&page[1]); + return PageHeadHuge(page) && PagePrivate(&page[1]); }
/* never called for tail page */
From: Hugh Dickins hughd@google.com
commit 1c2f67308af4c102b4e1e6cd6f69819ae59408e0 upstream.
Sergey reported deadlock between kswapd correctly doing its usual lock_page(page) followed by down_read(page->mapping->i_mmap_rwsem), and madvise(MADV_REMOVE) on an madvise(MADV_HUGEPAGE) area doing down_write(page->mapping->i_mmap_rwsem) followed by lock_page(page).
This happened when shmem_fallocate(punch hole)'s unmap_mapping_range() reaches zap_pmd_range()'s call to __split_huge_pmd(). The same deadlock could occur when partially truncating a mapped huge tmpfs file, or using fallocate(FALLOC_FL_PUNCH_HOLE) on it.
__split_huge_pmd()'s page lock was added in 5.8, to make sure that any concurrent use of reuse_swap_page() (holding page lock) could not catch the anon THP's mapcounts and swapcounts while they were being split.
Fortunately, reuse_swap_page() is never applied to a shmem or file THP (not even by khugepaged, which checks PageSwapCache before calling), and anonymous THPs are never created in shmem or file areas: so that __split_huge_pmd()'s page lock can only be necessary for anonymous THPs, on which there is no risk of deadlock with i_mmap_rwsem.
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2101161409470.2022@eggly.anvils Fixes: c444eb564fb1 ("mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked()") Signed-off-by: Hugh Dickins hughd@google.com Reported-by: Sergey Senozhatsky sergey.senozhatsky.work@gmail.com Reviewed-by: Andrea Arcangeli aarcange@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/huge_memory.c | 37 +++++++++++++++++++++++-------------- 1 file changed, 23 insertions(+), 14 deletions(-)
--- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2198,7 +2198,7 @@ void __split_huge_pmd(struct vm_area_str spinlock_t *ptl; struct mm_struct *mm = vma->vm_mm; unsigned long haddr = address & HPAGE_PMD_MASK; - bool was_locked = false; + bool do_unlock_page = false; pmd_t _pmd;
mmu_notifier_invalidate_range_start(mm, haddr, haddr + HPAGE_PMD_SIZE); @@ -2211,7 +2211,6 @@ void __split_huge_pmd(struct vm_area_str VM_BUG_ON(freeze && !page); if (page) { VM_WARN_ON_ONCE(!PageLocked(page)); - was_locked = true; if (page != pmd_page(*pmd)) goto out; } @@ -2220,19 +2219,29 @@ repeat: if (pmd_trans_huge(*pmd)) { if (!page) { page = pmd_page(*pmd); - if (unlikely(!trylock_page(page))) { - get_page(page); - _pmd = *pmd; - spin_unlock(ptl); - lock_page(page); - spin_lock(ptl); - if (unlikely(!pmd_same(*pmd, _pmd))) { - unlock_page(page); + /* + * An anonymous page must be locked, to ensure that a + * concurrent reuse_swap_page() sees stable mapcount; + * but reuse_swap_page() is not used on shmem or file, + * and page lock must not be taken when zap_pmd_range() + * calls __split_huge_pmd() while i_mmap_lock is held. + */ + if (PageAnon(page)) { + if (unlikely(!trylock_page(page))) { + get_page(page); + _pmd = *pmd; + spin_unlock(ptl); + lock_page(page); + spin_lock(ptl); + if (unlikely(!pmd_same(*pmd, _pmd))) { + unlock_page(page); + put_page(page); + page = NULL; + goto repeat; + } put_page(page); - page = NULL; - goto repeat; } - put_page(page); + do_unlock_page = true; } } if (PageMlocked(page)) @@ -2242,7 +2251,7 @@ repeat: __split_huge_pmd_locked(vma, pmd, haddr, freeze); out: spin_unlock(ptl); - if (!was_locked && page) + if (do_unlock_page) unlock_page(page); mmu_notifier_invalidate_range_end(mm, haddr, haddr + HPAGE_PMD_SIZE); }
From: Josh Poimboeuf jpoimboe@redhat.com
commit 20bf2b378729c4a0366a53e2018a0b70ace94bcd upstream.
With retpolines disabled, some configurations of GCC, and specifically the GCC versions 9 and 10 in Ubuntu will add Intel CET instrumentation to the kernel by default. That breaks certain tracing scenarios by adding a superfluous ENDBR64 instruction before the fentry call, for functions which can be called indirectly.
CET instrumentation isn't currently necessary in the kernel, as CET is only supported in user space. Disable it unconditionally and move it into the x86's Makefile as CET/CFI... enablement should be a per-arch decision anyway.
[ bp: Massage and extend commit message. ]
Fixes: 29be86d7f9cb ("kbuild: add -fcf-protection=none when using retpoline flags") Reported-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Nikolay Borisov nborisov@suse.com Tested-by: Nikolay Borisov nborisov@suse.com Cc: stable@vger.kernel.org Cc: Seth Forshee seth.forshee@canonical.com Cc: Masahiro Yamada yamada.masahiro@socionext.com Link: https://lkml.kernel.org/r/20210128215219.6kct3h2eiustncws@treble Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Makefile | 6 ------ arch/x86/Makefile | 3 +++ 2 files changed, 3 insertions(+), 6 deletions(-)
--- a/Makefile +++ b/Makefile @@ -844,12 +844,6 @@ KBUILD_CFLAGS += $(call cc-option,-Wer # change __FILE__ to the relative path from the srctree KBUILD_CFLAGS += $(call cc-option,-fmacro-prefix-map=$(srctree)/=)
-# ensure -fcf-protection is disabled when using retpoline as it is -# incompatible with -mindirect-branch=thunk-extern -ifdef CONFIG_RETPOLINE -KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none) -endif - # use the deterministic mode of AR if available KBUILD_ARFLAGS := $(call ar-option,D)
--- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -138,6 +138,9 @@ else KBUILD_CFLAGS += -mno-red-zone KBUILD_CFLAGS += -mcmodel=kernel
+ # Intel CET isn't enabled in the kernel + KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none) + # -funit-at-a-time shrinks the kernel .text considerably # unfortunately it makes reading oopses harder. KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time)
From: Dave Hansen dave.hansen@linux.intel.com
commit 25a068b8e9a4eb193d755d58efcb3c98928636e0 upstream.
Jan Kiszka reported that the x2apic_wrmsr_fence() function uses a plain MFENCE while the Intel SDM (10.12.3 MSR Access in x2APIC Mode) calls for MFENCE; LFENCE.
Short summary: we have special MSRs that have weaker ordering than all the rest. Add fencing consistent with current SDM recommendations.
This is not known to cause any issues in practice, only in theory.
Longer story below:
The reason the kernel uses a different semantic is that the SDM changed (roughly in late 2017). The SDM changed because folks at Intel were auditing all of the recommended fences in the SDM and realized that the x2apic fences were insufficient.
Why was the pain MFENCE judged insufficient?
WRMSR itself is normally a serializing instruction. No fences are needed because the instruction itself serializes everything.
But, there are explicit exceptions for this serializing behavior written into the WRMSR instruction documentation for two classes of MSRs: IA32_TSC_DEADLINE and the X2APIC MSRs.
Back to x2apic: WRMSR is *not* serializing in this specific case. But why is MFENCE insufficient? MFENCE makes writes visible, but only affects load/store instructions. WRMSR is unfortunately not a load/store instruction and is unaffected by MFENCE. This means that a non-serializing WRMSR could be reordered by the CPU to execute before the writes made visible by the MFENCE have even occurred in the first place.
This means that an x2apic IPI could theoretically be triggered before there is any (visible) data to process.
Does this affect anything in practice? I honestly don't know. It seems quite possible that by the time an interrupt gets to consume the (not yet) MFENCE'd data, it has become visible, mostly by accident.
To be safe, add the SDM-recommended fences for all x2apic WRMSRs.
This also leaves open the question of the _other_ weakly-ordered WRMSR: MSR_IA32_TSC_DEADLINE. While it has the same ordering architecture as the x2APIC MSRs, it seems substantially less likely to be a problem in practice. While writes to the in-memory Local Vector Table (LVT) might theoretically be reordered with respect to a weakly-ordered WRMSR like TSC_DEADLINE, the SDM has this to say:
In x2APIC mode, the WRMSR instruction is used to write to the LVT entry. The processor ensures the ordering of this write and any subsequent WRMSR to the deadline; no fencing is required.
But, that might still leave xAPIC exposed. The safest thing to do for now is to add the extra, recommended LFENCE.
[ bp: Massage commit message, fix typos, drop accidentally added newline to tools/arch/x86/include/asm/barrier.h. ]
Reported-by: Jan Kiszka jan.kiszka@siemens.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200305174708.F77040DD@viggo.jf.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/apic.h | 10 ---------- arch/x86/include/asm/barrier.h | 18 ++++++++++++++++++ arch/x86/kernel/apic/apic.c | 4 ++++ arch/x86/kernel/apic/x2apic_cluster.c | 6 ++++-- arch/x86/kernel/apic/x2apic_phys.c | 6 ++++-- 5 files changed, 30 insertions(+), 14 deletions(-)
--- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -174,16 +174,6 @@ static inline void lapic_update_tsc_freq #endif /* !CONFIG_X86_LOCAL_APIC */
#ifdef CONFIG_X86_X2APIC -/* - * Make previous memory operations globally visible before - * sending the IPI through x2apic wrmsr. We need a serializing instruction or - * mfence for this. - */ -static inline void x2apic_wrmsr_fence(void) -{ - asm volatile("mfence" : : : "memory"); -} - static inline void native_apic_msr_write(u32 reg, u32 v) { if (reg == APIC_DFR || reg == APIC_ID || reg == APIC_LDR || --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -111,4 +111,22 @@ do { \
#include <asm-generic/barrier.h>
+/* + * Make previous memory operations globally visible before + * a WRMSR. + * + * MFENCE makes writes visible, but only affects load/store + * instructions. WRMSR is unfortunately not a load/store + * instruction and is unaffected by MFENCE. The LFENCE ensures + * that the WRMSR is not reordered. + * + * Most WRMSRs are full serializing instructions themselves and + * do not require this barrier. This is only required for the + * IA32_TSC_DEADLINE and X2APIC MSRs. + */ +static inline void weak_wrmsr_fence(void) +{ + asm volatile("mfence; lfence" : : : "memory"); +} + #endif /* _ASM_X86_BARRIER_H */ --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -42,6 +42,7 @@ #include <asm/x86_init.h> #include <asm/pgalloc.h> #include <linux/atomic.h> +#include <asm/barrier.h> #include <asm/mpspec.h> #include <asm/i8259.h> #include <asm/proto.h> @@ -473,6 +474,9 @@ static int lapic_next_deadline(unsigned { u64 tsc;
+ /* This MSR is special and need a special fence: */ + weak_wrmsr_fence(); + tsc = rdtsc(); wrmsrl(MSR_IA32_TSC_DEADLINE, tsc + (((u64) delta) * TSC_DIVISOR)); return 0; --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -29,7 +29,8 @@ static void x2apic_send_IPI(int cpu, int { u32 dest = per_cpu(x86_cpu_to_logical_apicid, cpu);
- x2apic_wrmsr_fence(); + /* x2apic MSRs are special and need a special fence: */ + weak_wrmsr_fence(); __x2apic_send_IPI_dest(dest, vector, APIC_DEST_LOGICAL); }
@@ -42,7 +43,8 @@ __x2apic_send_IPI_mask(const struct cpum unsigned long flags; u32 dest;
- x2apic_wrmsr_fence(); + /* x2apic MSRs are special and need a special fence: */ + weak_wrmsr_fence();
local_irq_save(flags);
--- a/arch/x86/kernel/apic/x2apic_phys.c +++ b/arch/x86/kernel/apic/x2apic_phys.c @@ -41,7 +41,8 @@ static void x2apic_send_IPI(int cpu, int { u32 dest = per_cpu(x86_cpu_to_apicid, cpu);
- x2apic_wrmsr_fence(); + /* x2apic MSRs are special and need a special fence: */ + weak_wrmsr_fence(); __x2apic_send_IPI_dest(dest, vector, APIC_DEST_PHYSICAL); }
@@ -52,7 +53,8 @@ __x2apic_send_IPI_mask(const struct cpum unsigned long this_cpu; unsigned long flags;
- x2apic_wrmsr_fence(); + /* x2apic MSRs are special and need a special fence: */ + weak_wrmsr_fence();
local_irq_save(flags);
From: Benjamin Valentin benpicco@googlemail.com
commit 9bbd77d5bbc9aff8cb74d805c31751f5f0691ba8 upstream.
There is a fork of this driver on GitHub [0] that has been updated with new device IDs.
Merge those into the mainline driver, so the out-of-tree fork is not needed for users of those devices anymore.
[0] https://github.com/paroj/xpad
Signed-off-by: Benjamin Valentin benpicco@googlemail.com Link: https://lore.kernel.org/r/20210121142523.1b6b050f@rechenknecht2k11 Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/joystick/xpad.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)
--- a/drivers/input/joystick/xpad.c +++ b/drivers/input/joystick/xpad.c @@ -232,9 +232,17 @@ static const struct xpad_device { { 0x0e6f, 0x0213, "Afterglow Gamepad for Xbox 360", 0, XTYPE_XBOX360 }, { 0x0e6f, 0x021f, "Rock Candy Gamepad for Xbox 360", 0, XTYPE_XBOX360 }, { 0x0e6f, 0x0246, "Rock Candy Gamepad for Xbox One 2015", 0, XTYPE_XBOXONE }, - { 0x0e6f, 0x02ab, "PDP Controller for Xbox One", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02a0, "PDP Xbox One Controller", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02a1, "PDP Xbox One Controller", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02a2, "PDP Wired Controller for Xbox One - Crimson Red", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x02a4, "PDP Wired Controller for Xbox One - Stealth Series", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x02a6, "PDP Wired Controller for Xbox One - Camo Series", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02a7, "PDP Xbox One Controller", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02a8, "PDP Xbox One Controller", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02ab, "PDP Controller for Xbox One", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02ad, "PDP Wired Controller for Xbox One - Stealth Series", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02b3, "Afterglow Prismatic Wired Controller", 0, XTYPE_XBOXONE }, + { 0x0e6f, 0x02b8, "Afterglow Prismatic Wired Controller", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x0301, "Logic3 Controller", 0, XTYPE_XBOX360 }, { 0x0e6f, 0x0346, "Rock Candy Gamepad for Xbox One 2016", 0, XTYPE_XBOXONE }, { 0x0e6f, 0x0401, "Logic3 Controller", 0, XTYPE_XBOX360 }, @@ -313,6 +321,9 @@ static const struct xpad_device { { 0x1bad, 0xfa01, "MadCatz GamePad", 0, XTYPE_XBOX360 }, { 0x1bad, 0xfd00, "Razer Onza TE", 0, XTYPE_XBOX360 }, { 0x1bad, 0xfd01, "Razer Onza", 0, XTYPE_XBOX360 }, + { 0x20d6, 0x2001, "BDA Xbox Series X Wired Controller", 0, XTYPE_XBOXONE }, + { 0x20d6, 0x281f, "PowerA Wired Controller For Xbox 360", 0, XTYPE_XBOX360 }, + { 0x2e24, 0x0652, "Hyperkin Duke X-Box One pad", 0, XTYPE_XBOXONE }, { 0x24c6, 0x5000, "Razer Atrox Arcade Stick", MAP_TRIGGERS_TO_BUTTONS, XTYPE_XBOX360 }, { 0x24c6, 0x5300, "PowerA MINI PROEX Controller", 0, XTYPE_XBOX360 }, { 0x24c6, 0x5303, "Xbox Airflo wired controller", 0, XTYPE_XBOX360 }, @@ -446,8 +457,12 @@ static const struct usb_device_id xpad_t XPAD_XBOX360_VENDOR(0x162e), /* Joytech X-Box 360 controllers */ XPAD_XBOX360_VENDOR(0x1689), /* Razer Onza */ XPAD_XBOX360_VENDOR(0x1bad), /* Harminix Rock Band Guitar and Drums */ + XPAD_XBOX360_VENDOR(0x20d6), /* PowerA Controllers */ + XPAD_XBOXONE_VENDOR(0x20d6), /* PowerA Controllers */ XPAD_XBOX360_VENDOR(0x24c6), /* PowerA Controllers */ XPAD_XBOXONE_VENDOR(0x24c6), /* PowerA Controllers */ + XPAD_XBOXONE_VENDOR(0x2e24), /* Hyperkin Duke X-Box One pad */ + XPAD_XBOX360_VENDOR(0x2f24), /* GameSir Controllers */ { } };
From: Nadav Amit namit@vmware.com
commit 29b32839725f8c89a41cb6ee054c85f3116ea8b5 upstream.
When an Intel IOMMU is virtualized, and a physical device is passed-through to the VM, changes of the virtual IOMMU need to be propagated to the physical IOMMU. The hypervisor therefore needs to monitor PTE mappings in the IOMMU page-tables. Intel specifications provide "caching-mode" capability that a virtual IOMMU uses to report that the IOMMU is virtualized and a TLB flush is needed after mapping to allow the hypervisor to propagate virtual IOMMU mappings to the physical IOMMU. To the best of my knowledge no real physical IOMMU reports "caching-mode" as turned on.
Synchronizing the virtual and the physical IOMMU tables is expensive if the hypervisor is unaware which PTEs have changed, as the hypervisor is required to walk all the virtualized tables and look for changes. Consequently, domain flushes are much more expensive than page-specific flushes on virtualized IOMMUs with passthrough devices. The kernel therefore exploited the "caching-mode" indication to avoid domain flushing and use page-specific flushing in virtualized environments. See commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching mode.")
This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing"). Now, when batched TLB flushing is used (the default), full TLB domain flushes are performed frequently, requiring the hypervisor to perform expensive synchronization between the virtual TLB and the physical one.
Getting batched TLB flushes to use page-specific invalidations again in such circumstances is not easy, since the TLB invalidation scheme assumes that "full" domain TLB flushes are performed for scalability.
Disable batched TLB flushes when caching-mode is on, as the performance benefit from using batched TLB invalidations is likely to be much smaller than the overhead of the virtual-to-physical IOMMU page-tables synchronization.
Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Nadav Amit namit@vmware.com Cc: David Woodhouse dwmw2@infradead.org Cc: Lu Baolu baolu.lu@linux.intel.com Cc: Joerg Roedel joro@8bytes.org Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org Acked-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Nadav Amit namit@vmware.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iommu/intel-iommu.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3338,6 +3338,12 @@ static int __init init_dmars(void)
if (!ecap_pass_through(iommu->ecap)) hw_pass_through = 0; + + if (!intel_iommu_strict && cap_caching_mode(iommu->cap)) { + pr_info("Disable batched IOTLB flush due to virtualization"); + intel_iommu_strict = 1; + } + #ifdef CONFIG_INTEL_IOMMU_SVM if (pasid_enabled(iommu)) intel_svm_alloc_pasid_tables(iommu);
From: DENG Qingfang dqfext@gmail.com
commit f72f2fb8fb6be095b98af5d740ac50cffd0b0cae upstream.
Having multiple destination ports for a unicast address does not make sense. Make port_db_load_purge override existent unicast portvec instead of adding a new port bit.
Fixes: 884729399260 ("net: dsa: mv88e6xxx: handle multiple ports in ATU") Signed-off-by: DENG Qingfang dqfext@gmail.com Reviewed-by: Vladimir Oltean olteanv@gmail.com Link: https://lore.kernel.org/r/20210130134334.10243-1-dqfext@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/mv88e6xxx/chip.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -1364,7 +1364,11 @@ static int mv88e6xxx_port_db_load_purge( if (!entry.portvec) entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UNUSED; } else { - entry.portvec |= BIT(port); + if (state == MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC) + entry.portvec = BIT(port); + else + entry.portvec |= BIT(port); + entry.state = state; }
On Mon, 8 Feb 2021 at 20:38, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.14.221 release. There are 30 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 10 Feb 2021 14:57:55 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.221-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
Summary ------------------------------------------------------------------------
kernel: 4.14.221-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.14.y git commit: c7c1196add2085d4956a80729d2832ef83d963c8 git describe: v4.14.220-31-gc7c1196add20 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-4.14.y/build/v4.14....
No regressions (compared to build v4.14.220)
No fixes (compared to build v4.14.220)
Ran 41474 total tests in the following environments and test suites.
Environments -------------- - arm - arm64 - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-64k_page_size - juno-r2 - arm64 - juno-r2-compat - juno-r2-kasan - mips - qemu-arm64-kasan - qemu-x86_64-kasan - qemu_arm - qemu_arm64 - qemu_arm64-compat - qemu_i386 - qemu_x86_64 - qemu_x86_64-compat - sparc - x15 - arm - x86_64 - x86-kasan - x86_64
Test Suites ----------- * build * linux-log-parser * fwts * install-android-platform-tools-r2600 * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-controllers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fs-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * network-basic-tests * v4l2-compliance * kselftest-android * kselftest-bpf * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-efivarfs * kselftest-filesystems * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-lkdtm * kselftest-membarrier * kselftest-net * kselftest-netfilter * kselftest-nsfs * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-open-posix-tests * perf * kvm-unit-tests * rcutorture
On Mon, Feb 08, 2021 at 04:00:46PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.221 release. There are 30 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 10 Feb 2021 14:57:55 +0000. Anything received after that time might be too late.
Build results: total: 168 pass: 168 fail: 0 Qemu test results: total: 404 pass: 404 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
linux-stable-mirror@lists.linaro.org