Changes since v9 [1] and v10 [2]
* Resend the full series with the reworked "mm: introduce
MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS" (Christoph)
* Move generic_dax_pagefree() into the pmem driver (Christoph)
* Cleanup __bdev_dax_supported() (Christoph)
* Cleanup some stale SRCU bits leftover from other iterations (Jan)
* Cleanup xfs_break_layouts() (Jan)
[1]: https://lists.01.org/pipermail/linux-nvdimm/2018-April/015457.html
[2]: https://lists.01.org/pipermail/linux-nvdimm/2018-May/015885.html
---
Background:
get_user_pages() in the filesystem pins file backed memory pages for
access by devices performing dma. However, it only pins the memory pages
not the page-to-file offset association. If a file is truncated the
pages are mapped out of the file and dma may continue indefinitely into
a page that is owned by a device driver. This breaks coherency of the
file vs dma, but the assumption is that if userspace wants the
file-space truncated it does not matter what data is inbound from the
device, it is not relevant anymore. The only expectation is that dma can
safely continue while the filesystem reallocates the block(s).
Problem:
This expectation that dma can safely continue while the filesystem
changes the block map is broken by dax. With dax the target dma page
*is* the filesystem block. The model of leaving the page pinned for dma,
but truncating the file block out of the file, means that the filesytem
is free to reallocate a block under active dma to another file and now
the expected data-incoherency situation has turned into active
data-corruption.
Solution:
Defer all filesystem operations (fallocate(), truncate()) on a dax mode
file while any page/block in the file is under active dma. This solution
assumes that dma is transient. Cases where dma operations are known to
not be transient, like RDMA, have been explicitly disabled via
commits like 5f1d43de5416 "IB/core: disable memory registration of
filesystem-dax vmas".
The dax_layout_busy_page() routine is called by filesystems with a lock
held against mm faults (i_mmap_lock) to find pinned / busy dax pages.
The process of looking up a busy page invalidates all mappings
to trigger any subsequent get_user_pages() to block on i_mmap_lock.
The filesystem continues to call dax_layout_busy_page() until it finally
returns no more active pages. This approach assumes that the page
pinning is transient, if that assumption is violated the system would
have likely hung from the uncompleted I/O.
---
Dan Williams (7):
memremap: split devm_memremap_pages() and memremap() infrastructure
mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS
mm: fix __gup_device_huge vs unmap
mm, fs, dax: handle layout changes to pinned dax mappings
xfs: prepare xfs_break_layouts() to be called with XFS_MMAPLOCK_EXCL
xfs: prepare xfs_break_layouts() for another layout type
xfs, dax: introduce xfs_break_dax_layouts()
drivers/dax/super.c | 14 ++-
drivers/nvdimm/pfn_devs.c | 2
drivers/nvdimm/pmem.c | 25 +++++
fs/Kconfig | 1
fs/dax.c | 97 +++++++++++++++++++++
fs/xfs/xfs_file.c | 72 ++++++++++++++--
fs/xfs/xfs_inode.h | 16 +++
fs/xfs/xfs_ioctl.c | 8 --
fs/xfs/xfs_iops.c | 16 ++-
fs/xfs/xfs_pnfs.c | 15 ++-
fs/xfs/xfs_pnfs.h | 5 +
include/linux/dax.h | 7 ++
include/linux/memremap.h | 36 ++------
include/linux/mm.h | 71 +++++++++++----
kernel/Makefile | 3 -
kernel/iomem.c | 167 ++++++++++++++++++++++++++++++++++++
kernel/memremap.c | 209 ++++++---------------------------------------
mm/Kconfig | 5 +
mm/gup.c | 36 ++++++--
mm/hmm.c | 13 ---
mm/swap.c | 3 -
21 files changed, 542 insertions(+), 279 deletions(-)
create mode 100644 kernel/iomem.c
The patch titled
Subject: mm: fix devmem_is_allowed() for sub-page System RAM intersections
has been added to the -mm tree. Its filename is
mm-fix-devmem_is_allowed-for-sub-page-system-ram-intersections.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-fix-devmem_is_allowed-for-sub-p…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-fix-devmem_is_allowed-for-sub-p…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Dan Williams <dan.j.williams(a)intel.com>
Subject: mm: fix devmem_is_allowed() for sub-page System RAM intersections
Hussam reports:
I was poking around and for no real reason, I did cat /dev/mem and
strings /dev/mem. Then I saw the following warning in dmesg. I saved it
and rebooted immediately.
memremap attempted on mixed range 0x000000000009c000 size: 0x1000
------------[ cut here ]------------
WARNING: CPU: 0 PID: 11810 at kernel/memremap.c:98 memremap+0x104/0x170
[..]
Call Trace:
xlate_dev_mem_ptr+0x25/0x40
read_mem+0x89/0x1a0
__vfs_read+0x36/0x170
The memremap() implementation checks for attempts to remap System RAM with
MEMREMAP_WB and instead redirects those mapping attempts to the linear
map. However, that only works if the physical address range being
remapped is page aligned. In low memory we have situations like the
following:
00000000-00000fff : Reserved
00001000-0009fbff : System RAM
0009fc00-0009ffff : Reserved
...where System RAM intersects Reserved ranges on a sub-page page
granularity.
Given that devmem_is_allowed() special cases any attempt to map System RAM
in the first 1MB of memory, replace page_is_ram() with the more precise
region_intersects() to trap attempts to map disallowed ranges.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199999
Link: http://lkml.kernel.org/r/152856436164.18127.2847888121707136898.stgit@dwill…
Fixes: 92281dee825f ("arch: introduce memremap()")
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Reported-by: Hussam Al-Tayeb <me(a)hussam.eu.org>
Tested-by: Hussam Al-Tayeb <me(a)hussam.eu.org>
Cc: <stable(a)vger.kernel.org>
Cc: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/x86/mm/init.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff -puN arch/x86/mm/init.c~mm-fix-devmem_is_allowed-for-sub-page-system-ram-intersections arch/x86/mm/init.c
--- a/arch/x86/mm/init.c~mm-fix-devmem_is_allowed-for-sub-page-system-ram-intersections
+++ a/arch/x86/mm/init.c
@@ -706,7 +706,9 @@ void __init init_mem_mapping(void)
*/
int devmem_is_allowed(unsigned long pagenr)
{
- if (page_is_ram(pagenr)) {
+ if (region_intersects(PFN_PHYS(pagenr), PAGE_SIZE,
+ IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE)
+ != REGION_DISJOINT) {
/*
* For disallowed memory regions in the low 1MB range,
* request that the page be shown as all zeros.
_
Patches currently in -mm which might be from dan.j.williams(a)intel.com are
mm-fix-devmem_is_allowed-for-sub-page-system-ram-intersections.patch
mm-devm_memremap_pages-mark-devm_memremap_pages-export_symbol_gpl.patch
mm-devm_memremap_pages-handle-errors-allocating-final-devres-action.patch
mm-hmm-use-devm-semantics-for-hmm_devmem_add-remove.patch
mm-hmm-replace-hmm_devmem_pages_create-with-devm_memremap_pages.patch
mm-hmm-mark-hmm_devmem_add-add_resource-export_symbol_gpl.patch
In ubifs_jnl_update() we sync parent and child inodes to the flash,
in case of xattrs, the parent inode (AKA host inode) has a non-zero
data_len. Therefore we need to adjust synced_i_size too.
This issue was reported by ubifs self tests unter a xattr related work
load.
UBIFS error (ubi0:0 pid 1896): dbg_check_synced_i_size: ui_size is 4, synced_i_size is 0, but inode is clean
UBIFS error (ubi0:0 pid 1896): dbg_check_synced_i_size: i_ino 65, i_mode 0x81a4, i_size 4
Cc: <stable(a)vger.kernel.org>
Fixes: 1e51764a3c2a ("UBIFS: add new flash file system")
Signed-off-by: Richard Weinberger <richard(a)nod.at>
---
fs/ubifs/journal.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index 04c4ec6483e5..1fb123279bb5 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -665,6 +665,11 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir,
spin_lock(&ui->ui_lock);
ui->synced_i_size = ui->ui_size;
spin_unlock(&ui->ui_lock);
+ if (xent) {
+ spin_lock(&host_ui->ui_lock);
+ host_ui->synced_i_size = host_ui->ui_size;
+ spin_unlock(&host_ui->ui_lock);
+ }
mark_inode_clean(c, ui);
mark_inode_clean(c, host_ui);
return 0;
--
2.13.6
Hi folks,
A few Fedora users have reported[0] a regression starting in v4.16.8
where the boot will hang ~1/3 of the time with the following RCU stall
warning:
INFO: rcu_sched detected stalls on CPUs/tasks:
o1-...!: (0 ticks this GP) idle=688/0/0 softirq=171/171 fqs=0
o(detected by 0, t=60002 jiffies, g=-142, c=-143, q=9)
Sending NMI from CPU 0 to CPU 1:
NMI backtrace for cpu 1 skipped: idling at
acpi_processor_ffh_cstate_enter+0x65/0xb0
rcu_sched kthread starved for 60002 jiffies! g18446744073709551474
c1844674407370955143 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 -> cpu=1
RCU grace-period kthread stack dump:
rcu_sched I 0 9 2 0x80000000
Call Trace:
? __schedule+0x234/0x850
schedule+0x28/0x80
schedule_timeout+0x166/0x380
? __next_timer_interrupt+0xc0/0xc0
rcu_gp_kthread+0x368/0x830
? rcu_process_callbacks+0x4f0/0x4f0
kthread+0x112/0x130
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x35/0x40
A user has bisected the problem to the v4.16 commit 1ab4ca7c59d4
("x86/tsc: Fix mark_tsc_unstable()"). According to the reporter,
explicitly setting "tsc=" on the kernel command line causes the boot to
always succeed. All the users have Thinkpad T500s or T400s (Core 2 Duos)
[0] https://bugzilla.redhat.com/show_bug.cgi?id=1579925
Thanks,
Jeremy
This is the start of the stable review cycle for the 4.17.1 release.
There are 15 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Mon Jun 11 14:59:48 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.17.1-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.17.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.17.1-rc1
Dexuan Cui <decui(a)microsoft.com>
PCI: hv: Do not wait forever on a device that has disappeared
Sabrina Dubroca <sd(a)queasysnail.net>
ipmr: fix error path when ipmr_new_table fails
Arun Parameswaran <arun.parameswaran(a)broadcom.com>
net: dsa: b53: Fix for brcm tag issue in Cygnus SoC
Stephen Suryaputra <ssuryaextr(a)gmail.com>
vrf: check the original netdevice for generating redirect
Dan Carpenter <dan.carpenter(a)oracle.com>
team: use netdev_features_t instead of u32
Xin Long <lucien.xin(a)gmail.com>
sctp: not allow transport timeout value less than HZ/5 for hb_timer
Eric Dumazet <edumazet(a)google.com>
rtnetlink: validate attributes in do_setlink()
Eric Dumazet <edumazet(a)google.com>
net/packet: refine check for priv area size
Eric Dumazet <edumazet(a)google.com>
net: metrics: add proper netlink validation
Cong Wang <xiyou.wangcong(a)gmail.com>
netdev-FAQ: clarify DaveM's position for stable backports
Guillaume Nault <g.nault(a)alphalink.fr>
l2tp: fix refcount leakage on PPPoL2TP sockets
Michal Kubecek <mkubecek(a)suse.cz>
ipv6: omit traffic class when calculating flow hash
Sabrina Dubroca <sd(a)queasysnail.net>
ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
Julia Lawall <Julia.Lawall(a)lip6.fr>
bnx2x: use the right constant
Jason A. Donenfeld <Jason(a)zx2c4.com>
netfilter: nf_flow_table: attach dst to skbs
-------------
Diffstat:
Documentation/networking/netdev-FAQ.txt | 9 +++++
Makefile | 4 +--
drivers/net/dsa/b53/b53_common.c | 15 +++++++-
drivers/net/dsa/b53/b53_priv.h | 2 ++
drivers/net/dsa/b53/b53_srab.c | 4 +--
drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c | 2 +-
drivers/net/team/team.c | 3 +-
drivers/pci/host/pci-hyperv.c | 46 +++++++++++++++++-------
include/linux/mroute_base.h | 10 ------
include/net/ipv6.h | 5 +++
net/core/flow_dissector.c | 2 +-
net/core/rtnetlink.c | 8 ++---
net/ipv4/fib_semantics.c | 4 +++
net/ipv4/ipmr_base.c | 8 +++--
net/ipv4/netfilter/nf_flow_table_ipv4.c | 5 +--
net/ipv6/ip6_output.c | 3 +-
net/ipv6/ip6mr.c | 21 +++++++----
net/ipv6/ndisc.c | 6 ++++
net/ipv6/netfilter/nf_flow_table_ipv6.c | 1 +
net/ipv6/route.c | 4 +--
net/l2tp/l2tp_ppp.c | 35 +++++++++---------
net/packet/af_packet.c | 2 +-
net/sctp/transport.c | 2 +-
23 files changed, 132 insertions(+), 69 deletions(-)
This is the start of the stable review cycle for the 4.14.49 release.
There are 41 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Mon Jun 11 15:29:07 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.49-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.49-rc1
Dexuan Cui <decui(a)microsoft.com>
PCI: hv: Do not wait forever on a device that has disappeared
Paul Blakey <paulb(a)mellanox.com>
cls_flower: Fix incorrect idr release when failing to modify rule
Eric Dumazet <edumazet(a)google.com>
rtnetlink: validate attributes in do_setlink()
Jason Wang <jasowang(a)redhat.com>
virtio-net: fix leaking page for gso packet during mergeable XDP
Eran Ben Elisha <eranbe(a)mellanox.com>
net/mlx5e: When RXFCS is set, add FCS data into checksum calculation
Jason Wang <jasowang(a)redhat.com>
virtio-net: correctly check num_buf during err path
Toshiaki Makita <makita.toshiaki(a)lab.ntt.co.jp>
tun: Fix NULL pointer dereference in XDP redirect
Jack Morgenstein <jackm(a)dev.mellanox.co.il>
net/mlx4: Fix irq-unsafe spinlock usage
Jason Wang <jasowang(a)redhat.com>
virtio-net: correctly transmit XDP buff after linearizing
Alexander Duyck <alexander.h.duyck(a)intel.com>
net-sysfs: Fix memory leak in XPS configuration
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: broadcom: Fix auxiliary control register reads
Mathieu Xhonneux <m.xhonneux(a)gmail.com>
ipv6: sr: fix memory OOB access in seg6_do_srh_encap/inline
Stephen Suryaputra <ssuryaextr(a)gmail.com>
vrf: check the original netdevice for generating redirect
Jason Wang <jasowang(a)redhat.com>
vhost: synchronize IOTLB message with dev cleanup
Dan Carpenter <dan.carpenter(a)oracle.com>
team: use netdev_features_t instead of u32
Xin Long <lucien.xin(a)gmail.com>
sctp: not allow transport timeout value less than HZ/5 for hb_timer
Shahed Shaikh <shahed.shaikh(a)cavium.com>
qed: Fix mask for physical address in ILT entry
Willem de Bruijn <willemb(a)google.com>
packet: fix reserve calculation
Daniele Palmas <dnlplm(a)gmail.com>
net: usb: cdc_mbim: add flag FLAG_SEND_ZLP
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: broadcom: Fix bcm_write_exp()
Eric Dumazet <edumazet(a)google.com>
net/packet: refine check for priv area size
Eric Dumazet <edumazet(a)google.com>
net: metrics: add proper netlink validation
Roopa Prabhu <roopa(a)cumulusnetworks.com>
net: ipv4: add missing RTA_TABLE to rtm_ipv4_policy
Cong Wang <xiyou.wangcong(a)gmail.com>
netdev-FAQ: clarify DaveM's position for stable backports
Kirill Tkhai <ktkhai(a)virtuozzo.com>
kcm: Fix use-after-free caused by clonned sockets
Wenwen Wang <wang6495(a)umn.edu>
isdn: eicon: fix a missing-check bug
Michal Kubecek <mkubecek(a)suse.cz>
ipv6: omit traffic class when calculating flow hash
Willem de Bruijn <willemb(a)google.com>
ipv4: remove warning in ip_recv_error
Eric Dumazet <edumazet(a)google.com>
ipmr: properly check rhltable_init() return value
Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
ip6_tunnel: remove magic mtu value 0xFFF8
Sabrina Dubroca <sd(a)queasysnail.net>
ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
Govindarajulu Varadarajan <gvaradar(a)cisco.com>
enic: set DMA mask to 47 bit
Alexey Kodanev <alexey.kodanev(a)oracle.com>
dccp: don't free ccid2_hc_tx_sock struct in dccp_disconnect()
Julia Lawall <Julia.Lawall(a)lip6.fr>
bnx2x: use the right constant
Suresh Reddy <suresh.reddy(a)broadcom.com>
be2net: Fix error detection logic for BE3
Nathan Chancellor <natechancellor(a)gmail.com>
kconfig: Avoid format overflow warning from GCC 8.1
Anand Jain <Anand.Jain(a)oracle.com>
btrfs: define SUPER_FLAG_METADUMP_V2
Linus Torvalds <torvalds(a)linux-foundation.org>
mmap: relax file size limit for regular files
Linus Torvalds <torvalds(a)linux-foundation.org>
mmap: introduce sane default mmap limits
Bart Van Assche <bart.vanassche(a)wdc.com>
scsi: sd_zbc: Avoid that resetting a zone fails sporadically
Damien Le Moal <damien.lemoal(a)wdc.com>
scsi: sd_zbc: Fix potential memory leak
-------------
Diffstat:
Documentation/networking/netdev-FAQ.txt | 9 ++
Makefile | 4 +-
drivers/isdn/hardware/eicon/diva.c | 22 ++--
drivers/isdn/hardware/eicon/diva.h | 5 +-
drivers/isdn/hardware/eicon/divasmain.c | 18 ++--
drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c | 2 +-
drivers/net/ethernet/cisco/enic/enic_main.c | 8 +-
drivers/net/ethernet/emulex/benet/be_main.c | 4 +-
drivers/net/ethernet/mellanox/mlx4/qp.c | 4 +-
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 42 ++++++++
drivers/net/ethernet/qlogic/qed/qed_cxt.c | 2 +-
drivers/net/phy/bcm-cygnus.c | 6 +-
drivers/net/phy/bcm-phy-lib.c | 2 +-
drivers/net/phy/bcm-phy-lib.h | 7 ++
drivers/net/phy/bcm7xxx.c | 4 +-
drivers/net/team/team.c | 3 +-
drivers/net/tun.c | 15 +--
drivers/net/usb/cdc_mbim.c | 2 +-
drivers/net/virtio_net.c | 19 ++--
drivers/pci/host/pci-hyperv.c | 46 +++++---
drivers/scsi/sd_zbc.c | 128 ++++++++++++++---------
drivers/vhost/vhost.c | 3 +
fs/btrfs/disk-io.c | 3 +-
include/net/ipv6.h | 5 +
include/uapi/linux/btrfs_tree.h | 1 +
mm/mmap.c | 32 ++++++
net/core/flow_dissector.c | 2 +-
net/core/net-sysfs.c | 6 +-
net/core/rtnetlink.c | 8 +-
net/dccp/proto.c | 2 -
net/ipv4/fib_frontend.c | 1 +
net/ipv4/fib_semantics.c | 4 +
net/ipv4/ip_sockglue.c | 2 -
net/ipv4/ipmr.c | 7 +-
net/ipv6/ip6_output.c | 3 +-
net/ipv6/ip6_tunnel.c | 11 +-
net/ipv6/ip6mr.c | 3 +-
net/ipv6/ndisc.c | 6 ++
net/ipv6/route.c | 2 +-
net/ipv6/seg6_iptunnel.c | 4 +-
net/ipv6/sit.c | 5 +-
net/kcm/kcmsock.c | 2 +-
net/packet/af_packet.c | 4 +-
net/sched/cls_flower.c | 2 +-
net/sctp/transport.c | 2 +-
scripts/kconfig/confdata.c | 2 +-
46 files changed, 329 insertions(+), 145 deletions(-)
This is the start of the stable review cycle for the 4.16.15 release.
There are 48 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Mon Jun 11 14:59:28 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.16.15-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.16.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.16.15-rc1
Dexuan Cui <decui(a)microsoft.com>
PCI: hv: Do not wait forever on a device that has disappeared
Jason Wang <jasowang(a)redhat.com>
vhost_net: flush batched heads before trying to busy polling
Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
net: netsec: reduce DMA mask to 40 bits
Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
ip_tunnel: restore binding to ifaces with a large mtu
Jason Wang <jasowang(a)redhat.com>
virtio-net: correctly redirect linearized packet
Or Gerlitz <ogerlitz(a)mellanox.com>
net : sched: cls_api: deal with egdev path only if needed
Arun Parameswaran <arun.parameswaran(a)broadcom.com>
net: dsa: b53: Fix for brcm tag issue in Cygnus SoC
Jason Wang <jasowang(a)redhat.com>
virtio-net: correctly check num_buf during err path
Toshiaki Makita <makita.toshiaki(a)lab.ntt.co.jp>
tun: Fix NULL pointer dereference in XDP redirect
Eran Ben Elisha <eranbe(a)mellanox.com>
net/mlx5e: When RXFCS is set, add FCS data into checksum calculation
Jack Morgenstein <jackm(a)dev.mellanox.co.il>
net/mlx4: Fix irq-unsafe spinlock usage
Jason Wang <jasowang(a)redhat.com>
virtio-net: fix leaking page for gso packet during mergeable XDP
Jason Wang <jasowang(a)redhat.com>
virtio-net: correctly transmit XDP buff after linearizing
Alexander Duyck <alexander.h.duyck(a)intel.com>
net-sysfs: Fix memory leak in XPS configuration
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: broadcom: Fix auxiliary control register reads
Mathieu Xhonneux <m.xhonneux(a)gmail.com>
ipv6: sr: fix memory OOB access in seg6_do_srh_encap/inline
Stephen Suryaputra <ssuryaextr(a)gmail.com>
vrf: check the original netdevice for generating redirect
Jason Wang <jasowang(a)redhat.com>
vhost: synchronize IOTLB message with dev cleanup
Dan Carpenter <dan.carpenter(a)oracle.com>
team: use netdev_features_t instead of u32
Xin Long <lucien.xin(a)gmail.com>
sctp: not allow transport timeout value less than HZ/5 for hb_timer
Eric Dumazet <edumazet(a)google.com>
rtnetlink: validate attributes in do_setlink()
Shahed Shaikh <shahed.shaikh(a)cavium.com>
qed: Fix mask for physical address in ILT entry
Willem de Bruijn <willemb(a)google.com>
packet: fix reserve calculation
Daniele Palmas <dnlplm(a)gmail.com>
net: usb: cdc_mbim: add flag FLAG_SEND_ZLP
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: broadcom: Fix bcm_write_exp()
Eric Dumazet <edumazet(a)google.com>
net/packet: refine check for priv area size
Eric Dumazet <edumazet(a)google.com>
net: metrics: add proper netlink validation
Roopa Prabhu <roopa(a)cumulusnetworks.com>
net: ipv4: add missing RTA_TABLE to rtm_ipv4_policy
Dan Carpenter <dan.carpenter(a)oracle.com>
net: ethernet: davinci_emac: fix error handling in probe()
Cong Wang <xiyou.wangcong(a)gmail.com>
netdev-FAQ: clarify DaveM's position for stable backports
Petr Machata <petrm(a)mellanox.com>
mlxsw: spectrum: Forbid creation of VLAN 1 over port/LAG
Guillaume Nault <g.nault(a)alphalink.fr>
l2tp: fix refcount leakage on PPPoL2TP sockets
Kirill Tkhai <ktkhai(a)virtuozzo.com>
kcm: Fix use-after-free caused by clonned sockets
Wenwen Wang <wang6495(a)umn.edu>
isdn: eicon: fix a missing-check bug
Michal Kubecek <mkubecek(a)suse.cz>
ipv6: omit traffic class when calculating flow hash
Willem de Bruijn <willemb(a)google.com>
ipv4: remove warning in ip_recv_error
Eric Dumazet <edumazet(a)google.com>
ipmr: properly check rhltable_init() return value
Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
ip6_tunnel: remove magic mtu value 0xFFF8
Sabrina Dubroca <sd(a)queasysnail.net>
ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
Govindarajulu Varadarajan <gvaradar(a)cisco.com>
enic: set DMA mask to 47 bit
Alexey Kodanev <alexey.kodanev(a)oracle.com>
dccp: don't free ccid2_hc_tx_sock struct in dccp_disconnect()
Paul Blakey <paulb(a)mellanox.com>
cls_flower: Fix incorrect idr release when failing to modify rule
Julia Lawall <Julia.Lawall(a)lip6.fr>
bnx2x: use the right constant
Suresh Reddy <suresh.reddy(a)broadcom.com>
be2net: Fix error detection logic for BE3
Nathan Chancellor <natechancellor(a)gmail.com>
kconfig: Avoid format overflow warning from GCC 8.1
Jason A. Donenfeld <Jason(a)zx2c4.com>
netfilter: nf_flow_table: attach dst to skbs
Linus Torvalds <torvalds(a)linux-foundation.org>
mmap: relax file size limit for regular files
Linus Torvalds <torvalds(a)linux-foundation.org>
mmap: introduce sane default mmap limits
-------------
Diffstat:
Documentation/networking/netdev-FAQ.txt | 9 +++++
Makefile | 4 +--
drivers/isdn/hardware/eicon/diva.c | 22 ++++++++----
drivers/isdn/hardware/eicon/diva.h | 5 +--
drivers/isdn/hardware/eicon/divasmain.c | 18 ++++++----
drivers/net/dsa/b53/b53_common.c | 15 +++++++-
drivers/net/dsa/b53/b53_priv.h | 2 ++
drivers/net/dsa/b53/b53_srab.c | 4 +--
drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c | 2 +-
drivers/net/ethernet/cisco/enic/enic_main.c | 8 ++---
drivers/net/ethernet/emulex/benet/be_main.c | 4 ++-
drivers/net/ethernet/mellanox/mlx4/qp.c | 4 +--
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 42 ++++++++++++++++++++++
drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 5 +++
drivers/net/ethernet/qlogic/qed/qed_cxt.c | 2 +-
drivers/net/ethernet/socionext/netsec.c | 4 +--
drivers/net/ethernet/ti/davinci_emac.c | 22 ++++++------
drivers/net/phy/bcm-cygnus.c | 6 ++--
drivers/net/phy/bcm-phy-lib.c | 2 +-
drivers/net/phy/bcm-phy-lib.h | 7 ++++
drivers/net/phy/bcm7xxx.c | 4 +--
drivers/net/team/team.c | 3 +-
drivers/net/tun.c | 15 ++++----
drivers/net/usb/cdc_mbim.c | 2 +-
drivers/net/virtio_net.c | 21 ++++++-----
drivers/pci/host/pci-hyperv.c | 46 +++++++++++++++++-------
drivers/vhost/net.c | 37 ++++++++++++-------
drivers/vhost/vhost.c | 3 ++
include/net/ipv6.h | 5 +++
mm/mmap.c | 32 +++++++++++++++++
net/core/flow_dissector.c | 2 +-
net/core/net-sysfs.c | 6 ++--
net/core/rtnetlink.c | 8 ++---
net/dccp/proto.c | 2 --
net/ipv4/fib_frontend.c | 1 +
net/ipv4/fib_semantics.c | 4 +++
net/ipv4/ip_sockglue.c | 2 --
net/ipv4/ip_tunnel.c | 8 ++---
net/ipv4/ipmr.c | 7 +++-
net/ipv4/netfilter/nf_flow_table_ipv4.c | 5 +--
net/ipv6/ip6_output.c | 3 +-
net/ipv6/ip6_tunnel.c | 11 ++++--
net/ipv6/ip6mr.c | 3 +-
net/ipv6/ndisc.c | 6 ++++
net/ipv6/netfilter/nf_flow_table_ipv6.c | 1 +
net/ipv6/route.c | 2 +-
net/ipv6/seg6_iptunnel.c | 4 +--
net/ipv6/sit.c | 5 +--
net/kcm/kcmsock.c | 2 +-
net/l2tp/l2tp_ppp.c | 35 +++++++++---------
net/packet/af_packet.c | 4 +--
net/sched/cls_api.c | 2 +-
net/sched/cls_flower.c | 2 +-
net/sctp/transport.c | 2 +-
scripts/kconfig/confdata.c | 2 +-
55 files changed, 338 insertions(+), 146 deletions(-)
From: Alexander Duyck <alexander.h.duyck(a)intel.com>
This patch addresses two issues. First it adds the correct bit definitions
for the SECTXSTAT and SECRXSTAT registers. Then it makes use of those
definitions to test for if IPsec has been disabled on the part and if so we
do not enable it.
CC: stable <stable(a)vger.kernel.org>
Signed-off-by: Alexander Duyck <alexander.h.duyck(a)intel.com>
Reported-by: Andre Tomt <andre(a)tomt.net>
Acked-by: Shannon Nelson <shannon.nelson(a)oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers(a)intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher(a)intel.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 14 +++++++++++++-
drivers/net/ethernet/intel/ixgbe/ixgbe_type.h | 6 ++++--
2 files changed, 17 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 7b23fb0c2d07..c116f459945d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -975,10 +975,22 @@ void ixgbe_ipsec_rx(struct ixgbe_ring *rx_ring,
**/
void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
{
+ struct ixgbe_hw *hw = &adapter->hw;
struct ixgbe_ipsec *ipsec;
+ u32 t_dis, r_dis;
size_t size;
- if (adapter->hw.mac.type == ixgbe_mac_82598EB)
+ if (hw->mac.type == ixgbe_mac_82598EB)
+ return;
+
+ /* If there is no support for either Tx or Rx offload
+ * we should not be advertising support for IPsec.
+ */
+ t_dis = IXGBE_READ_REG(hw, IXGBE_SECTXSTAT) &
+ IXGBE_SECTXSTAT_SECTX_OFF_DIS;
+ r_dis = IXGBE_READ_REG(hw, IXGBE_SECRXSTAT) &
+ IXGBE_SECRXSTAT_SECRX_OFF_DIS;
+ if (t_dis || r_dis)
return;
ipsec = kzalloc(sizeof(*ipsec), GFP_KERNEL);
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index e8ed37749ab1..44cfb2021145 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -599,13 +599,15 @@ struct ixgbe_nvm_version {
#define IXGBE_SECTXCTRL_STORE_FORWARD 0x00000004
#define IXGBE_SECTXSTAT_SECTX_RDY 0x00000001
-#define IXGBE_SECTXSTAT_ECC_TXERR 0x00000002
+#define IXGBE_SECTXSTAT_SECTX_OFF_DIS 0x00000002
+#define IXGBE_SECTXSTAT_ECC_TXERR 0x00000004
#define IXGBE_SECRXCTRL_SECRX_DIS 0x00000001
#define IXGBE_SECRXCTRL_RX_DIS 0x00000002
#define IXGBE_SECRXSTAT_SECRX_RDY 0x00000001
-#define IXGBE_SECRXSTAT_ECC_RXERR 0x00000002
+#define IXGBE_SECRXSTAT_SECRX_OFF_DIS 0x00000002
+#define IXGBE_SECRXSTAT_ECC_RXERR 0x00000004
/* LinkSec (MacSec) Registers */
#define IXGBE_LSECTXCAP 0x08A00
--
2.17.1
From: Alexander Duyck <alexander.h.duyck(a)intel.com>
This patch fixes two issues. First we add an early test for the Tx and Rx
security block ready bits. By doing this we can avoid the need for waits or
loopback in the event that the security block is already flushed out.
Secondly we fix the boolean logic that was testing for the Tx OR Rx ready
bits being set and change it so that we only exit if the Tx AND Rx ready
bits are both set.
CC: stable <stable(a)vger.kernel.org>
Signed-off-by: Alexander Duyck <alexander.h.duyck(a)intel.com>
Acked-by: Shannon Nelson <shannon.nelson(a)oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers(a)intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher(a)intel.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 38d8cf75e9ad..7b23fb0c2d07 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -158,7 +158,16 @@ static void ixgbe_ipsec_stop_data(struct ixgbe_adapter *adapter)
reg |= IXGBE_SECRXCTRL_RX_DIS;
IXGBE_WRITE_REG(hw, IXGBE_SECRXCTRL, reg);
- IXGBE_WRITE_FLUSH(hw);
+ /* If both Tx and Rx are ready there are no packets
+ * that we need to flush so the loopback configuration
+ * below is not necessary.
+ */
+ t_rdy = IXGBE_READ_REG(hw, IXGBE_SECTXSTAT) &
+ IXGBE_SECTXSTAT_SECTX_RDY;
+ r_rdy = IXGBE_READ_REG(hw, IXGBE_SECRXSTAT) &
+ IXGBE_SECRXSTAT_SECRX_RDY;
+ if (t_rdy && r_rdy)
+ return;
/* If the tx fifo doesn't have link, but still has data,
* we can't clear the tx sec block. Set the MAC loopback
@@ -185,7 +194,7 @@ static void ixgbe_ipsec_stop_data(struct ixgbe_adapter *adapter)
IXGBE_SECTXSTAT_SECTX_RDY;
r_rdy = IXGBE_READ_REG(hw, IXGBE_SECRXSTAT) &
IXGBE_SECRXSTAT_SECRX_RDY;
- } while (!t_rdy && !r_rdy && limit--);
+ } while (!(t_rdy && r_rdy) && limit--);
/* undo loopback if we played with it earlier */
if (!link) {
--
2.17.1