The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From edc0b0bccc9c80d9a44d3002dcca94984b25e7cf Mon Sep 17 00:00:00 2001
From: Mark Bloch <mbloch(a)nvidia.com>
Date: Mon, 7 Jun 2021 11:03:12 +0300
Subject: [PATCH] RDMA/mlx5: Block FDB rules when not in switchdev mode
Allow creating FDB steering rules only when in switchdev mode.
The only software model where a userspace application can manipulate
FDB entries is when it manages the eswitch. This is only possible in
switchdev mode where we expose a single RDMA device with representors
for all the vports that are connected to the eswitch.
Fixes: 52438be44112 ("RDMA/mlx5: Allow inserting a steering rule to the FDB")
Link: https://lore.kernel.org/r/e928ae7c58d07f104716a2a8d730963d1bd01204.16230529…
Reviewed-by: Maor Gottlieb <maorg(a)nvidia.com>
Signed-off-by: Mark Bloch <mbloch(a)nvidia.com>
Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
diff --git a/drivers/infiniband/hw/mlx5/fs.c b/drivers/infiniband/hw/mlx5/fs.c
index 2fc6a60c4e77..f84441ff0c81 100644
--- a/drivers/infiniband/hw/mlx5/fs.c
+++ b/drivers/infiniband/hw/mlx5/fs.c
@@ -2134,6 +2134,12 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_FLOW_MATCHER_CREATE)(
if (err)
goto end;
+ if (obj->ns_type == MLX5_FLOW_NAMESPACE_FDB &&
+ mlx5_eswitch_mode(dev->mdev) != MLX5_ESWITCH_OFFLOADS) {
+ err = -EINVAL;
+ goto end;
+ }
+
uobj->object = obj;
obj->mdev = dev->mdev;
atomic_set(&obj->usecnt, 0);
Hello,
I request the following patch from v4.10-rc1 to get cherry-picked into
"stable/linux-4.9.y":
> commit f114dca2533ca770aebebffb5ed56e5e7d1fb3fb
> Author: Alexander Duyck <alexander.h.duyck(a)intel.com>
> Date: Tue Oct 25 16:08:46 2016 -0700
>
> i40e: Be much more verbose about what we can and cannot offload
>
> This change makes it so that we are much more robust about defining what we
> can and cannot offload. Previously we were just checking for the L4 tunnel
> header length, however there are other fields we should be verifying as
> there are multiple scenarios in which we cannot perform hardware offloads.
>
> In addition the device only supports GSO as long as the MSS is 64 or
> greater. We were not checking this so an MSS less than that was resulting
> in Tx hangs.
>
> Change-ID: I5e2fd5f3075c73601b4b36327b771c64fcb6c31b
> Signed-off-by: Alexander Duyck <alexander.h.duyck(a)intel.com>
> Tested-by: Andrew Bowers <andrewx.bowers(a)intel.com>
Debian had this old Bug
<https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892105> reported
against 4.9.82, which still exists in Debians old-stable 9 "Stretch"
current kernel 4.9.258, but also with latest stable 4.9.273.
Our environment
===============
- KVM server
- dual port i40e
- classic bridge with enp96s0f0
- VM attached to bridge via veth
- no VLANs
- no MacVLan
> # ethtool -i enp96s0f0
> driver: i40e
> version: 1.6.16-k
> firmware-version: 3.33 0x80000e48 1.1876.0
> expansion-rom-version:
> bus-info: 0000:60:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: ye
> # lspci -s 0000:60:00.0
> 60:00.0 Ethernet controller: Intel Corporation Ethernet Connection X722 for 10GBASE-T (rev 09)
Analysis
========
As soon as we start one of our "Ubuntu" images the bridge stops
receiving unicast packages for *all* VMs on that bridge.
- we still see outgoing traffic leaving the host, e.g. ARP requests
- "tcpdump -i enp96s0f0" shows no incoming unicast traffic, e.g. no ARP
response
- broadcast traffic passes the bridge
- VMs on the same bridge can communicate with each other
Most often I see the following error message after doing `dmesg -n 8`:
> [ +9,376367] i40e 0000:60:00.0: cleared PE_CRITERR
> [ +0,000252] i40e 0000:60:00.0: TX driver issue detected, PF reset issued
> [ +0,859912] i40e 0000:60:00.0: Error I40E_AQ_RC_EINVAL adding RX filters on PF, promiscuous mode forced on
In one case I've seen this also (don't know if it is relevant):
> [ 218.921466] i40e 0000:60:00.0 enp96s0f0: VSI_seid 390, Hung TX queue 43, tx_pending_hw: 1, NTC:0xa6, HWB: 0xa6, NTU: 0xa7, TAIL: 0xa7
> [ 218.921470] i40e 0000:60:00.0 enp96s0f0: VSI_seid 390, Issuing force_wb for TX queue 43, Interrupt Reg: 0x0
After that error the only way to reset this broken state it to reboot
the host. I've been unable to tear down the bridge and/or remove the
`i40e` driver, which most often crashes the Linux kernel (some other bug
on `ip link set enp96s0f0 nomaster`).
If you need more data I have a PCAP file, but I still don't know which
packet exactly triggers the bug.
The bugs seems to be fixed with 4.10.0 and I bisected it down to
> git bisect start '--' 'drivers/net/ethernet/intel/i40e'
> # new: [c470abd4fde40ea6a0846a2beab642a578c0b8cd] Linux 4.10
> git bisect new c470abd4fde40ea6a0846a2beab642a578c0b8cd
> # old: [69973b830859bc6529a7a0468ba0d80ee5117826] Linux 4.9
> git bisect old 69973b830859bc6529a7a0468ba0d80ee5117826
> # old: [13fd3f9cc3def8b276c7913ae4acbfa2653cb198] i40e: clear mac filter count on reset
> git bisect old 13fd3f9cc3def8b276c7913ae4acbfa2653cb198
> # new: [7ec9ba11b046b4b7fd768c366870ada60d409295] i40e: Driver prints log message on link speed change
> git bisect new 7ec9ba11b046b4b7fd768c366870ada60d409295
> # new: [0b7c8b5d5436317a5f4509e2a150c6cec017f348] i40e: fix trivial typo in naming of i40e_sync_filters_subtask
> git bisect new 0b7c8b5d5436317a5f4509e2a150c6cec017f348
> # new: [f114dca2533ca770aebebffb5ed56e5e7d1fb3fb] i40e: Be much more verbose about what we can and cannot offload
> git bisect new f114dca2533ca770aebebffb5ed56e5e7d1fb3fb
> # old: [81fa7c97bebd6e1a52c4e059eeffe18df5b3f11f] i40e: Implementation of ERROR state for NVM update state machine
> git bisect old 81fa7c97bebd6e1a52c4e059eeffe18df5b3f11f
> # old: [3aa7b74dbeedfb32406fec70cfd76d797209e8c9] i40e: removed unreachable code
> git bisect old 3aa7b74dbeedfb32406fec70cfd76d797209e8c9
> # first new commit: [f114dca2533ca770aebebffb5ed56e5e7d1fb3fb] i40e: Be much more verbose about what we can and cannot offload
I used v4.10 as the basis and only bisected everything in
drivers/net/ethernet/intel/i40e/ as vanilla v4.9 and several other
versions between that and v4.10 crashed my host, so basically
git checkout v4.10
git checkout $hash -- drivers/net/ethernet/intel/i40e/
make all modules_install install
git checkout v4-10 -- drivers/net/ethernet/intel/i40e/
git bisect (old|new) $hash
I verified that cherry-picking f114dca2533ca770aebebffb5ed56e5e7d1fb3fb
on top of v4.9.273 fixes the problem and reverting it again shows the
problem again.
Philipp
--
Philipp Hahn
Open Source Software Engineer
Univention GmbH
be open.
Mary-Somerville-Str. 1
D-28359 Bremen
📞 +49-421-22232-57
🖶 +49-421-22232-99
✉️ hahn(a)univention.de
🌐 https://www.univention.de/
Geschäftsführer: Peter H. Ganten
HRB 20755 Amtsgericht Bremen
Steuer-Nr.: 71-597-02876
Mainline commit ce9f24cccdc0 ("ext4: check journal inode extents more carefully")
enabled validity checks for journal inode's data blocks. This change got
ported to stable branches, but the backport for 4.19 has a bug where it will
flag an error even when system block entry's inode number matches journal
inode.
The way error is reported is also problematic because it updates the superblock
without following journaling rules. This may result in superblock checksum
errors if the superblock is in the process of being committed but has a
previously calculated checksum that doesn't include the bogus error update.
This patch eliminates the bogus error by trying to match how other backports
were implemented, which is to flag an error only when inode numbers mismatch.
Fixes: commit a75a5d163857 ("ext4: check journal inode extents more carefully")
Signed-off-by: Tahsin Erdogan <trdgn(a)amazon.com>
Cc: stable(a)vger.kernel.org
Cc: Jan Kara <jack(a)suse.cz>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/ext4/block_validity.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c
index 1ea8fc9ff048..1bc65ecd4bd6 100644
--- a/fs/ext4/block_validity.c
+++ b/fs/ext4/block_validity.c
@@ -171,8 +171,10 @@ static int ext4_data_block_valid_rcu(struct ext4_sb_info *sbi,
else if (start_blk >= (entry->start_blk + entry->count))
n = n->rb_right;
else {
+ if (entry->ino == ino)
+ return 1;
sbi->s_es->s_last_error_block = cpu_to_le64(start_blk);
- return entry->ino == ino;
+ return 0;
}
}
return 1;
--
2.17.1