On Tue, Nov 21, 2017 at 06:34:54PM +0200, Leon Romanovsky wrote:
> On Tue, Nov 21, 2017 at 09:04:42AM -0700, Jason Gunthorpe wrote:
> > On Tue, Nov 21, 2017 at 09:37:27AM -0600, Daniel Jurgens wrote:
> >
> > > The only warning that would make sense is if the mixed ports aren't
> > > all IB or RoCE. As you note, CX-3 can mix those two, we don't want to
> > > see warnings about that.
> >
> > I would really like to see cx3 be changed to not do that, then we
> > could finalize this issue upstream: All device ports must be the same
> > protocol.
>
> I don't see the point of such artificial limitation, the users who
> brought CX-3 have option to work in mixed mode and IMHO it is not right
> to deprecate such ability just because it is hard for us to code for it.
I don't really think it is really too user visible.. Only the device
and port number change, but only if running in mixed mode.
It is not just 'hard for us' it is impossible to reconcile the
differences between ports when enforcing device level things.
This keeps coming up again and again..
Jason
On 21/11/2017 15:22, Leon Romanovsky wrote:
> On Tue, Nov 21, 2017 at 12:44:10PM +0200, Mark Bloch wrote:
>> Hi,
>>
>> On 21/11/2017 12:26, Leon Romanovsky wrote:
>>> From: Daniel Jurgens <danielj(a)mellanox.com>
>>>
>>> For now the only LSM security enforcement mechanism available is
>>> specific to InfiniBand. Bypass enforcement for non-IB link types.
>>> This fixes a regression where modify_qp fails for iWARP because
>>> querying the PKEY returns -EINVAL.
>>>
>>> Cc: Paul Moore <paul(a)paul-moore.com>
>>> Cc: Don Dutile <ddutile(a)redhat.com>
>>> Cc: stable(a)vger.kernel.org
>>> Reported-by: Potnuri Bharat Teja <bharat(a)chelsio.com>
>>> Fixes: d291f1a65232("IB/core: Enforce PKey security on QPs")
>>> Fixes: 47a2b338fe63("IB/core: Enforce security on management datagrams")
>>> Signed-off-by: Daniel Jurgens <danielj(a)mellanox.com>
>>> Reviewed-by: Parav Pandit <parav(a)mellanox.com>
>>> Tested-by: Potnuri Bharat Teja <bharat(a)chelsio.com>
>>> Signed-off-by: Leon Romanovsky <leon(a)kernel.org>
>>> ---
>>> drivers/infiniband/core/security.c | 9 +++++++++
>>> 1 file changed, 9 insertions(+)
>>>
>>> diff --git a/drivers/infiniband/core/security.c b/drivers/infiniband/core/security.c
>>> index 23278ed5be45..314bf1137c7b 100644
>>> --- a/drivers/infiniband/core/security.c
>>> +++ b/drivers/infiniband/core/security.c
>>> @@ -417,8 +417,17 @@ void ib_close_shared_qp_security(struct ib_qp_security *sec)
>>>
>>> int ib_create_qp_security(struct ib_qp *qp, struct ib_device *dev)
>>> {
>>> + u8 i = rdma_start_port(dev);
>>> + bool is_ib = false;
>>> int ret;
>>>
>>> + while (i <= rdma_end_port(dev) && !is_ib)
>>> + is_ib = rdma_protocol_ib(dev, i++);
>>> +
>>
>> What happens if we have mixed port types?
>
> We will have is_ib set and qp_sec will be allocated on device layer and
> not on port level, but because pkeys are IB specific term (at least, I
> didn't find any mentioning in RoCE spec), the modify_qp won't query
> for PKEYS.
>
What is port level for qp_sec?
I just gave mlx4 as an example, but I was talking more about the ability of the RDMA
subsystem to support mixed port types, so in this case, if in the future a vendor
will come with an ib_device with 2 ports, one is IB and one is iWARP bad things will happen.
>> I believe mlx4 can expose two ports where each port uses a different ll protocol.
>> Was that changed?
>
> It is still true.
>
>>
>>> + /* If this isn't an IB device don't create the security context */
>>> + if (!is_ib)
>>> + return 0;
>>> +
>>> qp->qp_sec = kzalloc(sizeof(*qp->qp_sec), GFP_KERNEL);
>>> if (!qp->qp_sec)
>>> return -ENOMEM;
>>>
>>
>> Mark.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo(a)vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
Mark
From: Yazen Ghannam <yazen.ghannam(a)amd.com>
[Upstream commit d65dfc81bb3894fdb68cbc74bbf5fb48d2354071]
The AMD severity grading function was introduced in kernel 4.1. The
current logic can possibly give MCE_AR_SEVERITY for uncorrectable
errors in kernel context. The system may then get stuck in a loop as
memory_failure() will try to handle the bad kernel memory and find it
busy.
Return MCE_PANIC_SEVERITY for all UC errors IN_KERNEL context on AMD
systems.
After:
b2f9d678e28c ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries")
was accepted in v4.6, this issue was masked because of the tail-end attempt
at kernel mode recovery in the #MC handler.
However, uncorrectable errors IN_KERNEL context should always be considered
unrecoverable and cause a panic.
Signed-off-by: Yazen Ghannam <yazen.ghannam(a)amd.com>
Cc: <stable(a)vger.kernel.org> # 4.1.x, 4.4.x
Fixes: bf80bbd7dcf5 (x86/mce: Add an AMD severities-grading function)
---
Same as
d65dfc81bb38 ("x86/MCE/AMD: Always give panic severity for UC errors in kernel context")
but fixed up to apply to v4.1 and v4.4 stable branches.
arch/x86/kernel/cpu/mcheck/mce-severity.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index 9c682c222071..a9287f0f06f2 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -200,6 +200,9 @@ static int mce_severity_amd(struct mce *m, int tolerant, char **msg, bool is_exc
if (m->status & MCI_STATUS_UC) {
+ if (ctx == IN_KERNEL)
+ return MCE_PANIC_SEVERITY;
+
/*
* On older systems where overflow_recov flag is not present, we
* should simply panic if an error overflow occurs. If
@@ -207,10 +210,6 @@ static int mce_severity_amd(struct mce *m, int tolerant, char **msg, bool is_exc
* to at least kill process to prolong system operation.
*/
if (mce_flags.overflow_recov) {
- /* software can try to contain */
- if (!(m->mcgstatus & MCG_STATUS_RIPV) && (ctx == IN_KERNEL))
- return MCE_PANIC_SEVERITY;
-
/* kill current process */
return MCE_AR_SEVERITY;
} else {
--
2.14.1
This is a note to let you know that I've just added the patch titled
vlan: fix a use-after-free in vlan_device_event()
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
vlan-fix-a-use-after-free-in-vlan_device_event.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Tue Nov 21 15:37:44 CET 2017
From: Cong Wang <xiyou.wangcong(a)gmail.com>
Date: Thu, 9 Nov 2017 16:43:13 -0800
Subject: vlan: fix a use-after-free in vlan_device_event()
From: Cong Wang <xiyou.wangcong(a)gmail.com>
[ Upstream commit 052d41c01b3a2e3371d66de569717353af489d63 ]
After refcnt reaches zero, vlan_vid_del() could free
dev->vlan_info via RCU:
RCU_INIT_POINTER(dev->vlan_info, NULL);
call_rcu(&vlan_info->rcu, vlan_info_rcu_free);
However, the pointer 'grp' still points to that memory
since it is set before vlan_vid_del():
vlan_info = rtnl_dereference(dev->vlan_info);
if (!vlan_info)
goto out;
grp = &vlan_info->grp;
Depends on when that RCU callback is scheduled, we could
trigger a use-after-free in vlan_group_for_each_dev()
right following this vlan_vid_del().
Fix it by moving vlan_vid_del() before setting grp. This
is also symmetric to the vlan_vid_add() we call in
vlan_device_event().
Reported-by: Fengguang Wu <fengguang.wu(a)intel.com>
Fixes: efc73f4bbc23 ("net: Fix memory leak - vlan_info struct")
Cc: Alexander Duyck <alexander.duyck(a)gmail.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Girish Moodalbail <girish.moodalbail(a)oracle.com>
Signed-off-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Reviewed-by: Girish Moodalbail <girish.moodalbail(a)oracle.com>
Tested-by: Fengguang Wu <fengguang.wu(a)intel.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/8021q/vlan.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -376,6 +376,9 @@ static int vlan_device_event(struct noti
dev->name);
vlan_vid_add(dev, htons(ETH_P_8021Q), 0);
}
+ if (event == NETDEV_DOWN &&
+ (dev->features & NETIF_F_HW_VLAN_CTAG_FILTER))
+ vlan_vid_del(dev, htons(ETH_P_8021Q), 0);
vlan_info = rtnl_dereference(dev->vlan_info);
if (!vlan_info)
@@ -420,9 +423,6 @@ static int vlan_device_event(struct noti
break;
case NETDEV_DOWN:
- if (dev->features & NETIF_F_HW_VLAN_CTAG_FILTER)
- vlan_vid_del(dev, htons(ETH_P_8021Q), 0);
-
/* Put all VLANs for this dev in the down state too. */
vlan_group_for_each_dev(grp, i, vlandev) {
flgs = vlandev->flags;
Patches currently in stable-queue which might be from xiyou.wangcong(a)gmail.com are
queue-3.18/ipv6-dccp-do-not-inherit-ipv6_mc_list-from-parent.patch
queue-3.18/vlan-fix-a-use-after-free-in-vlan_device_event.patch