This is a note to let you know that I've just added the patch titled
race of lockd inetaddr notifiers vs nlmsvc_rqst change
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
race-of-lockd-inetaddr-notifiers-vs-nlmsvc_rqst-change.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Vasily Averin <vvs(a)virtuozzo.com>
Date: Fri, 10 Nov 2017 10:19:26 +0300
Subject: race of lockd inetaddr notifiers vs nlmsvc_rqst change
From: Vasily Averin <vvs(a)virtuozzo.com>
[ Upstream commit 6b18dd1c03e07262ea0866084856b2a3c5ba8d09 ]
lockd_inet[6]addr_event use nlmsvc_rqst without taken nlmsvc_mutex,
nlmsvc_rqst can be changed during execution of notifiers and crash the host.
Patch enables access to nlmsvc_rqst only when it was correctly initialized
and delays its cleanup until notifiers are no longer in use.
Note that nlmsvc_rqst can be temporally set to ERR_PTR, so the "if
(nlmsvc_rqst)" check in notifiers is insufficient on its own.
Signed-off-by: Vasily Averin <vvs(a)virtuozzo.com>
Tested-by: Scott Mayhew <smayhew(a)redhat.com>
Signed-off-by: J. Bruce Fields <bfields(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/lockd/svc.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -57,6 +57,9 @@ static struct task_struct *nlmsvc_task;
static struct svc_rqst *nlmsvc_rqst;
unsigned long nlmsvc_timeout;
+atomic_t nlm_ntf_refcnt = ATOMIC_INIT(0);
+DECLARE_WAIT_QUEUE_HEAD(nlm_ntf_wq);
+
unsigned int lockd_net_id;
/*
@@ -292,7 +295,8 @@ static int lockd_inetaddr_event(struct n
struct in_ifaddr *ifa = (struct in_ifaddr *)ptr;
struct sockaddr_in sin;
- if (event != NETDEV_DOWN)
+ if ((event != NETDEV_DOWN) ||
+ !atomic_inc_not_zero(&nlm_ntf_refcnt))
goto out;
if (nlmsvc_rqst) {
@@ -303,6 +307,8 @@ static int lockd_inetaddr_event(struct n
svc_age_temp_xprts_now(nlmsvc_rqst->rq_server,
(struct sockaddr *)&sin);
}
+ atomic_dec(&nlm_ntf_refcnt);
+ wake_up(&nlm_ntf_wq);
out:
return NOTIFY_DONE;
@@ -319,7 +325,8 @@ static int lockd_inet6addr_event(struct
struct inet6_ifaddr *ifa = (struct inet6_ifaddr *)ptr;
struct sockaddr_in6 sin6;
- if (event != NETDEV_DOWN)
+ if ((event != NETDEV_DOWN) ||
+ !atomic_inc_not_zero(&nlm_ntf_refcnt))
goto out;
if (nlmsvc_rqst) {
@@ -331,6 +338,8 @@ static int lockd_inet6addr_event(struct
svc_age_temp_xprts_now(nlmsvc_rqst->rq_server,
(struct sockaddr *)&sin6);
}
+ atomic_dec(&nlm_ntf_refcnt);
+ wake_up(&nlm_ntf_wq);
out:
return NOTIFY_DONE;
@@ -347,10 +356,12 @@ static void lockd_unregister_notifiers(v
#if IS_ENABLED(CONFIG_IPV6)
unregister_inet6addr_notifier(&lockd_inet6addr_notifier);
#endif
+ wait_event(nlm_ntf_wq, atomic_read(&nlm_ntf_refcnt) == 0);
}
static void lockd_svc_exit_thread(void)
{
+ atomic_dec(&nlm_ntf_refcnt);
lockd_unregister_notifiers();
svc_exit_thread(nlmsvc_rqst);
}
@@ -375,6 +386,7 @@ static int lockd_start_svc(struct svc_se
goto out_rqst;
}
+ atomic_inc(&nlm_ntf_refcnt);
svc_sock_update_bufs(serv);
serv->sv_maxconn = nlm_max_connections;
Patches currently in stable-queue which might be from vvs(a)virtuozzo.com are
queue-4.14/lockd-fix-list_add-double-add-caused-by-legacy-signal-interface.patch
queue-4.14/grace-replace-bug_on-by-warn_once-in-exit_net-hook.patch
queue-4.14/perf-core-fix-memory-leak-triggered-by-perf-namespace.patch
queue-4.14/race-of-lockd-inetaddr-notifiers-vs-nlmsvc_rqst-change.patch
This is a note to let you know that I've just added the patch titled
quota: propagate error from __dquot_initialize
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
quota-propagate-error-from-__dquot_initialize.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Chao Yu <yuchao0(a)huawei.com>
Date: Tue, 28 Nov 2017 23:01:44 +0800
Subject: quota: propagate error from __dquot_initialize
From: Chao Yu <yuchao0(a)huawei.com>
[ Upstream commit 1a6152d36dee08da2be2a3030dceb45ef680460a ]
In commit 6184fc0b8dd7 ("quota: Propagate error from ->acquire_dquot()"),
we have propagated error from __dquot_initialize to caller, but we forgot
to handle such error in add_dquot_ref(), so, currently, during quota
accounting information initialization flow, if we failed for some of
inodes, we just ignore such error, and do account for others, which is
not a good implementation.
In this patch, we choose to let user be aware of such error, so after
turning on quota successfully, we can make sure all inodes disk usage
can be accounted, which will be more reasonable.
Suggested-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Chao Yu <yuchao0(a)huawei.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/quota/dquot.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -934,12 +934,13 @@ static int dqinit_needed(struct inode *i
}
/* This routine is guarded by s_umount semaphore */
-static void add_dquot_ref(struct super_block *sb, int type)
+static int add_dquot_ref(struct super_block *sb, int type)
{
struct inode *inode, *old_inode = NULL;
#ifdef CONFIG_QUOTA_DEBUG
int reserved = 0;
#endif
+ int err = 0;
spin_lock(&sb->s_inode_list_lock);
list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
@@ -959,7 +960,11 @@ static void add_dquot_ref(struct super_b
reserved = 1;
#endif
iput(old_inode);
- __dquot_initialize(inode, type);
+ err = __dquot_initialize(inode, type);
+ if (err) {
+ iput(inode);
+ goto out;
+ }
/*
* We hold a reference to 'inode' so it couldn't have been
@@ -974,7 +979,7 @@ static void add_dquot_ref(struct super_b
}
spin_unlock(&sb->s_inode_list_lock);
iput(old_inode);
-
+out:
#ifdef CONFIG_QUOTA_DEBUG
if (reserved) {
quota_error(sb, "Writes happened before quota was turned on "
@@ -982,6 +987,7 @@ static void add_dquot_ref(struct super_b
"Please run quotacheck(8)");
}
#endif
+ return err;
}
/*
@@ -2372,10 +2378,11 @@ static int vfs_load_quota_inode(struct i
dqopt->flags |= dquot_state_flag(flags, type);
spin_unlock(&dq_state_lock);
- add_dquot_ref(sb, type);
-
- return 0;
+ error = add_dquot_ref(sb, type);
+ if (error)
+ dquot_disable(sb, type, flags);
+ return error;
out_file_init:
dqopt->files[type] = NULL;
iput(inode);
Patches currently in stable-queue which might be from yuchao0(a)huawei.com are
queue-4.14/quota-propagate-error-from-__dquot_initialize.patch
This is a note to let you know that I've just added the patch titled
quota: Check for register_shrinker() failure.
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
quota-check-for-register_shrinker-failure.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Date: Wed, 29 Nov 2017 22:34:50 +0900
Subject: quota: Check for register_shrinker() failure.
From: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
[ Upstream commit 88bc0ede8d35edc969350852894dc864a2dc1859 ]
register_shrinker() might return -ENOMEM error since Linux 3.12.
Call panic() as with other failure checks in this function if
register_shrinker() failed.
Fixes: 1d3d4437eae1 ("vmscan: per-node deferred work")
Signed-off-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: Jan Kara <jack(a)suse.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Michal Hocko <mhocko(a)suse.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/quota/dquot.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -2985,7 +2985,8 @@ static int __init dquot_init(void)
pr_info("VFS: Dquot-cache hash table entries: %ld (order %ld,"
" %ld bytes)\n", nr_hash, order, (PAGE_SIZE << order));
- register_shrinker(&dqcache_shrinker);
+ if (register_shrinker(&dqcache_shrinker))
+ panic("Cannot register dquot shrinker");
return 0;
}
Patches currently in stable-queue which might be from penguin-kernel(a)I-love.SAKURA.ne.jp are
queue-4.14/quota-check-for-register_shrinker-failure.patch
queue-4.14/xfs-fortify-xfs_alloc_buftarg-error-handling.patch
This is a note to let you know that I've just added the patch titled
perf/core: Fix memory leak triggered by perf --namespace
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-core-fix-memory-leak-triggered-by-perf-namespace.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Vasily Averin <vvs(a)virtuozzo.com>
Date: Wed, 15 Nov 2017 08:47:02 +0300
Subject: perf/core: Fix memory leak triggered by perf --namespace
From: Vasily Averin <vvs(a)virtuozzo.com>
[ Upstream commit 0e18dd12064e07519f7cbff4149ca7fff620cbed ]
perf with --namespace key leaks various memory objects including namespaces
4.14.0+
pid_namespace 1 12 2568 12 8
user_namespace 1 39 824 39 8
net_namespace 1 5 6272 5 8
This happen because perf_fill_ns_link_info() struct patch ns_path:
during initialization ns_path incremented counters on related mnt and dentry,
but without lost path_put nobody decremented them back.
Leaked dentry is name of related namespace,
and its leak does not allow to free unused namespace.
Signed-off-by: Vasily Averin <vvs(a)virtuozzo.com>
Acked-by: Peter Zijlstra <peterz(a)infradead.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme(a)kernel.org>
Cc: Hari Bathini <hbathini(a)linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Fixes: commit e422267322cd ("perf: Add PERF_RECORD_NAMESPACES to include namespaces related info")
Link: http://lkml.kernel.org/r/c510711b-3904-e5e1-d296-61273d21118d@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/events/core.c | 1 +
1 file changed, 1 insertion(+)
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6756,6 +6756,7 @@ static void perf_fill_ns_link_info(struc
ns_inode = ns_path.dentry->d_inode;
ns_link_info->dev = new_encode_dev(ns_inode->i_sb->s_dev);
ns_link_info->ino = ns_inode->i_ino;
+ path_put(&ns_path);
}
}
Patches currently in stable-queue which might be from vvs(a)virtuozzo.com are
queue-4.14/lockd-fix-list_add-double-add-caused-by-legacy-signal-interface.patch
queue-4.14/grace-replace-bug_on-by-warn_once-in-exit_net-hook.patch
queue-4.14/perf-core-fix-memory-leak-triggered-by-perf-namespace.patch
queue-4.14/race-of-lockd-inetaddr-notifiers-vs-nlmsvc_rqst-change.patch
This is a note to let you know that I've just added the patch titled
openvswitch: fix the incorrect flow action alloc size
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
openvswitch-fix-the-incorrect-flow-action-alloc-size.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: zhangliping <zhangliping02(a)baidu.com>
Date: Sat, 25 Nov 2017 22:02:12 +0800
Subject: openvswitch: fix the incorrect flow action alloc size
From: zhangliping <zhangliping02(a)baidu.com>
[ Upstream commit 67c8d22a73128ff910e2287567132530abcf5b71 ]
If we want to add a datapath flow, which has more than 500 vxlan outputs'
action, we will get the following error reports:
openvswitch: netlink: Flow action size 32832 bytes exceeds max
openvswitch: netlink: Flow action size 32832 bytes exceeds max
openvswitch: netlink: Actions may not be safe on all matching packets
... ...
It seems that we can simply enlarge the MAX_ACTIONS_BUFSIZE to fix it, but
this is not the root cause. For example, for a vxlan output action, we need
about 60 bytes for the nlattr, but after it is converted to the flow
action, it only occupies 24 bytes. This means that we can still support
more than 1000 vxlan output actions for a single datapath flow under the
the current 32k max limitation.
So even if the nla_len(attr) is larger than MAX_ACTIONS_BUFSIZE, we
shouldn't report EINVAL and keep it move on, as the judgement can be
done by the reserve_sfa_size.
Signed-off-by: zhangliping <zhangliping02(a)baidu.com>
Acked-by: Pravin B Shelar <pshelar(a)ovn.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/openvswitch/flow_netlink.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -1903,14 +1903,11 @@ int ovs_nla_put_mask(const struct sw_flo
#define MAX_ACTIONS_BUFSIZE (32 * 1024)
-static struct sw_flow_actions *nla_alloc_flow_actions(int size, bool log)
+static struct sw_flow_actions *nla_alloc_flow_actions(int size)
{
struct sw_flow_actions *sfa;
- if (size > MAX_ACTIONS_BUFSIZE) {
- OVS_NLERR(log, "Flow action size %u bytes exceeds max", size);
- return ERR_PTR(-EINVAL);
- }
+ WARN_ON_ONCE(size > MAX_ACTIONS_BUFSIZE);
sfa = kmalloc(sizeof(*sfa) + size, GFP_KERNEL);
if (!sfa)
@@ -1983,12 +1980,15 @@ static struct nlattr *reserve_sfa_size(s
new_acts_size = ksize(*sfa) * 2;
if (new_acts_size > MAX_ACTIONS_BUFSIZE) {
- if ((MAX_ACTIONS_BUFSIZE - next_offset) < req_size)
+ if ((MAX_ACTIONS_BUFSIZE - next_offset) < req_size) {
+ OVS_NLERR(log, "Flow action size exceeds max %u",
+ MAX_ACTIONS_BUFSIZE);
return ERR_PTR(-EMSGSIZE);
+ }
new_acts_size = MAX_ACTIONS_BUFSIZE;
}
- acts = nla_alloc_flow_actions(new_acts_size, log);
+ acts = nla_alloc_flow_actions(new_acts_size);
if (IS_ERR(acts))
return (void *)acts;
@@ -2660,7 +2660,7 @@ int ovs_nla_copy_actions(struct net *net
{
int err;
- *sfa = nla_alloc_flow_actions(nla_len(attr), log);
+ *sfa = nla_alloc_flow_actions(min(nla_len(attr), MAX_ACTIONS_BUFSIZE));
if (IS_ERR(*sfa))
return PTR_ERR(*sfa);
Patches currently in stable-queue which might be from zhangliping02(a)baidu.com are
queue-4.14/openvswitch-fix-the-incorrect-flow-action-alloc-size.patch
This is a note to let you know that I've just added the patch titled
nvmet-fc: correct ref counting error when deferred rcv used
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nvmet-fc-correct-ref-counting-error-when-deferred-rcv-used.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: James Smart <jsmart2021(a)gmail.com>
Date: Fri, 10 Nov 2017 15:38:45 -0800
Subject: nvmet-fc: correct ref counting error when deferred rcv used
From: James Smart <jsmart2021(a)gmail.com>
[ Upstream commit 619c62dcc62b957d17cccde2081cad527b020883 ]
Whenever a cmd is received a reference is taken while looking up the
queue. The reference is removed after the cmd is done as the iod is
returned for reuse. The fod may be reused for a deferred (recevied but
no job context) cmd. Existing code removes the reference only if the
fod is not reused for another command. Given the fod may be used for
one or more ios, although a reference was taken per io, it won't be
matched on the frees.
Remove the reference on every fod free. This pairs the references to
each io.
Signed-off-by: James Smart <james.smart(a)broadcom.com>
Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/nvme/target/fc.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -532,15 +532,15 @@ nvmet_fc_free_fcp_iod(struct nvmet_fc_tg
tgtport->ops->fcp_req_release(&tgtport->fc_target_port, fcpreq);
+ /* release the queue lookup reference on the completed IO */
+ nvmet_fc_tgt_q_put(queue);
+
spin_lock_irqsave(&queue->qlock, flags);
deferfcp = list_first_entry_or_null(&queue->pending_cmd_list,
struct nvmet_fc_defer_fcp_req, req_list);
if (!deferfcp) {
list_add_tail(&fod->fcp_list, &fod->queue->fod_list);
spin_unlock_irqrestore(&queue->qlock, flags);
-
- /* Release reference taken at queue lookup and fod allocation */
- nvmet_fc_tgt_q_put(queue);
return;
}
@@ -759,6 +759,9 @@ nvmet_fc_delete_target_queue(struct nvme
tgtport->ops->fcp_req_release(&tgtport->fc_target_port,
deferfcp->fcp_req);
+ /* release the queue lookup reference */
+ nvmet_fc_tgt_q_put(queue);
+
kfree(deferfcp);
spin_lock_irqsave(&queue->qlock, flags);
Patches currently in stable-queue which might be from jsmart2021(a)gmail.com are
queue-4.14/nvmet-fc-correct-ref-counting-error-when-deferred-rcv-used.patch
This is a note to let you know that I've just added the patch titled
nvme-rdma: don't complete requests before a send work request has completed
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nvme-rdma-don-t-complete-requests-before-a-send-work-request-has-completed.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Sagi Grimberg <sagi(a)grimberg.me>
Date: Thu, 23 Nov 2017 17:35:22 +0200
Subject: nvme-rdma: don't complete requests before a send work request has completed
From: Sagi Grimberg <sagi(a)grimberg.me>
[ Upstream commit 4af7f7ff92a42b6c713293c99e7982bcfcf51a70 ]
In order to guarantee that the HCA will never get an access violation
(either from invalidated rkey or from iommu) when retrying a send
operation we must complete a request only when both send completion and
the nvme cqe has arrived. We need to set the send/recv completions flags
atomically because we might have more than a single context accessing the
request concurrently (one is cq irq-poll context and the other is
user-polling used in IOCB_HIPRI).
Only then we are safe to invalidate the rkey (if needed), unmap the host
buffers, and complete the IO.
Signed-off-by: Sagi Grimberg <sagi(a)grimberg.me>
Reviewed-by: Max Gurtovoy <maxg(a)mellanox.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/nvme/host/rdma.c | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -67,6 +67,9 @@ struct nvme_rdma_request {
struct nvme_request req;
struct ib_mr *mr;
struct nvme_rdma_qe sqe;
+ union nvme_result result;
+ __le16 status;
+ refcount_t ref;
struct ib_sge sge[1 + NVME_RDMA_MAX_INLINE_SEGMENTS];
u32 num_sge;
int nents;
@@ -1177,6 +1180,7 @@ static int nvme_rdma_map_data(struct nvm
req->num_sge = 1;
req->inline_data = false;
req->mr->need_inval = false;
+ refcount_set(&req->ref, 2); /* send and recv completions */
c->common.flags |= NVME_CMD_SGL_METABUF;
@@ -1213,8 +1217,19 @@ static int nvme_rdma_map_data(struct nvm
static void nvme_rdma_send_done(struct ib_cq *cq, struct ib_wc *wc)
{
- if (unlikely(wc->status != IB_WC_SUCCESS))
+ struct nvme_rdma_qe *qe =
+ container_of(wc->wr_cqe, struct nvme_rdma_qe, cqe);
+ struct nvme_rdma_request *req =
+ container_of(qe, struct nvme_rdma_request, sqe);
+ struct request *rq = blk_mq_rq_from_pdu(req);
+
+ if (unlikely(wc->status != IB_WC_SUCCESS)) {
nvme_rdma_wr_error(cq, wc, "SEND");
+ return;
+ }
+
+ if (refcount_dec_and_test(&req->ref))
+ nvme_end_request(rq, req->status, req->result);
}
/*
@@ -1359,14 +1374,19 @@ static int nvme_rdma_process_nvme_rsp(st
}
req = blk_mq_rq_to_pdu(rq);
- if (rq->tag == tag)
- ret = 1;
+ req->status = cqe->status;
+ req->result = cqe->result;
if ((wc->wc_flags & IB_WC_WITH_INVALIDATE) &&
wc->ex.invalidate_rkey == req->mr->rkey)
req->mr->need_inval = false;
- nvme_end_request(rq, cqe->status, cqe->result);
+ if (refcount_dec_and_test(&req->ref)) {
+ if (rq->tag == tag)
+ ret = 1;
+ nvme_end_request(rq, req->status, req->result);
+ }
+
return ret;
}
Patches currently in stable-queue which might be from sagi(a)grimberg.me are
queue-4.14/nvmet-fc-correct-ref-counting-error-when-deferred-rcv-used.patch
queue-4.14/nvme-fabrics-introduce-init-command-check-for-a-queue-that-is-not-alive.patch
queue-4.14/nvme-fc-check-if-queue-is-ready-in-queue_rq.patch
queue-4.14/nvme-loop-check-if-queue-is-ready-in-queue_rq.patch
queue-4.14/nvme-rdma-don-t-complete-requests-before-a-send-work-request-has-completed.patch
This is a note to let you know that I've just added the patch titled
nvme-pci: fix NULL pointer dereference in nvme_free_host_mem()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nvme-pci-fix-null-pointer-dereference-in-nvme_free_host_mem.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Minwoo Im <minwoo.im.dev(a)gmail.com>
Date: Sat, 25 Nov 2017 03:03:00 +0900
Subject: nvme-pci: fix NULL pointer dereference in nvme_free_host_mem()
From: Minwoo Im <minwoo.im.dev(a)gmail.com>
[ Upstream commit 7e5dd57ef3081ff6c03908d786ed5087f6fbb7ae ]
Following condition which will cause NULL pointer dereference will
occur in nvme_free_host_mem() when it tries to remove pci device via
nvme_remove() especially after a failure of host memory allocation for HMB.
"(host_mem_descs == NULL) && (nr_host_mem_descs != 0)"
It's because __nr_host_mem_descs__ is not cleared to 0 unlike
__host_mem_descs__ is so.
Signed-off-by: Minwoo Im <minwoo.im.dev(a)gmail.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/nvme/host/pci.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1617,6 +1617,7 @@ static void nvme_free_host_mem(struct nv
dev->nr_host_mem_descs * sizeof(*dev->host_mem_descs),
dev->host_mem_descs, dev->host_mem_descs_dma);
dev->host_mem_descs = NULL;
+ dev->nr_host_mem_descs = 0;
}
static int __nvme_alloc_host_mem(struct nvme_dev *dev, u64 preferred,
Patches currently in stable-queue which might be from minwoo.im.dev(a)gmail.com are
queue-4.14/nvme-pci-avoid-hmb-desc-array-idx-out-of-bound-when-hmmaxd-set.patch
queue-4.14/nvme-pci-fix-null-pointer-dereference-in-nvme_free_host_mem.patch
This is a note to let you know that I've just added the patch titled
nvme-pci: disable APST on Samsung SSD 960 EVO + ASUS PRIME B350M-A
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nvme-pci-disable-apst-on-samsung-ssd-960-evo-asus-prime-b350m-a.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Date: Thu, 9 Nov 2017 01:12:03 -0500
Subject: nvme-pci: disable APST on Samsung SSD 960 EVO + ASUS PRIME B350M-A
From: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
[ Upstream commit 8427bbc224863e14d905c87920d4005cb3e88ac3 ]
The NVMe device in question drops off the PCIe bus after system suspend.
I've tried several approaches to workaround this issue, but none of them
works:
- NVME_QUIRK_DELAY_BEFORE_CHK_RDY
- NVME_QUIRK_NO_DEEPEST_PS
- Disable APST before controller shutdown
- Delay between controller shutdown and system suspend
- Explicitly set power state to 0 before controller shutdown
Fortunately it's a desktop, so disable APST won't hurt the battery.
Also, change the quirk function name to reflect it's for vendor
combination quirks.
BugLink: https://bugs.launchpad.net/bugs/1705748
Signed-off-by: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/nvme/host/pci.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2282,7 +2282,7 @@ static int nvme_dev_map(struct nvme_dev
return -ENODEV;
}
-static unsigned long check_dell_samsung_bug(struct pci_dev *pdev)
+static unsigned long check_vendor_combination_bug(struct pci_dev *pdev)
{
if (pdev->vendor == 0x144d && pdev->device == 0xa802) {
/*
@@ -2297,6 +2297,14 @@ static unsigned long check_dell_samsung_
(dmi_match(DMI_PRODUCT_NAME, "XPS 15 9550") ||
dmi_match(DMI_PRODUCT_NAME, "Precision 5510")))
return NVME_QUIRK_NO_DEEPEST_PS;
+ } else if (pdev->vendor == 0x144d && pdev->device == 0xa804) {
+ /*
+ * Samsung SSD 960 EVO drops off the PCIe bus after system
+ * suspend on a Ryzen board, ASUS PRIME B350M-A.
+ */
+ if (dmi_match(DMI_BOARD_VENDOR, "ASUSTeK COMPUTER INC.") &&
+ dmi_match(DMI_BOARD_NAME, "PRIME B350M-A"))
+ return NVME_QUIRK_NO_APST;
}
return 0;
@@ -2336,7 +2344,7 @@ static int nvme_probe(struct pci_dev *pd
if (result)
goto unmap;
- quirks |= check_dell_samsung_bug(pdev);
+ quirks |= check_vendor_combination_bug(pdev);
result = nvme_init_ctrl(&dev->ctrl, &pdev->dev, &nvme_pci_ctrl_ops,
quirks);
Patches currently in stable-queue which might be from kai.heng.feng(a)canonical.com are
queue-4.14/nvme-pci-disable-apst-on-samsung-ssd-960-evo-asus-prime-b350m-a.patch
This is a note to let you know that I've just added the patch titled
nvme-pci: avoid hmb desc array idx out-of-bound when hmmaxd set.
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nvme-pci-avoid-hmb-desc-array-idx-out-of-bound-when-hmmaxd-set.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Minwoo Im <minwoo.im.dev(a)gmail.com>
Date: Fri, 17 Nov 2017 01:34:24 +0900
Subject: nvme-pci: avoid hmb desc array idx out-of-bound when hmmaxd set.
From: Minwoo Im <minwoo.im.dev(a)gmail.com>
[ Upstream commit 244a8fe40a09c218622eb9927b9090b0a9b73a1a ]
hmb descriptor idx out-of-bound occurs in case of below conditions.
preferred = 128MiB
chunk_size = 4MiB
hmmaxd = 1
Current code will not allow rmmod which will free hmb descriptors
to be done successfully in above case.
"descs[i]" will be set in for-loop without seeing any conditions
related to "max_entries" after a single "descs" was allocated by
(max_entries = 1) in this case.
Added a condition into for-loop to check index of descriptors.
Fixes: 044a9df1("nvme-pci: implement the HMB entry number and size limitations")
Signed-off-by: Minwoo Im <minwoo.im.dev(a)gmail.com>
Reviewed-by: Keith Busch <keith.busch(a)intel.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/nvme/host/pci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1645,7 +1645,7 @@ static int __nvme_alloc_host_mem(struct
if (!bufs)
goto out_free_descs;
- for (size = 0; size < preferred; size += len) {
+ for (size = 0; size < preferred && i < max_entries; size += len) {
dma_addr_t dma_addr;
len = min_t(u64, chunk_size, preferred - size);
Patches currently in stable-queue which might be from minwoo.im.dev(a)gmail.com are
queue-4.14/nvme-pci-avoid-hmb-desc-array-idx-out-of-bound-when-hmmaxd-set.patch
queue-4.14/nvme-pci-fix-null-pointer-dereference-in-nvme_free_host_mem.patch