This is the start of the stable review cycle for the 5.8.9 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:24:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.8.9-rc1
Florian Westphal fw@strlen.de mptcp: free acked data before waiting for more memory
Jakub Kicinski kuba@kernel.org net: disable netpoll on fresh napis
Tuong Lien tuong.t.lien@dektech.com.au tipc: fix using smp_processor_id() in preemptible
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp tipc: fix shutdown() of connectionless socket
Vinicius Costa Gomes vinicius.gomes@intel.com taprio: Fix using wrong queues in gate mask
Xin Long lucien.xin@gmail.com sctp: not disable bh in the whole sctp_get_port_local()
Kamil Lorenc kamil@re-ws.pl net: usb: dm9601: Add USB ID of Keenetic Plus DSL
Paul Moore paul@paul-moore.com netlabel: fix problems with mapping removal
Ido Schimmel idosch@nvidia.com ipv6: Fix sysctl max for fib_multipath_hash_policy
Ido Schimmel idosch@nvidia.com ipv4: Silence suspicious RCU usage warning
Jason Gunthorpe jgg@nvidia.com RDMA/cma: Execute rdma_cm destruction from a handler properly
Jason Gunthorpe jgg@nvidia.com RDMA/cma: Remove unneeded locking for req paths
Jason Gunthorpe jgg@nvidia.com RDMA/cma: Using the standard locking pattern when delivering the removal event
Jason Gunthorpe jgg@nvidia.com RDMA/cma: Simplify DEVICE_REMOVAL for internal_id
Pavel Begunkov asml.silence@gmail.com io_uring: fix linked deferred ->files cancellation
Pavel Begunkov asml.silence@gmail.com io_uring: fix cancel of deferred reqs with ->files
-------------
Diffstat:
Makefile | 4 +- drivers/infiniband/core/cma.c | 257 ++++++++++++++++++------------------- drivers/net/usb/dm9601.c | 4 + fs/io_uring.c | 47 +++++++ net/core/dev.c | 3 +- net/core/netpoll.c | 2 +- net/ipv4/fib_trie.c | 3 +- net/ipv6/sysctl_net_ipv6.c | 3 +- net/mptcp/protocol.c | 3 +- net/netlabel/netlabel_domainhash.c | 59 ++++----- net/sched/sch_taprio.c | 30 ++++- net/sctp/socket.c | 16 +-- net/tipc/crypto.c | 12 +- net/tipc/socket.c | 9 +- 14 files changed, 259 insertions(+), 193 deletions(-)
From: Pavel Begunkov asml.silence@gmail.com
[ Upstream commit b7ddce3cbf010edbfac6c6d8cc708560a7bcd7a4 ]
While trying to cancel requests with ->files, it also should look for requests in ->defer_list, otherwise it might end up hanging a thread.
Cancel all requests in ->defer_list up to the last request there with matching ->files, that's needed to follow drain ordering semantics.
Signed-off-by: Pavel Begunkov asml.silence@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 38f3ec15ba3b1..5f627194d0920 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7675,12 +7675,38 @@ static void io_attempt_cancel(struct io_ring_ctx *ctx, struct io_kiocb *req) io_timeout_remove_link(ctx, req); }
+static void io_cancel_defer_files(struct io_ring_ctx *ctx, + struct files_struct *files) +{ + struct io_kiocb *req = NULL; + LIST_HEAD(list); + + spin_lock_irq(&ctx->completion_lock); + list_for_each_entry_reverse(req, &ctx->defer_list, list) { + if ((req->flags & REQ_F_WORK_INITIALIZED) + && req->work.files == files) { + list_cut_position(&list, &ctx->defer_list, &req->list); + break; + } + } + spin_unlock_irq(&ctx->completion_lock); + + while (!list_empty(&list)) { + req = list_first_entry(&list, struct io_kiocb, list); + list_del_init(&req->list); + req_set_fail_links(req); + io_cqring_add_event(req, -ECANCELED); + io_double_put_req(req); + } +} + static void io_uring_cancel_files(struct io_ring_ctx *ctx, struct files_struct *files) { if (list_empty_careful(&ctx->inflight_list)) return;
+ io_cancel_defer_files(ctx, files); /* cancel all at once, should be faster than doing it one by one*/ io_wq_cancel_cb(ctx->io_wq, io_wq_files_match, files, true);
From: Pavel Begunkov asml.silence@gmail.com
[ Upstream commit c127a2a1b7baa5eb40a7e2de4b7f0c51ccbbb2ef ]
While looking for ->files in ->defer_list, consider that requests there may actually be links.
Signed-off-by: Pavel Begunkov asml.silence@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 5f627194d0920..d05023ca74bdc 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7601,6 +7601,28 @@ static bool io_match_link(struct io_kiocb *preq, struct io_kiocb *req) return false; }
+static inline bool io_match_files(struct io_kiocb *req, + struct files_struct *files) +{ + return (req->flags & REQ_F_WORK_INITIALIZED) && req->work.files == files; +} + +static bool io_match_link_files(struct io_kiocb *req, + struct files_struct *files) +{ + struct io_kiocb *link; + + if (io_match_files(req, files)) + return true; + if (req->flags & REQ_F_LINK_HEAD) { + list_for_each_entry(link, &req->link_list, link_list) { + if (io_match_files(link, files)) + return true; + } + } + return false; +} + /* * We're looking to cancel 'req' because it's holding on to our files, but * 'req' could be a link to another request. See if it is, and cancel that @@ -7683,8 +7705,7 @@ static void io_cancel_defer_files(struct io_ring_ctx *ctx,
spin_lock_irq(&ctx->completion_lock); list_for_each_entry_reverse(req, &ctx->defer_list, list) { - if ((req->flags & REQ_F_WORK_INITIALIZED) - && req->work.files == files) { + if (io_match_link_files(req, files)) { list_cut_position(&list, &ctx->defer_list, &req->list); break; }
From: Jason Gunthorpe jgg@nvidia.com
[ Upstream commit d54f23c09ec62670901f1a2a4712a5218522ca2b ]
cma_process_remove() triggers an unconditional rdma_destroy_id() for internal_id's and skips the event deliver and transition through RDMA_CM_DEVICE_REMOVAL.
This is confusing and unnecessary. internal_id always has cma_listen_handler() as the handler, have it catch the RDMA_CM_DEVICE_REMOVAL event and directly consume it and signal removal.
This way the FSM sequence never skips the DEVICE_REMOVAL case and the logic in this hard to test area is simplified.
Link: https://lore.kernel.org/r/20200723070707.1771101-2-leon@kernel.org Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cma.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index c30cf5307ce3e..537eeebde5f4d 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -2482,6 +2482,10 @@ static int cma_listen_handler(struct rdma_cm_id *id, { struct rdma_id_private *id_priv = id->context;
+ /* Listening IDs are always destroyed on removal */ + if (event->event == RDMA_CM_EVENT_DEVICE_REMOVAL) + return -1; + id->context = id_priv->id.context; id->event_handler = id_priv->id.event_handler; trace_cm_event_handler(id_priv, event); @@ -4829,7 +4833,7 @@ static void cma_process_remove(struct cma_device *cma_dev) cma_id_get(id_priv); mutex_unlock(&lock);
- ret = id_priv->internal_id ? 1 : cma_remove_id_dev(id_priv); + ret = cma_remove_id_dev(id_priv); cma_id_put(id_priv); if (ret) rdma_destroy_id(&id_priv->id);
From: Jason Gunthorpe jgg@nvidia.com
[ Upstream commit 3647a28de1ada8708efc78d956619b9df5004478 ]
Whenever an event is delivered to the handler it should be done under the handler_mutex and upon any non-zero return from the handler it should trigger destruction of the cm_id.
cma_process_remove() skips some steps here, it is not necessarily wrong since the state change should prevent any races, but it is confusing and unnecessary.
Follow the standard pattern here, with the slight twist that the transition to RDMA_CM_DEVICE_REMOVAL includes a cma_cancel_operation().
Link: https://lore.kernel.org/r/20200723070707.1771101-3-leon@kernel.org Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cma.c | 62 ++++++++++++++++++++--------------- 1 file changed, 36 insertions(+), 26 deletions(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 537eeebde5f4d..04151c301e851 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1925,6 +1925,8 @@ static int cma_cm_event_handler(struct rdma_id_private *id_priv, { int ret;
+ lockdep_assert_held(&id_priv->handler_mutex); + trace_cm_event_handler(id_priv, event); ret = id_priv->id.event_handler(&id_priv->id, event); trace_cm_event_done(id_priv, event, ret); @@ -4793,50 +4795,58 @@ free_cma_dev: return ret; }
-static int cma_remove_id_dev(struct rdma_id_private *id_priv) +static void cma_send_device_removal_put(struct rdma_id_private *id_priv) { - struct rdma_cm_event event = {}; + struct rdma_cm_event event = { .event = RDMA_CM_EVENT_DEVICE_REMOVAL }; enum rdma_cm_state state; - int ret = 0; - - /* Record that we want to remove the device */ - state = cma_exch(id_priv, RDMA_CM_DEVICE_REMOVAL); - if (state == RDMA_CM_DESTROYING) - return 0; + unsigned long flags;
- cma_cancel_operation(id_priv, state); mutex_lock(&id_priv->handler_mutex); + /* Record that we want to remove the device */ + spin_lock_irqsave(&id_priv->lock, flags); + state = id_priv->state; + if (state == RDMA_CM_DESTROYING || state == RDMA_CM_DEVICE_REMOVAL) { + spin_unlock_irqrestore(&id_priv->lock, flags); + mutex_unlock(&id_priv->handler_mutex); + cma_id_put(id_priv); + return; + } + id_priv->state = RDMA_CM_DEVICE_REMOVAL; + spin_unlock_irqrestore(&id_priv->lock, flags);
- /* Check for destruction from another callback. */ - if (!cma_comp(id_priv, RDMA_CM_DEVICE_REMOVAL)) - goto out; - - event.event = RDMA_CM_EVENT_DEVICE_REMOVAL; - ret = cma_cm_event_handler(id_priv, &event); -out: + if (cma_cm_event_handler(id_priv, &event)) { + /* + * At this point the ULP promises it won't call + * rdma_destroy_id() concurrently + */ + cma_id_put(id_priv); + mutex_unlock(&id_priv->handler_mutex); + rdma_destroy_id(&id_priv->id); + return; + } mutex_unlock(&id_priv->handler_mutex); - return ret; + + /* + * If this races with destroy then the thread that first assigns state + * to a destroying does the cancel. + */ + cma_cancel_operation(id_priv, state); + cma_id_put(id_priv); }
static void cma_process_remove(struct cma_device *cma_dev) { - struct rdma_id_private *id_priv; - int ret; - mutex_lock(&lock); while (!list_empty(&cma_dev->id_list)) { - id_priv = list_entry(cma_dev->id_list.next, - struct rdma_id_private, list); + struct rdma_id_private *id_priv = list_first_entry( + &cma_dev->id_list, struct rdma_id_private, list);
list_del(&id_priv->listen_list); list_del_init(&id_priv->list); cma_id_get(id_priv); mutex_unlock(&lock);
- ret = cma_remove_id_dev(id_priv); - cma_id_put(id_priv); - if (ret) - rdma_destroy_id(&id_priv->id); + cma_send_device_removal_put(id_priv);
mutex_lock(&lock); }
From: Jason Gunthorpe jgg@nvidia.com
[ Upstream commit cc9c037343898eb7a775e6b81d092ee21eeff218 ]
The REQ flows are concerned that once the handler is called on the new cm_id the ULP can choose to trigger a rdma_destroy_id() concurrently at any time.
However, this is not true, while the ULP can call rdma_destroy_id(), it immediately blocks on the handler_mutex which prevents anything harmful from running concurrently.
Remove the confusing extra locking and refcounts and make the handler_mutex protecting state during destroy more clear.
Link: https://lore.kernel.org/r/20200723070707.1771101-4-leon@kernel.org Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cma.c | 31 ++++++------------------------- 1 file changed, 6 insertions(+), 25 deletions(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 04151c301e851..11f43204fee77 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1831,21 +1831,21 @@ static void cma_leave_mc_groups(struct rdma_id_private *id_priv)
void rdma_destroy_id(struct rdma_cm_id *id) { - struct rdma_id_private *id_priv; + struct rdma_id_private *id_priv = + container_of(id, struct rdma_id_private, id); enum rdma_cm_state state;
- id_priv = container_of(id, struct rdma_id_private, id); - trace_cm_id_destroy(id_priv); - state = cma_exch(id_priv, RDMA_CM_DESTROYING); - cma_cancel_operation(id_priv, state); - /* * Wait for any active callback to finish. New callbacks will find * the id_priv state set to destroying and abort. */ mutex_lock(&id_priv->handler_mutex); + trace_cm_id_destroy(id_priv); + state = cma_exch(id_priv, RDMA_CM_DESTROYING); mutex_unlock(&id_priv->handler_mutex);
+ cma_cancel_operation(id_priv, state); + rdma_restrack_del(&id_priv->res); if (id_priv->cma_dev) { if (rdma_cap_ib_cm(id_priv->id.device, 1)) { @@ -2205,19 +2205,9 @@ static int cma_ib_req_handler(struct ib_cm_id *cm_id, cm_id->context = conn_id; cm_id->cm_handler = cma_ib_handler;
- /* - * Protect against the user destroying conn_id from another thread - * until we're done accessing it. - */ - cma_id_get(conn_id); ret = cma_cm_event_handler(conn_id, &event); if (ret) goto err3; - /* - * Acquire mutex to prevent user executing rdma_destroy_id() - * while we're accessing the cm_id. - */ - mutex_lock(&lock); if (cma_comp(conn_id, RDMA_CM_CONNECT) && (conn_id->id.qp_type != IB_QPT_UD)) { trace_cm_send_mra(cm_id->context); @@ -2226,13 +2216,11 @@ static int cma_ib_req_handler(struct ib_cm_id *cm_id, mutex_unlock(&lock); mutex_unlock(&conn_id->handler_mutex); mutex_unlock(&listen_id->handler_mutex); - cma_id_put(conn_id); if (net_dev) dev_put(net_dev); return 0;
err3: - cma_id_put(conn_id); /* Destroy the CM ID by returning a non-zero value. */ conn_id->cm_id.ib = NULL; err2: @@ -2409,11 +2397,6 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, memcpy(cma_src_addr(conn_id), laddr, rdma_addr_size(laddr)); memcpy(cma_dst_addr(conn_id), raddr, rdma_addr_size(raddr));
- /* - * Protect against the user destroying conn_id from another thread - * until we're done accessing it. - */ - cma_id_get(conn_id); ret = cma_cm_event_handler(conn_id, &event); if (ret) { /* User wants to destroy the CM ID */ @@ -2421,13 +2404,11 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, cma_exch(conn_id, RDMA_CM_DESTROYING); mutex_unlock(&conn_id->handler_mutex); mutex_unlock(&listen_id->handler_mutex); - cma_id_put(conn_id); rdma_destroy_id(&conn_id->id); return ret; }
mutex_unlock(&conn_id->handler_mutex); - cma_id_put(conn_id);
out: mutex_unlock(&listen_id->handler_mutex);
From: Jason Gunthorpe jgg@nvidia.com
[ Upstream commit f6a9d47ae6854980fc4b1676f1fe9f9fa45ea4e2 ]
When a rdma_cm_id needs to be destroyed after a handler callback fails, part of the destruction pattern is open coded into each call site.
Unfortunately the blind assignment to state discards important information needed to do cma_cancel_operation(). This results in active operations being left running after rdma_destroy_id() completes, and the use-after-free bugs from KASAN.
Consolidate this entire pattern into destroy_id_handler_unlock() and manage the locking correctly. The state should be set to RDMA_CM_DESTROYING under the handler_lock to atomically ensure no futher handlers are called.
Link: https://lore.kernel.org/r/20200723070707.1771101-5-leon@kernel.org Reported-by: syzbot+08092148130652a6faae@syzkaller.appspotmail.com Reported-by: syzbot+a929647172775e335941@syzkaller.appspotmail.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cma.c | 174 ++++++++++++++++++++---------------------- 1 file changed, 84 insertions(+), 90 deletions(-)
--- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -428,19 +428,6 @@ static int cma_comp_exch(struct rdma_id_ return ret; }
-static enum rdma_cm_state cma_exch(struct rdma_id_private *id_priv, - enum rdma_cm_state exch) -{ - unsigned long flags; - enum rdma_cm_state old; - - spin_lock_irqsave(&id_priv->lock, flags); - old = id_priv->state; - id_priv->state = exch; - spin_unlock_irqrestore(&id_priv->lock, flags); - return old; -} - static inline u8 cma_get_ip_ver(const struct cma_hdr *hdr) { return hdr->ip_version >> 4; @@ -1829,21 +1816,9 @@ static void cma_leave_mc_groups(struct r } }
-void rdma_destroy_id(struct rdma_cm_id *id) +static void _destroy_id(struct rdma_id_private *id_priv, + enum rdma_cm_state state) { - struct rdma_id_private *id_priv = - container_of(id, struct rdma_id_private, id); - enum rdma_cm_state state; - - /* - * Wait for any active callback to finish. New callbacks will find - * the id_priv state set to destroying and abort. - */ - mutex_lock(&id_priv->handler_mutex); - trace_cm_id_destroy(id_priv); - state = cma_exch(id_priv, RDMA_CM_DESTROYING); - mutex_unlock(&id_priv->handler_mutex); - cma_cancel_operation(id_priv, state);
rdma_restrack_del(&id_priv->res); @@ -1874,6 +1849,42 @@ void rdma_destroy_id(struct rdma_cm_id * put_net(id_priv->id.route.addr.dev_addr.net); kfree(id_priv); } + +/* + * destroy an ID from within the handler_mutex. This ensures that no other + * handlers can start running concurrently. + */ +static void destroy_id_handler_unlock(struct rdma_id_private *id_priv) + __releases(&idprv->handler_mutex) +{ + enum rdma_cm_state state; + unsigned long flags; + + trace_cm_id_destroy(id_priv); + + /* + * Setting the state to destroyed under the handler mutex provides a + * fence against calling handler callbacks. If this is invoked due to + * the failure of a handler callback then it guarentees that no future + * handlers will be called. + */ + lockdep_assert_held(&id_priv->handler_mutex); + spin_lock_irqsave(&id_priv->lock, flags); + state = id_priv->state; + id_priv->state = RDMA_CM_DESTROYING; + spin_unlock_irqrestore(&id_priv->lock, flags); + mutex_unlock(&id_priv->handler_mutex); + _destroy_id(id_priv, state); +} + +void rdma_destroy_id(struct rdma_cm_id *id) +{ + struct rdma_id_private *id_priv = + container_of(id, struct rdma_id_private, id); + + mutex_lock(&id_priv->handler_mutex); + destroy_id_handler_unlock(id_priv); +} EXPORT_SYMBOL(rdma_destroy_id);
static int cma_rep_recv(struct rdma_id_private *id_priv) @@ -1938,7 +1949,7 @@ static int cma_ib_handler(struct ib_cm_i { struct rdma_id_private *id_priv = cm_id->context; struct rdma_cm_event event = {}; - int ret = 0; + int ret;
mutex_lock(&id_priv->handler_mutex); if ((ib_event->event != IB_CM_TIMEWAIT_EXIT && @@ -2007,14 +2018,12 @@ static int cma_ib_handler(struct ib_cm_i if (ret) { /* Destroy the CM ID by returning a non-zero value. */ id_priv->cm_id.ib = NULL; - cma_exch(id_priv, RDMA_CM_DESTROYING); - mutex_unlock(&id_priv->handler_mutex); - rdma_destroy_id(&id_priv->id); + destroy_id_handler_unlock(id_priv); return ret; } out: mutex_unlock(&id_priv->handler_mutex); - return ret; + return 0; }
static struct rdma_id_private * @@ -2176,7 +2185,7 @@ static int cma_ib_req_handler(struct ib_ mutex_lock(&listen_id->handler_mutex); if (listen_id->state != RDMA_CM_LISTEN) { ret = -ECONNABORTED; - goto err1; + goto err_unlock; }
offset = cma_user_data_offset(listen_id); @@ -2193,43 +2202,38 @@ static int cma_ib_req_handler(struct ib_ } if (!conn_id) { ret = -ENOMEM; - goto err1; + goto err_unlock; }
mutex_lock_nested(&conn_id->handler_mutex, SINGLE_DEPTH_NESTING); ret = cma_ib_acquire_dev(conn_id, listen_id, &req); - if (ret) - goto err2; + if (ret) { + destroy_id_handler_unlock(conn_id); + goto err_unlock; + }
conn_id->cm_id.ib = cm_id; cm_id->context = conn_id; cm_id->cm_handler = cma_ib_handler;
ret = cma_cm_event_handler(conn_id, &event); - if (ret) - goto err3; + if (ret) { + /* Destroy the CM ID by returning a non-zero value. */ + conn_id->cm_id.ib = NULL; + mutex_unlock(&listen_id->handler_mutex); + destroy_id_handler_unlock(conn_id); + goto net_dev_put; + } + if (cma_comp(conn_id, RDMA_CM_CONNECT) && (conn_id->id.qp_type != IB_QPT_UD)) { trace_cm_send_mra(cm_id->context); ib_send_cm_mra(cm_id, CMA_CM_MRA_SETTING, NULL, 0); } - mutex_unlock(&lock); mutex_unlock(&conn_id->handler_mutex); - mutex_unlock(&listen_id->handler_mutex); - if (net_dev) - dev_put(net_dev); - return 0;
-err3: - /* Destroy the CM ID by returning a non-zero value. */ - conn_id->cm_id.ib = NULL; -err2: - cma_exch(conn_id, RDMA_CM_DESTROYING); - mutex_unlock(&conn_id->handler_mutex); -err1: +err_unlock: mutex_unlock(&listen_id->handler_mutex); - if (conn_id) - rdma_destroy_id(&conn_id->id);
net_dev_put: if (net_dev) @@ -2329,9 +2333,7 @@ static int cma_iw_handler(struct iw_cm_i if (ret) { /* Destroy the CM ID by returning a non-zero value. */ id_priv->cm_id.iw = NULL; - cma_exch(id_priv, RDMA_CM_DESTROYING); - mutex_unlock(&id_priv->handler_mutex); - rdma_destroy_id(&id_priv->id); + destroy_id_handler_unlock(id_priv); return ret; }
@@ -2378,16 +2380,16 @@ static int iw_conn_req_handler(struct iw
ret = rdma_translate_ip(laddr, &conn_id->id.route.addr.dev_addr); if (ret) { - mutex_unlock(&conn_id->handler_mutex); - rdma_destroy_id(new_cm_id); - goto out; + mutex_unlock(&listen_id->handler_mutex); + destroy_id_handler_unlock(conn_id); + return ret; }
ret = cma_iw_acquire_dev(conn_id, listen_id); if (ret) { - mutex_unlock(&conn_id->handler_mutex); - rdma_destroy_id(new_cm_id); - goto out; + mutex_unlock(&listen_id->handler_mutex); + destroy_id_handler_unlock(conn_id); + return ret; }
conn_id->cm_id.iw = cm_id; @@ -2401,10 +2403,8 @@ static int iw_conn_req_handler(struct iw if (ret) { /* User wants to destroy the CM ID */ conn_id->cm_id.iw = NULL; - cma_exch(conn_id, RDMA_CM_DESTROYING); - mutex_unlock(&conn_id->handler_mutex); mutex_unlock(&listen_id->handler_mutex); - rdma_destroy_id(&conn_id->id); + destroy_id_handler_unlock(conn_id); return ret; }
@@ -2644,21 +2644,21 @@ static void cma_work_handler(struct work { struct cma_work *work = container_of(_work, struct cma_work, work); struct rdma_id_private *id_priv = work->id; - int destroy = 0;
mutex_lock(&id_priv->handler_mutex); if (!cma_comp_exch(id_priv, work->old_state, work->new_state)) - goto out; + goto out_unlock;
if (cma_cm_event_handler(id_priv, &work->event)) { - cma_exch(id_priv, RDMA_CM_DESTROYING); - destroy = 1; + cma_id_put(id_priv); + destroy_id_handler_unlock(id_priv); + goto out_free; } -out: + +out_unlock: mutex_unlock(&id_priv->handler_mutex); cma_id_put(id_priv); - if (destroy) - rdma_destroy_id(&id_priv->id); +out_free: kfree(work); }
@@ -2666,23 +2666,22 @@ static void cma_ndev_work_handler(struct { struct cma_ndev_work *work = container_of(_work, struct cma_ndev_work, work); struct rdma_id_private *id_priv = work->id; - int destroy = 0;
mutex_lock(&id_priv->handler_mutex); if (id_priv->state == RDMA_CM_DESTROYING || id_priv->state == RDMA_CM_DEVICE_REMOVAL) - goto out; + goto out_unlock;
if (cma_cm_event_handler(id_priv, &work->event)) { - cma_exch(id_priv, RDMA_CM_DESTROYING); - destroy = 1; + cma_id_put(id_priv); + destroy_id_handler_unlock(id_priv); + goto out_free; }
-out: +out_unlock: mutex_unlock(&id_priv->handler_mutex); cma_id_put(id_priv); - if (destroy) - rdma_destroy_id(&id_priv->id); +out_free: kfree(work); }
@@ -3158,9 +3157,7 @@ static void addr_handler(int status, str event.event = RDMA_CM_EVENT_ADDR_RESOLVED;
if (cma_cm_event_handler(id_priv, &event)) { - cma_exch(id_priv, RDMA_CM_DESTROYING); - mutex_unlock(&id_priv->handler_mutex); - rdma_destroy_id(&id_priv->id); + destroy_id_handler_unlock(id_priv); return; } out: @@ -3777,7 +3774,7 @@ static int cma_sidr_rep_handler(struct i struct rdma_cm_event event = {}; const struct ib_cm_sidr_rep_event_param *rep = &ib_event->param.sidr_rep_rcvd; - int ret = 0; + int ret;
mutex_lock(&id_priv->handler_mutex); if (id_priv->state != RDMA_CM_CONNECT) @@ -3827,14 +3824,12 @@ static int cma_sidr_rep_handler(struct i if (ret) { /* Destroy the CM ID by returning a non-zero value. */ id_priv->cm_id.ib = NULL; - cma_exch(id_priv, RDMA_CM_DESTROYING); - mutex_unlock(&id_priv->handler_mutex); - rdma_destroy_id(&id_priv->id); + destroy_id_handler_unlock(id_priv); return ret; } out: mutex_unlock(&id_priv->handler_mutex); - return ret; + return 0; }
static int cma_resolve_ib_udp(struct rdma_id_private *id_priv, @@ -4359,9 +4354,7 @@ static int cma_ib_mc_handler(int status,
rdma_destroy_ah_attr(&event.param.ud.ah_attr); if (ret) { - cma_exch(id_priv, RDMA_CM_DESTROYING); - mutex_unlock(&id_priv->handler_mutex); - rdma_destroy_id(&id_priv->id); + destroy_id_handler_unlock(id_priv); return 0; }
@@ -4802,7 +4795,8 @@ static void cma_send_device_removal_put( */ cma_id_put(id_priv); mutex_unlock(&id_priv->handler_mutex); - rdma_destroy_id(&id_priv->id); + trace_cm_id_destroy(id_priv); + _destroy_id(id_priv, state); return; } mutex_unlock(&id_priv->handler_mutex);
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit 7f6f32bb7d3355cd78ebf1dece9a6ea7a0ca8158 ]
fib_info_notify_update() is always called with RTNL held, but not from an RCU read-side critical section. This leads to the following warning [1] when the FIB table list is traversed with hlist_for_each_entry_rcu(), but without a proper lockdep expression.
Since modification of the list is protected by RTNL, silence the warning by adding a lockdep expression which verifies RTNL is held.
[1] ============================= WARNING: suspicious RCU usage 5.9.0-rc1-custom-14233-g2f26e122d62f #129 Not tainted ----------------------------- net/ipv4/fib_trie.c:2124 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 1 lock held by ip/834: #0: ffffffff85a3b6b0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x49a/0xbd0
stack backtrace: CPU: 0 PID: 834 Comm: ip Not tainted 5.9.0-rc1-custom-14233-g2f26e122d62f #129 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack+0x100/0x184 lockdep_rcu_suspicious+0x143/0x14d fib_info_notify_update+0x8d1/0xa60 __nexthop_replace_notify+0xd2/0x290 rtm_new_nexthop+0x35e2/0x5946 rtnetlink_rcv_msg+0x4f7/0xbd0 netlink_rcv_skb+0x17a/0x480 rtnetlink_rcv+0x22/0x30 netlink_unicast+0x5ae/0x890 netlink_sendmsg+0x98a/0xf40 ____sys_sendmsg+0x879/0xa00 ___sys_sendmsg+0x122/0x190 __sys_sendmsg+0x103/0x1d0 __x64_sys_sendmsg+0x7d/0xb0 do_syscall_64+0x32/0x50 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fde28c3be57 Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 RSP: 002b:00007ffc09330028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fde28c3be57 RDX: 0000000000000000 RSI: 00007ffc09330090 RDI: 0000000000000003 RBP: 000000005f45f911 R08: 0000000000000001 R09: 00007ffc0933012c R10: 0000000000000076 R11: 0000000000000246 R12: 0000000000000001 R13: 00007ffc09330290 R14: 00007ffc09330eee R15: 00005610e48ed020
Fixes: 1bff1a0c9bbd ("ipv4: Add function to send route updates") Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: David Ahern dsahern@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/fib_trie.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ipv4/fib_trie.c +++ b/net/ipv4/fib_trie.c @@ -2121,7 +2121,8 @@ void fib_info_notify_update(struct net * struct hlist_head *head = &net->ipv4.fib_table_hash[h]; struct fib_table *tb;
- hlist_for_each_entry_rcu(tb, head, tb_hlist) + hlist_for_each_entry_rcu(tb, head, tb_hlist, + lockdep_rtnl_is_held()) __fib_info_notify_update(net, tb, info); } }
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit 05d4487197b2b71d5363623c28924fd58c71c0b6 ]
Cited commit added the possible value of '2', but it cannot be set. Fix it by adjusting the maximum value to '2'. This is consistent with the corresponding IPv4 sysctl.
Before:
# sysctl -w net.ipv6.fib_multipath_hash_policy=2 sysctl: setting key "net.ipv6.fib_multipath_hash_policy": Invalid argument net.ipv6.fib_multipath_hash_policy = 2 # sysctl net.ipv6.fib_multipath_hash_policy net.ipv6.fib_multipath_hash_policy = 0
After:
# sysctl -w net.ipv6.fib_multipath_hash_policy=2 net.ipv6.fib_multipath_hash_policy = 2 # sysctl net.ipv6.fib_multipath_hash_policy net.ipv6.fib_multipath_hash_policy = 2
Fixes: d8f74f0975d8 ("ipv6: Support multipath hashing on inner IP pkts") Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Stephen Suryaputra ssuryaextr@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/sysctl_net_ipv6.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ipv6/sysctl_net_ipv6.c +++ b/net/ipv6/sysctl_net_ipv6.c @@ -21,6 +21,7 @@ #include <net/calipso.h> #endif
+static int two = 2; static int flowlabel_reflect_max = 0x7; static int auto_flowlabels_min; static int auto_flowlabels_max = IP6_AUTO_FLOW_LABEL_MAX; @@ -150,7 +151,7 @@ static struct ctl_table ipv6_table_templ .mode = 0644, .proc_handler = proc_rt6_multipath_hash_policy, .extra1 = SYSCTL_ZERO, - .extra2 = SYSCTL_ONE, + .extra2 = &two, }, { .procname = "seg6_flowlabel",
From: Paul Moore paul@paul-moore.com
[ Upstream commit d3b990b7f327e2afa98006e7666fb8ada8ed8683 ]
This patch fixes two main problems seen when removing NetLabel mappings: memory leaks and potentially extra audit noise.
The memory leaks are caused by not properly free'ing the mapping's address selector struct when free'ing the entire entry as well as not properly cleaning up a temporary mapping entry when adding new address selectors to an existing entry. This patch fixes both these problems such that kmemleak reports no NetLabel associated leaks after running the SELinux test suite.
The potentially extra audit noise was caused by the auditing code in netlbl_domhsh_remove_entry() being called regardless of the entry's validity. If another thread had already marked the entry as invalid, but not removed/free'd it from the list of mappings, then it was possible that an additional mapping removal audit record would be generated. This patch fixes this by returning early from the removal function when the entry was previously marked invalid. This change also had the side benefit of improving the code by decreasing the indentation level of large chunk of code by one (accounting for most of the diffstat).
Fixes: 63c416887437 ("netlabel: Add network address selectors to the NetLabel/LSM domain mapping") Reported-by: Stephen Smalley stephen.smalley.work@gmail.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netlabel/netlabel_domainhash.c | 59 ++++++++++++++++++------------------- 1 file changed, 30 insertions(+), 29 deletions(-)
--- a/net/netlabel/netlabel_domainhash.c +++ b/net/netlabel/netlabel_domainhash.c @@ -85,6 +85,7 @@ static void netlbl_domhsh_free_entry(str kfree(netlbl_domhsh_addr6_entry(iter6)); } #endif /* IPv6 */ + kfree(ptr->def.addrsel); } kfree(ptr->domain); kfree(ptr); @@ -537,6 +538,8 @@ int netlbl_domhsh_add(struct netlbl_dom_ goto add_return; } #endif /* IPv6 */ + /* cleanup the new entry since we've moved everything over */ + netlbl_domhsh_free_entry(&entry->rcu); } else ret_val = -EINVAL;
@@ -580,6 +583,12 @@ int netlbl_domhsh_remove_entry(struct ne { int ret_val = 0; struct audit_buffer *audit_buf; + struct netlbl_af4list *iter4; + struct netlbl_domaddr4_map *map4; +#if IS_ENABLED(CONFIG_IPV6) + struct netlbl_af6list *iter6; + struct netlbl_domaddr6_map *map6; +#endif /* IPv6 */
if (entry == NULL) return -ENOENT; @@ -597,6 +606,9 @@ int netlbl_domhsh_remove_entry(struct ne ret_val = -ENOENT; spin_unlock(&netlbl_domhsh_lock);
+ if (ret_val) + return ret_val; + audit_buf = netlbl_audit_start_common(AUDIT_MAC_MAP_DEL, audit_info); if (audit_buf != NULL) { audit_log_format(audit_buf, @@ -606,40 +618,29 @@ int netlbl_domhsh_remove_entry(struct ne audit_log_end(audit_buf); }
- if (ret_val == 0) { - struct netlbl_af4list *iter4; - struct netlbl_domaddr4_map *map4; -#if IS_ENABLED(CONFIG_IPV6) - struct netlbl_af6list *iter6; - struct netlbl_domaddr6_map *map6; -#endif /* IPv6 */ - - switch (entry->def.type) { - case NETLBL_NLTYPE_ADDRSELECT: - netlbl_af4list_foreach_rcu(iter4, - &entry->def.addrsel->list4) { - map4 = netlbl_domhsh_addr4_entry(iter4); - cipso_v4_doi_putdef(map4->def.cipso); - } + switch (entry->def.type) { + case NETLBL_NLTYPE_ADDRSELECT: + netlbl_af4list_foreach_rcu(iter4, &entry->def.addrsel->list4) { + map4 = netlbl_domhsh_addr4_entry(iter4); + cipso_v4_doi_putdef(map4->def.cipso); + } #if IS_ENABLED(CONFIG_IPV6) - netlbl_af6list_foreach_rcu(iter6, - &entry->def.addrsel->list6) { - map6 = netlbl_domhsh_addr6_entry(iter6); - calipso_doi_putdef(map6->def.calipso); - } + netlbl_af6list_foreach_rcu(iter6, &entry->def.addrsel->list6) { + map6 = netlbl_domhsh_addr6_entry(iter6); + calipso_doi_putdef(map6->def.calipso); + } #endif /* IPv6 */ - break; - case NETLBL_NLTYPE_CIPSOV4: - cipso_v4_doi_putdef(entry->def.cipso); - break; + break; + case NETLBL_NLTYPE_CIPSOV4: + cipso_v4_doi_putdef(entry->def.cipso); + break; #if IS_ENABLED(CONFIG_IPV6) - case NETLBL_NLTYPE_CALIPSO: - calipso_doi_putdef(entry->def.calipso); - break; + case NETLBL_NLTYPE_CALIPSO: + calipso_doi_putdef(entry->def.calipso); + break; #endif /* IPv6 */ - } - call_rcu(&entry->rcu, netlbl_domhsh_free_entry); } + call_rcu(&entry->rcu, netlbl_domhsh_free_entry);
return ret_val; }
From: Kamil Lorenc kamil@re-ws.pl
[ Upstream commit a609d0259183a841621f252e067f40f8cc25d6f6 ]
Keenetic Plus DSL is a xDSL modem that uses dm9620 as its USB interface.
Signed-off-by: Kamil Lorenc kamil@re-ws.pl Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/dm9601.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/net/usb/dm9601.c +++ b/drivers/net/usb/dm9601.c @@ -625,6 +625,10 @@ static const struct usb_device_id produc USB_DEVICE(0x0a46, 0x1269), /* DM9621A USB to Fast Ethernet Adapter */ .driver_info = (unsigned long)&dm9601_info, }, + { + USB_DEVICE(0x0586, 0x3427), /* ZyXEL Keenetic Plus DSL xDSL modem */ + .driver_info = (unsigned long)&dm9601_info, + }, {}, // END };
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 3106ecb43a05dc3e009779764b9da245a5d082de ]
With disabling bh in the whole sctp_get_port_local(), when snum == 0 and too many ports have been used, the do-while loop will take the cpu for a long time and cause cpu stuck:
[ ] watchdog: BUG: soft lockup - CPU#11 stuck for 22s! [ ] RIP: 0010:native_queued_spin_lock_slowpath+0x4de/0x940 [ ] Call Trace: [ ] _raw_spin_lock+0xc1/0xd0 [ ] sctp_get_port_local+0x527/0x650 [sctp] [ ] sctp_do_bind+0x208/0x5e0 [sctp] [ ] sctp_autobind+0x165/0x1e0 [sctp] [ ] sctp_connect_new_asoc+0x355/0x480 [sctp] [ ] __sctp_connect+0x360/0xb10 [sctp]
There's no need to disable bh in the whole function of sctp_get_port_local. So fix this cpu stuck by removing local_bh_disable() called at the beginning, and using spin_lock_bh() instead.
The same thing was actually done for inet_csk_get_port() in Commit ea8add2b1903 ("tcp/dccp: better use of ephemeral ports in bind()").
Thanks to Marcelo for pointing the buggy code out.
v1->v2: - use cond_resched() to yield cpu to other tasks if needed, as Eric noticed.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Ying Xu yinxu@redhat.com Signed-off-by: Xin Long lucien.xin@gmail.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/socket.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-)
--- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -8297,8 +8297,6 @@ static int sctp_get_port_local(struct so
pr_debug("%s: begins, snum:%d\n", __func__, snum);
- local_bh_disable(); - if (snum == 0) { /* Search for an available port. */ int low, high, remaining, index; @@ -8316,20 +8314,21 @@ static int sctp_get_port_local(struct so continue; index = sctp_phashfn(net, rover); head = &sctp_port_hashtable[index]; - spin_lock(&head->lock); + spin_lock_bh(&head->lock); sctp_for_each_hentry(pp, &head->chain) if ((pp->port == rover) && net_eq(net, pp->net)) goto next; break; next: - spin_unlock(&head->lock); + spin_unlock_bh(&head->lock); + cond_resched(); } while (--remaining > 0);
/* Exhausted local port range during search? */ ret = 1; if (remaining <= 0) - goto fail; + return ret;
/* OK, here is the one we will use. HEAD (the port * hash table list entry) is non-NULL and we hold it's @@ -8344,7 +8343,7 @@ static int sctp_get_port_local(struct so * port iterator, pp being NULL. */ head = &sctp_port_hashtable[sctp_phashfn(net, snum)]; - spin_lock(&head->lock); + spin_lock_bh(&head->lock); sctp_for_each_hentry(pp, &head->chain) { if ((pp->port == snum) && net_eq(pp->net, net)) goto pp_found; @@ -8444,10 +8443,7 @@ success: ret = 0;
fail_unlock: - spin_unlock(&head->lock); - -fail: - local_bh_enable(); + spin_unlock_bh(&head->lock); return ret; }
From: Vinicius Costa Gomes vinicius.gomes@intel.com
[ Upstream commit 09e31cf0c528dac3358a081dc4e773d1b3de1bc9 ]
Since commit 9c66d1564676 ("taprio: Add support for hardware offloading") there's a bit of inconsistency when offloading schedules to the hardware:
In software mode, the gate masks are specified in terms of traffic classes, so if say "sched-entry S 03 20000", it means that the traffic classes 0 and 1 are open for 20us; when taprio is offloaded to hardware, the gate masks are specified in terms of hardware queues.
The idea here is to fix hardware offloading, so schedules in hardware and software mode have the same behavior. What's needed to do is to map traffic classes to queues when applying the offload to the driver.
Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading") Signed-off-by: Vinicius Costa Gomes vinicius.gomes@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_taprio.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-)
--- a/net/sched/sch_taprio.c +++ b/net/sched/sch_taprio.c @@ -1177,9 +1177,27 @@ static void taprio_offload_config_change spin_unlock(&q->current_entry_lock); }
-static void taprio_sched_to_offload(struct taprio_sched *q, +static u32 tc_map_to_queue_mask(struct net_device *dev, u32 tc_mask) +{ + u32 i, queue_mask = 0; + + for (i = 0; i < dev->num_tc; i++) { + u32 offset, count; + + if (!(tc_mask & BIT(i))) + continue; + + offset = dev->tc_to_txq[i].offset; + count = dev->tc_to_txq[i].count; + + queue_mask |= GENMASK(offset + count - 1, offset); + } + + return queue_mask; +} + +static void taprio_sched_to_offload(struct net_device *dev, struct sched_gate_list *sched, - const struct tc_mqprio_qopt *mqprio, struct tc_taprio_qopt_offload *offload) { struct sched_entry *entry; @@ -1194,7 +1212,8 @@ static void taprio_sched_to_offload(stru
e->command = entry->command; e->interval = entry->interval; - e->gate_mask = entry->gate_mask; + e->gate_mask = tc_map_to_queue_mask(dev, entry->gate_mask); + i++; }
@@ -1202,7 +1221,6 @@ static void taprio_sched_to_offload(stru }
static int taprio_enable_offload(struct net_device *dev, - struct tc_mqprio_qopt *mqprio, struct taprio_sched *q, struct sched_gate_list *sched, struct netlink_ext_ack *extack) @@ -1224,7 +1242,7 @@ static int taprio_enable_offload(struct return -ENOMEM; } offload->enable = 1; - taprio_sched_to_offload(q, sched, mqprio, offload); + taprio_sched_to_offload(dev, sched, offload);
err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload); if (err < 0) { @@ -1486,7 +1504,7 @@ static int taprio_change(struct Qdisc *s }
if (FULL_OFFLOAD_IS_ENABLED(q->flags)) - err = taprio_enable_offload(dev, mqprio, q, new_admin, extack); + err = taprio_enable_offload(dev, q, new_admin, extack); else err = taprio_disable_offload(dev, q, extack); if (err)
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit 2a63866c8b51a3f72cea388dfac259d0e14c4ba6 ]
syzbot is reporting hung task at nbd_ioctl() [1], for there are two problems regarding TIPC's connectionless socket's shutdown() operation.
---------- #include <fcntl.h> #include <sys/socket.h> #include <sys/ioctl.h> #include <linux/nbd.h> #include <unistd.h>
int main(int argc, char *argv[]) { const int fd = open("/dev/nbd0", 3); alarm(5); ioctl(fd, NBD_SET_SOCK, socket(PF_TIPC, SOCK_DGRAM, 0)); ioctl(fd, NBD_DO_IT, 0); /* To be interrupted by SIGALRM. */ return 0; } ----------
One problem is that wait_for_completion() from flush_workqueue() from nbd_start_device_ioctl() from nbd_ioctl() cannot be completed when nbd_start_device_ioctl() received a signal at wait_event_interruptible(), for tipc_shutdown() from kernel_sock_shutdown(SHUT_RDWR) from nbd_mark_nsock_dead() from sock_shutdown() from nbd_start_device_ioctl() is failing to wake up a WQ thread sleeping at wait_woken() from tipc_wait_for_rcvmsg() from sock_recvmsg() from sock_xmit() from nbd_read_stat() from recv_work() scheduled by nbd_start_device() from nbd_start_device_ioctl(). Fix this problem by always invoking sk->sk_state_change() (like inet_shutdown() does) when tipc_shutdown() is called.
The other problem is that tipc_wait_for_rcvmsg() cannot return when tipc_shutdown() is called, for tipc_shutdown() sets sk->sk_shutdown to SEND_SHUTDOWN (despite "how" is SHUT_RDWR) while tipc_wait_for_rcvmsg() needs sk->sk_shutdown set to RCV_SHUTDOWN or SHUTDOWN_MASK. Fix this problem by setting sk->sk_shutdown to SHUTDOWN_MASK (like inet_shutdown() does) when the socket is connectionless.
[1] https://syzkaller.appspot.com/bug?id=3fe51d307c1f0a845485cf1798aa059d12bf18b...
Reported-by: syzbot syzbot+e36f41d207137b5d12f7@syzkaller.appspotmail.com Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/socket.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -2773,18 +2773,21 @@ static int tipc_shutdown(struct socket *
trace_tipc_sk_shutdown(sk, NULL, TIPC_DUMP_ALL, " "); __tipc_shutdown(sock, TIPC_CONN_SHUTDOWN); - sk->sk_shutdown = SEND_SHUTDOWN; + if (tipc_sk_type_connectionless(sk)) + sk->sk_shutdown = SHUTDOWN_MASK; + else + sk->sk_shutdown = SEND_SHUTDOWN;
if (sk->sk_state == TIPC_DISCONNECTING) { /* Discard any unreceived messages */ __skb_queue_purge(&sk->sk_receive_queue);
- /* Wake up anyone sleeping in poll */ - sk->sk_state_change(sk); res = 0; } else { res = -ENOTCONN; } + /* Wake up anyone sleeping in poll. */ + sk->sk_state_change(sk);
release_sock(sk); return res;
From: Tuong Lien tuong.t.lien@dektech.com.au
[ Upstream commit bb8872a1e6bc911869a729240781076ed950764b ]
The 'this_cpu_ptr()' is used to obtain the AEAD key' TFM on the current CPU for encryption, however the execution can be preemptible since it's actually user-space context, so the 'using smp_processor_id() in preemptible' has been observed.
We fix the issue by using the 'get/put_cpu_ptr()' API which consists of a 'preempt_disable()' instead.
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication") Acked-by: Jon Maloy jmaloy@redhat.com Signed-off-by: Tuong Lien tuong.t.lien@dektech.com.au Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/crypto.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
--- a/net/tipc/crypto.c +++ b/net/tipc/crypto.c @@ -326,7 +326,8 @@ static void tipc_aead_free(struct rcu_he if (aead->cloned) { tipc_aead_put(aead->cloned); } else { - head = *this_cpu_ptr(aead->tfm_entry); + head = *get_cpu_ptr(aead->tfm_entry); + put_cpu_ptr(aead->tfm_entry); list_for_each_entry_safe(tfm_entry, tmp, &head->list, list) { crypto_free_aead(tfm_entry->tfm); list_del(&tfm_entry->list); @@ -399,10 +400,15 @@ static void tipc_aead_users_set(struct t */ static struct crypto_aead *tipc_aead_tfm_next(struct tipc_aead *aead) { - struct tipc_tfm **tfm_entry = this_cpu_ptr(aead->tfm_entry); + struct tipc_tfm **tfm_entry; + struct crypto_aead *tfm;
+ tfm_entry = get_cpu_ptr(aead->tfm_entry); *tfm_entry = list_next_entry(*tfm_entry, list); - return (*tfm_entry)->tfm; + tfm = (*tfm_entry)->tfm; + put_cpu_ptr(tfm_entry); + + return tfm; }
/**
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 96e97bc07e90f175a8980a22827faf702ca4cb30 ]
napi_disable() makes sure to set the NAPI_STATE_NPSVC bit to prevent netpoll from accessing rings before init is complete. However, the same is not done for fresh napi instances in netif_napi_add(), even though we expect NAPI instances to be added as disabled.
This causes crashes during driver reconfiguration (enabling XDP, changing the channel count) - if there is any printk() after netif_napi_add() but before napi_enable().
To ensure memory ordering is correct we need to use RCU accessors.
Reported-by: Rob Sherwood rsher@fb.com Fixes: 2d8bff12699a ("netpoll: Close race condition between poll_one_napi and napi_disable") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/dev.c | 3 ++- net/core/netpoll.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-)
--- a/net/core/dev.c +++ b/net/core/dev.c @@ -6609,12 +6609,13 @@ void netif_napi_add(struct net_device *d netdev_err_once(dev, "%s() called with weight %d\n", __func__, weight); napi->weight = weight; - list_add(&napi->dev_list, &dev->napi_list); napi->dev = dev; #ifdef CONFIG_NETPOLL napi->poll_owner = -1; #endif set_bit(NAPI_STATE_SCHED, &napi->state); + set_bit(NAPI_STATE_NPSVC, &napi->state); + list_add_rcu(&napi->dev_list, &dev->napi_list); napi_hash_add(napi); } EXPORT_SYMBOL(netif_napi_add); --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -162,7 +162,7 @@ static void poll_napi(struct net_device struct napi_struct *napi; int cpu = smp_processor_id();
- list_for_each_entry(napi, &dev->napi_list, dev_list) { + list_for_each_entry_rcu(napi, &dev->napi_list, dev_list) { if (cmpxchg(&napi->poll_owner, -1, cpu) == -1) { poll_one_napi(napi); smp_store_release(&napi->poll_owner, -1);
From: Florian Westphal fw@strlen.de
[ Upstream commit 1cec170d458b1d18f6f1654ca84c0804a701c5ef ]
After subflow lock is dropped, more wmem might have been made available.
This fixes a deadlock in mptcp_connect.sh 'mmap' mode: wmem is exhausted. But as the mptcp socket holds on to already-acked data (for retransmit) no wakeup will occur.
Using 'goto restart' calls mptcp_clean_una(sk) which will free pages that have been acked completely in the mean time.
Fixes: fb529e62d3f3 ("mptcp: break and restart in case mptcp sndbuf is full") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -772,7 +772,6 @@ fallback: restart: mptcp_clean_una(sk);
-wait_for_sndbuf: __mptcp_flush_join_list(msk); ssk = mptcp_subflow_get_send(msk); while (!sk_stream_memory_free(sk) || @@ -873,7 +872,7 @@ wait_for_sndbuf: */ mptcp_set_timeout(sk, ssk); release_sock(ssk); - goto wait_for_sndbuf; + goto restart; } } }
On Fri, 11 Sep 2020 14:47:17 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.8.9 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:24:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.8: 14 builds: 14 pass, 0 fail 26 boots: 26 pass, 0 fail 60 tests: 60 pass, 0 fail
Linux version: 5.8.9-rc1-gdcdaabfe3cea Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Fri, Sep 11, 2020 at 05:10:32PM +0000, Jon Hunter wrote:
On Fri, 11 Sep 2020 14:47:17 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.8.9 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:24:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.8: 14 builds: 14 pass, 0 fail 26 boots: 26 pass, 0 fail 60 tests: 60 pass, 0 fail
Linux version: 5.8.9-rc1-gdcdaabfe3cea Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Thanks for testing and letting me know.
greg k-h
On 9/11/20 6:47 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.8.9 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:24:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Fri, Sep 11, 2020 at 04:19:48PM -0600, Shuah Khan wrote:
On 9/11/20 6:47 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.8.9 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:24:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
Thanks for testing all of these and letting me know.
greg k-h
On Fri, Sep 11, 2020 at 02:47:17PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.8.9 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:24:42 +0000. Anything received after that time might be too late.
Build results: total: 154 pass: 154 fail: 0 Qemu test results: total: 430 pass: 430 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On Fri, Sep 11, 2020 at 07:19:12PM -0700, Guenter Roeck wrote:
On Fri, Sep 11, 2020 at 02:47:17PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.8.9 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:24:42 +0000. Anything received after that time might be too late.
Build results: total: 154 pass: 154 fail: 0 Qemu test results: total: 430 pass: 430 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Thanks for testing all of tehse and letting me know.
greg k-h
On Fri, 11 Sep 2020 at 18:30, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.8.9 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:24:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
Summary ------------------------------------------------------------------------
kernel: 5.8.9-rc1 git repo: ['https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git', 'https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc'] git branch: linux-5.8.y git commit: dcdaabfe3cea0fafa25a9a2ce8d83fd4edad41d6 git describe: v5.8.8-17-gdcdaabfe3cea Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.8.y/build/v5.8.8-...
No regressions (compared to build v5.8.8)
No fixes (compared to build v5.8.8)
Ran 20836 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - hi6220-hikey - i386 - juno-r2 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - x86
Test Suites ----------- * build * install-android-platform-tools-r2600 * kselftest * kselftest/drivers * kselftest/filesystems * kselftest/net * linux-log-parser * perf * network-basic-tests * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-controllers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * v4l2-compliance * ltp-open-posix-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-native/drivers * kselftest-vsyscall-mode-native/filesystems * kselftest-vsyscall-mode-native/net * kselftest-vsyscall-mode-none * kselftest-vsyscall-mode-none/drivers * kselftest-vsyscall-mode-none/filesystems * kselftest-vsyscall-mode-none/net * ssuite
On Sat, Sep 12, 2020 at 12:57:38PM +0530, Naresh Kamboju wrote:
On Fri, 11 Sep 2020 at 18:30, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.8.9 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 13 Sep 2020 12:24:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
Wonderful, thanks for testing them all and letting me know.
greg k-h
linux-stable-mirror@lists.linaro.org