From: Ross Lagerwall ross.lagerwall@citrix.com
[ Upstream commit 7881ef3f33bb80f459ea6020d1e021fc524a6348 ]
Under certain conditions, lru_count may drop below zero resulting in a large amount of log spam like this:
vmscan: shrink_slab: gfs2_dump_glock+0x3b0/0x630 [gfs2] \ negative objects to delete nr=-1
This happens as follows: 1) A glock is moved from lru_list to the dispose list and lru_count is decremented. 2) The dispose function calls cond_resched() and drops the lru lock. 3) Another thread takes the lru lock and tries to add the same glock to lru_list, checking if the glock is on an lru list. 4) It is on a list (actually the dispose list) and so it avoids incrementing lru_count. 5) The glock is moved to lru_list. 5) The original thread doesn't dispose it because it has been re-added to the lru list but the lru_count has still decreased by one.
Fix by checking if the LRU flag is set on the glock rather than checking if the glock is on some list and rearrange the code so that the LRU flag is added/removed precisely when the glock is added/removed from lru_list.
Signed-off-by: Ross Lagerwall ross.lagerwall@citrix.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/glock.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index d32964cd11176..e4f6d39500bcc 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -183,15 +183,19 @@ static int demote_ok(const struct gfs2_glock *gl)
void gfs2_glock_add_to_lru(struct gfs2_glock *gl) { + if (!(gl->gl_ops->go_flags & GLOF_LRU)) + return; + spin_lock(&lru_lock);
- if (!list_empty(&gl->gl_lru)) - list_del_init(&gl->gl_lru); - else + list_del(&gl->gl_lru); + list_add_tail(&gl->gl_lru, &lru_list); + + if (!test_bit(GLF_LRU, &gl->gl_flags)) { + set_bit(GLF_LRU, &gl->gl_flags); atomic_inc(&lru_count); + }
- list_add_tail(&gl->gl_lru, &lru_list); - set_bit(GLF_LRU, &gl->gl_flags); spin_unlock(&lru_lock); }
@@ -201,7 +205,7 @@ static void gfs2_glock_remove_from_lru(struct gfs2_glock *gl) return;
spin_lock(&lru_lock); - if (!list_empty(&gl->gl_lru)) { + if (test_bit(GLF_LRU, &gl->gl_flags)) { list_del_init(&gl->gl_lru); atomic_dec(&lru_count); clear_bit(GLF_LRU, &gl->gl_flags); @@ -1159,8 +1163,7 @@ void gfs2_glock_dq(struct gfs2_holder *gh) !test_bit(GLF_DEMOTE, &gl->gl_flags)) fast_path = 1; } - if (!test_bit(GLF_LFLUSH, &gl->gl_flags) && demote_ok(gl) && - (glops->go_flags & GLOF_LRU)) + if (!test_bit(GLF_LFLUSH, &gl->gl_flags) && demote_ok(gl)) gfs2_glock_add_to_lru(gl);
trace_gfs2_glock_queue(gh, 0); @@ -1456,6 +1459,7 @@ __acquires(&lru_lock) if (!spin_trylock(&gl->gl_lockref.lock)) { add_back_to_lru: list_add(&gl->gl_lru, &lru_list); + set_bit(GLF_LRU, &gl->gl_flags); atomic_inc(&lru_count); continue; } @@ -1463,7 +1467,6 @@ __acquires(&lru_lock) spin_unlock(&gl->gl_lockref.lock); goto add_back_to_lru; } - clear_bit(GLF_LRU, &gl->gl_flags); gl->gl_lockref.count++; if (demote_ok(gl)) handle_callback(gl, LM_ST_UNLOCKED, 0, false); @@ -1498,6 +1501,7 @@ static long gfs2_scan_glock_lru(int nr) if (!test_bit(GLF_LOCK, &gl->gl_flags)) { list_move(&gl->gl_lru, &dispose); atomic_dec(&lru_count); + clear_bit(GLF_LRU, &gl->gl_flags); freed++; continue; }
From: YueHaibing yuehaibing@huawei.com
[ Upstream commit a3147770bea76c8dbad73eca3a24c2118da5e719 ]
BUG: unable to handle kernel paging request at ffffffffa016a270 PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bbd067 PTE 0 Oops: 0000 [#1 CPU: 0 PID: 6134 Comm: modprobe Not tainted 5.1.0+ #33 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:atomic_notifier_chain_register+0x24/0x60 Code: 1f 80 00 00 00 00 55 48 89 e5 41 54 49 89 f4 53 48 89 fb e8 ae b4 38 01 48 8b 53 38 48 8d 4b 38 48 85 d2 74 20 45 8b 44 24 10 <44> 3b 42 10 7e 08 eb 13 44 39 42 10 7c 0d 48 8d 4a 08 48 8b 52 08 RSP: 0018:ffffc90000e2bc60 EFLAGS: 00010086 RAX: 0000000000000292 RBX: ffffffff83467240 RCX: ffffffff83467278 RDX: ffffffffa016a260 RSI: ffffffff83752140 RDI: ffffffff83467240 RBP: ffffc90000e2bc70 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 00000000014fa61f R12: ffffffffa01c8260 R13: ffff888231091e00 R14: 0000000000000000 R15: ffffc90000e2be78 FS: 00007fbd8d7cd540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa016a270 CR3: 000000022c7e3000 CR4: 00000000000006f0 Call Trace: register_inet6addr_notifier+0x13/0x20 cxgb4_init_module+0x6c/0x1000 [cxgb4 ? 0xffffffffa01d7000 do_one_initcall+0x6c/0x3cc ? do_init_module+0x22/0x1f1 ? rcu_read_lock_sched_held+0x97/0xb0 ? kmem_cache_alloc_trace+0x325/0x3b0 do_init_module+0x5b/0x1f1 load_module+0x1db1/0x2690 ? m_show+0x1d0/0x1d0 __do_sys_finit_module+0xc5/0xd0 __x64_sys_finit_module+0x15/0x20 do_syscall_64+0x6b/0x1d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe
If pci_register_driver fails, register inet6addr_notifier is pointless. This patch fix the error path in cxgb4_init_module.
Fixes: b5a02f503caa ("cxgb4 : Update ipv6 address handling api") Signed-off-by: YueHaibing yuehaibing@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 89179e3166878..4bc0c357cb8ea 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -6161,15 +6161,24 @@ static int __init cxgb4_init_module(void)
ret = pci_register_driver(&cxgb4_driver); if (ret < 0) - debugfs_remove(cxgb4_debugfs_root); + goto err_pci;
#if IS_ENABLED(CONFIG_IPV6) if (!inet6addr_registered) { - register_inet6addr_notifier(&cxgb4_inet6addr_notifier); - inet6addr_registered = true; + ret = register_inet6addr_notifier(&cxgb4_inet6addr_notifier); + if (ret) + pci_unregister_driver(&cxgb4_driver); + else + inet6addr_registered = true; } #endif
+ if (ret == 0) + return ret; + +err_pci: + debugfs_remove(cxgb4_debugfs_root); + return ret; }
From: David Howells dhowells@redhat.com
[ Upstream commit a2f611a3dc317d8ea1c98ad6c54b911cf7f93193 ]
The AFS3 FID is three 32-bit unsigned numbers and is represented as three up-to-8-hex-digit numbers separated by colons to the afs.fid xattr. However, with the advent of support for YFS, the FID is now a 64-bit volume number, a 96-bit vnode/inode number and a 32-bit uniquifier (as before). Whilst the sprintf in afs_xattr_get_fid() has been partially updated (it currently ignores the upper 32 bits of the 96-bit vnode number), the size of the stack-based buffer has not been increased to match, thereby allowing stack corruption to occur.
Fix this by increasing the buffer size appropriately and conditionally including the upper part of the vnode number if it is non-zero. The latter requires the lower part to be zero-padded if the upper part is non-zero.
Fixes: 3b6492df4153 ("afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS") Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/xattr.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/fs/afs/xattr.c b/fs/afs/xattr.c index a2cdf25573e24..706801c6c4c4c 100644 --- a/fs/afs/xattr.c +++ b/fs/afs/xattr.c @@ -69,11 +69,20 @@ static int afs_xattr_get_fid(const struct xattr_handler *handler, void *buffer, size_t size) { struct afs_vnode *vnode = AFS_FS_I(inode); - char text[8 + 1 + 8 + 1 + 8 + 1]; + char text[16 + 1 + 24 + 1 + 8 + 1]; size_t len;
- len = sprintf(text, "%llx:%llx:%x", - vnode->fid.vid, vnode->fid.vnode, vnode->fid.unique); + /* The volume ID is 64-bit, the vnode ID is 96-bit and the + * uniquifier is 32-bit. + */ + len = sprintf(text, "%llx:", vnode->fid.vid); + if (vnode->fid.vnode_hi) + len += sprintf(text + len, "%x%016llx", + vnode->fid.vnode_hi, vnode->fid.vnode); + else + len += sprintf(text + len, "%llx", vnode->fid.vnode); + len += sprintf(text + len, ":%x", vnode->fid.unique); + if (size == 0) return len; if (len > size)
From: Roberto Bergantinos Corpas rbergant@redhat.com
[ Upstream commit 950a578c6128c2886e295b9c7ecb0b6b22fcc92b ]
Actually we don't do anything with return value from nfs_wait_client_init_complete in nfs_match_client, as a consequence if we get a fatal signal and client is not fully initialised, we'll loop to "again" label
This has been proven to cause soft lockups on some scenarios (no-carrier but configured network interfaces)
Signed-off-by: Roberto Bergantinos Corpas rbergant@redhat.com Reviewed-by: Benjamin Coddington bcodding@redhat.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/client.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 90d71fda65cec..350cfa561e0e8 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -284,6 +284,7 @@ static struct nfs_client *nfs_match_client(const struct nfs_client_initdata *dat struct nfs_client *clp; const struct sockaddr *sap = data->addr; struct nfs_net *nn = net_generic(data->net, nfs_net_id); + int error;
again: list_for_each_entry(clp, &nn->nfs_client_list, cl_share_link) { @@ -296,8 +297,10 @@ static struct nfs_client *nfs_match_client(const struct nfs_client_initdata *dat if (clp->cl_cons_state > NFS_CS_READY) { refcount_inc(&clp->cl_count); spin_unlock(&nn->nfs_client_lock); - nfs_wait_client_init_complete(clp); + error = nfs_wait_client_init_complete(clp); nfs_put_client(clp); + if (error < 0) + return ERR_PTR(error); spin_lock(&nn->nfs_client_lock); goto again; } @@ -407,6 +410,8 @@ struct nfs_client *nfs_get_client(const struct nfs_client_initdata *cl_init) clp = nfs_match_client(cl_init); if (clp) { spin_unlock(&nn->nfs_client_lock); + if (IS_ERR(clp)) + return clp; if (new) new->rpc_ops->free_client(new); return nfs_found_client(cl_init, clp);
Hi Sasha, if you take this one, you'll need the fix for it:
c260121a97a3 ("NFS: Fix a double unlock from nfs_match,get_client")
I didn't see this fix go through my inbox for your stable tree, so apologies if maybe I missed it.
Looks like you are also applying this one to 4.19 and 4.14, -- I'll just reply once here.
Ben
On 22 May 2019, at 15:15, Sasha Levin wrote:
From: Roberto Bergantinos Corpas rbergant@redhat.com
[ Upstream commit 950a578c6128c2886e295b9c7ecb0b6b22fcc92b ]
Actually we don't do anything with return value from nfs_wait_client_init_complete in nfs_match_client, as a consequence if we get a fatal signal and client is not fully initialised, we'll loop to "again" label This has been proven to cause soft lockups on some scenarios (no-carrier but configured network interfaces)
Signed-off-by: Roberto Bergantinos Corpas rbergant@redhat.com Reviewed-by: Benjamin Coddington bcodding@redhat.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org
fs/nfs/client.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 90d71fda65cec..350cfa561e0e8 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -284,6 +284,7 @@ static struct nfs_client *nfs_match_client(const struct nfs_client_initdata *dat struct nfs_client *clp; const struct sockaddr *sap = data->addr; struct nfs_net *nn = net_generic(data->net, nfs_net_id);
- int error;
again: list_for_each_entry(clp, &nn->nfs_client_list, cl_share_link) { @@ -296,8 +297,10 @@ static struct nfs_client *nfs_match_client(const struct nfs_client_initdata *dat if (clp->cl_cons_state > NFS_CS_READY) { refcount_inc(&clp->cl_count); spin_unlock(&nn->nfs_client_lock);
nfs_wait_client_init_complete(clp);
error = nfs_wait_client_init_complete(clp); nfs_put_client(clp);
if (error < 0)
}return ERR_PTR(error); spin_lock(&nn->nfs_client_lock); goto again;
@@ -407,6 +410,8 @@ struct nfs_client *nfs_get_client(const struct nfs_client_initdata *cl_init) clp = nfs_match_client(cl_init); if (clp) { spin_unlock(&nn->nfs_client_lock);
if (IS_ERR(clp))
return clp; if (new) new->rpc_ops->free_client(new); return nfs_found_client(cl_init, clp);
-- 2.20.1
On Thu, May 23, 2019 at 11:02:39AM -0400, Benjamin Coddington wrote:
Hi Sasha, if you take this one, you'll need the fix for it:
c260121a97a3 ("NFS: Fix a double unlock from nfs_match,get_client")
I didn't see this fix go through my inbox for your stable tree, so apologies if maybe I missed it.
Looks like you are also applying this one to 4.19 and 4.14, -- I'll just reply once here.
Queued c260121a97a3 up for 5.1-4.14, thank you!
-- Thanks, Sasha
From: Abhi Das adas@redhat.com
[ Upstream commit 8f91821990fd6f170a5dca79697a441181a41b16 ]
As part of the freeze operation, gfs2_freeze_func() is left blocking on a request to hold the sd_freeze_gl in SH. This glock is held in EX by the gfs2_freeze() code.
A subsequent call to gfs2_unfreeze() releases the EXclusively held sd_freeze_gl, which allows gfs2_freeze_func() to acquire it in SH and resume its operation.
gfs2_unfreeze(), however, doesn't wait for gfs2_freeze_func() to complete. If a umount is issued right after unfreeze, it could result in an inconsistent filesystem because some journal data (statfs update) isn't written out.
Refer to commit 24972557b12c for a more detailed explanation of how freeze/unfreeze work.
This patch causes gfs2_unfreeze() to wait for gfs2_freeze_func() to complete before returning to the user.
Signed-off-by: Abhi Das adas@redhat.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/incore.h | 1 + fs/gfs2/super.c | 8 +++++--- 2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h index cdf07b408f54c..539e8dc5a3f6c 100644 --- a/fs/gfs2/incore.h +++ b/fs/gfs2/incore.h @@ -621,6 +621,7 @@ enum { SDF_SKIP_DLM_UNLOCK = 8, SDF_FORCE_AIL_FLUSH = 9, SDF_AIL1_IO_ERROR = 10, + SDF_FS_FROZEN = 11, };
enum gfs2_freeze_state { diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index ca71163ff7cfd..360206704a14c 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -973,8 +973,7 @@ void gfs2_freeze_func(struct work_struct *work) if (error) { printk(KERN_INFO "GFS2: couldn't get freeze lock : %d\n", error); gfs2_assert_withdraw(sdp, 0); - } - else { + } else { atomic_set(&sdp->sd_freeze_state, SFS_UNFROZEN); error = thaw_super(sb); if (error) { @@ -987,6 +986,8 @@ void gfs2_freeze_func(struct work_struct *work) gfs2_glock_dq_uninit(&freeze_gh); } deactivate_super(sb); + clear_bit_unlock(SDF_FS_FROZEN, &sdp->sd_flags); + wake_up_bit(&sdp->sd_flags, SDF_FS_FROZEN); return; }
@@ -1029,6 +1030,7 @@ static int gfs2_freeze(struct super_block *sb) msleep(1000); } error = 0; + set_bit(SDF_FS_FROZEN, &sdp->sd_flags); out: mutex_unlock(&sdp->sd_freeze_mutex); return error; @@ -1053,7 +1055,7 @@ static int gfs2_unfreeze(struct super_block *sb)
gfs2_glock_dq_uninit(&sdp->sd_freeze_gh); mutex_unlock(&sdp->sd_freeze_mutex); - return 0; + return wait_on_bit(&sdp->sd_flags, SDF_FS_FROZEN, TASK_INTERRUPTIBLE); }
/**
From: Shenghui Wang shhuiw@foxmail.com
[ Upstream commit 7889f44dd9cee15aff1c3f7daf81ca4dfed48fc7 ]
This issue is found by running liburing/test/io_uring_setup test.
When test run, the testcase "attempt to bind to invalid cpu" would not pass with messages like: io_uring_setup(1, 0xbfc2f7c8), \ flags: IORING_SETUP_SQPOLL|IORING_SETUP_SQ_AFF, \ resv: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000, \ sq_thread_cpu: 2 expected -1, got 3 FAIL
On my system, there is: CPU(s) possible : 0-3 CPU(s) online : 0-1 CPU(s) offline : 2-3 CPU(s) present : 0-1
The sq_thread_cpu 2 is offline on my system, so the bind should fail. But cpu_possible() will pass the check. We shouldn't be able to bind to an offline cpu. Use cpu_online() to do the check.
After the change, the testcase run as expected: EINVAL will be returned for cpu offlined.
Reviewed-by: Jeff Moyer jmoyer@redhat.com Signed-off-by: Shenghui Wang shhuiw@foxmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 84efb8956734f..30a5687a17b65 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2334,7 +2334,7 @@ static int io_sq_offload_start(struct io_ring_ctx *ctx, nr_cpu_ids);
ret = -EINVAL; - if (!cpu_possible(cpu)) + if (!cpu_online(cpu)) goto err;
ctx->sqo_thread = kthread_create_on_cpu(io_sq_thread,
From: Mike Marciniszyn mike.marciniszyn@intel.com
[ Upstream commit 4c4b1996b5db688e2dcb8242b0a3bf7b1e845e42 ]
The work_item cancels that occur when a QP is destroyed can elicit the following trace:
workqueue: WQ_MEM_RECLAIM ipoib_wq:ipoib_cm_tx_reap [ib_ipoib] is flushing !WQ_MEM_RECLAIM hfi0_0:_hfi1_do_send [hfi1] WARNING: CPU: 7 PID: 1403 at kernel/workqueue.c:2486 check_flush_dependency+0xb1/0x100 Call Trace: __flush_work.isra.29+0x8c/0x1a0 ? __switch_to_asm+0x40/0x70 __cancel_work_timer+0x103/0x190 ? schedule+0x32/0x80 iowait_cancel_work+0x15/0x30 [hfi1] rvt_reset_qp+0x1f8/0x3e0 [rdmavt] rvt_destroy_qp+0x65/0x1f0 [rdmavt] ? _cond_resched+0x15/0x30 ib_destroy_qp+0xe9/0x230 [ib_core] ipoib_cm_tx_reap+0x21c/0x560 [ib_ipoib] process_one_work+0x171/0x370 worker_thread+0x49/0x3f0 kthread+0xf8/0x130 ? max_active_store+0x80/0x80 ? kthread_bind+0x10/0x10 ret_from_fork+0x35/0x40
Since QP destruction frees memory, hfi1_wq should have the WQ_MEM_RECLAIM.
The hfi1_wq does not allocate memory with GFP_KERNEL or otherwise become entangled with memory reclaim, so this flag is appropriate.
Fixes: 0a226edd203f ("staging/rdma/hfi1: Use parallel workqueue for SDMA engines") Reviewed-by: Michael J. Ruhl michael.j.ruhl@intel.com Signed-off-by: Mike Marciniszyn mike.marciniszyn@intel.com Signed-off-by: Dennis Dalessandro dennis.dalessandro@intel.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hfi1/init.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c index faaaac8fbc553..3af5eb10a5ffb 100644 --- a/drivers/infiniband/hw/hfi1/init.c +++ b/drivers/infiniband/hw/hfi1/init.c @@ -805,7 +805,8 @@ static int create_workqueues(struct hfi1_devdata *dd) ppd->hfi1_wq = alloc_workqueue( "hfi%d_%d", - WQ_SYSFS | WQ_HIGHPRI | WQ_CPU_INTENSIVE, + WQ_SYSFS | WQ_HIGHPRI | WQ_CPU_INTENSIVE | + WQ_MEM_RECLAIM, HFI1_MAX_ACTIVE_WORKQUEUE_ENTRIES, dd->unit, pidx); if (!ppd->hfi1_wq)
From: Andreas Gruenbacher agruenba@redhat.com
[ Upstream commit 9287c6452d2b1f24ea8e84bd3cf6f3c6f267f712 ]
This patch has to do with the life cycle of glocks and buffers. When gfs2 metadata or journaled data is queued to be written, a gfs2_bufdata object is assigned to track the buffer, and that is queued to various lists, including the glock's gl_ail_list to indicate it's on the active items list. Once the page associated with the buffer has been written, it is removed from the ail list, but its life isn't over until a revoke has been successfully written.
So after the block is written, its bufdata object is moved from the glock's gl_ail_list to a file-system-wide list of pending revokes, sd_log_le_revoke. At that point the glock still needs to track how many revokes it contributed to that list (in gl_revokes) so that things like glock go_sync can ensure all the metadata has been not only written, but also revoked before the glock is granted to a different node. This is to guarantee journal replay doesn't replay the block once the glock has been granted to another node.
Ross Lagerwall recently discovered a race in which an inode could be evicted, and its glock freed after its ail list had been synced, but while it still had unwritten revokes on the sd_log_le_revoke list. The evict decremented the glock reference count to zero, which allowed the glock to be freed. After the revoke was written, function revoke_lo_after_commit tried to adjust the glock's gl_revokes counter and clear its GLF_LFLUSH flag, at which time it referenced the freed glock.
This patch fixes the problem by incrementing the glock reference count in gfs2_add_revoke when the glock's first bufdata object is moved from the glock to the global revokes list. Later, when the glock's last such bufdata object is freed, the reference count is decremented. This guarantees that whichever process finishes last (the revoke writing or the evict) will properly free the glock, and neither will reference the glock after it has been freed.
Reported-by: Ross Lagerwall ross.lagerwall@citrix.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Bob Peterson rpeterso@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/glock.c | 1 + fs/gfs2/log.c | 3 ++- fs/gfs2/lops.c | 6 ++++-- 3 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index e4f6d39500bcc..71c28ff98b564 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -140,6 +140,7 @@ void gfs2_glock_free(struct gfs2_glock *gl) { struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
+ BUG_ON(atomic_read(&gl->gl_revokes)); rhashtable_remove_fast(&gl_hash_table, &gl->gl_node, ht_parms); smp_mb(); wake_up_glock(gl); diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index b8830fda51e8f..0e04f87a7dddb 100644 --- a/fs/gfs2/log.c +++ b/fs/gfs2/log.c @@ -606,7 +606,8 @@ void gfs2_add_revoke(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd) gfs2_remove_from_ail(bd); /* drops ref on bh */ bd->bd_bh = NULL; sdp->sd_log_num_revoke++; - atomic_inc(&gl->gl_revokes); + if (atomic_inc_return(&gl->gl_revokes) == 1) + gfs2_glock_hold(gl); set_bit(GLF_LFLUSH, &gl->gl_flags); list_add(&bd->bd_list, &sdp->sd_log_le_revoke); } diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c index 8722c60b11feb..4b280611246df 100644 --- a/fs/gfs2/lops.c +++ b/fs/gfs2/lops.c @@ -669,8 +669,10 @@ static void revoke_lo_after_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr) bd = list_entry(head->next, struct gfs2_bufdata, bd_list); list_del_init(&bd->bd_list); gl = bd->bd_gl; - atomic_dec(&gl->gl_revokes); - clear_bit(GLF_LFLUSH, &gl->gl_flags); + if (atomic_dec_return(&gl->gl_revokes) == 0) { + clear_bit(GLF_LFLUSH, &gl->gl_flags); + gfs2_glock_queue_put(gl); + } kmem_cache_free(gfs2_bufdata_cachep, bd); } }
From: Raul E Rangel rrangel@chromium.org
[ Upstream commit 9e4be8d03f50d1b25c38e2b59e73b194c130df7d ]
The SD Physical Layer Spec says the following: Since the SD Memory Card shall support at least the two bus modes 1-bit or 4-bit width, then any SD Card shall set at least bits 0 and 2 (SD_BUS_WIDTH="0101").
This change verifies the card has specified a bus width.
AMD SDHC Device 7806 can get into a bad state after a card disconnect where anything transferred via the DATA lines will always result in a zero filled buffer. Currently the driver will continue without error if the HC is in this condition. A block device will be created, but reading from it will result in a zero buffer. This makes it seem like the SD device has been erased, when in actuality the data is never getting copied from the DATA lines to the data buffer.
SCR is the first command in the SD initialization sequence that uses the DATA lines. By checking that the response was invalid, we can abort mounting the card.
Reviewed-by: Avri Altman avri.altman@wdc.com Signed-off-by: Raul E Rangel rrangel@chromium.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/core/sd.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c index 265e1aeeb9d88..d3d32f9a2cb18 100644 --- a/drivers/mmc/core/sd.c +++ b/drivers/mmc/core/sd.c @@ -221,6 +221,14 @@ static int mmc_decode_scr(struct mmc_card *card)
if (scr->sda_spec3) scr->cmds = UNSTUFF_BITS(resp, 32, 2); + + /* SD Spec says: any SD Card shall set at least bits 0 and 2 */ + if (!(scr->bus_widths & SD_SCR_BUS_WIDTH_1) || + !(scr->bus_widths & SD_SCR_BUS_WIDTH_4)) { + pr_err("%s: invalid bus width\n", mmc_hostname(card->host)); + return -EINVAL; + } + return 0; }
From: Linus LÃŧssing linus.luessing@c0d3.blue
[ Upstream commit a3c7cd0cdf1107f891aff847ad481e34df727055 ]
Syzbot has reported some issues with the locking assumptions made for the multicast tt/tvlv worker: It was able to trigger the WARN_ON() in batadv_mcast_mla_tt_retract() and batadv_mcast_mla_tt_add(). While hard/not reproduceable for us so far it seems that the delayed_work_pending() we use might not be quite safe from reordering.
Therefore this patch adds an explicit, new spinlock to protect the update of the mla_list and flags in bat_priv and then removes the WARN_ON(delayed_work_pending()).
Reported-by: syzbot+83f2d54ec6b7e417e13f@syzkaller.appspotmail.com Reported-by: syzbot+050927a651272b145a5d@syzkaller.appspotmail.com Reported-by: syzbot+979ffc89b87309b1b94b@syzkaller.appspotmail.com Reported-by: syzbot+f9f3f388440283da2965@syzkaller.appspotmail.com Fixes: cbebd363b2e9 ("batman-adv: Use own timer for multicast TT and TVLV updates") Signed-off-by: Linus LÃŧssing linus.luessing@c0d3.blue Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/batman-adv/main.c | 1 + net/batman-adv/multicast.c | 11 +++-------- net/batman-adv/types.h | 5 +++++ 3 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c index 75750870cf048..f8725786b5961 100644 --- a/net/batman-adv/main.c +++ b/net/batman-adv/main.c @@ -161,6 +161,7 @@ int batadv_mesh_init(struct net_device *soft_iface) spin_lock_init(&bat_priv->tt.commit_lock); spin_lock_init(&bat_priv->gw.list_lock); #ifdef CONFIG_BATMAN_ADV_MCAST + spin_lock_init(&bat_priv->mcast.mla_lock); spin_lock_init(&bat_priv->mcast.want_lists_lock); #endif spin_lock_init(&bat_priv->tvlv.container_list_lock); diff --git a/net/batman-adv/multicast.c b/net/batman-adv/multicast.c index f91b1b6265cfe..1b985ab89c087 100644 --- a/net/batman-adv/multicast.c +++ b/net/batman-adv/multicast.c @@ -325,8 +325,6 @@ static void batadv_mcast_mla_list_free(struct hlist_head *mcast_list) * translation table except the ones listed in the given mcast_list. * * If mcast_list is NULL then all are retracted. - * - * Do not call outside of the mcast worker! (or cancel mcast worker first) */ static void batadv_mcast_mla_tt_retract(struct batadv_priv *bat_priv, struct hlist_head *mcast_list) @@ -334,8 +332,6 @@ static void batadv_mcast_mla_tt_retract(struct batadv_priv *bat_priv, struct batadv_hw_addr *mcast_entry; struct hlist_node *tmp;
- WARN_ON(delayed_work_pending(&bat_priv->mcast.work)); - hlist_for_each_entry_safe(mcast_entry, tmp, &bat_priv->mcast.mla_list, list) { if (mcast_list && @@ -359,8 +355,6 @@ static void batadv_mcast_mla_tt_retract(struct batadv_priv *bat_priv, * * Adds multicast listener announcements from the given mcast_list to the * translation table if they have not been added yet. - * - * Do not call outside of the mcast worker! (or cancel mcast worker first) */ static void batadv_mcast_mla_tt_add(struct batadv_priv *bat_priv, struct hlist_head *mcast_list) @@ -368,8 +362,6 @@ static void batadv_mcast_mla_tt_add(struct batadv_priv *bat_priv, struct batadv_hw_addr *mcast_entry; struct hlist_node *tmp;
- WARN_ON(delayed_work_pending(&bat_priv->mcast.work)); - if (!mcast_list) return;
@@ -658,7 +650,10 @@ static void batadv_mcast_mla_update(struct work_struct *work) priv_mcast = container_of(delayed_work, struct batadv_priv_mcast, work); bat_priv = container_of(priv_mcast, struct batadv_priv, mcast);
+ spin_lock(&bat_priv->mcast.mla_lock); __batadv_mcast_mla_update(bat_priv); + spin_unlock(&bat_priv->mcast.mla_lock); + batadv_mcast_start_timer(bat_priv); }
diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h index a21b34ed6548f..ed0f6a519de55 100644 --- a/net/batman-adv/types.h +++ b/net/batman-adv/types.h @@ -1223,6 +1223,11 @@ struct batadv_priv_mcast { /** @bridged: whether the soft interface has a bridge on top */ unsigned char bridged:1;
+ /** + * @mla_lock: a lock protecting mla_list and mla_flags + */ + spinlock_t mla_lock; + /** * @num_want_all_unsnoopables: number of nodes wanting unsnoopable IP * traffic
From: Eric Dumazet edumazet@google.com
[ Upstream commit 47d3d7fdb10a21c223036b58bd70ffdc24a472c4 ]
Since ip6frag_expire_frag_queue() now pulls the head skb from frag queue, we should no longer use skb_get(), since this leads to an skb leak.
Stefan Bader initially reported a problem in 4.4.stable [1] caused by the skb_get(), so this patch should also fix this issue.
296583.091021] kernel BUG at /build/linux-6VmqmP/linux-4.4.0/net/core/skbuff.c:1207! [296583.091734] Call Trace: [296583.091749] [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350 [296583.091764] [<ffffffff8183939a>] _decode_session6+0x26a/0x400 [296583.091779] [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50 [296583.091795] [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0 [296583.091809] [<ffffffff81824421>] icmp6_send+0x5e1/0x940 [296583.091823] [<ffffffff81753238>] ? __netif_receive_skb+0x18/0x60 [296583.091838] [<ffffffff817532b2>] ? netif_receive_skb_internal+0x32/0xa0 [296583.091858] [<ffffffffc0199f74>] ? ixgbe_clean_rx_irq+0x594/0xac0 [ixgbe] [296583.091876] [<ffffffffc04eb260>] ? nf_ct_net_exit+0x50/0x50 [nf_defrag_ipv6] [296583.091893] [<ffffffff8183d431>] icmpv6_send+0x21/0x30 [296583.091906] [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120 [296583.091921] [<ffffffffc04eb27f>] nf_ct_frag6_expire+0x1f/0x30 [nf_defrag_ipv6] [296583.091938] [<ffffffff810f3b57>] call_timer_fn+0x37/0x140 [296583.091951] [<ffffffffc04eb260>] ? nf_ct_net_exit+0x50/0x50 [nf_defrag_ipv6] [296583.091968] [<ffffffff810f5464>] run_timer_softirq+0x234/0x330 [296583.091982] [<ffffffff8108a339>] __do_softirq+0x109/0x2b0
Fixes: d4289fcc9b16 ("net: IP6 defrag: use rbtrees for IPv6 defrag") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: Stefan Bader stefan.bader@canonical.com Cc: Peter Oskolkov posk@google.com Cc: Florian Westphal fw@strlen.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/ipv6_frag.h | 1 - 1 file changed, 1 deletion(-)
diff --git a/include/net/ipv6_frag.h b/include/net/ipv6_frag.h index 28aa9b30aecea..1f77fb4dc79df 100644 --- a/include/net/ipv6_frag.h +++ b/include/net/ipv6_frag.h @@ -94,7 +94,6 @@ ip6frag_expire_frag_queue(struct net *net, struct frag_queue *fq) goto out;
head->dev = dev; - skb_get(head); spin_unlock(&fq->q.lock);
icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0);
On 22.05.19 21:15, Sasha Levin wrote:
From: Eric Dumazet edumazet@google.com
[ Upstream commit 47d3d7fdb10a21c223036b58bd70ffdc24a472c4 ]
Since ip6frag_expire_frag_queue() now pulls the head skb from frag queue, we should no longer use skb_get(), since this leads to an skb leak.
Stefan Bader initially reported a problem in 4.4.stable [1] caused by the skb_get(), so this patch should also fix this issue.
Just to let everybody know, while changing this has fixed the BUG_ON problem while sending (in 4.4) it now crashes when releasing just a little later. Still feels like the right direction but not complete, yet.
-Stefan
296583.091021] kernel BUG at /build/linux-6VmqmP/linux-4.4.0/net/core/skbuff.c:1207! [296583.091734] Call Trace: [296583.091749] [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350 [296583.091764] [<ffffffff8183939a>] _decode_session6+0x26a/0x400 [296583.091779] [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50 [296583.091795] [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0 [296583.091809] [<ffffffff81824421>] icmp6_send+0x5e1/0x940 [296583.091823] [<ffffffff81753238>] ? __netif_receive_skb+0x18/0x60 [296583.091838] [<ffffffff817532b2>] ? netif_receive_skb_internal+0x32/0xa0 [296583.091858] [<ffffffffc0199f74>] ? ixgbe_clean_rx_irq+0x594/0xac0 [ixgbe] [296583.091876] [<ffffffffc04eb260>] ? nf_ct_net_exit+0x50/0x50 [nf_defrag_ipv6] [296583.091893] [<ffffffff8183d431>] icmpv6_send+0x21/0x30 [296583.091906] [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120 [296583.091921] [<ffffffffc04eb27f>] nf_ct_frag6_expire+0x1f/0x30 [nf_defrag_ipv6] [296583.091938] [<ffffffff810f3b57>] call_timer_fn+0x37/0x140 [296583.091951] [<ffffffffc04eb260>] ? nf_ct_net_exit+0x50/0x50 [nf_defrag_ipv6] [296583.091968] [<ffffffff810f5464>] run_timer_softirq+0x234/0x330 [296583.091982] [<ffffffff8108a339>] __do_softirq+0x109/0x2b0
Fixes: d4289fcc9b16 ("net: IP6 defrag: use rbtrees for IPv6 defrag") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: Stefan Bader stefan.bader@canonical.com Cc: Peter Oskolkov posk@google.com Cc: Florian Westphal fw@strlen.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org
include/net/ipv6_frag.h | 1 - 1 file changed, 1 deletion(-)
diff --git a/include/net/ipv6_frag.h b/include/net/ipv6_frag.h index 28aa9b30aecea..1f77fb4dc79df 100644 --- a/include/net/ipv6_frag.h +++ b/include/net/ipv6_frag.h @@ -94,7 +94,6 @@ ip6frag_expire_frag_queue(struct net *net, struct frag_queue *fq) goto out; head->dev = dev;
- skb_get(head); spin_unlock(&fq->q.lock);
icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0);
On Thu, May 23, 2019 at 09:47:23AM +0200, Stefan Bader wrote:
On 22.05.19 21:15, Sasha Levin wrote:
From: Eric Dumazet edumazet@google.com
[ Upstream commit 47d3d7fdb10a21c223036b58bd70ffdc24a472c4 ]
Since ip6frag_expire_frag_queue() now pulls the head skb from frag queue, we should no longer use skb_get(), since this leads to an skb leak.
Stefan Bader initially reported a problem in 4.4.stable [1] caused by the skb_get(), so this patch should also fix this issue.
Just to let everybody know, while changing this has fixed the BUG_ON problem while sending (in 4.4) it now crashes when releasing just a little later. Still feels like the right direction but not complete, yet.
mhm, this commit is really under David's domain, it squeezed through my filters as it doesn't actually touch net/. I'll drop it for now.
-- Thanks, Sasha
From: Vineet Gupta Vineet.Gupta1@synopsys.com
[ Upstream commit ca31ca8247e2d3807ff5fa1d1760616a2292001c ]
When build perf for ARC recently, there was a build failure due to lack of __NR_bpf.
| Auto-detecting system features: | | ... get_cpuid: [ OFF ] | ... bpf: [ on ] | | # error __NR_bpf not defined. libbpf does not support your arch. ^~~~~ | bpf.c: In function 'sys_bpf': | bpf.c:66:17: error: '__NR_bpf' undeclared (first use in this function) | return syscall(__NR_bpf, cmd, attr, size); | ^~~~~~~~ | sys_bpf
Signed-off-by: Vineet Gupta vgupta@synopsys.com Acked-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/bpf.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 9cd015574e838..d82edadf75893 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -46,6 +46,8 @@ # define __NR_bpf 349 # elif defined(__s390__) # define __NR_bpf 351 +# elif defined(__arc__) +# define __NR_bpf 280 # else # error __NR_bpf not defined. libbpf does not support your arch. # endif
From: Martyna Szapar martyna.szapar@intel.com
[ Upstream commit 24474f2709af6729b9b1da1c5e160ab62e25e3a4 ]
Fixed possible memory leak in i40e_vc_add_cloud_filter function: cfilter is being allocated and in some error conditions the function returns without freeing the memory.
Fix of integer truncation from u16 (type of queue_id value) to u8 when calling i40e_vc_isvalid_queue_id function.
Signed-off-by: Martyna Szapar martyna.szapar@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index 831d52bc3c9ae..0b5b867c9fbcb 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -181,7 +181,7 @@ static inline bool i40e_vc_isvalid_vsi_id(struct i40e_vf *vf, u16 vsi_id) * check for the valid queue id **/ static inline bool i40e_vc_isvalid_queue_id(struct i40e_vf *vf, u16 vsi_id, - u8 qid) + u16 qid) { struct i40e_pf *pf = vf->pf; struct i40e_vsi *vsi = i40e_find_vsi_from_id(pf, vsi_id); @@ -3374,7 +3374,7 @@ static int i40e_vc_add_cloud_filter(struct i40e_vf *vf, u8 *msg)
if (!test_bit(I40E_VF_STATE_ACTIVE, &vf->vf_states)) { aq_ret = I40E_ERR_PARAM; - goto err; + goto err_out; }
if (!vf->adq_enabled) { @@ -3382,7 +3382,7 @@ static int i40e_vc_add_cloud_filter(struct i40e_vf *vf, u8 *msg) "VF %d: ADq is not enabled, can't apply cloud filter\n", vf->vf_id); aq_ret = I40E_ERR_PARAM; - goto err; + goto err_out; }
if (i40e_validate_cloud_filter(vf, vcf)) { @@ -3390,7 +3390,7 @@ static int i40e_vc_add_cloud_filter(struct i40e_vf *vf, u8 *msg) "VF %d: Invalid input/s, can't apply cloud filter\n", vf->vf_id); aq_ret = I40E_ERR_PARAM; - goto err; + goto err_out; }
cfilter = kzalloc(sizeof(*cfilter), GFP_KERNEL); @@ -3451,13 +3451,17 @@ static int i40e_vc_add_cloud_filter(struct i40e_vf *vf, u8 *msg) "VF %d: Failed to add cloud filter, err %s aq_err %s\n", vf->vf_id, i40e_stat_str(&pf->hw, ret), i40e_aq_str(&pf->hw, pf->hw.aq.asq_last_status)); - goto err; + goto err_free; }
INIT_HLIST_NODE(&cfilter->cloud_node); hlist_add_head(&cfilter->cloud_node, &vf->cloud_filter_list); + /* release the pointer passing it to the collection */ + cfilter = NULL; vf->num_cloud_filters++; -err: +err_free: + kfree(cfilter); +err_out: return i40e_vc_send_resp_to_vf(vf, VIRTCHNL_OP_ADD_CLOUD_FILTER, aq_ret); }
From: BjÃļrn TÃļpel bjorn.topel@intel.com
[ Upstream commit 0e6741f092979535d159d5a851f12c88bfb7cb9a ]
When unmapping the AF_XDP memory regions used for the rings, an invalid address was passed to the munmap() calls. Instead of passing the beginning of the memory region, the descriptor region was passed to munmap.
When the userspace application tried to tear down an AF_XDP socket, the operation failed and the application would still have a reference to socket it wished to get rid of.
Reported-by: William Tu u9012063@gmail.com Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets") Signed-off-by: BjÃļrn TÃļpel bjorn.topel@intel.com Tested-by: William Tu u9012063@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/xsk.c | 77 +++++++++++++++++++++++---------------------- 1 file changed, 40 insertions(+), 37 deletions(-)
diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c index 8d0078b65486f..af5f310ecca1c 100644 --- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -248,8 +248,7 @@ int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, __u64 size, return 0;
out_mmap: - munmap(umem->fill, - off.fr.desc + umem->config.fill_size * sizeof(__u64)); + munmap(map, off.fr.desc + umem->config.fill_size * sizeof(__u64)); out_socket: close(umem->fd); out_umem_alloc: @@ -523,11 +522,11 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname, struct xsk_ring_cons *rx, struct xsk_ring_prod *tx, const struct xsk_socket_config *usr_config) { + void *rx_map = NULL, *tx_map = NULL; struct sockaddr_xdp sxdp = {}; struct xdp_mmap_offsets off; struct xsk_socket *xsk; socklen_t optlen; - void *map; int err;
if (!umem || !xsk_ptr || !rx || !tx) @@ -593,40 +592,40 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname, }
if (rx) { - map = xsk_mmap(NULL, off.rx.desc + - xsk->config.rx_size * sizeof(struct xdp_desc), - PROT_READ | PROT_WRITE, - MAP_SHARED | MAP_POPULATE, - xsk->fd, XDP_PGOFF_RX_RING); - if (map == MAP_FAILED) { + rx_map = xsk_mmap(NULL, off.rx.desc + + xsk->config.rx_size * sizeof(struct xdp_desc), + PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, + xsk->fd, XDP_PGOFF_RX_RING); + if (rx_map == MAP_FAILED) { err = -errno; goto out_socket; }
rx->mask = xsk->config.rx_size - 1; rx->size = xsk->config.rx_size; - rx->producer = map + off.rx.producer; - rx->consumer = map + off.rx.consumer; - rx->ring = map + off.rx.desc; + rx->producer = rx_map + off.rx.producer; + rx->consumer = rx_map + off.rx.consumer; + rx->ring = rx_map + off.rx.desc; } xsk->rx = rx;
if (tx) { - map = xsk_mmap(NULL, off.tx.desc + - xsk->config.tx_size * sizeof(struct xdp_desc), - PROT_READ | PROT_WRITE, - MAP_SHARED | MAP_POPULATE, - xsk->fd, XDP_PGOFF_TX_RING); - if (map == MAP_FAILED) { + tx_map = xsk_mmap(NULL, off.tx.desc + + xsk->config.tx_size * sizeof(struct xdp_desc), + PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, + xsk->fd, XDP_PGOFF_TX_RING); + if (tx_map == MAP_FAILED) { err = -errno; goto out_mmap_rx; }
tx->mask = xsk->config.tx_size - 1; tx->size = xsk->config.tx_size; - tx->producer = map + off.tx.producer; - tx->consumer = map + off.tx.consumer; - tx->ring = map + off.tx.desc; + tx->producer = tx_map + off.tx.producer; + tx->consumer = tx_map + off.tx.consumer; + tx->ring = tx_map + off.tx.desc; tx->cached_cons = xsk->config.tx_size; } xsk->tx = tx; @@ -653,13 +652,11 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
out_mmap_tx: if (tx) - munmap(xsk->tx, - off.tx.desc + + munmap(tx_map, off.tx.desc + xsk->config.tx_size * sizeof(struct xdp_desc)); out_mmap_rx: if (rx) - munmap(xsk->rx, - off.rx.desc + + munmap(rx_map, off.rx.desc + xsk->config.rx_size * sizeof(struct xdp_desc)); out_socket: if (--umem->refcount) @@ -684,10 +681,12 @@ int xsk_umem__delete(struct xsk_umem *umem) optlen = sizeof(off); err = getsockopt(umem->fd, SOL_XDP, XDP_MMAP_OFFSETS, &off, &optlen); if (!err) { - munmap(umem->fill->ring, - off.fr.desc + umem->config.fill_size * sizeof(__u64)); - munmap(umem->comp->ring, - off.cr.desc + umem->config.comp_size * sizeof(__u64)); + (void)munmap(umem->fill->ring - off.fr.desc, + off.fr.desc + + umem->config.fill_size * sizeof(__u64)); + (void)munmap(umem->comp->ring - off.cr.desc, + off.cr.desc + + umem->config.comp_size * sizeof(__u64)); }
close(umem->fd); @@ -698,6 +697,7 @@ int xsk_umem__delete(struct xsk_umem *umem)
void xsk_socket__delete(struct xsk_socket *xsk) { + size_t desc_sz = sizeof(struct xdp_desc); struct xdp_mmap_offsets off; socklen_t optlen; int err; @@ -710,14 +710,17 @@ void xsk_socket__delete(struct xsk_socket *xsk) optlen = sizeof(off); err = getsockopt(xsk->fd, SOL_XDP, XDP_MMAP_OFFSETS, &off, &optlen); if (!err) { - if (xsk->rx) - munmap(xsk->rx->ring, - off.rx.desc + - xsk->config.rx_size * sizeof(struct xdp_desc)); - if (xsk->tx) - munmap(xsk->tx->ring, - off.tx.desc + - xsk->config.tx_size * sizeof(struct xdp_desc)); + if (xsk->rx) { + (void)munmap(xsk->rx->ring - off.rx.desc, + off.rx.desc + + xsk->config.rx_size * desc_sz); + } + if (xsk->tx) { + (void)munmap(xsk->tx->ring - off.tx.desc, + off.tx.desc + + xsk->config.tx_size * desc_sz); + } + }
xsk->umem->refcount--;
From: Yonghong Song yhs@fb.com
[ Upstream commit 6cea33701eb024bc6c920ab83940ee22afd29139 ]
Test test_libbpf.sh failed on my development server with failure -bash-4.4$ sudo ./test_libbpf.sh [0] libbpf: Error in bpf_object__probe_name():Operation not permitted(1). Couldn't load basic 'r0 = 0' BPF program. test_libbpf: failed at file test_l4lb.o selftests: test_libbpf [FAILED] -bash-4.4$
The reason is because my machine has 64KB locked memory by default which is not enough for this program to get locked memory. Similar to other bpf selftests, let us increase RLIMIT_MEMLOCK to infinity, which fixed the issue.
Signed-off-by: Yonghong Song yhs@fb.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/bpf/test_libbpf_open.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/bpf/test_libbpf_open.c b/tools/testing/selftests/bpf/test_libbpf_open.c index 65cbd30704b5a..9e9db202d218a 100644 --- a/tools/testing/selftests/bpf/test_libbpf_open.c +++ b/tools/testing/selftests/bpf/test_libbpf_open.c @@ -11,6 +11,8 @@ static const char *__doc__ = #include <bpf/libbpf.h> #include <getopt.h>
+#include "bpf_rlimit.h" + static const struct option long_options[] = { {"help", no_argument, NULL, 'h' }, {"debug", no_argument, NULL, 'D' },
From: Masahiro Yamada yamada.masahiro@socionext.com
[ Upstream commit a7d006714724de4334c5e3548701b33f7b12ca96 ]
tools/bpf/bpftool/.gitignore has the "bpftool" pattern, which is intended to ignore the following build artifact:
tools/bpf/bpftool/bpftool
However, the .gitignore entry is effective not only for the current directory, but also for any sub-directories.
So, from the point of .gitignore grammar, the following check-in file is also considered to be ignored:
tools/bpf/bpftool/bash-completion/bpftool
As the manual gitignore(5) says "Files already tracked by Git are not affected", this is not a problem as far as Git is concerned.
However, Git is not the only program that parses .gitignore because .gitignore is useful to distinguish build artifacts from source files.
For example, tar(1) supports the --exclude-vcs-ignore option. As of writing, this option does not work perfectly, but it intends to create a tarball excluding files specified by .gitignore.
So, I believe it is better to fix this issue.
You can fix it by prefixing the pattern with a slash; the leading slash means the specified pattern is relative to the current directory.
Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Reviewed-by: Quentin Monnet quentin.monnet@netronome.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/bpf/bpftool/.gitignore | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/bpf/bpftool/.gitignore b/tools/bpf/bpftool/.gitignore index 67167e44b7266..8248b8dd89d4b 100644 --- a/tools/bpf/bpftool/.gitignore +++ b/tools/bpf/bpftool/.gitignore @@ -1,5 +1,5 @@ *.d -bpftool +/bpftool bpftool*.8 bpf-helpers.* FEATURE-DUMP.bpftool
From: Tony Nguyen anthony.l.nguyen@intel.com
[ Upstream commit 8f529ff912073f778e3cd74e87fb69a36499fc2f ]
Set features can have multiple features turned on|off in a single call. Grouping these all in an if/else means after one condition is met, other conditions/features will not be evaluated. Break the if/else statements by feature to ensure all features will be handled properly.
Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Anirudh Venkataramanan anirudh.venkataramanan@intel.com Tested-by: Andrew Bowers andrewx.bowers@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_main.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 47cc3f905b7ff..ac30288720f71 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -2545,6 +2545,9 @@ static int ice_set_features(struct net_device *netdev, struct ice_vsi *vsi = np->vsi; int ret = 0;
+ /* Multiple features can be changed in one call so keep features in + * separate if/else statements to guarantee each feature is checked + */ if (features & NETIF_F_RXHASH && !(netdev->features & NETIF_F_RXHASH)) ret = ice_vsi_manage_rss_lut(vsi, true); else if (!(features & NETIF_F_RXHASH) && @@ -2557,8 +2560,9 @@ static int ice_set_features(struct net_device *netdev, else if (!(features & NETIF_F_HW_VLAN_CTAG_RX) && (netdev->features & NETIF_F_HW_VLAN_CTAG_RX)) ret = ice_vsi_manage_vlan_stripping(vsi, false); - else if ((features & NETIF_F_HW_VLAN_CTAG_TX) && - !(netdev->features & NETIF_F_HW_VLAN_CTAG_TX)) + + if ((features & NETIF_F_HW_VLAN_CTAG_TX) && + !(netdev->features & NETIF_F_HW_VLAN_CTAG_TX)) ret = ice_vsi_manage_vlan_insertion(vsi); else if (!(features & NETIF_F_HW_VLAN_CTAG_TX) && (netdev->features & NETIF_F_HW_VLAN_CTAG_TX))
From: Tony Nguyen anthony.l.nguyen@intel.com
[ Upstream commit e80e76db6c5bbc7a8f8512f3dc630a2170745b0b ]
When Tx insertion is set, we are not accounting for the state of Rx stripping. This causes Rx stripping to be enabled any time Tx insertion is changed, even when it's supposed to be disabled.
Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Anirudh Venkataramanan anirudh.venkataramanan@intel.com Tested-by: Andrew Bowers andrewx.bowers@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_lib.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index fa61203bee269..b710545cf7d1a 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -1848,6 +1848,10 @@ int ice_vsi_manage_vlan_insertion(struct ice_vsi *vsi) */ ctxt->info.vlan_flags = ICE_AQ_VSI_VLAN_MODE_ALL;
+ /* Preserve existing VLAN strip setting */ + ctxt->info.vlan_flags |= (vsi->info.vlan_flags & + ICE_AQ_VSI_VLAN_EMOD_M); + ctxt->info.valid_sections = cpu_to_le16(ICE_AQ_VSI_PROP_VLAN_VALID);
status = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 7c6c5b7c9186e3fb5b10afb8e5f710ae661144c6 ]
Split blk_mq_alloc_and_init_hctx into two parts, and one is blk_mq_alloc_hctx() for allocating all hctx resources, another is blk_mq_init_hctx() for initializing hctx, which serves as counter-part of blk_mq_exit_hctx().
Cc: Dongli Zhang dongli.zhang@oracle.com Cc: James Smart james.smart@broadcom.com Cc: Bart Van Assche bart.vanassche@wdc.com Cc: linux-scsi@vger.kernel.org Cc: Martin K . Petersen martin.petersen@oracle.com Cc: Christoph Hellwig hch@lst.de Cc: James E . J . Bottomley jejb@linux.vnet.ibm.com Reviewed-by: Hannes Reinecke hare@suse.com Reviewed-by: Christoph Hellwig hch@lst.de Tested-by: James Smart james.smart@broadcom.com Signed-off-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq.c | 139 ++++++++++++++++++++++++++----------------------- 1 file changed, 75 insertions(+), 64 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c index fc60ed7e940ea..24e3ae3bd710e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2289,15 +2289,65 @@ static void blk_mq_exit_hw_queues(struct request_queue *q, } }
+static int blk_mq_hw_ctx_size(struct blk_mq_tag_set *tag_set) +{ + int hw_ctx_size = sizeof(struct blk_mq_hw_ctx); + + BUILD_BUG_ON(ALIGN(offsetof(struct blk_mq_hw_ctx, srcu), + __alignof__(struct blk_mq_hw_ctx)) != + sizeof(struct blk_mq_hw_ctx)); + + if (tag_set->flags & BLK_MQ_F_BLOCKING) + hw_ctx_size += sizeof(struct srcu_struct); + + return hw_ctx_size; +} + static int blk_mq_init_hctx(struct request_queue *q, struct blk_mq_tag_set *set, struct blk_mq_hw_ctx *hctx, unsigned hctx_idx) { - int node; + hctx->queue_num = hctx_idx; + + cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_DEAD, &hctx->cpuhp_dead); + + hctx->tags = set->tags[hctx_idx]; + + if (set->ops->init_hctx && + set->ops->init_hctx(hctx, set->driver_data, hctx_idx)) + goto unregister_cpu_notifier;
- node = hctx->numa_node; + if (blk_mq_init_request(set, hctx->fq->flush_rq, hctx_idx, + hctx->numa_node)) + goto exit_hctx; + return 0; + + exit_hctx: + if (set->ops->exit_hctx) + set->ops->exit_hctx(hctx, hctx_idx); + unregister_cpu_notifier: + blk_mq_remove_cpuhp(hctx); + return -1; +} + +static struct blk_mq_hw_ctx * +blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set, + int node) +{ + struct blk_mq_hw_ctx *hctx; + gfp_t gfp = GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY; + + hctx = kzalloc_node(blk_mq_hw_ctx_size(set), gfp, node); + if (!hctx) + goto fail_alloc_hctx; + + if (!zalloc_cpumask_var_node(&hctx->cpumask, gfp, node)) + goto free_hctx; + + atomic_set(&hctx->nr_active, 0); if (node == NUMA_NO_NODE) - node = hctx->numa_node = set->numa_node; + node = set->numa_node; + hctx->numa_node = node;
INIT_DELAYED_WORK(&hctx->run_work, blk_mq_run_work_fn); spin_lock_init(&hctx->lock); @@ -2305,58 +2355,45 @@ static int blk_mq_init_hctx(struct request_queue *q, hctx->queue = q; hctx->flags = set->flags & ~BLK_MQ_F_TAG_SHARED;
- cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_DEAD, &hctx->cpuhp_dead); - - hctx->tags = set->tags[hctx_idx]; - /* * Allocate space for all possible cpus to avoid allocation at * runtime */ hctx->ctxs = kmalloc_array_node(nr_cpu_ids, sizeof(void *), - GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, node); + gfp, node); if (!hctx->ctxs) - goto unregister_cpu_notifier; + goto free_cpumask;
if (sbitmap_init_node(&hctx->ctx_map, nr_cpu_ids, ilog2(8), - GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, node)) + gfp, node)) goto free_ctxs; - hctx->nr_ctx = 0;
spin_lock_init(&hctx->dispatch_wait_lock); init_waitqueue_func_entry(&hctx->dispatch_wait, blk_mq_dispatch_wake); INIT_LIST_HEAD(&hctx->dispatch_wait.entry);
- if (set->ops->init_hctx && - set->ops->init_hctx(hctx, set->driver_data, hctx_idx)) - goto free_bitmap; - hctx->fq = blk_alloc_flush_queue(q, hctx->numa_node, set->cmd_size, - GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY); + gfp); if (!hctx->fq) - goto exit_hctx; - - if (blk_mq_init_request(set, hctx->fq->flush_rq, hctx_idx, node)) - goto free_fq; + goto free_bitmap;
if (hctx->flags & BLK_MQ_F_BLOCKING) init_srcu_struct(hctx->srcu); + blk_mq_hctx_kobj_init(hctx);
- return 0; + return hctx;
- free_fq: - blk_free_flush_queue(hctx->fq); - exit_hctx: - if (set->ops->exit_hctx) - set->ops->exit_hctx(hctx, hctx_idx); free_bitmap: sbitmap_free(&hctx->ctx_map); free_ctxs: kfree(hctx->ctxs); - unregister_cpu_notifier: - blk_mq_remove_cpuhp(hctx); - return -1; + free_cpumask: + free_cpumask_var(hctx->cpumask); + free_hctx: + kfree(hctx); + fail_alloc_hctx: + return NULL; }
static void blk_mq_init_cpu_queues(struct request_queue *q, @@ -2700,51 +2737,25 @@ struct request_queue *blk_mq_init_sq_queue(struct blk_mq_tag_set *set, } EXPORT_SYMBOL(blk_mq_init_sq_queue);
-static int blk_mq_hw_ctx_size(struct blk_mq_tag_set *tag_set) -{ - int hw_ctx_size = sizeof(struct blk_mq_hw_ctx); - - BUILD_BUG_ON(ALIGN(offsetof(struct blk_mq_hw_ctx, srcu), - __alignof__(struct blk_mq_hw_ctx)) != - sizeof(struct blk_mq_hw_ctx)); - - if (tag_set->flags & BLK_MQ_F_BLOCKING) - hw_ctx_size += sizeof(struct srcu_struct); - - return hw_ctx_size; -} - static struct blk_mq_hw_ctx *blk_mq_alloc_and_init_hctx( struct blk_mq_tag_set *set, struct request_queue *q, int hctx_idx, int node) { struct blk_mq_hw_ctx *hctx;
- hctx = kzalloc_node(blk_mq_hw_ctx_size(set), - GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, - node); + hctx = blk_mq_alloc_hctx(q, set, node); if (!hctx) - return NULL; - - if (!zalloc_cpumask_var_node(&hctx->cpumask, - GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, - node)) { - kfree(hctx); - return NULL; - } - - atomic_set(&hctx->nr_active, 0); - hctx->numa_node = node; - hctx->queue_num = hctx_idx; + goto fail;
- if (blk_mq_init_hctx(q, set, hctx, hctx_idx)) { - free_cpumask_var(hctx->cpumask); - kfree(hctx); - return NULL; - } - blk_mq_hctx_kobj_init(hctx); + if (blk_mq_init_hctx(q, set, hctx, hctx_idx)) + goto free_hctx;
return hctx; + + free_hctx: + kobject_put(&hctx->kobj); + fail: + return NULL; }
static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
From: Ming Lei ming.lei@redhat.com
[ Upstream commit e87eb301bee183d82bb3d04bd71b6660889a2588 ]
Just like aio/io_uring, we need to grab 2 refcount for queuing one request, one is for submission, another is for completion.
If the request isn't queued from plug code path, the refcount grabbed in generic_make_request() serves for submission. In theroy, this refcount should have been released after the sumission(async run queue) is done. blk_freeze_queue() works with blk_sync_queue() together for avoiding race between cleanup queue and IO submission, given async run queue activities are canceled because hctx->run_work is scheduled with the refcount held, so it is fine to not hold the refcount when running the run queue work function for dispatch IO.
However, if request is staggered into plug list, and finally queued from plug code path, the refcount in submission side is actually missed. And we may start to run queue after queue is removed because the queue's kobject refcount isn't guaranteed to be grabbed in flushing plug list context, then kernel oops is triggered, see the following race:
blk_mq_flush_plug_list(): blk_mq_sched_insert_requests() insert requests to sw queue or scheduler queue blk_mq_run_hw_queue
Because of concurrent run queue, all requests inserted above may be completed before calling the above blk_mq_run_hw_queue. Then queue can be freed during the above blk_mq_run_hw_queue().
Fixes the issue by grab .q_usage_counter before calling blk_mq_sched_insert_requests() in blk_mq_flush_plug_list(). This way is safe because the queue is absolutely alive before inserting request.
Cc: Dongli Zhang dongli.zhang@oracle.com Cc: James Smart james.smart@broadcom.com Cc: linux-scsi@vger.kernel.org, Cc: Martin K . Petersen martin.petersen@oracle.com, Cc: Christoph Hellwig hch@lst.de, Cc: James E . J . Bottomley jejb@linux.vnet.ibm.com, Reviewed-by: Bart Van Assche bvanassche@acm.org Tested-by: James Smart james.smart@broadcom.com Signed-off-by: Ming Lei ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq-sched.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index aa6bc5c026438..c59babca6857a 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -413,6 +413,14 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, struct list_head *list, bool run_queue_async) { struct elevator_queue *e; + struct request_queue *q = hctx->queue; + + /* + * blk_mq_sched_insert_requests() is called from flush plug + * context only, and hold one usage counter to prevent queue + * from being released. + */ + percpu_ref_get(&q->q_usage_counter);
e = hctx->queue->elevator; if (e && e->type->ops.insert_requests) @@ -426,12 +434,14 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx, if (!hctx->dispatch_busy && !e && !run_queue_async) { blk_mq_try_issue_list_directly(hctx, list); if (list_empty(list)) - return; + goto out; } blk_mq_insert_requests(hctx, ctx, list); }
blk_mq_run_hw_queue(hctx, run_queue_async); + out: + percpu_ref_put(&q->q_usage_counter); }
static void blk_mq_sched_free_tags(struct blk_mq_tag_set *set,
From: Sameer Pujar spujar@nvidia.com
[ Upstream commit f030e419501cb95e961e9ed35c493b5d46a04eca ]
Following kernel panic is seen during DMA driver unload->load sequence ========================================================================== Unable to handle kernel paging request at virtual address ffffff8001198880 Internal error: Oops: 86000007 [#1] PREEMPT SMP CPU: 0 PID: 5907 Comm: HwBinder:4123_1 Tainted: G C 4.9.128-tegra-g065839f Hardware name: galen (DT) task: ffffffc3590d1a80 task.stack: ffffffc3d0678000 PC is at 0xffffff8001198880 LR is at of_dma_request_slave_channel+0xd8/0x1f8 pc : [<ffffff8001198880>] lr : [<ffffff8008746f30>] pstate: 60400045 sp : ffffffc3d067b710 x29: ffffffc3d067b710 x28: 000000000000002f x27: ffffff800949e000 x26: ffffff800949e750 x25: ffffff800949e000 x24: ffffffbefe817d84 x23: ffffff8009f77cb0 x22: 0000000000000028 x21: ffffffc3ffda49c8 x20: 0000000000000029 x19: 0000000000000001 x18: ffffffffffffffff x17: 0000000000000000 x16: ffffff80082b66a0 x15: ffffff8009e78250 x14: 000000000000000a x13: 0000000000000038 x12: 0101010101010101 x11: 0000000000000030 x10: 0101010101010101 x9 : fffffffffffffffc x8 : 7f7f7f7f7f7f7f7f x7 : 62ff726b6b64622c x6 : 0000000000008064 x5 : 6400000000000000 x4 : ffffffbefe817c44 x3 : ffffffc3ffda3e08 x2 : ffffff8001198880 x1 : ffffffc3d48323c0 x0 : ffffffc3d067b788
Process HwBinder:4123_1 (pid: 5907, stack limit = 0xffffffc3d0678028) Call trace: [<ffffff8001198880>] 0xffffff8001198880 [<ffffff80087459f8>] dma_request_chan+0x50/0x1f0 [<ffffff8008745bc0>] dma_request_slave_channel+0x28/0x40 [<ffffff8001552c44>] tegra_alt_pcm_open+0x114/0x170 [<ffffff8008d65fa4>] soc_pcm_open+0x10c/0x878 [<ffffff8008d18618>] snd_pcm_open_substream+0xc0/0x170 [<ffffff8008d1878c>] snd_pcm_open+0xc4/0x240 [<ffffff8008d189e0>] snd_pcm_playback_open+0x58/0x80 [<ffffff8008cfc6d4>] snd_open+0xb4/0x178 [<ffffff8008250628>] chrdev_open+0xb8/0x1d0 [<ffffff8008246fdc>] do_dentry_open+0x214/0x318 [<ffffff80082485d0>] vfs_open+0x58/0x88 [<ffffff800825bce0>] do_last+0x450/0xde0 [<ffffff800825c718>] path_openat+0xa8/0x368 [<ffffff800825dd84>] do_filp_open+0x8c/0x110 [<ffffff8008248a74>] do_sys_open+0x164/0x220 [<ffffff80082b66dc>] compat_SyS_openat+0x3c/0x50 [<ffffff8008083040>] el0_svc_naked+0x34/0x38 ---[ end trace 67e6d544e65b5145 ]--- Kernel panic - not syncing: Fatal exception ==========================================================================
In device probe(), of_dma_controller_register() registers DMA controller. But when driver is removed, this is not freed. During driver reload this results in data abort and kernel panic. Add of_dma_controller_free() in driver remove path to fix the issue.
Fixes: f46b195799b5 ("dmaengine: tegra-adma: Add support for Tegra210 ADMA") Signed-off-by: Sameer Pujar spujar@nvidia.com Reviewed-by: Jon Hunter jonathanh@nvidia.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/tegra210-adma.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/dma/tegra210-adma.c b/drivers/dma/tegra210-adma.c index 5ec0dd97b3971..9aa35a7f13692 100644 --- a/drivers/dma/tegra210-adma.c +++ b/drivers/dma/tegra210-adma.c @@ -787,6 +787,7 @@ static int tegra_adma_remove(struct platform_device *pdev) struct tegra_adma *tdma = platform_get_drvdata(pdev); int i;
+ of_dma_controller_free(pdev->dev.of_node); dma_async_device_unregister(&tdma->dma_dev);
for (i = 0; i < tdma->nr_channels; ++i)
From: Sameeh Jubran sameehj@amazon.com
[ Upstream commit f913308879bc6ae437ce64d878c7b05643ddea44 ]
GCC 8 contains a number of new warnings as well as enhancements to existing checkers. The warning - Wstringop-truncation - warns for calls to bounded string manipulation functions such as strncat, strncpy, and stpncpy that may either truncate the copied string or leave the destination unchanged.
In our case the destination string length (32 bytes) is much shorter than the source string (64 bytes) which causes this warning to show up. In general the destination has to be at least a byte larger than the length of the source string with strncpy for this warning not to showup.
This can be easily fixed by using strlcpy instead which already does the truncation to the string. Documentation for this function can be found here:
https://elixir.bootlin.com/linux/latest/source/lib/string.c#L141
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Sameeh Jubran sameehj@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/amazon/ena/ena_netdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index a6eacf2099c30..41c1c9acb3246 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -2292,7 +2292,7 @@ static void ena_config_host_info(struct ena_com_dev *ena_dev, host_info->bdf = (pdev->bus->number << 8) | pdev->devfn; host_info->os_type = ENA_ADMIN_OS_LINUX; host_info->kernel_ver = LINUX_VERSION_CODE; - strncpy(host_info->kernel_ver_str, utsname()->version, + strlcpy(host_info->kernel_ver_str, utsname()->version, sizeof(host_info->kernel_ver_str) - 1); host_info->os_dist = 0; strncpy(host_info->os_dist_str, utsname()->release,
From: Sameeh Jubran sameehj@amazon.com
[ Upstream commit 8ee8ee7fe87bf64738ab4e31be036a7165608b27 ]
In some cases when a queue related allocation fails, successful past allocations are freed but the pointer that pointed to them is not set to NULL. This is a problem for 2 reasons: 1. This is generally a bad practice since this pointer might be accidentally accessed in the future. 2. Future allocations using the same pointer check if the pointer is NULL and fail if it is not.
Fixed this by setting such pointers to NULL in the allocation of queue related objects.
Also refactored the code of ena_setup_tx_resources() to goto-style error handling to avoid code duplication of resource freeing.
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Arthur Kiyanovski akiyano@amazon.com Signed-off-by: Sameeh Jubran sameehj@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/amazon/ena/ena_netdev.c | 25 ++++++++++++-------- 1 file changed, 15 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index 41c1c9acb3246..9b03d7e404f83 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -224,28 +224,23 @@ static int ena_setup_tx_resources(struct ena_adapter *adapter, int qid) if (!tx_ring->tx_buffer_info) { tx_ring->tx_buffer_info = vzalloc(size); if (!tx_ring->tx_buffer_info) - return -ENOMEM; + goto err_tx_buffer_info; }
size = sizeof(u16) * tx_ring->ring_size; tx_ring->free_tx_ids = vzalloc_node(size, node); if (!tx_ring->free_tx_ids) { tx_ring->free_tx_ids = vzalloc(size); - if (!tx_ring->free_tx_ids) { - vfree(tx_ring->tx_buffer_info); - return -ENOMEM; - } + if (!tx_ring->free_tx_ids) + goto err_free_tx_ids; }
size = tx_ring->tx_max_header_size; tx_ring->push_buf_intermediate_buf = vzalloc_node(size, node); if (!tx_ring->push_buf_intermediate_buf) { tx_ring->push_buf_intermediate_buf = vzalloc(size); - if (!tx_ring->push_buf_intermediate_buf) { - vfree(tx_ring->tx_buffer_info); - vfree(tx_ring->free_tx_ids); - return -ENOMEM; - } + if (!tx_ring->push_buf_intermediate_buf) + goto err_push_buf_intermediate_buf; }
/* Req id ring for TX out of order completions */ @@ -259,6 +254,15 @@ static int ena_setup_tx_resources(struct ena_adapter *adapter, int qid) tx_ring->next_to_clean = 0; tx_ring->cpu = ena_irq->cpu; return 0; + +err_push_buf_intermediate_buf: + vfree(tx_ring->free_tx_ids); + tx_ring->free_tx_ids = NULL; +err_free_tx_ids: + vfree(tx_ring->tx_buffer_info); + tx_ring->tx_buffer_info = NULL; +err_tx_buffer_info: + return -ENOMEM; }
/* ena_free_tx_resources - Free I/O Tx Resources per Queue @@ -378,6 +382,7 @@ static int ena_setup_rx_resources(struct ena_adapter *adapter, rx_ring->free_rx_ids = vzalloc(size); if (!rx_ring->free_rx_ids) { vfree(rx_ring->rx_buffer_info); + rx_ring->rx_buffer_info = NULL; return -ENOMEM; } }
From: Haiyang Zhang haiyangz@microsoft.com
[ Upstream commit 93aa4792c3908eac87ddd368ee0fe0564148232b ]
When the ring buffer is almost full due to RX completion messages, a TX packet may reach the "low watermark" and cause the queue stopped. If the TX completion arrives earlier than queue stopping, the wakeup may be missed.
This patch moves the check for the last pending packet to cover both EAGAIN and success cases, so the queue will be reliably waked up when necessary.
Reported-and-tested-by: Stephan Klein stephan.klein@wegfinder.at Signed-off-by: Haiyang Zhang haiyangz@microsoft.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/hyperv/netvsc.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index e0dce373cdd9d..3d4a166a49d58 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -875,12 +875,6 @@ static inline int netvsc_send_pkt( } else if (ret == -EAGAIN) { netif_tx_stop_queue(txq); ndev_ctx->eth_stats.stop_queue++; - if (atomic_read(&nvchan->queue_sends) < 1 && - !net_device->tx_disable) { - netif_tx_wake_queue(txq); - ndev_ctx->eth_stats.wake_queue++; - ret = -ENOSPC; - } } else { netdev_err(ndev, "Unable to send packet pages %u len %u, ret %d\n", @@ -888,6 +882,15 @@ static inline int netvsc_send_pkt( ret); }
+ if (netif_tx_queue_stopped(txq) && + atomic_read(&nvchan->queue_sends) < 1 && + !net_device->tx_disable) { + netif_tx_wake_queue(txq); + ndev_ctx->eth_stats.wake_queue++; + if (ret == -EAGAIN) + ret = -ENOSPC; + } + return ret; }
From: Martin Brandenburg martin@omnibond.com
[ Upstream commit 33713cd09ccdc1e01b10d0782ae60200d4989553 ]
Otherwise we race with orangefs_writepage/orangefs_writepages which and does not expect i_size < page_offset.
Fixes xfstests generic/129.
Signed-off-by: Martin Brandenburg martin@omnibond.com Signed-off-by: Mike Marshall hubcap@omnibond.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/orangefs/inode.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c index c3334eca18c7e..3260f757c0803 100644 --- a/fs/orangefs/inode.c +++ b/fs/orangefs/inode.c @@ -172,7 +172,11 @@ static int orangefs_setattr_size(struct inode *inode, struct iattr *iattr) } orig_size = i_size_read(inode);
- truncate_setsize(inode, iattr->ia_size); + /* This is truncate_setsize in a different order. */ + truncate_pagecache(inode, iattr->ia_size); + i_size_write(inode, iattr->ia_size); + if (iattr->ia_size > orig_size) + pagecache_isize_extended(inode, orig_size, iattr->ia_size);
new_op = op_alloc(ORANGEFS_VFS_OP_TRUNCATE); if (!new_op)
I don't think it makes much sense to put this one in stable. Without the rest of the pagecache patches the race doesn't exist.
Martin
On Wed, May 22, 2019 at 03:15:25PM -0400, Sasha Levin wrote:
From: Martin Brandenburg martin@omnibond.com
[ Upstream commit 33713cd09ccdc1e01b10d0782ae60200d4989553 ]
Otherwise we race with orangefs_writepage/orangefs_writepages which and does not expect i_size < page_offset.
Fixes xfstests generic/129.
Signed-off-by: Martin Brandenburg martin@omnibond.com Signed-off-by: Mike Marshall hubcap@omnibond.com Signed-off-by: Sasha Levin sashal@kernel.org
fs/orangefs/inode.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c index c3334eca18c7e..3260f757c0803 100644 --- a/fs/orangefs/inode.c +++ b/fs/orangefs/inode.c @@ -172,7 +172,11 @@ static int orangefs_setattr_size(struct inode *inode, struct iattr *iattr) } orig_size = i_size_read(inode);
- truncate_setsize(inode, iattr->ia_size);
- /* This is truncate_setsize in a different order. */
- truncate_pagecache(inode, iattr->ia_size);
- i_size_write(inode, iattr->ia_size);
- if (iattr->ia_size > orig_size)
pagecache_isize_extended(inode, orig_size, iattr->ia_size);
new_op = op_alloc(ORANGEFS_VFS_OP_TRUNCATE); if (!new_op) -- 2.20.1
On Wed, May 22, 2019 at 05:44:37PM -0400, martin@omnibond.com wrote:
I don't think it makes much sense to put this one in stable. Without the rest of the pagecache patches the race doesn't exist.
Dropped it, thank you!
-- Thanks, Sasha
From: JoÃŖo Paulo Rechi Vita jprvita@gmail.com
[ Upstream commit f80c5dad7b6467b884c445ffea45985793b4b2d0 ]
This commit makes the kernel not send the next queued HCI command until a command complete arrives for the last HCI command sent to the controller. This change avoids a problem with some buggy controllers (seen on two SKUs of QCA9377) that send an extra command complete event for the previous command after the kernel had already sent a new HCI command to the controller.
The problem was reproduced when starting an active scanning procedure, where an extra command complete event arrives for the LE_SET_RANDOM_ADDR command. When this happends the kernel ends up not processing the command complete for the following commmand, LE_SET_SCAN_PARAM, and ultimately behaving as if a passive scanning procedure was being performed, when in fact controller is performing an active scanning procedure. This makes it impossible to discover BLE devices as no device found events are sent to userspace.
This problem is reproducible on 100% of the attempts on the affected controllers. The extra command complete event can be seen at timestamp 27.420131 on the btmon logs bellow.
Bluetooth monitor ver 5.50 = Note: Linux version 5.0.0+ (x86_64) 0.352340 = Note: Bluetooth subsystem version 2.22 0.352343 = New Index: 80:C5:F2:8F:87:84 (Primary,USB,hci0) [hci0] 0.352344 = Open Index: 80:C5:F2:8F:87:84 [hci0] 0.352345 = Index Info: 80:C5:F2:8F:87:84 (Qualcomm) [hci0] 0.352346 @ MGMT Open: bluetoothd (privileged) version 1.14 {0x0001} 0.352347 @ MGMT Open: btmon (privileged) version 1.14 {0x0002} 0.352366 @ MGMT Open: btmgmt (privileged) version 1.14 {0x0003} 27.302164 @ MGMT Command: Start Discovery (0x0023) plen 1 {0x0003} [hci0] 27.302310 Address type: 0x06 LE Public LE Random < HCI Command: LE Set Random Address (0x08|0x0005) plen 6 #1 [hci0] 27.302496 Address: 15:60:F2:91:B2:24 (Non-Resolvable)
HCI Event: Command Complete (0x0e) plen 4 #2 [hci0] 27.419117
LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7 #3 [hci0] 27.419244 Type: Active (0x01) Interval: 11.250 msec (0x0012) Window: 11.250 msec (0x0012) Own address type: Random (0x01) Filter policy: Accept all advertisement (0x00)
HCI Event: Command Complete (0x0e) plen 4 #4 [hci0] 27.420131
LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #5 [hci0] 27.420259 Scanning: Enabled (0x01) Filter duplicates: Enabled (0x01)
HCI Event: Command Complete (0x0e) plen 4 #6 [hci0] 27.420969
LE Set Scan Parameters (0x08|0x000b) ncmd 1 Status: Success (0x00)
HCI Event: Command Complete (0x0e) plen 4 #7 [hci0] 27.421983
LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) @ MGMT Event: Command Complete (0x0001) plen 4 {0x0003} [hci0] 27.422059 Start Discovery (0x0023) plen 1 Status: Success (0x00) Address type: 0x06 LE Public LE Random @ MGMT Event: Discovering (0x0013) plen 2 {0x0003} [hci0] 27.422067 Address type: 0x06 LE Public LE Random Discovery: Enabled (0x01) @ MGMT Event: Discovering (0x0013) plen 2 {0x0002} [hci0] 27.422067 Address type: 0x06 LE Public LE Random Discovery: Enabled (0x01) @ MGMT Event: Discovering (0x0013) plen 2 {0x0001} [hci0] 27.422067 Address type: 0x06 LE Public LE Random Discovery: Enabled (0x01)
Signed-off-by: JoÃŖo Paulo Rechi Vita jprvita@endlessm.com Signed-off-by: Marcel Holtmann marcel@holtmann.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/bluetooth/hci.h | 1 + net/bluetooth/hci_core.c | 5 +++++ net/bluetooth/hci_event.c | 12 ++++++++++++ net/bluetooth/hci_request.c | 5 +++++ net/bluetooth/hci_request.h | 1 + 5 files changed, 24 insertions(+)
diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h index fbba43e9bef5b..9a5330eed7944 100644 --- a/include/net/bluetooth/hci.h +++ b/include/net/bluetooth/hci.h @@ -282,6 +282,7 @@ enum { HCI_FORCE_BREDR_SMP, HCI_FORCE_STATIC_ADDR, HCI_LL_RPA_RESOLUTION, + HCI_CMD_PENDING,
__HCI_NUM_FLAGS, }; diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c index d6b2540ba7f8b..f275c99056507 100644 --- a/net/bluetooth/hci_core.c +++ b/net/bluetooth/hci_core.c @@ -4383,6 +4383,9 @@ void hci_req_cmd_complete(struct hci_dev *hdev, u16 opcode, u8 status, return; }
+ /* If we reach this point this event matches the last command sent */ + hci_dev_clear_flag(hdev, HCI_CMD_PENDING); + /* If the command succeeded and there's still more commands in * this request the request is not yet complete. */ @@ -4493,6 +4496,8 @@ static void hci_cmd_work(struct work_struct *work)
hdev->sent_cmd = skb_clone(skb, GFP_KERNEL); if (hdev->sent_cmd) { + if (hci_req_status_pend(hdev)) + hci_dev_set_flag(hdev, HCI_CMD_PENDING); atomic_dec(&hdev->cmd_cnt); hci_send_frame(hdev, skb); if (test_bit(HCI_RESET, &hdev->flags)) diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index 609fd6871c5ad..8b893baf9bbe2 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -3404,6 +3404,12 @@ static void hci_cmd_complete_evt(struct hci_dev *hdev, struct sk_buff *skb, hci_req_cmd_complete(hdev, *opcode, *status, req_complete, req_complete_skb);
+ if (hci_dev_test_flag(hdev, HCI_CMD_PENDING)) { + bt_dev_err(hdev, + "unexpected event for opcode 0x%4.4x", *opcode); + return; + } + if (atomic_read(&hdev->cmd_cnt) && !skb_queue_empty(&hdev->cmd_q)) queue_work(hdev->workqueue, &hdev->cmd_work); } @@ -3511,6 +3517,12 @@ static void hci_cmd_status_evt(struct hci_dev *hdev, struct sk_buff *skb, hci_req_cmd_complete(hdev, *opcode, ev->status, req_complete, req_complete_skb);
+ if (hci_dev_test_flag(hdev, HCI_CMD_PENDING)) { + bt_dev_err(hdev, + "unexpected event for opcode 0x%4.4x", *opcode); + return; + } + if (atomic_read(&hdev->cmd_cnt) && !skb_queue_empty(&hdev->cmd_q)) queue_work(hdev->workqueue, &hdev->cmd_work); } diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c index ca73d36cc1494..e9a95ed654915 100644 --- a/net/bluetooth/hci_request.c +++ b/net/bluetooth/hci_request.c @@ -46,6 +46,11 @@ void hci_req_purge(struct hci_request *req) skb_queue_purge(&req->cmd_q); }
+bool hci_req_status_pend(struct hci_dev *hdev) +{ + return hdev->req_status == HCI_REQ_PEND; +} + static int req_run(struct hci_request *req, hci_req_complete_t complete, hci_req_complete_skb_t complete_skb) { diff --git a/net/bluetooth/hci_request.h b/net/bluetooth/hci_request.h index 692cc8b133682..55b2050cc9ff0 100644 --- a/net/bluetooth/hci_request.h +++ b/net/bluetooth/hci_request.h @@ -37,6 +37,7 @@ struct hci_request {
void hci_req_init(struct hci_request *req, struct hci_dev *hdev); void hci_req_purge(struct hci_request *req); +bool hci_req_status_pend(struct hci_dev *hdev); int hci_req_run(struct hci_request *req, hci_req_complete_t complete); int hci_req_run_skb(struct hci_request *req, hci_req_complete_skb_t complete); void hci_req_add(struct hci_request *req, u16 opcode, u32 plen,
From: Wen Yang wen.yang99@zte.com.cn
[ Upstream commit 02d15f0d80720545f1f4922a1550ea4aaad4e152 ]
The call to of_parse_phandle returns a node pointer with refcount incremented thus it must be explicitly decremented after the last usage.
Detected by coccinelle with the following warnings: ./drivers/pinctrl/zte/pinctrl-zx.c:415:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 407, but without a corresponding object release within this function. ./drivers/pinctrl/zte/pinctrl-zx.c:422:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 407, but without a corresponding object release within this function. ./drivers/pinctrl/zte/pinctrl-zx.c:436:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 407, but without a corresponding object release within this function. ./drivers/pinctrl/zte/pinctrl-zx.c:444:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 407, but without a corresponding object release within this function. ./drivers/pinctrl/zte/pinctrl-zx.c:448:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 407, but without a corresponding object release within this function.
Signed-off-by: Wen Yang wen.yang99@zte.com.cn Cc: Linus Walleij linus.walleij@linaro.org Cc: Jun Nie jun.nie@linaro.org Cc: Linus Walleij linus.walleij@linaro.org Cc: linux-gpio@vger.kernel.org Cc: linux-kernel@vger.kernel.org Acked-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/zte/pinctrl-zx.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/pinctrl/zte/pinctrl-zx.c b/drivers/pinctrl/zte/pinctrl-zx.c index caa44dd2880a8..3cb69309912ba 100644 --- a/drivers/pinctrl/zte/pinctrl-zx.c +++ b/drivers/pinctrl/zte/pinctrl-zx.c @@ -411,6 +411,7 @@ int zx_pinctrl_init(struct platform_device *pdev, }
zpctl->aux_base = of_iomap(np, 0); + of_node_put(np); if (!zpctl->aux_base) return -ENOMEM;
From: Mac Chiang mac.chiang@intel.com
[ Upstream commit 16ec5dfe0327ddcf279957bffe4c8fe527088c63 ]
On kbl_rt5663_max98927, commit 38a5882e4292 ("ASoC: Intel: kbl_rt5663_max98927: Map BTN_0 to KEY_PLAYPAUSE") This key pair mapping to play/pause when playing Youtube
The Android 3.5mm Headset jack specification mentions that BTN_0 should be mapped to KEY_MEDIA, but this is less logical than KEY_PLAYPAUSE, which has much broader userspace support.
For example, the Chrome OS userspace now supports KEY_PLAYPAUSE to toggle play/pause of videos and audio, but does not handle KEY_MEDIA.
Furthermore, Android itself now supports KEY_PLAYPAUSE equivalently, as the new USB headset spec requires KEY_PLAYPAUSE for BTN_0. https://source.android.com/devices/accessories/headset/usb-headset-spec
The same fix is required on Chrome kbl_da7219_max98357a.
Signed-off-by: Mac Chiang mac.chiang@intel.com Reviewed-by: Benson Leung bleung@chromium.org Acked-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/boards/kbl_da7219_max98357a.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/intel/boards/kbl_da7219_max98357a.c b/sound/soc/intel/boards/kbl_da7219_max98357a.c index 38f6ab74709d0..07491a0f8fb8b 100644 --- a/sound/soc/intel/boards/kbl_da7219_max98357a.c +++ b/sound/soc/intel/boards/kbl_da7219_max98357a.c @@ -188,7 +188,7 @@ static int kabylake_da7219_codec_init(struct snd_soc_pcm_runtime *rtd)
jack = &ctx->kabylake_headset;
- snd_jack_set_key(jack->jack, SND_JACK_BTN_0, KEY_MEDIA); + snd_jack_set_key(jack->jack, SND_JACK_BTN_0, KEY_PLAYPAUSE); snd_jack_set_key(jack->jack, SND_JACK_BTN_1, KEY_VOLUMEUP); snd_jack_set_key(jack->jack, SND_JACK_BTN_2, KEY_VOLUMEDOWN); snd_jack_set_key(jack->jack, SND_JACK_BTN_3, KEY_VOICECOMMAND);
From: Minas Harutyunyan minas.harutyunyan@synopsys.com
[ Upstream commit 54f37f56631747075f1f9a2f0edf6ba405e3e66c ]
Some function drivers queueing more than 128 ISOC requests at a time. To avoid "descriptor chain full" cases, increasing descriptors count from MAX_DMA_DESC_NUM_GENERIC to MAX_DMA_DESC_NUM_HS_ISOC for ISOC's only.
Signed-off-by: Minas Harutyunyan hminas@synopsys.com Signed-off-by: Felipe Balbi felipe.balbi@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/dwc2/gadget.c | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-)
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c index 6812a8a3a98ba..a749de7604c62 100644 --- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -714,13 +714,11 @@ static unsigned int dwc2_gadget_get_chain_limit(struct dwc2_hsotg_ep *hs_ep) unsigned int maxsize;
if (is_isoc) - maxsize = hs_ep->dir_in ? DEV_DMA_ISOC_TX_NBYTES_LIMIT : - DEV_DMA_ISOC_RX_NBYTES_LIMIT; + maxsize = (hs_ep->dir_in ? DEV_DMA_ISOC_TX_NBYTES_LIMIT : + DEV_DMA_ISOC_RX_NBYTES_LIMIT) * + MAX_DMA_DESC_NUM_HS_ISOC; else - maxsize = DEV_DMA_NBYTES_LIMIT; - - /* Above size of one descriptor was chosen, multiple it */ - maxsize *= MAX_DMA_DESC_NUM_GENERIC; + maxsize = DEV_DMA_NBYTES_LIMIT * MAX_DMA_DESC_NUM_GENERIC;
return maxsize; } @@ -932,7 +930,7 @@ static int dwc2_gadget_fill_isoc_desc(struct dwc2_hsotg_ep *hs_ep,
/* Update index of last configured entry in the chain */ hs_ep->next_desc++; - if (hs_ep->next_desc >= MAX_DMA_DESC_NUM_GENERIC) + if (hs_ep->next_desc >= MAX_DMA_DESC_NUM_HS_ISOC) hs_ep->next_desc = 0;
return 0; @@ -964,7 +962,7 @@ static void dwc2_gadget_start_isoc_ddma(struct dwc2_hsotg_ep *hs_ep) }
/* Initialize descriptor chain by Host Busy status */ - for (i = 0; i < MAX_DMA_DESC_NUM_GENERIC; i++) { + for (i = 0; i < MAX_DMA_DESC_NUM_HS_ISOC; i++) { desc = &hs_ep->desc_list[i]; desc->status = 0; desc->status |= (DEV_DMA_BUFF_STS_HBUSY @@ -2162,7 +2160,7 @@ static void dwc2_gadget_complete_isoc_request_ddma(struct dwc2_hsotg_ep *hs_ep) dwc2_hsotg_complete_request(hsotg, hs_ep, hs_req, 0);
hs_ep->compl_desc++; - if (hs_ep->compl_desc > (MAX_DMA_DESC_NUM_GENERIC - 1)) + if (hs_ep->compl_desc > (MAX_DMA_DESC_NUM_HS_ISOC - 1)) hs_ep->compl_desc = 0; desc_sts = hs_ep->desc_list[hs_ep->compl_desc].status; } @@ -3899,6 +3897,7 @@ static int dwc2_hsotg_ep_enable(struct usb_ep *ep, unsigned int i, val, size; int ret = 0; unsigned char ep_type; + int desc_num;
dev_dbg(hsotg->dev, "%s: ep %s: a 0x%02x, attr 0x%02x, mps 0x%04x, intr %d\n", @@ -3945,11 +3944,15 @@ static int dwc2_hsotg_ep_enable(struct usb_ep *ep, dev_dbg(hsotg->dev, "%s: read DxEPCTL=0x%08x from 0x%08x\n", __func__, epctrl, epctrl_reg);
+ if (using_desc_dma(hsotg) && ep_type == USB_ENDPOINT_XFER_ISOC) + desc_num = MAX_DMA_DESC_NUM_HS_ISOC; + else + desc_num = MAX_DMA_DESC_NUM_GENERIC; + /* Allocate DMA descriptor chain for non-ctrl endpoints */ if (using_desc_dma(hsotg) && !hs_ep->desc_list) { hs_ep->desc_list = dmam_alloc_coherent(hsotg->dev, - MAX_DMA_DESC_NUM_GENERIC * - sizeof(struct dwc2_dma_desc), + desc_num * sizeof(struct dwc2_dma_desc), &hs_ep->desc_list_dma, GFP_ATOMIC); if (!hs_ep->desc_list) { ret = -ENOMEM; @@ -4092,7 +4095,7 @@ static int dwc2_hsotg_ep_enable(struct usb_ep *ep,
error2: if (ret && using_desc_dma(hsotg) && hs_ep->desc_list) { - dmam_free_coherent(hsotg->dev, MAX_DMA_DESC_NUM_GENERIC * + dmam_free_coherent(hsotg->dev, desc_num * sizeof(struct dwc2_dma_desc), hs_ep->desc_list, hs_ep->desc_list_dma); hs_ep->desc_list = NULL;
From: Marek Szyprowski m.szyprowski@samsung.com
[ Upstream commit 41a91c606e7d2b74358a944525267cc451c271e8 ]
dwc3_gadget_suspend() is called under dwc->lock spinlock. In such context calling synchronize_irq() is not allowed. Move the problematic call out of the protected block to fix the following kernel BUG during system suspend:
BUG: sleeping function called from invalid context at kernel/irq/manage.c:112 in_atomic(): 1, irqs_disabled(): 128, pid: 1601, name: rtcwake 6 locks held by rtcwake/1601: #0: f70ac2a2 (sb_writers#7){.+.+}, at: vfs_write+0x130/0x16c #1: b5fe1270 (&of->mutex){+.+.}, at: kernfs_fop_write+0xc0/0x1e4 #2: 7e597705 (kn->count#60){.+.+}, at: kernfs_fop_write+0xc8/0x1e4 #3: 8b3527d0 (system_transition_mutex){+.+.}, at: pm_suspend+0xc4/0xc04 #4: fc7f1c42 (&dev->mutex){....}, at: __device_suspend+0xd8/0x74c #5: 4b36507e (&(&dwc->lock)->rlock){....}, at: dwc3_gadget_suspend+0x24/0x3c irq event stamp: 11252 hardirqs last enabled at (11251): [<c09c54a4>] _raw_spin_unlock_irqrestore+0x6c/0x74 hardirqs last disabled at (11252): [<c09c4d44>] _raw_spin_lock_irqsave+0x1c/0x5c softirqs last enabled at (9744): [<c0102564>] __do_softirq+0x3a4/0x66c softirqs last disabled at (9737): [<c0128528>] irq_exit+0x140/0x168 Preemption disabled at: [<00000000>] (null) CPU: 7 PID: 1601 Comm: rtcwake Not tainted 5.0.0-rc3-next-20190122-00039-ga3f4ee4f8a52 #5252 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [<c01110f0>] (unwind_backtrace) from [<c010d120>] (show_stack+0x10/0x14) [<c010d120>] (show_stack) from [<c09a4d04>] (dump_stack+0x90/0xc8) [<c09a4d04>] (dump_stack) from [<c014c700>] (___might_sleep+0x22c/0x2c8) [<c014c700>] (___might_sleep) from [<c0189d68>] (synchronize_irq+0x28/0x84) [<c0189d68>] (synchronize_irq) from [<c05cbbf8>] (dwc3_gadget_suspend+0x34/0x3c) [<c05cbbf8>] (dwc3_gadget_suspend) from [<c05bd020>] (dwc3_suspend_common+0x154/0x410) [<c05bd020>] (dwc3_suspend_common) from [<c05bd34c>] (dwc3_suspend+0x14/0x2c) [<c05bd34c>] (dwc3_suspend) from [<c051c730>] (platform_pm_suspend+0x2c/0x54) [<c051c730>] (platform_pm_suspend) from [<c05285d4>] (dpm_run_callback+0xa4/0x3dc) [<c05285d4>] (dpm_run_callback) from [<c0528a40>] (__device_suspend+0x134/0x74c) [<c0528a40>] (__device_suspend) from [<c052c508>] (dpm_suspend+0x174/0x588) [<c052c508>] (dpm_suspend) from [<c0182134>] (suspend_devices_and_enter+0xc0/0xe74) [<c0182134>] (suspend_devices_and_enter) from [<c0183658>] (pm_suspend+0x770/0xc04) [<c0183658>] (pm_suspend) from [<c0180ddc>] (state_store+0x6c/0xcc) [<c0180ddc>] (state_store) from [<c09a9a70>] (kobj_attr_store+0x14/0x20) [<c09a9a70>] (kobj_attr_store) from [<c02d6800>] (sysfs_kf_write+0x4c/0x50) [<c02d6800>] (sysfs_kf_write) from [<c02d594c>] (kernfs_fop_write+0xfc/0x1e4) [<c02d594c>] (kernfs_fop_write) from [<c02593d8>] (__vfs_write+0x2c/0x160) [<c02593d8>] (__vfs_write) from [<c0259694>] (vfs_write+0xa4/0x16c) [<c0259694>] (vfs_write) from [<c0259870>] (ksys_write+0x40/0x8c) [<c0259870>] (ksys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28) Exception stack(0xed55ffa8 to 0xed55fff0) ...
Fixes: 01c10880d242 ("usb: dwc3: gadget: synchronize_irq dwc irq in suspend") Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Signed-off-by: Felipe Balbi felipe.balbi@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/dwc3/core.c | 2 ++ drivers/usb/dwc3/gadget.c | 2 -- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index f944cea4056bc..72110a8c49d68 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1600,6 +1600,7 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) spin_lock_irqsave(&dwc->lock, flags); dwc3_gadget_suspend(dwc); spin_unlock_irqrestore(&dwc->lock, flags); + synchronize_irq(dwc->irq_gadget); dwc3_core_exit(dwc); break; case DWC3_GCTL_PRTCAP_HOST: @@ -1632,6 +1633,7 @@ static int dwc3_suspend_common(struct dwc3 *dwc, pm_message_t msg) spin_lock_irqsave(&dwc->lock, flags); dwc3_gadget_suspend(dwc); spin_unlock_irqrestore(&dwc->lock, flags); + synchronize_irq(dwc->irq_gadget); }
dwc3_otg_exit(dwc); diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index e293400cc6e95..2bb0ff9608d30 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -3384,8 +3384,6 @@ int dwc3_gadget_suspend(struct dwc3 *dwc) dwc3_disconnect_gadget(dwc); __dwc3_gadget_stop(dwc);
- synchronize_irq(dwc->irq_gadget); - return 0; }
From: Fei Yang fei.yang@intel.com
[ Upstream commit 73103c7f958b99561555c3bd1bc1a0809e0b7d61 ]
The following kernel panic happens due to the io_data buffer gets deallocated before the async io is completed. Add a check for the case where io_data buffer should be deallocated by ffs_user_copy_worker.
[ 41.663334] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 [ 41.672099] #PF error: [normal kernel read fault] [ 41.677356] PGD 20c974067 P4D 20c974067 PUD 20c973067 PMD 0 [ 41.683687] Oops: 0000 [#1] PREEMPT SMP [ 41.687976] CPU: 1 PID: 7 Comm: kworker/u8:0 Tainted: G U 5.0.0-quilt-2e5dc0ac-00790-gd8c79f2-dirty #2 [ 41.705309] Workqueue: adb ffs_user_copy_worker [ 41.705316] RIP: 0010:__vunmap+0x2a/0xc0 [ 41.705318] Code: 0f 1f 44 00 00 48 85 ff 0f 84 87 00 00 00 55 f7 c7 ff 0f 00 00 48 89 e5 41 55 41 89 f5 41 54 53 48 89 fb 75 71 e8 56 d7 ff ff <4c> 8b 60 48 4d 85 e4 74 76 48 89 df e8 25 ff ff ff 45 85 ed 74 46 [ 41.705320] RSP: 0018:ffffbc3a40053df0 EFLAGS: 00010286 [ 41.705322] RAX: 0000000000000000 RBX: ffffbc3a406f1000 RCX: 0000000000000000 [ 41.705323] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 00000000ffffffff [ 41.705324] RBP: ffffbc3a40053e08 R08: 000000000001fb79 R09: 0000000000000037 [ 41.705325] R10: ffffbc3a40053b68 R11: ffffbc3a40053cad R12: fffffffffffffff2 [ 41.705326] R13: 0000000000000001 R14: 0000000000000000 R15: ffffffffffffffff [ 41.705328] FS: 0000000000000000(0000) GS:ffff9e2977a80000(0000) knlGS:0000000000000000 [ 41.705329] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 41.705330] CR2: 0000000000000048 CR3: 000000020c994000 CR4: 00000000003406e0 [ 41.705331] Call Trace: [ 41.705338] vfree+0x50/0xb0 [ 41.705341] ffs_user_copy_worker+0xe9/0x1c0 [ 41.705344] process_one_work+0x19f/0x3e0 [ 41.705348] worker_thread+0x3f/0x3b0 [ 41.829766] kthread+0x12b/0x150 [ 41.833371] ? process_one_work+0x3e0/0x3e0 [ 41.838045] ? kthread_create_worker_on_cpu+0x70/0x70 [ 41.843695] ret_from_fork+0x3a/0x50 [ 41.847689] Modules linked in: hci_uart bluetooth ecdh_generic rfkill_gpio dwc3_pci dwc3 snd_usb_audio mei_me tpm_crb snd_usbmidi_lib xhci_pci xhci_hcd mei tpm snd_hwdep cfg80211 snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_hda_core videobuf2_dma_sg crlmodule [ 41.876880] CR2: 0000000000000048 [ 41.880584] ---[ end trace 2bc4addff0f2e673 ]--- [ 41.891346] RIP: 0010:__vunmap+0x2a/0xc0 [ 41.895734] Code: 0f 1f 44 00 00 48 85 ff 0f 84 87 00 00 00 55 f7 c7 ff 0f 00 00 48 89 e5 41 55 41 89 f5 41 54 53 48 89 fb 75 71 e8 56 d7 ff ff <4c> 8b 60 48 4d 85 e4 74 76 48 89 df e8 25 ff ff ff 45 85 ed 74 46 [ 41.916740] RSP: 0018:ffffbc3a40053df0 EFLAGS: 00010286 [ 41.922583] RAX: 0000000000000000 RBX: ffffbc3a406f1000 RCX: 0000000000000000 [ 41.930563] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 00000000ffffffff [ 41.938540] RBP: ffffbc3a40053e08 R08: 000000000001fb79 R09: 0000000000000037 [ 41.946520] R10: ffffbc3a40053b68 R11: ffffbc3a40053cad R12: fffffffffffffff2 [ 41.954502] R13: 0000000000000001 R14: 0000000000000000 R15: ffffffffffffffff [ 41.962482] FS: 0000000000000000(0000) GS:ffff9e2977a80000(0000) knlGS:0000000000000000 [ 41.971536] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 41.977960] CR2: 0000000000000048 CR3: 000000020c994000 CR4: 00000000003406e0 [ 41.985930] Kernel panic - not syncing: Fatal exception [ 41.991817] Kernel Offset: 0x16000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 42.009525] Rebooting in 10 seconds.. [ 52.014376] ACPI MEMORY or I/O RESET_REG.
Fixes: 772a7a724f69 ("usb: gadget: f_fs: Allow scatter-gather buffers") Signed-off-by: Fei Yang fei.yang@intel.com Reviewed-by: Manu Gautam mgautam@codeaurora.org Tested-by: John Stultz john.stultz@linaro.org Signed-off-by: Felipe Balbi felipe.balbi@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/function/f_fs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index 20413c276c616..47be961f1bf3f 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -1133,7 +1133,8 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data) error_mutex: mutex_unlock(&epfile->mutex); error: - ffs_free_buffer(io_data); + if (ret != -EIOCBQUEUED) /* don't free if there is iocb queued */ + ffs_free_buffer(io_data); return ret; }
From: Jerome Brunet jbrunet@baylibre.com
[ Upstream commit 30180e8436046344b12813dc954b2e01dfdcd22d ]
If the hdmi codec startup fails, it should clear the current_substream pointer to free the device. This is properly done for the audio_startup() callback but for snd_pcm_hw_constraint_eld().
Make sure the pointer cleared if an error is reported.
Signed-off-by: Jerome Brunet jbrunet@baylibre.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/hdmi-codec.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/sound/soc/codecs/hdmi-codec.c b/sound/soc/codecs/hdmi-codec.c index 35df73e42cbc5..fb2f0ac1f16f3 100644 --- a/sound/soc/codecs/hdmi-codec.c +++ b/sound/soc/codecs/hdmi-codec.c @@ -439,8 +439,12 @@ static int hdmi_codec_startup(struct snd_pcm_substream *substream, if (!ret) { ret = snd_pcm_hw_constraint_eld(substream->runtime, hcp->eld); - if (ret) + if (ret) { + mutex_lock(&hcp->current_stream_lock); + hcp->current_stream = NULL; + mutex_unlock(&hcp->current_stream_lock); return ret; + } } /* Select chmap supported */ hdmi_codec_eld_chmap(hcp);
From: Pavel Machek pavel@ucw.cz
[ Upstream commit 0db37915d912e8dc6588f25da76d3ed36718d92f ]
There are races between "main" thread and workqueue. They manifest themselves on Thinkpad X60:
This should result in LED blinking, but it turns it off instead:
root@amd:/data/pavel# cd /sys/class/leds/tpacpi::power root@amd:/sys/class/leds/tpacpi::power# echo timer > trigger root@amd:/sys/class/leds/tpacpi::power# echo timer > trigger
It should be possible to transition from blinking to solid on by echo 0 > brightness; echo 1 > brightness... but that does not work, either, if done too quickly.
Synchronization of the workqueue fixes both.
Fixes: 1afcadfcd184 ("leds: core: Use set_brightness_work for the blocking op") Signed-off-by: Pavel Machek pavel@ucw.cz Signed-off-by: Jacek Anaszewski jacek.anaszewski@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/leds/led-class.c | 1 + drivers/leds/led-core.c | 5 +++++ 2 files changed, 6 insertions(+)
diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c index 3c7e3487b373b..85848c5da705f 100644 --- a/drivers/leds/led-class.c +++ b/drivers/leds/led-class.c @@ -57,6 +57,7 @@ static ssize_t brightness_store(struct device *dev, if (state == LED_OFF) led_trigger_remove(led_cdev); led_set_brightness(led_cdev, state); + flush_work(&led_cdev->set_brightness_work);
ret = size; unlock: diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c index e3da7c03da1b5..e9ae7f87ab900 100644 --- a/drivers/leds/led-core.c +++ b/drivers/leds/led-core.c @@ -164,6 +164,11 @@ static void led_blink_setup(struct led_classdev *led_cdev, unsigned long *delay_on, unsigned long *delay_off) { + /* + * If "set brightness to 0" is pending in workqueue, we don't + * want that to be reordered after blink_set() + */ + flush_work(&led_cdev->set_brightness_work); if (!test_bit(LED_BLINK_ONESHOT, &led_cdev->work_flags) && led_cdev->blink_set && !led_cdev->blink_set(led_cdev, delay_on, delay_off))
Hi!
Could we hold this patch for now?
From: Pavel Machek pavel@ucw.cz
[ Upstream commit 0db37915d912e8dc6588f25da76d3ed36718d92f ]
There are races between "main" thread and workqueue. They manifest themselves on Thinkpad X60:
This should result in LED blinking, but it turns it off instead:
root@amd:/data/pavel# cd /sys/class/leds/tpacpi\:\:power root@amd:/sys/class/leds/tpacpi::power# echo timer > trigger root@amd:/sys/class/leds/tpacpi::power# echo timer > trigger
It should be possible to transition from blinking to solid on by echo 0 > brightness; echo 1 > brightness... but that does not work, either, if done too quickly.
Synchronization of the workqueue fixes both.
Fixes: 1afcadfcd184 ("leds: core: Use set_brightness_work for the blocking op") Signed-off-by: Pavel Machek pavel@ucw.cz Signed-off-by: Jacek Anaszewski jacek.anaszewski@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org
drivers/leds/led-class.c | 1 + drivers/leds/led-core.c | 5 +++++ 2 files changed, 6 insertions(+)
index e3da7c03da1b5..e9ae7f87ab900 100644 --- a/drivers/leds/led-core.c +++ b/drivers/leds/led-core.c @@ -164,6 +164,11 @@ static void led_blink_setup(struct led_classdev *led_cdev, unsigned long *delay_on, unsigned long *delay_off) {
- /*
* If "set brightness to 0" is pending in workqueue, we don't
* want that to be reordered after blink_set()
*/
- flush_work(&led_cdev->set_brightness_work); if (!test_bit(LED_BLINK_ONESHOT, &led_cdev->work_flags) && led_cdev->blink_set && !led_cdev->blink_set(led_cdev, delay_on, delay_off))
This part is likely buggy. It seems triggers are using this from atomic context... ledtrig-disk for example.
Pavel
Hi!
Could we hold this patch for now?
Sure, dropped. Thanks!
So... fix for this now should be in mainline. But as original problem is not too severe, I don't think we need to apply this or the fixed version to stable.
Best regards, Pavel
From: Anju T Sudhakar anju@linux.vnet.ibm.com
[ Upstream commit a913e5e8b43be1d3897a141ce61c1ec071cad89c ]
Nest hardware counter memory resides in a per-chip reserve-memory. During nest_imc_event_init(), chip-id of the event-cpu is considered to calculate the base memory addresss for that cpu. Return, proper error condition if the chip_id calculated is invalid.
Reported-by: Dan Carpenter dan.carpenter@oracle.com Fixes: 885dcd709ba91 ("powerpc/perf: Add nest IMC PMU support") Reviewed-by: Madhavan Srinivasan maddy@linux.vnet.ibm.com Signed-off-by: Anju T Sudhakar anju@linux.vnet.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/perf/imc-pmu.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c index b1c37cc3fa98b..6159e9edddfd0 100644 --- a/arch/powerpc/perf/imc-pmu.c +++ b/arch/powerpc/perf/imc-pmu.c @@ -487,6 +487,11 @@ static int nest_imc_event_init(struct perf_event *event) * Get the base memory addresss for this cpu. */ chip_id = cpu_to_chip_id(event->cpu); + + /* Return, if chip_id is not valid */ + if (chip_id < 0) + return -ENODEV; + pcni = pmu->mem_info; do { if (pcni->id == chip_id) {
From: Bo YU tsu.yubo@gmail.com
[ Upstream commit 5d085ec04a000fefb5182d3b03ee46ca96d8389b ]
This is detected by Coverity scan: CID: 1440481
Signed-off-by: Bo YU tsu.yubo@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/boot/addnote.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/boot/addnote.c b/arch/powerpc/boot/addnote.c index 9d9f6f334d3cc..3da3e2b1b51bc 100644 --- a/arch/powerpc/boot/addnote.c +++ b/arch/powerpc/boot/addnote.c @@ -223,7 +223,11 @@ main(int ac, char **av) PUT_16(E_PHNUM, np + 2);
/* write back */ - lseek(fd, (long) 0, SEEK_SET); + i = lseek(fd, (long) 0, SEEK_SET); + if (i < 0) { + perror("lseek"); + exit(1); + } i = write(fd, buf, n); if (i < 0) { perror("write");
From: Anju T Sudhakar anju@linux.vnet.ibm.com
[ Upstream commit 860b7d2286236170a36f94946d03ca9888d32571 ]
The data structure (i.e struct imc_mem_info) to hold the memory address information for nest imc units is allocated based on the number of nodes in the system.
nest_imc_event_init() traverse this struct array to calculate the memory base address for the event-cpu. If we fail to find a match for the event cpu's chip-id in imc_mem_info struct array, then the do-while loop will iterate until we crash.
Fix this by changing the loop exit condition based on the number of non zero vbase elements in the array, since the allocation is done for nr_chips + 1.
Reported-by: Dan Carpenter dan.carpenter@oracle.com Fixes: 885dcd709ba91 ("powerpc/perf: Add nest IMC PMU support") Signed-off-by: Anju T Sudhakar anju@linux.vnet.ibm.com Reviewed-by: Madhavan Srinivasan maddy@linux.vnet.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/perf/imc-pmu.c | 2 +- arch/powerpc/platforms/powernv/opal-imc.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c index 6159e9edddfd0..2d12f0037e3a5 100644 --- a/arch/powerpc/perf/imc-pmu.c +++ b/arch/powerpc/perf/imc-pmu.c @@ -499,7 +499,7 @@ static int nest_imc_event_init(struct perf_event *event) break; } pcni++; - } while (pcni); + } while (pcni->vbase != 0);
if (!flag) return -ENODEV; diff --git a/arch/powerpc/platforms/powernv/opal-imc.c b/arch/powerpc/platforms/powernv/opal-imc.c index 58a07948c76e7..3d27f02695e41 100644 --- a/arch/powerpc/platforms/powernv/opal-imc.c +++ b/arch/powerpc/platforms/powernv/opal-imc.c @@ -127,7 +127,7 @@ static int imc_get_mem_addr_nest(struct device_node *node, nr_chips)) goto error;
- pmu_ptr->mem_info = kcalloc(nr_chips, sizeof(*pmu_ptr->mem_info), + pmu_ptr->mem_info = kcalloc(nr_chips + 1, sizeof(*pmu_ptr->mem_info), GFP_KERNEL); if (!pmu_ptr->mem_info) goto error;
From: Claudiu Beznea claudiu.beznea@microchip.com
[ Upstream commit e5c27498a0403b270620b1a8a0a66e3efc222fb6 ]
atmel_qspi objects are kept in spi_controller objects, so, first get pointer to spi_controller object and then get atmel_qspi object from spi_controller object.
Fixes: 2d30ac5ed633 ("mtd: spi-nor: atmel-quadspi: Use spi-mem interface for atmel-quadspi driver") Signed-off-by: Claudiu Beznea claudiu.beznea@microchip.com Reviewed-by: Tudor Ambarus tudor.ambarus@microchip.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/atmel-quadspi.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/atmel-quadspi.c b/drivers/spi/atmel-quadspi.c index fffc21cd5f793..b3173ebddaded 100644 --- a/drivers/spi/atmel-quadspi.c +++ b/drivers/spi/atmel-quadspi.c @@ -570,7 +570,8 @@ static int atmel_qspi_remove(struct platform_device *pdev)
static int __maybe_unused atmel_qspi_suspend(struct device *dev) { - struct atmel_qspi *aq = dev_get_drvdata(dev); + struct spi_controller *ctrl = dev_get_drvdata(dev); + struct atmel_qspi *aq = spi_controller_get_devdata(ctrl);
clk_disable_unprepare(aq->qspick); clk_disable_unprepare(aq->pclk); @@ -580,7 +581,8 @@ static int __maybe_unused atmel_qspi_suspend(struct device *dev)
static int __maybe_unused atmel_qspi_resume(struct device *dev) { - struct atmel_qspi *aq = dev_get_drvdata(dev); + struct spi_controller *ctrl = dev_get_drvdata(dev); + struct atmel_qspi *aq = spi_controller_get_devdata(ctrl);
clk_prepare_enable(aq->pclk); clk_prepare_enable(aq->qspick);
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit ea751227c813ab833609afecfeedaf0aa26f327e ]
During randconfig builds, I occasionally run into an invalid configuration of the freescale FIQ sound support:
WARNING: unmet direct dependencies detected for SND_SOC_IMX_PCM_FIQ Depends on [m]: SOUND [=y] && !UML && SND [=y] && SND_SOC [=y] && SND_IMX_SOC [=m] Selected by [y]: - SND_SOC_FSL_SPDIF [=y] && SOUND [=y] && !UML && SND [=y] && SND_SOC [=y] && SND_IMX_SOC [=m]!=n && (MXC_TZIC [=n] || MXC_AVIC [=y])
sound/soc/fsl/imx-ssi.o: In function `imx_ssi_remove': imx-ssi.c:(.text+0x28): undefined reference to `imx_pcm_fiq_exit' sound/soc/fsl/imx-ssi.o: In function `imx_ssi_probe': imx-ssi.c:(.text+0xa64): undefined reference to `imx_pcm_fiq_init'
The Kconfig warning is a result of the symbol being defined inside of the "if SND_IMX_SOC" block, and is otherwise harmless. The link error is more tricky and happens with SND_SOC_IMX_SSI=y, which may or may not imply FIQ support. However, if SND_SOC_FSL_SSI is set to =m at the same time, that selects SND_SOC_IMX_PCM_FIQ as a loadable module dependency, which then causes a link failure from imx-ssi.
The solution here is to make SND_SOC_IMX_PCM_FIQ built-in whenever one of its potential users is built-in.
Fixes: ff40260f79dc ("ASoC: fsl: refine DMA/FIQ dependencies") Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/fsl/Kconfig | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/sound/soc/fsl/Kconfig b/sound/soc/fsl/Kconfig index 7b1d9970be8b3..1f65cf555ebe0 100644 --- a/sound/soc/fsl/Kconfig +++ b/sound/soc/fsl/Kconfig @@ -182,16 +182,17 @@ config SND_MPC52xx_SOC_EFIKA
endif # SND_POWERPC_SOC
+config SND_SOC_IMX_PCM_FIQ + tristate + default y if SND_SOC_IMX_SSI=y && (SND_SOC_FSL_SSI=m || SND_SOC_FSL_SPDIF=m) && (MXC_TZIC || MXC_AVIC) + select FIQ + if SND_IMX_SOC
config SND_SOC_IMX_SSI tristate select SND_SOC_FSL_UTILS
-config SND_SOC_IMX_PCM_FIQ - tristate - select FIQ - comment "SoC Audio support for Freescale i.MX boards:"
config SND_MXC_SOC_WM1133_EV1
From: Flavio Suligoi f.suligoi@asem.it
[ Upstream commit 29f2133717c527f492933b0622a4aafe0b3cbe9e ]
Calculate the divisor for the SCR (Serial Clock Rate), avoiding that the SSP transmission rate can be greater than the device rate.
When the division between the SSP clock and the device rate generates a reminder, we have to increment by one the divisor. In this way the resulting SSP clock will never be greater than the device SPI max frequency.
For example, with:
- ssp_clk = 50 MHz - dev freq = 15 MHz
without this patch the SSP clock will be greater than 15 MHz:
- 25 MHz for PXA25x_SSP and CE4100_SSP - 16,56 MHz for the others
Instead, with this patch, we have in both case an SSP clock of 12.5MHz, so the max rate of the SPI device clock is respected.
Signed-off-by: Flavio Suligoi f.suligoi@asem.it Reviewed-by: Jarkko Nikula jarkko.nikula@linux.intel.com Reviewed-by: Jarkko Nikula jarkko.nikula@linux.intel.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-pxa2xx.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/spi-pxa2xx.c b/drivers/spi/spi-pxa2xx.c index b6ddba833d021..d2076f2f468f0 100644 --- a/drivers/spi/spi-pxa2xx.c +++ b/drivers/spi/spi-pxa2xx.c @@ -884,10 +884,14 @@ static unsigned int ssp_get_clk_div(struct driver_data *drv_data, int rate)
rate = min_t(int, ssp_clk, rate);
+ /* + * Calculate the divisor for the SCR (Serial Clock Rate), avoiding + * that the SSP transmission rate can be greater than the device rate + */ if (ssp->type == PXA25x_SSP || ssp->type == CE4100_SSP) - return (ssp_clk / (2 * rate) - 1) & 0xff; + return (DIV_ROUND_UP(ssp_clk, 2 * rate) - 1) & 0xff; else - return (ssp_clk / rate - 1) & 0xfff; + return (DIV_ROUND_UP(ssp_clk, rate) - 1) & 0xfff; }
static unsigned int pxa2xx_ssp_get_clk_div(struct driver_data *drv_data,
From: Bodong Wang bodong@mellanox.com
[ Upstream commit 6f4e02193c9a9ea54dd3151cf97489fa787cd0e6 ]
When the state of rep was introduced, it was also designed to prevent duplicate unloading of the same rep. Considering the following two flows when an eswitch manager is at switchdev mode with n VF reps loaded.
+--------------------------------------+--------------------------------+ | cpu-0 | cpu-1 | | -------- | -------- | | mlx5_ib_remove | mlx5_eswitch_disable_sriov | | mlx5_ib_unregister_vport_reps | esw_offloads_cleanup | | mlx5_eswitch_unregister_vport_reps | esw_offloads_unload_all_reps | | __unload_reps_all_vport | __unload_reps_all_vport | +--------------------------------------+--------------------------------+
These two flows will try to unload the same rep. Per original design, once one flow unloads the rep, the state moves to REGISTERED. The 2nd flow will no longer needs to do the unload and bails out. However, as read and write of the state is not atomic, when 1st flow is doing the unload, the state is still LOADED, 2nd flow is able to do the same unload action. Kernel crash will happen.
To solve this, driver should do atomic test-and-set for the state. So that only one flow can change the rep state from LOADED to REGISTERED, and proceed to do the actual unloading.
Since the state is changing to atomic type, all other read/write should be atomic action as well.
Fixes: f121e0ea9586 (net/mlx5: E-Switch, Add state to eswitch vport representors) Signed-off-by: Bodong Wang bodong@mellanox.com Reviewed-by: Parav Pandit parav@mellanox.com Reviewed-by: Vu Pham vuhuong@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../mellanox/mlx5/core/eswitch_offloads.c | 36 +++++++++---------- include/linux/mlx5/eswitch.h | 2 +- 2 files changed, 18 insertions(+), 20 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 9b2d78ee22b88..d2d8da133082c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -363,7 +363,7 @@ static int esw_set_global_vlan_pop(struct mlx5_eswitch *esw, u8 val) esw_debug(esw->dev, "%s applying global %s policy\n", __func__, val ? "pop" : "none"); for (vf_vport = 1; vf_vport < esw->enabled_vports; vf_vport++) { rep = &esw->offloads.vport_reps[vf_vport]; - if (rep->rep_if[REP_ETH].state != REP_LOADED) + if (atomic_read(&rep->rep_if[REP_ETH].state) != REP_LOADED) continue;
err = __mlx5_eswitch_set_vport_vlan(esw, rep->vport, 0, 0, val); @@ -1306,7 +1306,8 @@ int esw_offloads_init_reps(struct mlx5_eswitch *esw) ether_addr_copy(rep->hw_id, hw_id);
for (rep_type = 0; rep_type < NUM_REP_TYPES; rep_type++) - rep->rep_if[rep_type].state = REP_UNREGISTERED; + atomic_set(&rep->rep_if[rep_type].state, + REP_UNREGISTERED); }
return 0; @@ -1315,11 +1316,9 @@ int esw_offloads_init_reps(struct mlx5_eswitch *esw) static void __esw_offloads_unload_rep(struct mlx5_eswitch *esw, struct mlx5_eswitch_rep *rep, u8 rep_type) { - if (rep->rep_if[rep_type].state != REP_LOADED) - return; - - rep->rep_if[rep_type].unload(rep); - rep->rep_if[rep_type].state = REP_REGISTERED; + if (atomic_cmpxchg(&rep->rep_if[rep_type].state, + REP_LOADED, REP_REGISTERED) == REP_LOADED) + rep->rep_if[rep_type].unload(rep); }
static void __unload_reps_special_vport(struct mlx5_eswitch *esw, u8 rep_type) @@ -1380,16 +1379,15 @@ static int __esw_offloads_load_rep(struct mlx5_eswitch *esw, { int err = 0;
- if (rep->rep_if[rep_type].state != REP_REGISTERED) - return 0; - - err = rep->rep_if[rep_type].load(esw->dev, rep); - if (err) - return err; - - rep->rep_if[rep_type].state = REP_LOADED; + if (atomic_cmpxchg(&rep->rep_if[rep_type].state, + REP_REGISTERED, REP_LOADED) == REP_REGISTERED) { + err = rep->rep_if[rep_type].load(esw->dev, rep); + if (err) + atomic_set(&rep->rep_if[rep_type].state, + REP_REGISTERED); + }
- return 0; + return err; }
static int __load_reps_special_vport(struct mlx5_eswitch *esw, u8 rep_type) @@ -2076,7 +2074,7 @@ void mlx5_eswitch_register_vport_reps(struct mlx5_eswitch *esw, rep_if->get_proto_dev = __rep_if->get_proto_dev; rep_if->priv = __rep_if->priv;
- rep_if->state = REP_REGISTERED; + atomic_set(&rep_if->state, REP_REGISTERED); } } EXPORT_SYMBOL(mlx5_eswitch_register_vport_reps); @@ -2091,7 +2089,7 @@ void mlx5_eswitch_unregister_vport_reps(struct mlx5_eswitch *esw, u8 rep_type) __unload_reps_all_vport(esw, max_vf, rep_type);
mlx5_esw_for_all_reps(esw, i, rep) - rep->rep_if[rep_type].state = REP_UNREGISTERED; + atomic_set(&rep->rep_if[rep_type].state, REP_UNREGISTERED); } EXPORT_SYMBOL(mlx5_eswitch_unregister_vport_reps);
@@ -2111,7 +2109,7 @@ void *mlx5_eswitch_get_proto_dev(struct mlx5_eswitch *esw,
rep = mlx5_eswitch_get_rep(esw, vport);
- if (rep->rep_if[rep_type].state == REP_LOADED && + if (atomic_read(&rep->rep_if[rep_type].state) == REP_LOADED && rep->rep_if[rep_type].get_proto_dev) return rep->rep_if[rep_type].get_proto_dev(rep); return NULL; diff --git a/include/linux/mlx5/eswitch.h b/include/linux/mlx5/eswitch.h index 96d8435421de8..0ca77dd1429c0 100644 --- a/include/linux/mlx5/eswitch.h +++ b/include/linux/mlx5/eswitch.h @@ -35,7 +35,7 @@ struct mlx5_eswitch_rep_if { void (*unload)(struct mlx5_eswitch_rep *rep); void *(*get_proto_dev)(struct mlx5_eswitch_rep *rep); void *priv; - u8 state; + atomic_t state; };
struct mlx5_eswitch_rep {
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit e025da3d7aa4770bb1d1b3b0aa7cc4da1744852d ]
If "ret_len" is negative then it could lead to a NULL dereference.
The "ret_len" value comes from nl80211_vendor_cmd(), if it's negative then we don't allocate the "dcmd_buf" buffer. Then we pass "ret_len" to brcmf_fil_cmd_data_set() where it is cast to a very high u32 value. Most of the functions in that call tree check whether the buffer we pass is NULL but there are at least a couple places which don't such as brcmf_dbg_hex_dump() and brcmf_msgbuf_query_dcmd(). We memcpy() to and from the buffer so it would result in a NULL dereference.
The fix is to change the types so that "ret_len" can't be negative. (If we memcpy() zero bytes to NULL, that's a no-op and doesn't cause an issue).
Fixes: 1bacb0487d0e ("brcmfmac: replace cfg80211 testmode with vendor command") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/broadcom/brcm80211/brcmfmac/vendor.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/vendor.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/vendor.c index 8eff2753abade..d493021f60318 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/vendor.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/vendor.c @@ -35,9 +35,10 @@ static int brcmf_cfg80211_vndr_cmds_dcmd_handler(struct wiphy *wiphy, struct brcmf_if *ifp; const struct brcmf_vndr_dcmd_hdr *cmdhdr = data; struct sk_buff *reply; - int ret, payload, ret_len; + unsigned int payload, ret_len; void *dcmd_buf = NULL, *wr_pointer; u16 msglen, maxmsglen = PAGE_SIZE - 0x100; + int ret;
if (len < sizeof(*cmdhdr)) { brcmf_err("vendor command too short: %d\n", len); @@ -65,7 +66,7 @@ static int brcmf_cfg80211_vndr_cmds_dcmd_handler(struct wiphy *wiphy, brcmf_err("oversize return buffer %d\n", ret_len); ret_len = BRCMF_DCMD_MAXLEN; } - payload = max(ret_len, len) + 1; + payload = max_t(unsigned int, ret_len, len) + 1; dcmd_buf = vzalloc(payload); if (NULL == dcmd_buf) return -ENOMEM;
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit 23583f7795025e3c783b680d906509366b0906ad ]
When the DSDT tables expose devices with subdevices and a set of hierarchical _DSD properties, the data returned by acpi_get_next_subnode() is incorrect, with the results suggesting a bad pointer assignment. The parser works fine with device_nodes or data_nodes, but not with a combination of the two.
The problem is traced to an invalid pointer used when jumping from handling device_nodes to data nodes. The existing code looks for data nodes below the last subdevice found instead of the common root. Fix by forcing the acpi_device pointer to be derived from the same fwnode for the two types of subnodes.
This same problem of handling device and data nodes was already fixed in a similar way by 'commit bf4703fdd166 ("ACPI / property: fix data node parsing in acpi_get_next_subnode()")' but broken later by 'commit 34055190b19 ("ACPI / property: Add fwnode_get_next_child_node()")', so this should probably go to linux-stable all the way to 4.12
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/property.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c index 77abe0ec40431..bd533f68b1dec 100644 --- a/drivers/acpi/property.c +++ b/drivers/acpi/property.c @@ -1031,6 +1031,14 @@ struct fwnode_handle *acpi_get_next_subnode(const struct fwnode_handle *fwnode, const struct acpi_data_node *data = to_acpi_data_node(fwnode); struct acpi_data_node *dn;
+ /* + * We can have a combination of device and data nodes, e.g. with + * hierarchical _DSD properties. Make sure the adev pointer is + * restored before going through data nodes, otherwise we will + * be looking for data_nodes below the last device found instead + * of the common fwnode shared by device_nodes and data_nodes. + */ + adev = to_acpi_device_node(fwnode); if (adev) head = &adev->data.subnodes; else if (data)
From: Jon Derrick jonathan.derrick@intel.com
[ Upstream commit f10b83de1fd49216a4c657816f48001437e4bdd5 ]
If the BAR is zero size, it indicates it was never successfully mapped. Ensure that the BAR is valid during initialization before attempting to use it.
Signed-off-by: Jon Derrick jonathan.derrick@intel.com Signed-off-by: Ben Skeggs bskeggs@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c index 157b076a12723..38c9c086754b6 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/nv50.c @@ -109,7 +109,7 @@ nv50_bar_oneinit(struct nvkm_bar *base) struct nvkm_device *device = bar->base.subdev.device; static struct lock_class_key bar1_lock; static struct lock_class_key bar2_lock; - u64 start, limit; + u64 start, limit, size; int ret;
ret = nvkm_gpuobj_new(device, 0x20000, 0, false, NULL, &bar->mem); @@ -127,7 +127,10 @@ nv50_bar_oneinit(struct nvkm_bar *base)
/* BAR2 */ start = 0x0100000000ULL; - limit = start + device->func->resource_size(device, 3); + size = device->func->resource_size(device, 3); + if (!size) + return -ENOMEM; + limit = start + size;
ret = nvkm_vmm_new(device, start, limit-- - start, NULL, 0, &bar2_lock, "bar2", &bar->bar2_vmm); @@ -164,7 +167,10 @@ nv50_bar_oneinit(struct nvkm_bar *base)
/* BAR1 */ start = 0x0000000000ULL; - limit = start + device->func->resource_size(device, 1); + size = device->func->resource_size(device, 1); + if (!size) + return -ENOMEM; + limit = start + size;
ret = nvkm_vmm_new(device, start, limit-- - start, NULL, 0, &bar1_lock, "bar1", &bar->bar1_vmm);
From: Fabien Dessenne fabien.dessenne@st.com
[ Upstream commit b5b5a27bee5884860798ffd0f08e611a3942064b ]
During probe, return the provided errors value instead of -ENODEV. This allows the driver to be deferred probed if needed.
Signed-off-by: Fabien Dessenne fabien.dessenne@st.com Acked-by: Hugues Fruchet hugues.fruchet@st.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/stm32/stm32-dcmi.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/media/platform/stm32/stm32-dcmi.c b/drivers/media/platform/stm32/stm32-dcmi.c index 5fe5b38fa901d..a1f0801081ba9 100644 --- a/drivers/media/platform/stm32/stm32-dcmi.c +++ b/drivers/media/platform/stm32/stm32-dcmi.c @@ -1645,7 +1645,7 @@ static int dcmi_probe(struct platform_device *pdev) dcmi->rstc = devm_reset_control_get_exclusive(&pdev->dev, NULL); if (IS_ERR(dcmi->rstc)) { dev_err(&pdev->dev, "Could not get reset control\n"); - return -ENODEV; + return PTR_ERR(dcmi->rstc); }
/* Get bus characteristics from devicetree */ @@ -1660,7 +1660,7 @@ static int dcmi_probe(struct platform_device *pdev) of_node_put(np); if (ret) { dev_err(&pdev->dev, "Could not parse the endpoint\n"); - return -ENODEV; + return ret; }
if (ep.bus_type == V4L2_MBUS_CSI2_DPHY) { @@ -1673,8 +1673,9 @@ static int dcmi_probe(struct platform_device *pdev)
irq = platform_get_irq(pdev, 0); if (irq <= 0) { - dev_err(&pdev->dev, "Could not get irq\n"); - return -ENODEV; + if (irq != -EPROBE_DEFER) + dev_err(&pdev->dev, "Could not get irq\n"); + return irq; }
dcmi->res = platform_get_resource(pdev, IORESOURCE_MEM, 0); @@ -1694,12 +1695,13 @@ static int dcmi_probe(struct platform_device *pdev) dev_name(&pdev->dev), dcmi); if (ret) { dev_err(&pdev->dev, "Unable to request irq %d\n", irq); - return -ENODEV; + return ret; }
mclk = devm_clk_get(&pdev->dev, "mclk"); if (IS_ERR(mclk)) { - dev_err(&pdev->dev, "Unable to get mclk\n"); + if (PTR_ERR(mclk) != -EPROBE_DEFER) + dev_err(&pdev->dev, "Unable to get mclk\n"); return PTR_ERR(mclk); }
From: Marc Zyngier marc.zyngier@arm.com
[ Upstream commit 1f5b62f09f6b314c8d70b9de5182dae4de1f94da ]
The VDSO code uses the kernel helper that was originally designed to abstract the access between 32 and 64bit systems. It worked so far because this function is declared as 'inline'.
As we're about to revamp that part of the code, the VDSO would break. Let's fix it by doing what should have been done from the start, a proper system register access.
Reviewed-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Marc Zyngier marc.zyngier@arm.com Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/include/asm/cp15.h | 2 ++ arch/arm/vdso/vgettimeofday.c | 5 +++-- 2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/arm/include/asm/cp15.h b/arch/arm/include/asm/cp15.h index 07e27f212dc75..d2453e2d3f1f3 100644 --- a/arch/arm/include/asm/cp15.h +++ b/arch/arm/include/asm/cp15.h @@ -68,6 +68,8 @@ #define BPIALL __ACCESS_CP15(c7, 0, c5, 6) #define ICIALLU __ACCESS_CP15(c7, 0, c5, 0)
+#define CNTVCT __ACCESS_CP15_64(1, c14) + extern unsigned long cr_alignment; /* defined in entry-armv.S */
static inline unsigned long get_cr(void) diff --git a/arch/arm/vdso/vgettimeofday.c b/arch/arm/vdso/vgettimeofday.c index a9dd619c6c290..7bdbf5d5c47d3 100644 --- a/arch/arm/vdso/vgettimeofday.c +++ b/arch/arm/vdso/vgettimeofday.c @@ -18,9 +18,9 @@ #include <linux/compiler.h> #include <linux/hrtimer.h> #include <linux/time.h> -#include <asm/arch_timer.h> #include <asm/barrier.h> #include <asm/bug.h> +#include <asm/cp15.h> #include <asm/page.h> #include <asm/unistd.h> #include <asm/vdso_datapage.h> @@ -123,7 +123,8 @@ static notrace u64 get_ns(struct vdso_data *vdata) u64 cycle_now; u64 nsec;
- cycle_now = arch_counter_get_cntvct(); + isb(); + cycle_now = read_sysreg(CNTVCT);
cycle_delta = (cycle_now - vdata->cs_cycle_last) & vdata->cs_mask;
From: Qian Cai cai@lca.pw
[ Upstream commit 74dd022f9e6260c3b5b8d15901d27ebcc5f21eda ]
When building with -Wunused-but-set-variable, the compiler shouts about a number of pte_unmap() users, since this expands to an empty macro on arm64:
| mm/gup.c: In function 'gup_pte_range': | mm/gup.c:1727:16: warning: variable 'ptem' set but not used | [-Wunused-but-set-variable] | mm/gup.c: At top level: | mm/memory.c: In function 'copy_pte_range': | mm/memory.c:821:24: warning: variable 'orig_dst_pte' set but not used | [-Wunused-but-set-variable] | mm/memory.c:821:9: warning: variable 'orig_src_pte' set but not used | [-Wunused-but-set-variable] | mm/swap_state.c: In function 'swap_ra_info': | mm/swap_state.c:641:15: warning: variable 'orig_pte' set but not used | [-Wunused-but-set-variable] | mm/madvise.c: In function 'madvise_free_pte_range': | mm/madvise.c:318:9: warning: variable 'orig_pte' set but not used | [-Wunused-but-set-variable]
Rewrite pte_unmap() as a static inline function, which silences the warnings.
Signed-off-by: Qian Cai cai@lca.pw Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/pgtable.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index de70c1eabf336..74ebe96937141 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -478,6 +478,8 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd) return __pmd_to_phys(pmd); }
+static inline void pte_unmap(pte_t *pte) { } + /* Find an entry in the third-level page table. */ #define pte_index(addr) (((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
@@ -486,7 +488,6 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
#define pte_offset_map(dir,addr) pte_offset_kernel((dir), (addr)) #define pte_offset_map_nested(dir,addr) pte_offset_kernel((dir), (addr)) -#define pte_unmap(pte) do { } while (0) #define pte_unmap_nested(pte) do { } while (0)
#define pte_set_fixmap(addr) ((pte_t *)set_fixmap_offset(FIX_PTE, addr))
From: Lorenzo Bianconi lorenzo@kernel.org
[ Upstream commit 89a37842b0c13c9e568bf12f4fcbe6507147e41d ]
Remove mt76_queue dependency from tx_queue_skb function pointer and rely on mt76_tx_qid instead. This is a preliminary patch to introduce mt76_sw_queue support
Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt76/dma.c | 3 ++- drivers/net/wireless/mediatek/mt76/mt76.h | 4 ++-- drivers/net/wireless/mediatek/mt76/mt7603/beacon.c | 6 +++--- drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c | 4 ++-- drivers/net/wireless/mediatek/mt76/tx.c | 10 +++++----- drivers/net/wireless/mediatek/mt76/usb.c | 3 ++- 6 files changed, 16 insertions(+), 14 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/dma.c b/drivers/net/wireless/mediatek/mt76/dma.c index 76629b98c78d7..8c7ee8302fb87 100644 --- a/drivers/net/wireless/mediatek/mt76/dma.c +++ b/drivers/net/wireless/mediatek/mt76/dma.c @@ -271,10 +271,11 @@ mt76_dma_tx_queue_skb_raw(struct mt76_dev *dev, enum mt76_txq_id qid, return 0; }
-int mt76_dma_tx_queue_skb(struct mt76_dev *dev, struct mt76_queue *q, +int mt76_dma_tx_queue_skb(struct mt76_dev *dev, enum mt76_txq_id qid, struct sk_buff *skb, struct mt76_wcid *wcid, struct ieee80211_sta *sta) { + struct mt76_queue *q = &dev->q_tx[qid]; struct mt76_queue_entry e; struct mt76_txwi_cache *t; struct mt76_queue_buf buf[32]; diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h index bcbfd3c4a44b6..eb882b2cbc0ec 100644 --- a/drivers/net/wireless/mediatek/mt76/mt76.h +++ b/drivers/net/wireless/mediatek/mt76/mt76.h @@ -156,7 +156,7 @@ struct mt76_queue_ops { struct mt76_queue_buf *buf, int nbufs, u32 info, struct sk_buff *skb, void *txwi);
- int (*tx_queue_skb)(struct mt76_dev *dev, struct mt76_queue *q, + int (*tx_queue_skb)(struct mt76_dev *dev, enum mt76_txq_id qid, struct sk_buff *skb, struct mt76_wcid *wcid, struct ieee80211_sta *sta);
@@ -645,7 +645,7 @@ static inline struct mt76_tx_cb *mt76_tx_skb_cb(struct sk_buff *skb) return ((void *) IEEE80211_SKB_CB(skb)->status.status_driver_data); }
-int mt76_dma_tx_queue_skb(struct mt76_dev *dev, struct mt76_queue *q, +int mt76_dma_tx_queue_skb(struct mt76_dev *dev, enum mt76_txq_id qid, struct sk_buff *skb, struct mt76_wcid *wcid, struct ieee80211_sta *sta);
diff --git a/drivers/net/wireless/mediatek/mt76/mt7603/beacon.c b/drivers/net/wireless/mediatek/mt76/mt7603/beacon.c index 4dcb465095d19..99c0a3ba37cb7 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7603/beacon.c +++ b/drivers/net/wireless/mediatek/mt76/mt7603/beacon.c @@ -23,7 +23,7 @@ mt7603_update_beacon_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) if (!skb) return;
- mt76_dma_tx_queue_skb(&dev->mt76, &dev->mt76.q_tx[MT_TXQ_BEACON], skb, + mt76_dma_tx_queue_skb(&dev->mt76, MT_TXQ_BEACON, skb, &mvif->sta.wcid, NULL);
spin_lock_bh(&dev->ps_lock); @@ -118,8 +118,8 @@ void mt7603_pre_tbtt_tasklet(unsigned long arg) struct ieee80211_vif *vif = info->control.vif; struct mt7603_vif *mvif = (struct mt7603_vif *)vif->drv_priv;
- mt76_dma_tx_queue_skb(&dev->mt76, q, skb, &mvif->sta.wcid, - NULL); + mt76_dma_tx_queue_skb(&dev->mt76, MT_TXQ_CAB, skb, + &mvif->sta.wcid, NULL); } mt76_queue_kick(dev, q); spin_unlock_bh(&q->lock); diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c b/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c index daaed1220147e..952fe19cba9b6 100644 --- a/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c +++ b/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c @@ -146,8 +146,8 @@ static void mt76x02_pre_tbtt_tasklet(unsigned long arg) struct ieee80211_vif *vif = info->control.vif; struct mt76x02_vif *mvif = (struct mt76x02_vif *)vif->drv_priv;
- mt76_dma_tx_queue_skb(&dev->mt76, q, skb, &mvif->group_wcid, - NULL); + mt76_dma_tx_queue_skb(&dev->mt76, MT_TXQ_PSD, skb, + &mvif->group_wcid, NULL); } spin_unlock_bh(&q->lock); } diff --git a/drivers/net/wireless/mediatek/mt76/tx.c b/drivers/net/wireless/mediatek/mt76/tx.c index 2585df5123350..0c1036da9a92a 100644 --- a/drivers/net/wireless/mediatek/mt76/tx.c +++ b/drivers/net/wireless/mediatek/mt76/tx.c @@ -286,7 +286,7 @@ mt76_tx(struct mt76_dev *dev, struct ieee80211_sta *sta, q = &dev->q_tx[qid];
spin_lock_bh(&q->lock); - dev->queue_ops->tx_queue_skb(dev, q, skb, wcid, sta); + dev->queue_ops->tx_queue_skb(dev, qid, skb, wcid, sta); dev->queue_ops->kick(dev, q);
if (q->queued > q->ndesc - 8 && !q->stopped) { @@ -327,7 +327,6 @@ mt76_queue_ps_skb(struct mt76_dev *dev, struct ieee80211_sta *sta, { struct mt76_wcid *wcid = (struct mt76_wcid *) sta->drv_priv; struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); - struct mt76_queue *hwq = &dev->q_tx[MT_TXQ_PSD];
info->control.flags |= IEEE80211_TX_CTRL_PS_RESPONSE; if (last) @@ -335,7 +334,7 @@ mt76_queue_ps_skb(struct mt76_dev *dev, struct ieee80211_sta *sta, IEEE80211_TX_CTL_REQ_TX_STATUS;
mt76_skb_set_moredata(skb, !last); - dev->queue_ops->tx_queue_skb(dev, hwq, skb, wcid, sta); + dev->queue_ops->tx_queue_skb(dev, MT_TXQ_PSD, skb, wcid, sta); }
void @@ -390,6 +389,7 @@ mt76_txq_send_burst(struct mt76_dev *dev, struct mt76_queue *hwq, struct mt76_txq *mtxq, bool *empty) { struct ieee80211_txq *txq = mtxq_to_txq(mtxq); + enum mt76_txq_id qid = mt76_txq_get_qid(txq); struct ieee80211_tx_info *info; struct mt76_wcid *wcid = mtxq->wcid; struct sk_buff *skb; @@ -423,7 +423,7 @@ mt76_txq_send_burst(struct mt76_dev *dev, struct mt76_queue *hwq, if (ampdu) mt76_check_agg_ssn(mtxq, skb);
- idx = dev->queue_ops->tx_queue_skb(dev, hwq, skb, wcid, txq->sta); + idx = dev->queue_ops->tx_queue_skb(dev, qid, skb, wcid, txq->sta);
if (idx < 0) return idx; @@ -458,7 +458,7 @@ mt76_txq_send_burst(struct mt76_dev *dev, struct mt76_queue *hwq, if (cur_ampdu) mt76_check_agg_ssn(mtxq, skb);
- idx = dev->queue_ops->tx_queue_skb(dev, hwq, skb, wcid, + idx = dev->queue_ops->tx_queue_skb(dev, qid, skb, wcid, txq->sta); if (idx < 0) return idx; diff --git a/drivers/net/wireless/mediatek/mt76/usb.c b/drivers/net/wireless/mediatek/mt76/usb.c index 4c1abd4924054..b1551419338f0 100644 --- a/drivers/net/wireless/mediatek/mt76/usb.c +++ b/drivers/net/wireless/mediatek/mt76/usb.c @@ -726,10 +726,11 @@ mt76u_tx_build_sg(struct mt76_dev *dev, struct sk_buff *skb, }
static int -mt76u_tx_queue_skb(struct mt76_dev *dev, struct mt76_queue *q, +mt76u_tx_queue_skb(struct mt76_dev *dev, enum mt76_txq_id qid, struct sk_buff *skb, struct mt76_wcid *wcid, struct ieee80211_sta *sta) { + struct mt76_queue *q = &dev->q_tx[qid]; struct mt76u_buf *buf; u16 idx = q->tail; int err;
From: Nadav Amit namit@vmware.com
[ Upstream commit 3c0dab44e22782359a0a706cbce72de99a22aa75 ]
Since alloc_module() will not set the pages as executable soon, set ftrace trampoline pages as executable after they are allocated.
For the time being, do not change ftrace to use the text_poke() interface. As a result, ftrace still breaks W^X.
Signed-off-by: Nadav Amit namit@vmware.com Signed-off-by: Rick Edgecombe rick.p.edgecombe@intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Steven Rostedt (VMware) rostedt@goodmis.org Cc: akpm@linux-foundation.org Cc: ard.biesheuvel@linaro.org Cc: deneen.t.dock@intel.com Cc: kernel-hardening@lists.openwall.com Cc: kristen@linux.intel.com Cc: linux_dti@icloud.com Cc: will.deacon@arm.com Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@alien8.de Cc: Dave Hansen dave.hansen@linux.intel.com Cc: H. Peter Anvin hpa@zytor.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Rik van Riel riel@surriel.com Cc: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/20190426001143.4983-10-namit@vmware.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/ftrace.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index ef49517f6bb24..53ba1aa3a01f8 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -730,6 +730,7 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) unsigned long end_offset; unsigned long op_offset; unsigned long offset; + unsigned long npages; unsigned long size; unsigned long retq; unsigned long *ptr; @@ -762,6 +763,7 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) return 0;
*tramp_size = size + RET_SIZE + sizeof(void *); + npages = DIV_ROUND_UP(*tramp_size, PAGE_SIZE);
/* Copy ftrace_caller onto the trampoline memory */ ret = probe_kernel_read(trampoline, (void *)start_offset, size); @@ -806,6 +808,12 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) /* ALLOC_TRAMP flags lets us know we created it */ ops->flags |= FTRACE_OPS_FL_ALLOC_TRAMP;
+ /* + * Module allocation needs to be completed by making the page + * executable. The page is still writable, which is a security hazard, + * but anyhow ftrace breaks W^X completely. + */ + set_memory_x((unsigned long)trampoline, npages); return (unsigned long)trampoline; fail: tramp_free(trampoline, *tramp_size);
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit 7ae3f6e130e8dc6188b59e3b4ebc2f16e9c8d053 ]
Using a jiffies timer creates a dependency on the tick_do_timer_cpu incrementing jiffies. If that CPU has locked up and jiffies is not incrementing, the watchdog heartbeat timer for all CPUs stops and creates false positives and confusing warnings on local CPUs, and also causes the SMP detector to stop, so the root cause is never detected.
Fix this by using hrtimer based timers for the watchdog heartbeat, like the generic kernel hardlockup detector.
Cc: Gautham R. Shenoy ego@linux.vnet.ibm.com Reported-by: Ravikumar Bangoria ravi.bangoria@in.ibm.com Signed-off-by: Nicholas Piggin npiggin@gmail.com Tested-by: Ravi Bangoria ravi.bangoria@linux.ibm.com Reported-by: Ravi Bangoria ravi.bangoria@linux.ibm.com Reviewed-by: Gautham R. Shenoy ego@linux.vnet.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kernel/watchdog.c | 81 +++++++++++++++++----------------- 1 file changed, 40 insertions(+), 41 deletions(-)
diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c index 3c6ab22a0c4e3..af3c15a1d41eb 100644 --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -77,7 +77,7 @@ static u64 wd_smp_panic_timeout_tb __read_mostly; /* panic other CPUs */
static u64 wd_timer_period_ms __read_mostly; /* interval between heartbeat */
-static DEFINE_PER_CPU(struct timer_list, wd_timer); +static DEFINE_PER_CPU(struct hrtimer, wd_hrtimer); static DEFINE_PER_CPU(u64, wd_timer_tb);
/* SMP checker bits */ @@ -293,21 +293,21 @@ void soft_nmi_interrupt(struct pt_regs *regs) nmi_exit(); }
-static void wd_timer_reset(unsigned int cpu, struct timer_list *t) -{ - t->expires = jiffies + msecs_to_jiffies(wd_timer_period_ms); - if (wd_timer_period_ms > 1000) - t->expires = __round_jiffies_up(t->expires, cpu); - add_timer_on(t, cpu); -} - -static void wd_timer_fn(struct timer_list *t) +static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) { int cpu = smp_processor_id();
+ if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED)) + return HRTIMER_NORESTART; + + if (!cpumask_test_cpu(cpu, &watchdog_cpumask)) + return HRTIMER_NORESTART; + watchdog_timer_interrupt(cpu);
- wd_timer_reset(cpu, t); + hrtimer_forward_now(hrtimer, ms_to_ktime(wd_timer_period_ms)); + + return HRTIMER_RESTART; }
void arch_touch_nmi_watchdog(void) @@ -323,37 +323,22 @@ void arch_touch_nmi_watchdog(void) } EXPORT_SYMBOL(arch_touch_nmi_watchdog);
-static void start_watchdog_timer_on(unsigned int cpu) -{ - struct timer_list *t = per_cpu_ptr(&wd_timer, cpu); - - per_cpu(wd_timer_tb, cpu) = get_tb(); - - timer_setup(t, wd_timer_fn, TIMER_PINNED); - wd_timer_reset(cpu, t); -} - -static void stop_watchdog_timer_on(unsigned int cpu) -{ - struct timer_list *t = per_cpu_ptr(&wd_timer, cpu); - - del_timer_sync(t); -} - -static int start_wd_on_cpu(unsigned int cpu) +static void start_watchdog(void *arg) { + struct hrtimer *hrtimer = this_cpu_ptr(&wd_hrtimer); + int cpu = smp_processor_id(); unsigned long flags;
if (cpumask_test_cpu(cpu, &wd_cpus_enabled)) { WARN_ON(1); - return 0; + return; }
if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED)) - return 0; + return;
if (!cpumask_test_cpu(cpu, &watchdog_cpumask)) - return 0; + return;
wd_smp_lock(&flags); cpumask_set_cpu(cpu, &wd_cpus_enabled); @@ -363,27 +348,40 @@ static int start_wd_on_cpu(unsigned int cpu) } wd_smp_unlock(&flags);
- start_watchdog_timer_on(cpu); + *this_cpu_ptr(&wd_timer_tb) = get_tb();
- return 0; + hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + hrtimer->function = watchdog_timer_fn; + hrtimer_start(hrtimer, ms_to_ktime(wd_timer_period_ms), + HRTIMER_MODE_REL_PINNED); }
-static int stop_wd_on_cpu(unsigned int cpu) +static int start_watchdog_on_cpu(unsigned int cpu) { + return smp_call_function_single(cpu, start_watchdog, NULL, true); +} + +static void stop_watchdog(void *arg) +{ + struct hrtimer *hrtimer = this_cpu_ptr(&wd_hrtimer); + int cpu = smp_processor_id(); unsigned long flags;
if (!cpumask_test_cpu(cpu, &wd_cpus_enabled)) - return 0; /* Can happen in CPU unplug case */ + return; /* Can happen in CPU unplug case */
- stop_watchdog_timer_on(cpu); + hrtimer_cancel(hrtimer);
wd_smp_lock(&flags); cpumask_clear_cpu(cpu, &wd_cpus_enabled); wd_smp_unlock(&flags);
wd_smp_clear_cpu_pending(cpu, get_tb()); +}
- return 0; +static int stop_watchdog_on_cpu(unsigned int cpu) +{ + return smp_call_function_single(cpu, stop_watchdog, NULL, true); }
static void watchdog_calc_timeouts(void) @@ -402,7 +400,7 @@ void watchdog_nmi_stop(void) int cpu;
for_each_cpu(cpu, &wd_cpus_enabled) - stop_wd_on_cpu(cpu); + stop_watchdog_on_cpu(cpu); }
void watchdog_nmi_start(void) @@ -411,7 +409,7 @@ void watchdog_nmi_start(void)
watchdog_calc_timeouts(); for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) - start_wd_on_cpu(cpu); + start_watchdog_on_cpu(cpu); }
/* @@ -423,7 +421,8 @@ int __init watchdog_nmi_probe(void)
err = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "powerpc/watchdog:online", - start_wd_on_cpu, stop_wd_on_cpu); + start_watchdog_on_cpu, + stop_watchdog_on_cpu); if (err < 0) { pr_warn("could not be initialized"); return err;
From: Viresh Kumar viresh.kumar@linaro.org
[ Upstream commit 4ebe36c94aed95de71a8ce6a6762226d31c938ee ]
Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject.
Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add().
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Reviewed-by: Tobin C. Harding tobin@kernel.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpufreq/cpufreq.c | 1 + drivers/cpufreq/cpufreq_governor.c | 2 ++ 2 files changed, 3 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index e10922709d139..bbf79544d0ad8 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1098,6 +1098,7 @@ static struct cpufreq_policy *cpufreq_policy_alloc(unsigned int cpu) cpufreq_global_kobject, "policy%u", cpu); if (ret) { pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret); + kobject_put(&policy->kobj); goto err_free_real_cpus; }
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c index ffa9adeaba31b..9d1d9bf02710b 100644 --- a/drivers/cpufreq/cpufreq_governor.c +++ b/drivers/cpufreq/cpufreq_governor.c @@ -459,6 +459,8 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy) /* Failure, so roll back. */ pr_err("initialization failed (dbs_data kobject init error %d)\n", ret);
+ kobject_put(&dbs_data->attr_set.kobj); + policy->governor_data = NULL;
if (!have_governor_per_policy())
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit 24afabdbd0b3553963a2bbf465895492b14d1107 ]
Make sure that the allocated interrupts are freed if allocating memory for the msix_entries array fails.
Cc: Himanshu Madhani hmadhani@marvell.com Cc: Giridhar Malavali gmalavali@marvell.com Signed-off-by: Bart Van Assche bvanassche@acm.org Acked-by: Himanshu Madhani hmadhani@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_isr.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index 69bbea9239cc8..add17843148dd 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -3475,7 +3475,7 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp) ql_log(ql_log_fatal, vha, 0x00c8, "Failed to allocate memory for ha->msix_entries.\n"); ret = -ENOMEM; - goto msix_out; + goto free_irqs; } ha->flags.msix_enabled = 1;
@@ -3558,6 +3558,10 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
msix_out: return ret; + +free_irqs: + pci_free_irq_vectors(ha->pdev); + goto msix_out; }
int
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit e209783d66bca04b5fce4429e59338517ffc1a0b ]
Implementations of the .write_pending() callback functions must guarantee that an appropriate LIO core callback function will be called immediately or at a later time. Make sure that this guarantee is met for aborted SCSI commands.
[mkp: typo]
Cc: Himanshu Madhani hmadhani@marvell.com Cc: Giridhar Malavali gmalavali@marvell.com Fixes: 694833ee00c4 ("scsi: tcm_qla2xxx: Do not allow aborted cmd to advance.") # v4.13. Fixes: a07100e00ac4 ("qla2xxx: Fix TMR ABORT interaction issue between qla2xxx and TCM") # v4.5. Signed-off-by: Bart Van Assche bvanassche@acm.org Acked-by: Himanshu Madhani hmadhani@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/tcm_qla2xxx.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c index 8a3075d17c63c..bddb573c88dd2 100644 --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c @@ -399,6 +399,8 @@ static int tcm_qla2xxx_write_pending(struct se_cmd *se_cmd) cmd->se_cmd.transport_state, cmd->se_cmd.t_state, cmd->se_cmd.se_cmd_flags); + transport_generic_request_failure(&cmd->se_cmd, + TCM_CHECK_CONDITION_ABORT_CMD); return 0; } cmd->trc_flags |= TRC_XFR_RDY;
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit d4023db71108375e4194e92730ba0d32d7f07813 ]
This patch avoids that lockdep reports the following warning:
===================================================== WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected 5.1.0-rc1-dbg+ #11 Tainted: G W ----------------------------------------------------- rmdir/1478 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: 00000000e7ac4607 (&(&k->k_lock)->rlock){+.+.}, at: klist_next+0x43/0x1d0
and this task is already holding: 00000000cf0baf5e (&(&ha->tgt.sess_lock)->rlock){-...}, at: tcm_qla2xxx_close_session+0x57/0xb0 [tcm_qla2xxx] which would create a new lock dependency: (&(&ha->tgt.sess_lock)->rlock){-...} -> (&(&k->k_lock)->rlock){+.+.}
but this new dependency connects a HARDIRQ-irq-safe lock: (&(&ha->tgt.sess_lock)->rlock){-...}
... which became HARDIRQ-irq-safe at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla2x00_fcport_event_handler+0x1f3d/0x22b0 [qla2xxx] qla2x00_async_login_sp_done+0x1dc/0x1f0 [qla2xxx] qla24xx_process_response_queue+0xa37/0x10e0 [qla2xxx] qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx] __handle_irq_event_percpu+0x79/0x3c0 handle_irq_event_percpu+0x70/0xf0 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12c/0x310 handle_irq+0x192/0x20a do_IRQ+0x73/0x160 ret_from_intr+0x0/0x1d default_idle+0x23/0x1f0 arch_cpu_idle+0x15/0x20 default_idle_call+0x35/0x40 do_idle+0x2bb/0x2e0 cpu_startup_entry+0x1d/0x20 start_secondary+0x24d/0x2d0 secondary_startup_64+0xa4/0xb0
to a HARDIRQ-irq-unsafe lock: (&(&k->k_lock)->rlock){+.+.}
... which became HARDIRQ-irq-unsafe at: ... lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7f4/0xb60 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&(&k->k_lock)->rlock); local_irq_disable(); lock(&(&ha->tgt.sess_lock)->rlock); lock(&(&k->k_lock)->rlock); <Interrupt> lock(&(&ha->tgt.sess_lock)->rlock);
*** DEADLOCK ***
4 locks held by rmdir/1478: #0: 000000002c7f1ba4 (sb_writers#10){.+.+}, at: mnt_want_write+0x32/0x70 #1: 00000000c85eb147 (&default_group_class[depth - 1]#2/1){+.+.}, at: do_rmdir+0x217/0x2d0 #2: 000000002b164d6f (&sb->s_type->i_mutex_key#13){++++}, at: vfs_rmdir+0x7e/0x1d0 #3: 00000000cf0baf5e (&(&ha->tgt.sess_lock)->rlock){-...}, at: tcm_qla2xxx_close_session+0x57/0xb0 [tcm_qla2xxx]
the dependencies between HARDIRQ-irq-safe lock and the holding lock: -> (&(&ha->tgt.sess_lock)->rlock){-...} ops: 127 { IN-HARDIRQ-W at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla2x00_fcport_event_handler+0x1f3d/0x22b0 [qla2xxx] qla2x00_async_login_sp_done+0x1dc/0x1f0 [qla2xxx] qla24xx_process_response_queue+0xa37/0x10e0 [qla2xxx] qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx] __handle_irq_event_percpu+0x79/0x3c0 handle_irq_event_percpu+0x70/0xf0 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12c/0x310 handle_irq+0x192/0x20a do_IRQ+0x73/0x160 ret_from_intr+0x0/0x1d default_idle+0x23/0x1f0 arch_cpu_idle+0x15/0x20 default_idle_call+0x35/0x40 do_idle+0x2bb/0x2e0 cpu_startup_entry+0x1d/0x20 start_secondary+0x24d/0x2d0 secondary_startup_64+0xa4/0xb0 INITIAL USE at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla2x00_loop_resync+0xb3d/0x2690 [qla2xxx] qla2x00_do_dpc+0xcee/0xf30 [qla2xxx] kthread+0x1d2/0x1f0 ret_from_fork+0x3a/0x50 } ... key at: [<ffffffffa125f700>] __key.62804+0x0/0xfffffffffff7e900 [qla2xxx] ... acquired at: __lock_acquire+0x11ed/0x1b60 lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0x4d3/0x500 [qla2xxx] qlt_unreg_sess+0x104/0x2c0 [qla2xxx] tcm_qla2xxx_close_session+0xa2/0xb0 [tcm_qla2xxx] target_shutdown_sessions+0x17b/0x190 [target_core_mod] core_tpg_del_initiator_node_acl+0xf3/0x1f0 [target_core_mod] target_fabric_nacl_base_release+0x25/0x30 [target_core_mod] config_item_release+0x9f/0x120 [configfs] config_item_put+0x29/0x2b [configfs] configfs_rmdir+0x3d2/0x520 [configfs] vfs_rmdir+0xb3/0x1d0 do_rmdir+0x25c/0x2d0 __x64_sys_rmdir+0x24/0x30 do_syscall_64+0x77/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe
the dependencies between the lock to be acquired and HARDIRQ-irq-unsafe lock: -> (&(&k->k_lock)->rlock){+.+.} ops: 14568 { HARDIRQ-ON-W at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7f4/0xb60 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 SOFTIRQ-ON-W at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7f4/0xb60 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 INITIAL USE at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7f4/0xb60 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 } ... key at: [<ffffffff83f3d900>] __key.15805+0x0/0x40 ... acquired at: __lock_acquire+0x11ed/0x1b60 lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0x4d3/0x500 [qla2xxx] qlt_unreg_sess+0x104/0x2c0 [qla2xxx] tcm_qla2xxx_close_session+0xa2/0xb0 [tcm_qla2xxx] target_shutdown_sessions+0x17b/0x190 [target_core_mod] core_tpg_del_initiator_node_acl+0xf3/0x1f0 [target_core_mod] target_fabric_nacl_base_release+0x25/0x30 [target_core_mod] config_item_release+0x9f/0x120 [configfs] config_item_put+0x29/0x2b [configfs] configfs_rmdir+0x3d2/0x520 [configfs] vfs_rmdir+0xb3/0x1d0 do_rmdir+0x25c/0x2d0 __x64_sys_rmdir+0x24/0x30 do_syscall_64+0x77/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe
stack backtrace: CPU: 7 PID: 1478 Comm: rmdir Tainted: G W 5.1.0-rc1-dbg+ #11 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x86/0xca check_usage.cold.59+0x473/0x563 check_prev_add.constprop.43+0x1f1/0x1170 __lock_acquire+0x11ed/0x1b60 lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0x4d3/0x500 [qla2xxx] qlt_unreg_sess+0x104/0x2c0 [qla2xxx] tcm_qla2xxx_close_session+0xa2/0xb0 [tcm_qla2xxx] target_shutdown_sessions+0x17b/0x190 [target_core_mod] core_tpg_del_initiator_node_acl+0xf3/0x1f0 [target_core_mod] target_fabric_nacl_base_release+0x25/0x30 [target_core_mod] config_item_release+0x9f/0x120 [configfs] config_item_put+0x29/0x2b [configfs] configfs_rmdir+0x3d2/0x520 [configfs] vfs_rmdir+0xb3/0x1d0 do_rmdir+0x25c/0x2d0 __x64_sys_rmdir+0x24/0x30 do_syscall_64+0x77/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Cc: Himanshu Madhani hmadhani@marvell.com Cc: Giridhar Malavali gmalavali@marvell.com Signed-off-by: Bart Van Assche bvanassche@acm.org Acked-by: Himanshu Madhani hmadhani@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/tcm_qla2xxx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c index bddb573c88dd2..d6104f23f697f 100644 --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c @@ -374,8 +374,9 @@ static void tcm_qla2xxx_close_session(struct se_session *se_sess)
spin_lock_irqsave(&vha->hw->tgt.sess_lock, flags); target_sess_cmd_list_set_waiting(se_sess); - tcm_qla2xxx_put_sess(sess); spin_unlock_irqrestore(&vha->hw->tgt.sess_lock, flags); + + tcm_qla2xxx_put_sess(sess); }
static u32 tcm_qla2xxx_sess_get_index(struct se_session *se_sess)
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit 300ec7415c1fed5c73660f50c8e14a67e236dc0a ]
Since fc_remote_port_delete() must be called with interrupts enabled, do not disable interrupts when calling that function. Remove the lockin calls from around the put_sess() call. This is safe because the function that is called when the final reference is dropped, qlt_unreg_sess(), grabs the proper locks. This patch avoids that lockdep reports the following:
WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected kworker/2:1/62 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: 0000000009e679b3 (&(&k->k_lock)->rlock){+.+.}, at: klist_next+0x43/0x1d0
and this task is already holding: 00000000a033b71c (&(&ha->tgt.sess_lock)->rlock){-...}, at: qla24xx_delete_sess_fn+0x55/0xf0 [qla2xxx_scst] which would create a new lock dependency: (&(&ha->tgt.sess_lock)->rlock){-...} -> (&(&k->k_lock)->rlock){+.+.}
but this new dependency connects a HARDIRQ-irq-safe lock: (&(&ha->tgt.sess_lock)->rlock){-...}
... which became HARDIRQ-irq-safe at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla24xx_report_id_acquisition+0xa69/0xe30 [qla2xxx_scst] qla24xx_process_response_queue+0x69e/0x1270 [qla2xxx_scst] qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx_scst] __handle_irq_event_percpu+0x79/0x3c0 handle_irq_event_percpu+0x70/0xf0 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12c/0x310 handle_irq+0x192/0x20a do_IRQ+0x73/0x160 ret_from_intr+0x0/0x1d default_idle+0x23/0x1f0 arch_cpu_idle+0x15/0x20 default_idle_call+0x35/0x40 do_idle+0x2bb/0x2e0 cpu_startup_entry+0x1d/0x20 start_secondary+0x2a8/0x320 secondary_startup_64+0xa4/0xb0
to a HARDIRQ-irq-unsafe lock: (&(&k->k_lock)->rlock){+.+.}
... which became HARDIRQ-irq-unsafe at: ... lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7e1/0xb50 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&(&k->k_lock)->rlock); local_irq_disable(); lock(&(&ha->tgt.sess_lock)->rlock); lock(&(&k->k_lock)->rlock); <Interrupt> lock(&(&ha->tgt.sess_lock)->rlock);
*** DEADLOCK ***
3 locks held by kworker/2:1/62: #0: 00000000a4319c16 ((wq_completion)"qla2xxx_wq"){+.+.}, at: process_one_work+0x437/0xa80 #1: 00000000ffa34c42 ((work_completion)(&sess->del_work)){+.+.}, at: process_one_work+0x437/0xa80 #2: 00000000a033b71c (&(&ha->tgt.sess_lock)->rlock){-...}, at: qla24xx_delete_sess_fn+0x55/0xf0 [qla2xxx_scst]
the dependencies between HARDIRQ-irq-safe lock and the holding lock: -> (&(&ha->tgt.sess_lock)->rlock){-...} ops: 8 { IN-HARDIRQ-W at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla24xx_report_id_acquisition+0xa69/0xe30 [qla2xxx_scst] qla24xx_process_response_queue+0x69e/0x1270 [qla2xxx_scst] qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx_scst] __handle_irq_event_percpu+0x79/0x3c0 handle_irq_event_percpu+0x70/0xf0 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12c/0x310 handle_irq+0x192/0x20a do_IRQ+0x73/0x160 ret_from_intr+0x0/0x1d default_idle+0x23/0x1f0 arch_cpu_idle+0x15/0x20 default_idle_call+0x35/0x40 do_idle+0x2bb/0x2e0 cpu_startup_entry+0x1d/0x20 start_secondary+0x2a8/0x320 secondary_startup_64+0xa4/0xb0 INITIAL USE at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla24xx_report_id_acquisition+0xa69/0xe30 [qla2xxx_scst] qla24xx_process_response_queue+0x69e/0x1270 [qla2xxx_scst] qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx_scst] __handle_irq_event_percpu+0x79/0x3c0 handle_irq_event_percpu+0x70/0xf0 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12c/0x310 handle_irq+0x192/0x20a do_IRQ+0x73/0x160 ret_from_intr+0x0/0x1d default_idle+0x23/0x1f0 arch_cpu_idle+0x15/0x20 default_idle_call+0x35/0x40 do_idle+0x2bb/0x2e0 cpu_startup_entry+0x1d/0x20 start_secondary+0x2a8/0x320 secondary_startup_64+0xa4/0xb0 } ... key at: [<ffffffffa0c0d080>] __key.85462+0x0/0xfffffffffff7df80 [qla2xxx_scst] ... acquired at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0xa0b/0xa30 [qla2xxx_scst] qlt_unreg_sess+0x1c6/0x380 [qla2xxx_scst] qla24xx_delete_sess_fn+0xe6/0xf0 [qla2xxx_scst] process_one_work+0x511/0xa80 worker_thread+0x67/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x3a/0x50
the dependencies between the lock to be acquired and HARDIRQ-irq-unsafe lock: -> (&(&k->k_lock)->rlock){+.+.} ops: 13831 { HARDIRQ-ON-W at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7e1/0xb50 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 SOFTIRQ-ON-W at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7e1/0xb50 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 INITIAL USE at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7e1/0xb50 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 } ... key at: [<ffffffff83ed8780>] __key.15491+0x0/0x40 ... acquired at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0xa0b/0xa30 [qla2xxx_scst] qlt_unreg_sess+0x1c6/0x380 [qla2xxx_scst] qla24xx_delete_sess_fn+0xe6/0xf0 [qla2xxx_scst] process_one_work+0x511/0xa80 worker_thread+0x67/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x3a/0x50
stack backtrace: CPU: 2 PID: 62 Comm: kworker/2:1 Tainted: G O 5.0.7-dbg+ #8 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: qla2xxx_wq qla24xx_delete_sess_fn [qla2xxx_scst] Call Trace: dump_stack+0x86/0xca check_usage.cold.52+0x473/0x563 __lock_acquire+0x11c0/0x23e0 lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0xa0b/0xa30 [qla2xxx_scst] qlt_unreg_sess+0x1c6/0x380 [qla2xxx_scst] qla24xx_delete_sess_fn+0xe6/0xf0 [qla2xxx_scst] process_one_work+0x511/0xa80 worker_thread+0x67/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x3a/0x50
Cc: Himanshu Madhani hmadhani@marvell.com Cc: Giridhar Malavali gmalavali@marvell.com Signed-off-by: Bart Van Assche bvanassche@acm.org Acked-by: Himanshu Madhani hmadhani@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_target.c | 25 ++++++++----------------- drivers/scsi/qla2xxx/tcm_qla2xxx.c | 2 -- 2 files changed, 8 insertions(+), 19 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c index 697eee1d88474..b210a8296c275 100644 --- a/drivers/scsi/qla2xxx/qla_target.c +++ b/drivers/scsi/qla2xxx/qla_target.c @@ -680,7 +680,6 @@ int qla24xx_async_notify_ack(scsi_qla_host_t *vha, fc_port_t *fcport, void qla24xx_do_nack_work(struct scsi_qla_host *vha, struct qla_work_evt *e) { fc_port_t *t; - unsigned long flags;
switch (e->u.nack.type) { case SRB_NACK_PRLI: @@ -693,10 +692,8 @@ void qla24xx_do_nack_work(struct scsi_qla_host *vha, struct qla_work_evt *e) if (t) { ql_log(ql_log_info, vha, 0xd034, "%s create sess success %p", __func__, t); - spin_lock_irqsave(&vha->hw->tgt.sess_lock, flags); /* create sess has an extra kref */ vha->hw->tgt.tgt_ops->put_sess(e->u.nack.fcport); - spin_unlock_irqrestore(&vha->hw->tgt.sess_lock, flags); } break; } @@ -708,9 +705,6 @@ void qla24xx_delete_sess_fn(struct work_struct *work) { fc_port_t *fcport = container_of(work, struct fc_port, del_work); struct qla_hw_data *ha = fcport->vha->hw; - unsigned long flags; - - spin_lock_irqsave(&ha->tgt.sess_lock, flags);
if (fcport->se_sess) { ha->tgt.tgt_ops->shutdown_sess(fcport); @@ -718,7 +712,6 @@ void qla24xx_delete_sess_fn(struct work_struct *work) } else { qlt_unreg_sess(fcport); } - spin_unlock_irqrestore(&ha->tgt.sess_lock, flags); }
/* @@ -787,8 +780,9 @@ void qlt_fc_port_added(struct scsi_qla_host *vha, fc_port_t *fcport) fcport->port_name, sess->loop_id); sess->local = 0; } - ha->tgt.tgt_ops->put_sess(sess); spin_unlock_irqrestore(&ha->tgt.sess_lock, flags); + + ha->tgt.tgt_ops->put_sess(sess); }
/* @@ -4242,9 +4236,7 @@ static void __qlt_do_work(struct qla_tgt_cmd *cmd) /* * Drop extra session reference from qla_tgt_handle_cmd_for_atio*( */ - spin_lock_irqsave(&ha->tgt.sess_lock, flags); ha->tgt.tgt_ops->put_sess(sess); - spin_unlock_irqrestore(&ha->tgt.sess_lock, flags); return;
out_term: @@ -4261,9 +4253,7 @@ static void __qlt_do_work(struct qla_tgt_cmd *cmd) target_free_tag(sess->se_sess, &cmd->se_cmd); spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
- spin_lock_irqsave(&ha->tgt.sess_lock, flags); ha->tgt.tgt_ops->put_sess(sess); - spin_unlock_irqrestore(&ha->tgt.sess_lock, flags); }
static void qlt_do_work(struct work_struct *work) @@ -4472,9 +4462,7 @@ static int qlt_handle_cmd_for_atio(struct scsi_qla_host *vha, if (!cmd) { ql_dbg(ql_dbg_io, vha, 0x3062, "qla_target(%d): Allocation of cmd failed\n", vha->vp_idx); - spin_lock_irqsave(&ha->tgt.sess_lock, flags); ha->tgt.tgt_ops->put_sess(sess); - spin_unlock_irqrestore(&ha->tgt.sess_lock, flags); return -EBUSY; }
@@ -6318,17 +6306,19 @@ static void qlt_abort_work(struct qla_tgt *tgt, }
rc = __qlt_24xx_handle_abts(vha, &prm->abts, sess); - ha->tgt.tgt_ops->put_sess(sess); spin_unlock_irqrestore(&ha->tgt.sess_lock, flags2);
+ ha->tgt.tgt_ops->put_sess(sess); + if (rc != 0) goto out_term; return;
out_term2: + spin_unlock_irqrestore(&ha->tgt.sess_lock, flags2); + if (sess) ha->tgt.tgt_ops->put_sess(sess); - spin_unlock_irqrestore(&ha->tgt.sess_lock, flags2);
out_term: spin_lock_irqsave(&ha->hardware_lock, flags); @@ -6386,9 +6376,10 @@ static void qlt_tmr_work(struct qla_tgt *tgt, scsilun_to_int((struct scsi_lun *)&a->u.isp24.fcp_cmnd.lun);
rc = qlt_issue_task_mgmt(sess, unpacked_lun, fn, iocb, 0); - ha->tgt.tgt_ops->put_sess(sess); spin_unlock_irqrestore(&ha->tgt.sess_lock, flags);
+ ha->tgt.tgt_ops->put_sess(sess); + if (rc != 0) goto out_term; return; diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c index d6104f23f697f..e58becb790fa3 100644 --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c @@ -359,7 +359,6 @@ static void tcm_qla2xxx_put_sess(struct fc_port *sess) if (!sess) return;
- assert_spin_locked(&sess->vha->hw->tgt.sess_lock); kref_put(&sess->sess_kref, tcm_qla2xxx_release_session); }
@@ -832,7 +831,6 @@ static void tcm_qla2xxx_clear_nacl_from_fcport_map(struct fc_port *sess)
static void tcm_qla2xxx_shutdown_sess(struct fc_port *sess) { - assert_spin_locked(&sess->vha->hw->tgt.sess_lock); target_sess_cmd_list_set_waiting(sess->se_sess); }
From: Nadav Amit namit@vmware.com
[ Upstream commit f2c65fb3221adc6b73b0549fc7ba892022db9797 ]
When modules and BPF filters are loaded, there is a time window in which some memory is both writable and executable. An attacker that has already found another vulnerability (e.g., a dangling pointer) might be able to exploit this behavior to overwrite kernel code. Prevent having writable executable PTEs in this stage.
In addition, avoiding having W+X mappings can also slightly simplify the patching of modules code on initialization (e.g., by alternatives and static-key), as would be done in the next patch. This was actually the main motivation for this patch.
To avoid having W+X mappings, set them initially as RW (NX) and after they are set as RO set them as X as well. Setting them as executable is done as a separate step to avoid one core in which the old PTE is cached (hence writable), and another which sees the updated PTE (executable), which would break the W^X protection.
Suggested-by: Thomas Gleixner tglx@linutronix.de Suggested-by: Andy Lutomirski luto@amacapital.net Signed-off-by: Nadav Amit namit@vmware.com Signed-off-by: Rick Edgecombe rick.p.edgecombe@intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: akpm@linux-foundation.org Cc: ard.biesheuvel@linaro.org Cc: deneen.t.dock@intel.com Cc: kernel-hardening@lists.openwall.com Cc: kristen@linux.intel.com Cc: linux_dti@icloud.com Cc: will.deacon@arm.com Cc: Andy Lutomirski luto@kernel.org Cc: Borislav Petkov bp@alien8.de Cc: Dave Hansen dave.hansen@intel.com Cc: H. Peter Anvin hpa@zytor.com Cc: Jessica Yu jeyu@kernel.org Cc: Kees Cook keescook@chromium.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Rik van Riel riel@surriel.com Link: https://lkml.kernel.org/r/20190426001143.4983-12-namit@vmware.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/alternative.c | 28 +++++++++++++++++++++------- arch/x86/kernel/module.c | 2 +- include/linux/filter.h | 1 + kernel/module.c | 5 +++++ 4 files changed, 28 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 9a79c7808f9cc..d7df79fc448cd 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -667,15 +667,29 @@ void __init alternative_instructions(void) * handlers seeing an inconsistent instruction while you patch. */ void *__init_or_module text_poke_early(void *addr, const void *opcode, - size_t len) + size_t len) { unsigned long flags; - local_irq_save(flags); - memcpy(addr, opcode, len); - local_irq_restore(flags); - sync_core(); - /* Could also do a CLFLUSH here to speed up CPU recovery; but - that causes hangs on some VIA CPUs. */ + + if (boot_cpu_has(X86_FEATURE_NX) && + is_module_text_address((unsigned long)addr)) { + /* + * Modules text is marked initially as non-executable, so the + * code cannot be running and speculative code-fetches are + * prevented. Just change the code. + */ + memcpy(addr, opcode, len); + } else { + local_irq_save(flags); + memcpy(addr, opcode, len); + local_irq_restore(flags); + sync_core(); + + /* + * Could also do a CLFLUSH here to speed up CPU recovery; but + * that causes hangs on some VIA CPUs. + */ + } return addr; }
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index b052e883dd8cc..cfa3106faee42 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -87,7 +87,7 @@ void *module_alloc(unsigned long size) p = __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR + get_module_load_offset(), MODULES_END, GFP_KERNEL, - PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE, + PAGE_KERNEL, 0, NUMA_NO_NODE, __builtin_return_address(0)); if (p && (kasan_module_alloc(p, size) < 0)) { vfree(p); diff --git a/include/linux/filter.h b/include/linux/filter.h index 6074aa064b540..14ec3bdad9a90 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -746,6 +746,7 @@ static inline void bpf_prog_unlock_ro(struct bpf_prog *fp) static inline void bpf_jit_binary_lock_ro(struct bpf_binary_header *hdr) { set_memory_ro((unsigned long)hdr, hdr->pages); + set_memory_x((unsigned long)hdr, hdr->pages); }
static inline void bpf_jit_binary_unlock_ro(struct bpf_binary_header *hdr) diff --git a/kernel/module.c b/kernel/module.c index 0b9aa8ab89f08..2b2845ae983ed 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -1950,8 +1950,13 @@ void module_enable_ro(const struct module *mod, bool after_init) return;
frob_text(&mod->core_layout, set_memory_ro); + frob_text(&mod->core_layout, set_memory_x); + frob_rodata(&mod->core_layout, set_memory_ro); + frob_text(&mod->init_layout, set_memory_ro); + frob_text(&mod->init_layout, set_memory_x); + frob_rodata(&mod->init_layout, set_memory_ro);
if (after_init)
From: Robbie Ko robbieko@synology.com
[ Upstream commit 39ad317315887c2cb9a4347a93a8859326ddf136 ]
When doing fallocate, we first add the range to the reserve_list and then reserve the quota. If quota reservation fails, we'll release all reserved parts of reserve_list.
However, cur_offset is not updated to indicate that this range is already been inserted into the list. Therefore, the same range is freed twice. Once at list_for_each_entry loop, and once at the end of the function. This will result in WARN_ON on bytes_may_use when we free the remaining space.
At the end, under the 'out' label we have a call to:
btrfs_free_reserved_data_space(inode, data_reserved, alloc_start, alloc_end - cur_offset);
The start offset, third argument, should be cur_offset.
Everything from alloc_start to cur_offset was freed by the list_for_each_entry_safe_loop.
Fixes: 18513091af94 ("btrfs: update btrfs_space_info's bytes_may_use timely") Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Robbie Ko robbieko@synology.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/file.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 34fe8a58b0e9c..0832449722c12 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -3132,6 +3132,7 @@ static long btrfs_fallocate(struct file *file, int mode, ret = btrfs_qgroup_reserve_data(inode, &data_reserved, cur_offset, last_byte - cur_offset); if (ret < 0) { + cur_offset = last_byte; free_extent_map(em); break; } @@ -3181,7 +3182,7 @@ static long btrfs_fallocate(struct file *file, int mode, /* Let go of our reservation. */ if (ret != 0 && !(mode & FALLOC_FL_ZERO_RANGE)) btrfs_free_reserved_data_space(inode, data_reserved, - alloc_start, alloc_end - cur_offset); + cur_offset, alloc_end - cur_offset); extent_changeset_free(data_reserved); return ret; }
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit ff612ba7849964b1898fd3ccd1f56941129c6aab ]
We've been seeing the following sporadically throughout our fleet
panic: kernel BUG at fs/btrfs/relocation.c:4584! netversion: 5.0-0 Backtrace: #0 [ffffc90003adb880] machine_kexec at ffffffff81041da8 #1 [ffffc90003adb8c8] __crash_kexec at ffffffff8110396c #2 [ffffc90003adb988] crash_kexec at ffffffff811048ad #3 [ffffc90003adb9a0] oops_end at ffffffff8101c19a #4 [ffffc90003adb9c0] do_trap at ffffffff81019114 #5 [ffffc90003adba00] do_error_trap at ffffffff810195d0 #6 [ffffc90003adbab0] invalid_op at ffffffff81a00a9b [exception RIP: btrfs_reloc_cow_block+692] RIP: ffffffff8143b614 RSP: ffffc90003adbb68 RFLAGS: 00010246 RAX: fffffffffffffff7 RBX: ffff8806b9c32000 RCX: ffff8806aad00690 RDX: ffff880850b295e0 RSI: ffff8806b9c32000 RDI: ffff88084f205bd0 RBP: ffff880849415000 R8: ffffc90003adbbe0 R9: ffff88085ac90000 R10: ffff8805f7369140 R11: 0000000000000000 R12: ffff880850b295e0 R13: ffff88084f205bd0 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffffc90003adbbb0] __btrfs_cow_block at ffffffff813bf1cd #8 [ffffc90003adbc28] btrfs_cow_block at ffffffff813bf4b3 #9 [ffffc90003adbc78] btrfs_search_slot at ffffffff813c2e6c
The way relocation moves data extents is by creating a reloc inode and preallocating extents in this inode and then copying the data into these preallocated extents. Once we've done this for all of our extents, we'll write out these dirty pages, which marks the extent written, and goes into btrfs_reloc_cow_block(). From here we get our current reloc_control, which _should_ match the reloc_control for the current block group we're relocating.
However if we get an ENOSPC in this path at some point we'll bail out, never initiating writeback on this inode. Not a huge deal, unless we happen to be doing relocation on a different block group, and this block group is now rc->stage == UPDATE_DATA_PTRS. This trips the BUG_ON() in btrfs_reloc_cow_block(), because we expect to be done modifying the data inode. We are in fact done modifying the metadata for the data inode we're currently using, but not the one from the failed block group, and thus we BUG_ON().
(This happens when writeback finishes for extents from the previous group, when we are at btrfs_finish_ordered_io() which updates the data reloc tree (inode item, drops/adds extent items, etc).)
Fix this by writing out the reloc data inode always, and then breaking out of the loop after that point to keep from tripping this BUG_ON() later.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: Filipe Manana fdmanana@suse.com [ add note from Filipe ] Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/relocation.c | 31 ++++++++++++++++++++----------- 1 file changed, 20 insertions(+), 11 deletions(-)
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index ddf0285099312..00c3dd92f088f 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -4330,27 +4330,36 @@ int btrfs_relocate_block_group(struct btrfs_fs_info *fs_info, u64 group_start) mutex_lock(&fs_info->cleaner_mutex); ret = relocate_block_group(rc); mutex_unlock(&fs_info->cleaner_mutex); - if (ret < 0) { + if (ret < 0) err = ret; - goto out; - } - - if (rc->extents_found == 0) - break; - - btrfs_info(fs_info, "found %llu extents", rc->extents_found);
+ /* + * We may have gotten ENOSPC after we already dirtied some + * extents. If writeout happens while we're relocating a + * different block group we could end up hitting the + * BUG_ON(rc->stage == UPDATE_DATA_PTRS) in + * btrfs_reloc_cow_block. Make sure we write everything out + * properly so we don't trip over this problem, and then break + * out of the loop if we hit an error. + */ if (rc->stage == MOVE_DATA_EXTENTS && rc->found_file_extent) { ret = btrfs_wait_ordered_range(rc->data_inode, 0, (u64)-1); - if (ret) { + if (ret) err = ret; - goto out; - } invalidate_mapping_pages(rc->data_inode->i_mapping, 0, -1); rc->stage = UPDATE_DATA_PTRS; } + + if (err < 0) + goto out; + + if (rc->extents_found == 0) + break; + + btrfs_info(fs_info, "found %llu extents", rc->extents_found); + }
WARN_ON(rc->block_group->pinned > 0);
From: Qu Wenruo wqu@suse.com
[ Upstream commit 10995c0491204c861948c9850939a7f4e90760a4 ]
Commit d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots()") expands the life span of root->reloc_root.
This breaks certain checs of fs_info->reloc_ctl. Before that commit, if we have a root with valid reloc_root, then it's ensured to have fs_info->reloc_ctl.
But now since reloc_root doesn't always mean a valid fs_info->reloc_ctl, such check is unreliable and can cause the following NULL pointer dereference:
BUG: unable to handle kernel NULL pointer dereference at 00000000000005c1 IP: btrfs_reloc_pre_snapshot+0x20/0x50 [btrfs] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 10379 Comm: snapperd Not tainted Call Trace: create_pending_snapshot+0xd7/0xfc0 [btrfs] create_pending_snapshots+0x8e/0xb0 [btrfs] btrfs_commit_transaction+0x2ac/0x8f0 [btrfs] btrfs_mksubvol+0x561/0x570 [btrfs] btrfs_ioctl_snap_create_transid+0x189/0x190 [btrfs] btrfs_ioctl_snap_create_v2+0x102/0x150 [btrfs] btrfs_ioctl+0x5c9/0x1e60 [btrfs] do_vfs_ioctl+0x90/0x5f0 SyS_ioctl+0x74/0x80 do_syscall_64+0x7b/0x150 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x7fd7cdab8467
Fix it by explicitly checking fs_info->reloc_ctl other than using the implied root->reloc_root.
Fixes: d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots") Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/relocation.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 00c3dd92f088f..1d82ee4883eb3 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -4676,14 +4676,12 @@ int btrfs_reloc_cow_block(struct btrfs_trans_handle *trans, void btrfs_reloc_pre_snapshot(struct btrfs_pending_snapshot *pending, u64 *bytes_to_reserve) { - struct btrfs_root *root; - struct reloc_control *rc; + struct btrfs_root *root = pending->root; + struct reloc_control *rc = root->fs_info->reloc_ctl;
- root = pending->root; - if (!root->reloc_root) + if (!root->reloc_root || !rc) return;
- rc = root->fs_info->reloc_ctl; if (!rc->merge_reloc_tree) return;
@@ -4712,10 +4710,10 @@ int btrfs_reloc_post_snapshot(struct btrfs_trans_handle *trans, struct btrfs_root *root = pending->root; struct btrfs_root *reloc_root; struct btrfs_root *new_root; - struct reloc_control *rc; + struct reloc_control *rc = root->fs_info->reloc_ctl; int ret;
- if (!root->reloc_root) + if (!root->reloc_root || !rc) return 0;
rc = root->fs_info->reloc_ctl;
From: Qu Wenruo wqu@suse.com
[ Upstream commit 7ac1e464c4d473b517bb784f30d40da1f842482e ]
When we failed to find a root key in btrfs_update_root(), we just panic.
That's definitely not cool, fix it by outputting an unique error message, aborting current transaction and return -EUCLEAN. This should not normally happen as the root has been used by the callers in some way.
Reviewed-by: Filipe Manana fdmanana@suse.com Reviewed-by: Johannes Thumshirn jthumshirn@suse.de Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/root-tree.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c index 893d12fbfda07..1b9a5d0de1392 100644 --- a/fs/btrfs/root-tree.c +++ b/fs/btrfs/root-tree.c @@ -137,11 +137,14 @@ int btrfs_update_root(struct btrfs_trans_handle *trans, struct btrfs_root goto out; }
- if (ret != 0) { - btrfs_print_leaf(path->nodes[0]); - btrfs_crit(fs_info, "unable to update root key %llu %u %llu", - key->objectid, key->type, key->offset); - BUG_ON(1); + if (ret > 0) { + btrfs_crit(fs_info, + "unable to find root key (%llu %u %llu) in tree %llu", + key->objectid, key->type, key->offset, + root->root_key.objectid); + ret = -EUCLEAN; + btrfs_abort_transaction(trans, ret); + goto out; }
l = path->nodes[0];
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 30f24eabab8cd801064c5c37589d803cb4341929 ]
If for some reason the device gives us an RX interrupt before we're ready for it, perhaps during device power-on with misconfigured IRQ causes mapping or so, we can crash trying to access the queues.
Prevent that by checking that we actually have RXQs and that they were properly allocated.
Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/pcie/rx.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c index 8d4f0628622bb..12f02aaf923ed 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c @@ -1434,10 +1434,15 @@ static struct iwl_rx_mem_buffer *iwl_pcie_get_rxb(struct iwl_trans *trans, static void iwl_pcie_rx_handle(struct iwl_trans *trans, int queue) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); - struct iwl_rxq *rxq = &trans_pcie->rxq[queue]; + struct iwl_rxq *rxq; u32 r, i, count = 0; bool emergency = false;
+ if (WARN_ON_ONCE(!trans_pcie->rxq || !trans_pcie->rxq[queue].bd)) + return; + + rxq = &trans_pcie->rxq[queue]; + restart: spin_lock(&rxq->lock); /* uCode's read index (stored in shared DRAM) indicates the last Rx
From: Sven Van Asbroeck thesven73@gmail.com
[ Upstream commit f22b1ba15ee5785aa028384ebf77dd39e8e47b70 ]
The device's remove() attempts to shut down the delayed_work scheduled on the kernel-global workqueue by calling flush_scheduled_work().
Unfortunately, flush_scheduled_work() does not prevent the delayed_work from re-scheduling itself. The delayed_work might run after the device has been removed, and touch the already de-allocated info structure. This is a potential use-after-free.
Fix by calling cancel_delayed_work_sync() during remove(): this ensures that the delayed work is properly cancelled, is no longer running, and is not able to re-schedule itself.
This issue was detected with the help of Coccinelle.
Signed-off-by: Sven Van Asbroeck TheSven73@gmail.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/rtc/rtc-88pm860x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/rtc/rtc-88pm860x.c b/drivers/rtc/rtc-88pm860x.c index d25282b4a7dd1..73697e4b18a9d 100644 --- a/drivers/rtc/rtc-88pm860x.c +++ b/drivers/rtc/rtc-88pm860x.c @@ -421,7 +421,7 @@ static int pm860x_rtc_remove(struct platform_device *pdev) struct pm860x_rtc_info *info = platform_get_drvdata(pdev);
#ifdef VRTC_CALIBRATION - flush_scheduled_work(); + cancel_delayed_work_sync(&info->calib_work); /* disable measurement */ pm860x_set_bits(info->i2c, PM8607_MEAS_EN2, MEAS2_VRTC, 0); #endif /* VRTC_CALIBRATION */
From: Fabien Dessenne fabien.dessenne@st.com
[ Upstream commit cf612c5949aca2bd81a1e28688957c8149ea2693 ]
Manage the -EPROBE_DEFER error case for the wake IRQ.
Signed-off-by: Fabien Dessenne fabien.dessenne@st.com Acked-by: Amelie Delaunay amelie.delaunay@st.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/rtc/rtc-stm32.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/rtc/rtc-stm32.c b/drivers/rtc/rtc-stm32.c index c5908cfea2340..8e6c9b3bcc29a 100644 --- a/drivers/rtc/rtc-stm32.c +++ b/drivers/rtc/rtc-stm32.c @@ -788,11 +788,14 @@ static int stm32_rtc_probe(struct platform_device *pdev) ret = device_init_wakeup(&pdev->dev, true); if (rtc->data->has_wakeirq) { rtc->wakeirq_alarm = platform_get_irq(pdev, 1); - if (rtc->wakeirq_alarm <= 0) - ret = rtc->wakeirq_alarm; - else + if (rtc->wakeirq_alarm > 0) { ret = dev_pm_set_dedicated_wake_irq(&pdev->dev, rtc->wakeirq_alarm); + } else { + ret = rtc->wakeirq_alarm; + if (rtc->wakeirq_alarm == -EPROBE_DEFER) + goto err; + } } if (ret) dev_warn(&pdev->dev, "alarm can't wake up the system: %d", ret);
From: Manish Rangankar mrangankar@marvell.com
[ Upstream commit f848bfd8e167210a29374e8a678892bed591684f ]
Sometimes during connection recovery when there is a failure to resolve ARP, and offload connection was not issued, driver tries to flush pending offload connection work which was not queued up.
kernel: WARNING: CPU: 19 PID: 10110 at kernel/workqueue.c:3030 __flush_work.isra.34+0x19c/0x1b0 kernel: CPU: 19 PID: 10110 Comm: iscsid Tainted: G W 5.1.0-rc4 #11 kernel: Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 2.9.1 12/04/2018 kernel: RIP: 0010:__flush_work.isra.34+0x19c/0x1b0 kernel: Code: 8b fb 66 0f 1f 44 00 00 31 c0 eb ab 48 89 ef c6 07 00 0f 1f 40 00 fb 66 0f 1f 44 00 00 31 c0 eb 96 e8 08 16 fe ff 0f 0b eb 8d <0f> 0b 31 c0 eb 87 0f 1f 40 00 66 2e 0f 1 f 84 00 00 00 00 00 0f 1f kernel: RSP: 0018:ffffa6b4054dba68 EFLAGS: 00010246 kernel: RAX: 0000000000000000 RBX: ffff91df21c36fc0 RCX: 0000000000000000 kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff91df21c36fc0 kernel: RBP: ffff91df21c36ef0 R08: 0000000000000000 R09: 0000000000000000 kernel: R10: 0000000000000038 R11: ffffa6b4054dbd60 R12: ffffffffc05e72c0 kernel: R13: ffff91db10280820 R14: 0000000000000048 R15: 0000000000000000 kernel: FS: 00007f5d83cc1740(0000) GS:ffff91df2f840000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 0000000001cc5000 CR3: 0000000465450002 CR4: 00000000001606e0 kernel: Call Trace: kernel: ? try_to_del_timer_sync+0x4d/0x80 kernel: qedi_ep_disconnect+0x3b/0x410 [qedi] kernel: ? 0xffffffffc083c000 kernel: ? klist_iter_exit+0x14/0x20 kernel: ? class_find_device+0x93/0xf0 kernel: iscsi_if_ep_disconnect.isra.18+0x58/0x70 [scsi_transport_iscsi] kernel: iscsi_if_recv_msg+0x10e2/0x1510 [scsi_transport_iscsi] kernel: ? copyout+0x22/0x30 kernel: ? _copy_to_iter+0xa0/0x430 kernel: ? _cond_resched+0x15/0x30 kernel: ? __kmalloc_node_track_caller+0x1f9/0x270 kernel: iscsi_if_rx+0xa5/0x1e0 [scsi_transport_iscsi] kernel: netlink_unicast+0x17f/0x230 kernel: netlink_sendmsg+0x2d2/0x3d0 kernel: sock_sendmsg+0x36/0x50 kernel: ___sys_sendmsg+0x280/0x2a0 kernel: ? timerqueue_add+0x54/0x80 kernel: ? enqueue_hrtimer+0x38/0x90 kernel: ? hrtimer_start_range_ns+0x19f/0x2c0 kernel: __sys_sendmsg+0x58/0xa0 kernel: do_syscall_64+0x5b/0x180 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: Manish Rangankar mrangankar@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qedi/qedi_iscsi.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c index 6d6d6013e35b8..bf371e7b957d0 100644 --- a/drivers/scsi/qedi/qedi_iscsi.c +++ b/drivers/scsi/qedi/qedi_iscsi.c @@ -1000,6 +1000,9 @@ static void qedi_ep_disconnect(struct iscsi_endpoint *ep) qedi_ep = ep->dd_data; qedi = qedi_ep->qedi;
+ if (qedi_ep->state == EP_STATE_OFLDCONN_START) + goto ep_exit_recover; + flush_work(&qedi_ep->offload_work);
if (qedi_ep->conn) {
From: Philipp Rudo prudo@linux.ibm.com
[ Upstream commit 729829d775c9a5217abc784b2f16087d79c4eec8 ]
To register data for the next kernel (command line, oldmem_base, etc.) the current kernel needs to find the ELF segment that contains head.S. This is currently done by checking ifor 'phdr->p_paddr == 0'. This works fine for the current kernel build but in theory the first few pages could be skipped. Make the detection more robust by checking if the entry point lies within the segment.
Signed-off-by: Philipp Rudo prudo@linux.ibm.com Signed-off-by: Martin Schwidefsky schwidefsky@de.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/kexec_elf.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/s390/kernel/kexec_elf.c b/arch/s390/kernel/kexec_elf.c index 5a286b012043b..602e7cc26d118 100644 --- a/arch/s390/kernel/kexec_elf.c +++ b/arch/s390/kernel/kexec_elf.c @@ -19,10 +19,15 @@ static int kexec_file_add_elf_kernel(struct kimage *image, struct kexec_buf buf; const Elf_Ehdr *ehdr; const Elf_Phdr *phdr; + Elf_Addr entry; int i, ret;
ehdr = (Elf_Ehdr *)kernel; buf.image = image; + if (image->type == KEXEC_TYPE_CRASH) + entry = STARTUP_KDUMP_OFFSET; + else + entry = ehdr->e_entry;
phdr = (void *)ehdr + ehdr->e_phoff; for (i = 0; i < ehdr->e_phnum; i++, phdr++) { @@ -35,7 +40,7 @@ static int kexec_file_add_elf_kernel(struct kimage *image, buf.mem = ALIGN(phdr->p_paddr, phdr->p_align); buf.memsz = phdr->p_memsz;
- if (phdr->p_paddr == 0) { + if (entry - phdr->p_paddr < phdr->p_memsz) { data->kernel_buf = buf.buffer; data->memsz += STARTUP_NORMAL_OFFSET;
From: Bard liao yung-chuan.liao@linux.intel.com
[ Upstream commit 4d95c51776b2edb4d4ebcea00b6e5a1fe538ce66 ]
snd_hda_codec_device_new() is used by both legacy HDA and ASoC driver. However, we will call snd_hdac_device_unregister() in snd_hdac_ext_bus_device_remove() for ASoC device. This patch uses the type flag in hdac_device struct to determine is it a ASoC device or legacy HDA device and call snd_hdac_device_unregister() in snd_hda_codec_dev_free() only if it is a legacy HDA device.
Signed-off-by: Bard liao yung-chuan.liao@linux.intel.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/hda_codec.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index 701a69d856f5f..b20eb7fc83eb2 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -832,7 +832,13 @@ static int snd_hda_codec_dev_free(struct snd_device *device) struct hda_codec *codec = device->device_data;
codec->in_freeing = 1; - snd_hdac_device_unregister(&codec->core); + /* + * snd_hda_codec_device_new() is used by legacy HDA and ASoC driver. + * We can't unregister ASoC device since it will be unregistered in + * snd_hdac_ext_bus_device_remove(). + */ + if (codec->core.type == HDA_DEV_LEGACY) + snd_hdac_device_unregister(&codec->core); codec_display_power(codec, false); put_device(hda_codec_dev(codec)); return 0;
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit 9b019acb72e4b5741d88e8936d6f200ed44b66b2 ]
The NOHZ idle balancer runs on the lowest idle CPU. This can interfere with isolated CPUs, so confine it to HK_FLAG_MISC housekeeping CPUs.
HK_FLAG_SCHED is not used for this because it is not set anywhere at the moment. This could be folded into HK_FLAG_SCHED once that option is fixed.
The problem was observed with increased jitter on an application running on CPU0, caused by NOHZ idle load balancing being run on CPU1 (an SMT sibling).
Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Frederic Weisbecker fweisbec@gmail.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/20190412042613.28930-1-npiggin@gmail.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 35f3ea3750844..232491e3ed0db 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9551,22 +9551,26 @@ static inline int on_null_domain(struct rq *rq) * - When one of the busy CPUs notice that there may be an idle rebalancing * needed, they will kick the idle load balancer, which then does idle * load balancing for all the idle CPUs. + * - HK_FLAG_MISC CPUs are used for this task, because HK_FLAG_SCHED not set + * anywhere yet. */
static inline int find_new_ilb(void) { - int ilb = cpumask_first(nohz.idle_cpus_mask); + int ilb;
- if (ilb < nr_cpu_ids && idle_cpu(ilb)) - return ilb; + for_each_cpu_and(ilb, nohz.idle_cpus_mask, + housekeeping_cpumask(HK_FLAG_MISC)) { + if (idle_cpu(ilb)) + return ilb; + }
return nr_cpu_ids; }
/* - * Kick a CPU to do the nohz balancing, if it is time for it. We pick the - * nohz_load_balancer CPU (if there is one) otherwise fallback to any idle - * CPU (if there is one). + * Kick a CPU to do the nohz balancing, if it is time for it. We pick any + * idle CPU in the HK_FLAG_MISC housekeeping set (if there is one). */ static void kick_ilb(unsigned int flags) {
From: Grygorii Strashko grygorii.strashko@ti.com
[ Upstream commit 06095f34f8a0a2c4c83a19514c272699edd5f80b ]
Now CPSW ALE will set/clean Host port bit in Unregistered Multicast Flood Mask (UNREG_MCAST_FLOOD_MASK) for every VLAN without checking if this port belongs to VLAN or not when ALLMULTI mode flag is set for nedev. This is working in non dual_mac mode, but in dual_mac - it causes enabling/disabling ALLMULTI flag for both ports.
Hence fix it by adding additional parameter to cpsw_ale_set_allmulti() to specify ALE port number for which ALLMULTI has to be enabled and check if port belongs to VLAN before modifying UNREG_MCAST_FLOOD_MASK.
Signed-off-by: Grygorii Strashko grygorii.strashko@ti.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ti/cpsw.c | 12 +++++++++--- drivers/net/ethernet/ti/cpsw_ale.c | 19 ++++++++++--------- drivers/net/ethernet/ti/cpsw_ale.h | 3 +-- 3 files changed, 20 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index a591583d120e1..dd12b73a88530 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -800,12 +800,17 @@ static int cpsw_purge_all_mc(struct net_device *ndev, const u8 *addr, int num)
static void cpsw_ndo_set_rx_mode(struct net_device *ndev) { - struct cpsw_common *cpsw = ndev_to_cpsw(ndev); + struct cpsw_priv *priv = netdev_priv(ndev); + struct cpsw_common *cpsw = priv->cpsw; + int slave_port = -1; + + if (cpsw->data.dual_emac) + slave_port = priv->emac_port + 1;
if (ndev->flags & IFF_PROMISC) { /* Enable promiscuous mode */ cpsw_set_promiscious(ndev, true); - cpsw_ale_set_allmulti(cpsw->ale, IFF_ALLMULTI); + cpsw_ale_set_allmulti(cpsw->ale, IFF_ALLMULTI, slave_port); return; } else { /* Disable promiscuous mode */ @@ -813,7 +818,8 @@ static void cpsw_ndo_set_rx_mode(struct net_device *ndev) }
/* Restore allmulti on vlans if necessary */ - cpsw_ale_set_allmulti(cpsw->ale, ndev->flags & IFF_ALLMULTI); + cpsw_ale_set_allmulti(cpsw->ale, + ndev->flags & IFF_ALLMULTI, slave_port);
/* add/remove mcast address either for real netdev or for vlan */ __hw_addr_ref_sync_dev(&ndev->mc, ndev, cpsw_add_mc_addr, diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index 798c989d5d934..b3d9591b4824a 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -482,24 +482,25 @@ int cpsw_ale_del_vlan(struct cpsw_ale *ale, u16 vid, int port_mask) } EXPORT_SYMBOL_GPL(cpsw_ale_del_vlan);
-void cpsw_ale_set_allmulti(struct cpsw_ale *ale, int allmulti) +void cpsw_ale_set_allmulti(struct cpsw_ale *ale, int allmulti, int port) { u32 ale_entry[ALE_ENTRY_WORDS]; - int type, idx; int unreg_mcast = 0; - - /* Only bother doing the work if the setting is actually changing */ - if (ale->allmulti == allmulti) - return; - - /* Remember the new setting to check against next time */ - ale->allmulti = allmulti; + int type, idx;
for (idx = 0; idx < ale->params.ale_entries; idx++) { + int vlan_members; + cpsw_ale_read(ale, idx, ale_entry); type = cpsw_ale_get_entry_type(ale_entry); if (type != ALE_TYPE_VLAN) continue; + vlan_members = + cpsw_ale_get_vlan_member_list(ale_entry, + ale->vlan_field_bits); + + if (port != -1 && !(vlan_members & BIT(port))) + continue;
unreg_mcast = cpsw_ale_get_vlan_unreg_mcast(ale_entry, diff --git a/drivers/net/ethernet/ti/cpsw_ale.h b/drivers/net/ethernet/ti/cpsw_ale.h index cd07a3e96d576..1fe196d8a5e42 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.h +++ b/drivers/net/ethernet/ti/cpsw_ale.h @@ -37,7 +37,6 @@ struct cpsw_ale { struct cpsw_ale_params params; struct timer_list timer; unsigned long ageout; - int allmulti; u32 version; /* These bits are different on NetCP NU Switch ALE */ u32 port_mask_bits; @@ -116,7 +115,7 @@ int cpsw_ale_del_mcast(struct cpsw_ale *ale, const u8 *addr, int port_mask, int cpsw_ale_add_vlan(struct cpsw_ale *ale, u16 vid, int port, int untag, int reg_mcast, int unreg_mcast); int cpsw_ale_del_vlan(struct cpsw_ale *ale, u16 vid, int port); -void cpsw_ale_set_allmulti(struct cpsw_ale *ale, int allmulti); +void cpsw_ale_set_allmulti(struct cpsw_ale *ale, int allmulti, int port);
int cpsw_ale_control_get(struct cpsw_ale *ale, int port, int control); int cpsw_ale_control_set(struct cpsw_ale *ale, int port,
From: Mariusz Bialonczyk manio@skyboo.net
[ Upstream commit 62909da8aca048ecf9fbd7e484e5100608f40a63 ]
From the DS2408 datasheet [1]:
"Resume Command function checks the status of the RC flag and, if it is set, directly transfers control to the control functions, similar to a Skip ROM command. The only way to set the RC flag is through successfully executing the Match ROM, Search ROM, Conditional Search ROM, or Overdrive-Match ROM command"
The function currently works perfectly fine in a multidrop bus, but when we have only a single slave connected, then only a Skip ROM is used and Match ROM is not called at all. This is leading to problems e.g. with single one DS2408 connected, as the Resume Command is not working properly and the device is responding with failing results after the Resume Command.
This commit is fixing this by using a Skip ROM instead in those cases. The bandwidth / performance advantage is exactly the same.
Refs: [1] https://datasheets.maximintegrated.com/en/ds/DS2408.pdf
Signed-off-by: Mariusz Bialonczyk manio@skyboo.net Reviewed-by: Jean-Francois Dagenais jeff.dagenais@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/w1/w1_io.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/w1/w1_io.c b/drivers/w1/w1_io.c index 0364d3329c526..3516ce6718d94 100644 --- a/drivers/w1/w1_io.c +++ b/drivers/w1/w1_io.c @@ -432,8 +432,7 @@ int w1_reset_resume_command(struct w1_master *dev) if (w1_reset_bus(dev)) return -1;
- /* This will make only the last matched slave perform a skip ROM. */ - w1_write_8(dev, W1_RESUME_CMD); + w1_write_8(dev, dev->slave_count > 1 ? W1_RESUME_CMD : W1_SKIP_ROM); return 0; } EXPORT_SYMBOL_GPL(w1_reset_resume_command);
From: Huazhong Tan tanhuazhong@huawei.com
[ Upstream commit fba2efdae8b4f998f66a2ff4c9f0575e1c4bbc40 ]
When configure pause, current implementation returns directly after setup PFC without setup BP, which is not sufficient.
So this patch fixes it, only return while setting PFC failed.
Fixes: 44e59e375bf7 ("net: hns3: do not return GE PFC setting err when initializing") Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c index aafc69f4bfdd6..a7bbb6d3091a6 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c @@ -1331,8 +1331,11 @@ int hclge_pause_setup_hw(struct hclge_dev *hdev, bool init) ret = hclge_pfc_setup_hw(hdev); if (init && ret == -EOPNOTSUPP) dev_warn(&hdev->pdev->dev, "GE MAC does not support pfc\n"); - else + else if (ret) { + dev_err(&hdev->pdev->dev, "config pfc failed! ret = %d\n", + ret); return ret; + }
return hclge_tm_bp_setup(hdev); }
From: Yunsheng Lin linyunsheng@huawei.com
[ Upstream commit 63380a1ae4ced8aef67659ff9547c69ef8b9613a ]
hns3_desc_unused() returns how many BD have been cleaned, but new buffer has not been attached to them. The register of HNS3_RING_RX_RING_FBDNUM_REG returns how many BD need allocating new buffer to or need to cleaned. So the remaining BD need to be clean is HNS3_RING_RX_RING_FBDNUM_REG - hns3_desc_unused().
Also, new buffer can not attach to the pending BD when the last BD is not handled, because memcpy has not been done on the first pending BD.
This patch fixes by subtracting the pending BD num from unused_count after 'HNS3_RING_RX_RING_FBDNUM_REG - unused_count' is used to calculate the BD bum need to be clean.
Fixes: e55970950556 ("net: hns3: Add handling of GRO Pkts not fully RX'ed in NAPI poll") Signed-off-by: Yunsheng Lin linyunsheng@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index 162cb9afa0e70..0208efe282775 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -2705,7 +2705,7 @@ int hns3_clean_rx_ring( #define RCB_NOF_ALLOC_RX_BUFF_ONCE 16 struct net_device *netdev = ring->tqp->handle->kinfo.netdev; int recv_pkts, recv_bds, clean_count, err; - int unused_count = hns3_desc_unused(ring) - ring->pending_buf; + int unused_count = hns3_desc_unused(ring); struct sk_buff *skb = ring->skb; int num;
@@ -2714,6 +2714,7 @@ int hns3_clean_rx_ring(
recv_pkts = 0, recv_bds = 0, clean_count = 0; num -= unused_count; + unused_count -= ring->pending_buf;
while (recv_pkts < budget && recv_bds < num) { /* Reuse or realloc buffers */
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit 8c90b795e90f7753d23c18e8b95dd71b4a18c5d9 ]
PHY's behave differently when being reset. Some reset registers to defaults, some don't. Some trigger an autoneg restart, some don't.
So let's also set the autoneg restart bit when resetting. Then PHY behavior should be more consistent. Clearing BMCR_ISOLATE serves the same purpose and is borrowed from genphy_restart_aneg.
BMCR holds the speed / duplex settings in fixed mode. Therefore we may have an issue if a soft reset resets BMCR to its default. So better call genphy_setup_forced() afterwards in fixed mode. We've seen no related complaint in the last >10 yrs, so let's treat it as an improvement.
Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/phy_device.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index cd5966b0db571..f6a6cc5bf118d 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -1829,13 +1829,25 @@ EXPORT_SYMBOL(genphy_read_status); */ int genphy_soft_reset(struct phy_device *phydev) { + u16 res = BMCR_RESET; int ret;
- ret = phy_set_bits(phydev, MII_BMCR, BMCR_RESET); + if (phydev->autoneg == AUTONEG_ENABLE) + res |= BMCR_ANRESTART; + + ret = phy_modify(phydev, MII_BMCR, BMCR_ISOLATE, res); if (ret < 0) return ret;
- return phy_poll_reset(phydev); + ret = phy_poll_reset(phydev); + if (ret) + return ret; + + /* BMCR may be reset to defaults */ + if (phydev->autoneg == AUTONEG_DISABLE) + ret = genphy_setup_forced(phydev); + + return ret; } EXPORT_SYMBOL(genphy_soft_reset);
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 46b83629dede262315aa82179d105581f11763b6 ]
clang produces a harmless warning for each use for the qeth_adp_supported macro:
drivers/s390/net/qeth_l2_main.c:559:31: warning: implicit conversion from enumeration type 'enum qeth_ipa_setadp_cmd' to different enumeration type 'enum qeth_ipa_funcs' [-Wenum-conversion] if (qeth_adp_supported(card, IPA_SETADP_SET_PROMISC_MODE)) ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/s390/net/qeth_core.h:179:41: note: expanded from macro 'qeth_adp_supported' qeth_is_ipa_supported(&c->options.adp, f) ~~~~~~~~~~~~~~~~~~~~~ ^
Add a version of this macro that uses the correct types, and remove the unused qeth_adp_enabled() macro that has the same problem.
Reviewed-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Julian Wiedmann jwi@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/net/qeth_core.h | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h index c851cf6e01c43..d603dfea97ab2 100644 --- a/drivers/s390/net/qeth_core.h +++ b/drivers/s390/net/qeth_core.h @@ -163,6 +163,12 @@ struct qeth_vnicc_info { bool rx_bcast_enabled; };
+static inline int qeth_is_adp_supported(struct qeth_ipa_info *ipa, + enum qeth_ipa_setadp_cmd func) +{ + return (ipa->supported_funcs & func); +} + static inline int qeth_is_ipa_supported(struct qeth_ipa_info *ipa, enum qeth_ipa_funcs func) { @@ -176,9 +182,7 @@ static inline int qeth_is_ipa_enabled(struct qeth_ipa_info *ipa, }
#define qeth_adp_supported(c, f) \ - qeth_is_ipa_supported(&c->options.adp, f) -#define qeth_adp_enabled(c, f) \ - qeth_is_ipa_enabled(&c->options.adp, f) + qeth_is_adp_supported(&c->options.adp, f) #define qeth_is_supported(c, f) \ qeth_is_ipa_supported(&c->options.ipa4, f) #define qeth_is_enabled(c, f) \
From: Will Deacon will.deacon@arm.com
[ Upstream commit 84ff7a09c371bc7417eabfda19bf7f113ec917b6 ]
Rather embarrassingly, our futex() FUTEX_WAKE_OP implementation doesn't explicitly set the return value on the non-faulting path and instead leaves it holding the result of the underlying atomic operation. This means that any FUTEX_WAKE_OP atomic operation which computes a non-zero value will be reported as having failed. Regrettably, I wrote the buggy code back in 2011 and it was upstreamed as part of the initial arm64 support in 2012.
The reasons we appear to get away with this are:
1. FUTEX_WAKE_OP is rarely used and therefore doesn't appear to get exercised by futex() test applications
2. If the result of the atomic operation is zero, the system call behaves correctly
3. Prior to version 2.25, the only operation used by GLIBC set the futex to zero, and therefore worked as expected. From 2.25 onwards, FUTEX_WAKE_OP is not used by GLIBC at all.
Fix the implementation by ensuring that the return value is either 0 to indicate that the atomic operation completed successfully, or -EFAULT if we encountered a fault when accessing the user mapping.
Cc: stable@kernel.org Fixes: 6170a97460db ("arm64: Atomic operations") Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/futex.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h index 6fb2214333a24..2d78ea6932b7b 100644 --- a/arch/arm64/include/asm/futex.h +++ b/arch/arm64/include/asm/futex.h @@ -58,7 +58,7 @@ do { \ static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *_uaddr) { - int oldval = 0, ret, tmp; + int oldval, ret, tmp; u32 __user *uaddr = __uaccess_mask_ptr(_uaddr);
pagefault_disable();
From: Huazhong Tan tanhuazhong@huawei.com
[ Upstream commit 30780a8b1677e7409b32ae52a9a84f7d41ae6b43 ]
Since irq handler and mailbox task will both update arq's count, so arq's count should use atomic_t instead of u32, otherwise its value may go wrong finally.
Fixes: 07a0556a3a73 ("net: hns3: Changes to support ARQ(Asynchronous Receive Queue)") Signed-off-by: Huazhong Tan tanhuazhong@huawei.com Signed-off-by: Peng Li lipeng321@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h | 2 +- drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c | 2 +- drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c | 7 ++++--- 3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h b/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h index 299b277bc7ae9..589b7ee32bff8 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h +++ b/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h @@ -107,7 +107,7 @@ struct hclgevf_mbx_arq_ring { struct hclgevf_dev *hdev; u32 head; u32 tail; - u32 count; + atomic_t count; u16 msg_q[HCLGE_MBX_MAX_ARQ_MSG_NUM][HCLGE_MBX_MAX_ARQ_MSG_SIZE]; };
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c index 9441b453d38df..9a0a501908aec 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_cmd.c @@ -327,7 +327,7 @@ int hclgevf_cmd_init(struct hclgevf_dev *hdev) hdev->arq.hdev = hdev; hdev->arq.head = 0; hdev->arq.tail = 0; - hdev->arq.count = 0; + atomic_set(&hdev->arq.count, 0); hdev->hw.cmq.csq.next_to_clean = 0; hdev->hw.cmq.csq.next_to_use = 0; hdev->hw.cmq.crq.next_to_clean = 0; diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c index 7dc3c9f79169f..4f2c77283cb43 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c @@ -208,7 +208,8 @@ void hclgevf_mbx_handler(struct hclgevf_dev *hdev) /* we will drop the async msg if we find ARQ as full * and continue with next message */ - if (hdev->arq.count >= HCLGE_MBX_MAX_ARQ_MSG_NUM) { + if (atomic_read(&hdev->arq.count) >= + HCLGE_MBX_MAX_ARQ_MSG_NUM) { dev_warn(&hdev->pdev->dev, "Async Q full, dropping msg(%d)\n", req->msg[1]); @@ -220,7 +221,7 @@ void hclgevf_mbx_handler(struct hclgevf_dev *hdev) memcpy(&msg_q[0], req->msg, HCLGE_MBX_MAX_ARQ_MSG_SIZE * sizeof(u16)); hclge_mbx_tail_ptr_move_arq(hdev->arq); - hdev->arq.count++; + atomic_inc(&hdev->arq.count);
hclgevf_mbx_task_schedule(hdev);
@@ -308,7 +309,7 @@ void hclgevf_mbx_async_handler(struct hclgevf_dev *hdev) }
hclge_mbx_head_ptr_move_arq(hdev->arq); - hdev->arq.count--; + atomic_dec(&hdev->arq.count); msg_q = hdev->arq.msg_q[hdev->arq.head]; } }
From: Sugar Zhang sugar.zhang@rock-chips.com
[ Upstream commit 2da254cc7908105a60a6bb219d18e8dced03dcb9 ]
This patch kill instructs the DMAC to immediately terminate execution of a thread. and then clear the interrupt status, at last, stop generating interrupts for DMA_SEV. to guarantee the next dma start is clean. otherwise, one interrupt maybe leave to next start and make some mistake.
we can reporduce the problem as follows:
DMASEV: modify the event-interrupt resource, and if the INTEN sets function as interrupt, the DMAC will set irq<event_num> HIGH to generate interrupt. write INTCLR to clear interrupt.
DMA EXECUTING INSTRUCTS DMA TERMINATE | | | | ... _stop | | | spin_lock_irqsave DMASEV | | | | mask INTEN | | | DMAKILL | | | spin_unlock_irqrestore
in above case, a interrupt was left, and if we unmask INTEN, the DMAC will set irq<event_num> HIGH to generate interrupt.
to fix this, do as follows:
DMA EXECUTING INSTRUCTS DMA TERMINATE | | | | ... _stop | | | spin_lock_irqsave DMASEV | | | | DMAKILL | | | clear INTCLR | mask INTEN | | | spin_unlock_irqrestore
Signed-off-by: Sugar Zhang sugar.zhang@rock-chips.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/pl330.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c index eec79fdf27a5b..56695ffb5d377 100644 --- a/drivers/dma/pl330.c +++ b/drivers/dma/pl330.c @@ -966,6 +966,7 @@ static void _stop(struct pl330_thread *thrd) { void __iomem *regs = thrd->dmac->base; u8 insn[6] = {0, 0, 0, 0, 0, 0}; + u32 inten = readl(regs + INTEN);
if (_state(thrd) == PL330_STATE_FAULT_COMPLETING) UNTIL(thrd, PL330_STATE_FAULTING | PL330_STATE_KILLING); @@ -978,10 +979,13 @@ static void _stop(struct pl330_thread *thrd)
_emit_KILL(0, insn);
- /* Stop generating interrupts for SEV */ - writel(readl(regs + INTEN) & ~(1 << thrd->ev), regs + INTEN); - _execute_DBGINSN(thrd, insn, is_manager(thrd)); + + /* clear the event */ + if (inten & (1 << thrd->ev)) + writel(1 << thrd->ev, regs + INTCLR); + /* Stop generating interrupts for SEV */ + writel(inten & ~(1 << thrd->ev), regs + INTEN); }
/* Start doing req 'idx' of thread 'thrd' */
From: Sergey Matyukevich sergey.matyukevich.os@quantenna.com
[ Upstream commit 5dc8cdce1d722c733f8c7af14c5fb595cfedbfa8 ]
FullMAC STAs have no way to update bss channel after CSA channel switch completion. As a result, user-space tools may provide inconsistent channel info. For instance, consider the following two commands: $ sudo iw dev wlan0 link $ sudo iw dev wlan0 info The latter command gets channel info from the hardware, so most probably its output will be correct. However the former command gets channel info from scan cache, so its output will contain outdated channel info. In fact, current bss channel info will not be updated until the next [re-]connect.
Note that mac80211 STAs have a workaround for this, but it requires access to internal cfg80211 data, see ieee80211_chswitch_work:
/* XXX: shouldn't really modify cfg80211-owned data! */ ifmgd->associated->channel = sdata->csa_chandef.chan;
This patch suggests to convert mac80211 workaround into cfg80211 behavior and to update current bss channel in cfg80211_ch_switch_notify.
Signed-off-by: Sergey Matyukevich sergey.matyukevich.os@quantenna.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/mlme.c | 3 --- net/wireless/nl80211.c | 5 +++++ 2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index 2dbcf5d5512ef..b7a9fe3d5fcb7 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -1188,9 +1188,6 @@ static void ieee80211_chswitch_work(struct work_struct *work) goto out; }
- /* XXX: shouldn't really modify cfg80211-owned data! */ - ifmgd->associated->channel = sdata->csa_chandef.chan; - ifmgd->csa_waiting_bcn = true;
ieee80211_sta_reset_beacon_monitor(sdata); diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index 47e30a58566c2..d2a7459a5da43 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -15727,6 +15727,11 @@ void cfg80211_ch_switch_notify(struct net_device *dev,
wdev->chandef = *chandef; wdev->preset_chandef = *chandef; + + if (wdev->iftype == NL80211_IFTYPE_STATION && + !WARN_ON(!wdev->current_bss)) + wdev->current_bss->pub.channel = chandef->chan; + nl80211_ch_switch_notify(rdev, dev, chandef, GFP_KERNEL, NL80211_CMD_CH_SWITCH_NOTIFY, 0); }
From: Johan Hovold johan@kernel.org
[ Upstream commit 579bebe5dd522580019e7b10b07daaf500f9fb1e ]
The USB-serial driver init_termios callback is used to override the default initial terminal settings provided by USB-serial core.
After a bug was fixed in the original implementation introduced by commit fe1ae7fdd2ee ("tty: USB serial termios bits"), the init_termios callback was no longer called just once on first use as intended but rather on every (first) open.
This specifically meant that the terminal settings saved on (final) close were ignored when reopening a port for drivers overriding the initial settings.
Also update the outdated function header referring to the creation of termios objects.
Fixes: 7e29bb4b779f ("usb-serial: fix termios initialization logic") Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/serial/usb-serial.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/serial/usb-serial.c b/drivers/usb/serial/usb-serial.c index 7e89efbf2c284..676c296103a2f 100644 --- a/drivers/usb/serial/usb-serial.c +++ b/drivers/usb/serial/usb-serial.c @@ -164,9 +164,9 @@ void usb_serial_put(struct usb_serial *serial) * @driver: the driver (USB in our case) * @tty: the tty being created * - * Create the termios objects for this tty. We use the default + * Initialise the termios structure for this tty. We use the default * USB serial settings but permit them to be overridden by - * serial->type->init_termios. + * serial->type->init_termios on first open. * * This is the first place a new tty gets used. Hence this is where we * acquire references to the usb_serial structure and the driver module, @@ -178,6 +178,7 @@ static int serial_install(struct tty_driver *driver, struct tty_struct *tty) int idx = tty->index; struct usb_serial *serial; struct usb_serial_port *port; + bool init_termios; int retval = -ENODEV;
port = usb_serial_port_get_by_minor(idx); @@ -192,14 +193,16 @@ static int serial_install(struct tty_driver *driver, struct tty_struct *tty) if (retval) goto error_get_interface;
+ init_termios = (driver->termios[idx] == NULL); + retval = tty_standard_install(driver, tty); if (retval) goto error_init_termios;
mutex_unlock(&serial->disc_mutex);
- /* allow the driver to update the settings */ - if (serial->type->init_termios) + /* allow the driver to update the initial settings */ + if (init_termios && serial->type->init_termios) serial->type->init_termios(tty);
tty->driver_data = port;
Hi Sasha,
On Wed, May 22, 2019 at 03:16:17PM -0400, Sasha Levin wrote:
From: Johan Hovold johan@kernel.org
[ Upstream commit 579bebe5dd522580019e7b10b07daaf500f9fb1e ]
The USB-serial driver init_termios callback is used to override the default initial terminal settings provided by USB-serial core.
After a bug was fixed in the original implementation introduced by commit fe1ae7fdd2ee ("tty: USB serial termios bits"), the init_termios callback was no longer called just once on first use as intended but rather on every (first) open.
This specifically meant that the terminal settings saved on (final) close were ignored when reopening a port for drivers overriding the initial settings.
Also update the outdated function header referring to the creation of termios objects.
Fixes: 7e29bb4b779f ("usb-serial: fix termios initialization logic") Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org
The stable tag was left out on purpose as this is essentially a new feature, and definitely a behavioural change which should not be backported.
Please drop from your autosel queues.
Also, may I ask you again not to include usb-serial (and drivers/gnss) in your autosel processing.
Thanks, Johan
On Thu, May 23, 2019 at 07:26:00AM +0200, Johan Hovold wrote:
Hi Sasha,
On Wed, May 22, 2019 at 03:16:17PM -0400, Sasha Levin wrote:
From: Johan Hovold johan@kernel.org
[ Upstream commit 579bebe5dd522580019e7b10b07daaf500f9fb1e ]
The USB-serial driver init_termios callback is used to override the default initial terminal settings provided by USB-serial core.
After a bug was fixed in the original implementation introduced by commit fe1ae7fdd2ee ("tty: USB serial termios bits"), the init_termios callback was no longer called just once on first use as intended but rather on every (first) open.
This specifically meant that the terminal settings saved on (final) close were ignored when reopening a port for drivers overriding the initial settings.
Also update the outdated function header referring to the creation of termios objects.
Fixes: 7e29bb4b779f ("usb-serial: fix termios initialization logic") Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org
The stable tag was left out on purpose as this is essentially a new feature, and definitely a behavioural change which should not be backported.
Please drop from your autosel queues.
Dropped!
Also, may I ask you again not to include usb-serial (and drivers/gnss) in your autosel processing.
Of course, I've added it to my list.
-- Thanks, Sasha
linux-stable-mirror@lists.linaro.org