September 2018 - Linux-stable-mirror

[PATCH RESEND] kthread, tracing: Don't expose half-written comm when creating kthreads

by Snild Dolkow

I didn't receive any replies on this, and it doesn't seem to have made it into the latest 3.18 or 4.4 releases. Previously sent on Aug 23rd: https://lists.linaro.org/pipermail/linux-stable-mirror/2018-August/056347.h… I assume I sent it wrong; maybe missing keywords or recipients? If anyone can tell me what I missed, I'd appreciate it. Trivial backport of commit 3e536e222f293053; newer kernels have simply moved the vararg macros. Testing: 3.18 and 4.4 booted OK in qemu. >8------------------------------------------------------8< [backport of commit 3e536e222f293053 from mainline] There is a window for racing when printing directly to task->comm, allowing other threads to see a non-terminated string. The vsnprintf function fills the buffer, counts the truncated chars, then finally writes the \0 at the end. creator other vsnprintf: fill (not terminated) count the rest trace_sched_waking(p): ... memcpy(comm, p->comm, TASK_COMM_LEN) write \0 The consequences depend on how 'other' uses the string. In our case, it was copied into the tracing system's saved cmdlines, a buffer of adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be): crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk' 0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12" ...and a strcpy out of there would cause stack corruption: [224761.522292] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffff9bf9783c78 crash-arm64> kbt | grep 'comm\|trace_print_context' #6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396) comm (char [16]) = "irq/497-pwr_even" crash-arm64> rd 0xffffffd4d0e17d14 8 ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_ ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16: ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`.. ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`.......... The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was likely needed because of this same bug. Solved by vsnprintf:ing to a local buffer, then using set_task_comm(). This way, there won't be a window where comm is not terminated. Cc: stable(a)vger.kernel.org Fixes: bc0c38d139ec7 ("ftrace: latency tracer infrastructure") Reviewed-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> [backported to 3.18 / 4.4 by Snild] Signed-off-by: Snild Dolkow <snild(a)sony.com> --- kernel/kthread.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index 850b255..ac6849e 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -313,10 +313,16 @@ struct task_struct *kthread_create_on_node(int (*threadfn)(void *data), task = create->result; if (!IS_ERR(task)) { static const struct sched_param param = { .sched_priority = 0 }; + char name[TASK_COMM_LEN]; va_list args; va_start(args, namefmt); - vsnprintf(task->comm, sizeof(task->comm), namefmt, args); + /* + * task is already visible to other tasks, so updating + * COMM must be protected. + */ + vsnprintf(name, sizeof(name), namefmt, args); + set_task_comm(task, name); va_end(args); /* * root may have changed our (kthreadd's) priority or CPU mask. -- 2.7.4

7 years, 3 months

2
3
0 0

[PATCH v3 02/10] usb: roles: Handle driver reference counting

by Heikki Krogerus

This fixes potential "BUG: unable to handle kernel paging request at ..." from happening. Fixes: fde0aa6c175a ("usb: common: Small class for USB role switches") Cc: <stable(a)vger.kernel.org> Signed-off-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com> --- drivers/usb/common/roles.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/usb/common/roles.c b/drivers/usb/common/roles.c index 15cc76e22123..3d8a776e55ee 100644 --- a/drivers/usb/common/roles.c +++ b/drivers/usb/common/roles.c @@ -109,8 +109,15 @@ static void *usb_role_switch_match(struct device_connection *con, int ep, */ struct usb_role_switch *usb_role_switch_get(struct device *dev) { - return device_connection_find_match(dev, "usb-role-switch", NULL, - usb_role_switch_match); + struct usb_role_switch *sw; + + sw = device_connection_find_match(dev, "usb-role-switch", NULL, + usb_role_switch_match); + + if (!IS_ERR(sw)) + WARN_ON(!try_module_get(sw->dev.parent->driver->owner)); + + return sw; } EXPORT_SYMBOL_GPL(usb_role_switch_get); @@ -122,8 +129,10 @@ EXPORT_SYMBOL_GPL(usb_role_switch_get); */ void usb_role_switch_put(struct usb_role_switch *sw) { - if (!IS_ERR_OR_NULL(sw)) + if (!IS_ERR_OR_NULL(sw)) { put_device(&sw->dev); + module_put(sw->dev.parent->driver->owner); + } } EXPORT_SYMBOL_GPL(usb_role_switch_put); -- 2.18.0

7 years, 3 months

2
2
0 0

[PATCH] kthread, tracing: Don't expose half-written comm when creating kthreads

by Snild Dolkow

Trivial backport of commit 3e536e222f293053; newer kernels have simply moved the vararg macros. Testing: 3.18 and 4.4 booted OK in qemu. >8------------------------------------------------------8< [backport of commit 3e536e222f293053 from mainline] There is a window for racing when printing directly to task->comm, allowing other threads to see a non-terminated string. The vsnprintf function fills the buffer, counts the truncated chars, then finally writes the \0 at the end. creator other vsnprintf: fill (not terminated) count the rest trace_sched_waking(p): ... memcpy(comm, p->comm, TASK_COMM_LEN) write \0 The consequences depend on how 'other' uses the string. In our case, it was copied into the tracing system's saved cmdlines, a buffer of adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be): crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk' 0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12" ...and a strcpy out of there would cause stack corruption: [224761.522292] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffff9bf9783c78 crash-arm64> kbt | grep 'comm\|trace_print_context' #6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396) comm (char [16]) = "irq/497-pwr_even" crash-arm64> rd 0xffffffd4d0e17d14 8 ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_ ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16: ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`.. ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`.......... The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was likely needed because of this same bug. Solved by vsnprintf:ing to a local buffer, then using set_task_comm(). This way, there won't be a window where comm is not terminated. Cc: stable(a)vger.kernel.org Fixes: bc0c38d139ec7 ("ftrace: latency tracer infrastructure") Reviewed-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> [backported to 3.18 / 4.4 by Snild] Signed-off-by: Snild Dolkow <snild(a)sony.com> --- kernel/kthread.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index 850b255..ac6849e 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -313,10 +313,16 @@ struct task_struct *kthread_create_on_node(int (*threadfn)(void *data), task = create->result; if (!IS_ERR(task)) { static const struct sched_param param = { .sched_priority = 0 }; + char name[TASK_COMM_LEN]; va_list args; va_start(args, namefmt); - vsnprintf(task->comm, sizeof(task->comm), namefmt, args); + /* + * task is already visible to other tasks, so updating + * COMM must be protected. + */ + vsnprintf(name, sizeof(name), namefmt, args); + set_task_comm(task, name); va_end(args); /* * root may have changed our (kthreadd's) priority or CPU mask. -- 2.7.4

7 years, 3 months

2
1
0 0

FAILED: patch "[PATCH] extcon: Release locking when sending the notification of" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 8a9dbb779fe882325b9a0238494a7afaff2eb444 Mon Sep 17 00:00:00 2001 From: Chanwoo Choi <cw00.choi(a)samsung.com> Date: Thu, 14 Jun 2018 11:16:29 +0900 Subject: [PATCH] extcon: Release locking when sending the notification of connector state Previously, extcon used the spinlock before calling the notifier_call_chain to prevent the scheduled out of task and to prevent the notification delay. When spinlock is locked for sending the notification, deadlock issue occured on the side of extcon consumer device. To fix this issue, extcon consumer device should always use the work. it is always not reasonable to use work. To fix this issue on extcon consumer device, release locking when sending the notification of connector state. Fixes: ab11af049f88 ("extcon: Add the synchronization extcon APIs to support the notification") Cc: stable(a)vger.kernel.org Cc: Roger Quadros <rogerq(a)ti.com> Cc: Kishon Vijay Abraham I <kishon(a)ti.com> Signed-off-by: Chanwoo Choi <cw00.choi(a)samsung.com> diff --git a/drivers/extcon/extcon.c b/drivers/extcon/extcon.c index af83ad58819c..b9d27c8fe57e 100644 --- a/drivers/extcon/extcon.c +++ b/drivers/extcon/extcon.c @@ -433,8 +433,8 @@ int extcon_sync(struct extcon_dev *edev, unsigned int id) return index; spin_lock_irqsave(&edev->lock, flags); - state = !!(edev->state & BIT(index)); + spin_unlock_irqrestore(&edev->lock, flags); /* * Call functions in a raw notifier chain for the specific one @@ -448,6 +448,7 @@ int extcon_sync(struct extcon_dev *edev, unsigned int id) */ raw_notifier_call_chain(&edev->nh_all, state, edev); + spin_lock_irqsave(&edev->lock, flags); /* This could be in interrupt handler */ prop_buf = (char *)get_zeroed_page(GFP_ATOMIC); if (!prop_buf) {

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] dm cache metadata: set dirty on all cache blocks after a" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 5b1fe7bec8a8d0cc547a22e7ddc2bd59acd67de4 Mon Sep 17 00:00:00 2001 From: Ilya Dryomov <idryomov(a)gmail.com> Date: Thu, 9 Aug 2018 12:38:28 +0200 Subject: [PATCH] dm cache metadata: set dirty on all cache blocks after a crash Quoting Documentation/device-mapper/cache.txt: The 'dirty' state for a cache block changes far too frequently for us to keep updating it on the fly. So we treat it as a hint. In normal operation it will be written when the dm device is suspended. If the system crashes all cache blocks will be assumed dirty when restarted. This got broken in commit f177940a8091 ("dm cache metadata: switch to using the new cursor api for loading metadata") in 4.9, which removed the code that consulted cmd->clean_when_opened (CLEAN_SHUTDOWN on-disk flag) when loading cache blocks. This results in data corruption on an unclean shutdown with dirty cache blocks on the fast device. After the crash those blocks are considered clean and may get evicted from the cache at any time. This can be demonstrated by doing a lot of reads to trigger individual evictions, but uncache is more predictable: ### Disable auto-activation in lvm.conf to be able to do uncache in ### time (i.e. see uncache doing flushing) when the fix is applied. # xfs_io -d -c 'pwrite -b 4M -S 0xaa 0 1G' /dev/vdb # vgcreate vg_cache /dev/vdb /dev/vdc # lvcreate -L 1G -n lv_slowdev vg_cache /dev/vdb # lvcreate -L 512M -n lv_cachedev vg_cache /dev/vdc # lvcreate -L 256M -n lv_metadev vg_cache /dev/vdc # lvconvert --type cache-pool --cachemode writeback vg_cache/lv_cachedev --poolmetadata vg_cache/lv_metadev # lvconvert --type cache vg_cache/lv_slowdev --cachepool vg_cache/lv_cachedev # xfs_io -d -c 'pwrite -b 4M -S 0xbb 0 512M' /dev/mapper/vg_cache-lv_slowdev # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2 0fe00000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ 0fe00010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ # dmsetup status vg_cache-lv_slowdev 0 2097152 cache 8 27/65536 128 8192/8192 1 100 0 0 0 8192 7065 2 metadata2 writeback 2 migration_threshold 2048 smq 0 rw - ^^^^ 7065 * 64k = 441M yet to be written to the slow device # echo b >/proc/sysrq-trigger # vgchange -ay vg_cache # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2 0fe00000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ 0fe00010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ # lvconvert --uncache vg_cache/lv_slowdev Flushing 0 blocks for cache vg_cache/lv_slowdev. Logical volume "lv_cachedev" successfully removed Logical volume vg_cache/lv_slowdev is not cached. # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2 0fe00000: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ 0fe00010: aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa ................ This is the case with both v1 and v2 cache pool metatata formats. After applying this patch: # vgchange -ay vg_cache # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2 0fe00000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ 0fe00010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ # lvconvert --uncache vg_cache/lv_slowdev Flushing 3724 blocks for cache vg_cache/lv_slowdev. ... Flushing 71 blocks for cache vg_cache/lv_slowdev. Logical volume "lv_cachedev" successfully removed Logical volume vg_cache/lv_slowdev is not cached. # xfs_io -d -c 'pread -v 254M 512' /dev/mapper/vg_cache-lv_slowdev | head -n 2 0fe00000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ 0fe00010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ Cc: stable(a)vger.kernel.org Fixes: f177940a8091 ("dm cache metadata: switch to using the new cursor api for loading metadata") Signed-off-by: Ilya Dryomov <idryomov(a)gmail.com> Signed-off-by: Mike Snitzer <snitzer(a)redhat.com> diff --git a/drivers/md/dm-cache-metadata.c b/drivers/md/dm-cache-metadata.c index 1a449105b007..69dddeab124c 100644 --- a/drivers/md/dm-cache-metadata.c +++ b/drivers/md/dm-cache-metadata.c @@ -1323,6 +1323,7 @@ static int __load_mapping_v1(struct dm_cache_metadata *cmd, dm_oblock_t oblock; unsigned flags; + bool dirty = true; dm_array_cursor_get_value(mapping_cursor, (void **) &mapping_value_le); memcpy(&mapping, mapping_value_le, sizeof(mapping)); @@ -1333,8 +1334,10 @@ static int __load_mapping_v1(struct dm_cache_metadata *cmd, dm_array_cursor_get_value(hint_cursor, (void **) &hint_value_le); memcpy(&hint, hint_value_le, sizeof(hint)); } + if (cmd->clean_when_opened) + dirty = flags & M_DIRTY; - r = fn(context, oblock, to_cblock(cb), flags & M_DIRTY, + r = fn(context, oblock, to_cblock(cb), dirty, le32_to_cpu(hint), hints_valid); if (r) { DMERR("policy couldn't load cache block %llu", @@ -1362,7 +1365,7 @@ static int __load_mapping_v2(struct dm_cache_metadata *cmd, dm_oblock_t oblock; unsigned flags; - bool dirty; + bool dirty = true; dm_array_cursor_get_value(mapping_cursor, (void **) &mapping_value_le); memcpy(&mapping, mapping_value_le, sizeof(mapping)); @@ -1373,8 +1376,9 @@ static int __load_mapping_v2(struct dm_cache_metadata *cmd, dm_array_cursor_get_value(hint_cursor, (void **) &hint_value_le); memcpy(&hint, hint_value_le, sizeof(hint)); } + if (cmd->clean_when_opened) + dirty = dm_bitset_cursor_get_value(dirty_cursor); - dirty = dm_bitset_cursor_get_value(dirty_cursor); r = fn(context, oblock, to_cblock(cb), dirty, le32_to_cpu(hint), hints_valid); if (r) {

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] dm thin: stop no_space_timeout worker when switching to" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 75294442d896f2767be34f75aca7cc2b0d01301f Mon Sep 17 00:00:00 2001 From: Hou Tao <houtao1(a)huawei.com> Date: Thu, 2 Aug 2018 16:18:24 +0800 Subject: [PATCH] dm thin: stop no_space_timeout worker when switching to write-mode Now both check_for_space() and do_no_space_timeout() will read & write pool->pf.error_if_no_space. If these functions run concurrently, as shown in the following case, the default setting of "queue_if_no_space" can get lost. precondition: * error_if_no_space = false (aka "queue_if_no_space") * pool is in Out-of-Data-Space (OODS) mode * no_space_timeout worker has been queued CPU 0: CPU 1: // delete a thin device process_delete_mesg() // check_for_space() invoked by commit() set_pool_mode(pool, PM_WRITE) pool->pf.error_if_no_space = \ pt->requested_pf.error_if_no_space // timeout, pool is still in OODS mode do_no_space_timeout // "queue_if_no_space" config is lost pool->pf.error_if_no_space = true pool->pf.mode = new_mode Fix it by stopping no_space_timeout worker when switching to write mode. Fixes: bcc696fac11f ("dm thin: stay in out-of-data-space mode once no_space_timeout expires") Cc: stable(a)vger.kernel.org Signed-off-by: Hou Tao <houtao1(a)huawei.com> Signed-off-by: Mike Snitzer <snitzer(a)redhat.com> diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 5997d6808b57..7bd60a150f8f 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -2503,6 +2503,8 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode) case PM_WRITE: if (old_mode != new_mode) notify_of_pool_mode_change(pool, "write"); + if (old_mode == PM_OUT_OF_DATA_SPACE) + cancel_delayed_work_sync(&pool->no_space_timeout); pool->out_of_data_space = false; pool->pf.error_if_no_space = pt->requested_pf.error_if_no_space; dm_pool_metadata_read_write(pool->pmd);

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] RDMA/mlx5: Check that supplied blue flame index doesn't" failed to apply to 4.18-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.18-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 05f58ceba123bdb420cf44c6ea04b6db467edd1c Mon Sep 17 00:00:00 2001 From: Leon Romanovsky <leonro(a)mellanox.com> Date: Sun, 8 Jul 2018 13:50:21 +0300 Subject: [PATCH] RDMA/mlx5: Check that supplied blue flame index doesn't overflow User's supplied index is checked again total number of system pages, but this number already includes num_static_sys_pages, so addition of that value to supplied index causes to below error while trying to access sys_pages[]. BUG: KASAN: slab-out-of-bounds in bfregn_to_uar_index+0x34f/0x400 Read of size 4 at addr ffff880065561904 by task syz-executor446/314 CPU: 0 PID: 314 Comm: syz-executor446 Not tainted 4.18.0-rc1+ #256 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xef/0x17e print_address_description+0x83/0x3b0 kasan_report+0x18d/0x4d0 bfregn_to_uar_index+0x34f/0x400 create_user_qp+0x272/0x227d create_qp_common+0x32eb/0x43e0 mlx5_ib_create_qp+0x379/0x1ca0 create_qp.isra.5+0xc94/0x22d0 ib_uverbs_create_qp+0x21b/0x2a0 ib_uverbs_write+0xc2c/0x1010 vfs_write+0x1b0/0x550 ksys_write+0xc6/0x1a0 do_syscall_64+0xa7/0x590 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x433679 Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 91 fd ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fff2b3d8e48 EFLAGS: 00000217 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00000000004002f8 RCX: 0000000000433679 RDX: 0000000000000040 RSI: 0000000020000240 RDI: 0000000000000003 RBP: 00000000006d4018 R08: 00000000004002f8 R09: 00000000004002f8 R10: 00000000004002f8 R11: 0000000000000217 R12: 0000000000000000 R13: 000000000040cb00 R14: 000000000040cb90 R15: 0000000000000006 Allocated by task 314: kasan_kmalloc+0xa0/0xd0 __kmalloc+0x1a9/0x510 mlx5_ib_alloc_ucontext+0x966/0x2620 ib_uverbs_get_context+0x23f/0xa60 ib_uverbs_write+0xc2c/0x1010 __vfs_write+0x10d/0x720 vfs_write+0x1b0/0x550 ksys_write+0xc6/0x1a0 do_syscall_64+0xa7/0x590 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 1: __kasan_slab_free+0x12e/0x180 kfree+0x159/0x630 kvfree+0x37/0x50 single_release+0x8e/0xf0 __fput+0x2d8/0x900 task_work_run+0x102/0x1f0 exit_to_usermode_loop+0x159/0x1c0 do_syscall_64+0x408/0x590 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff880065561100 which belongs to the cache kmalloc-4096 of size 4096 The buggy address is located 2052 bytes inside of 4096-byte region [ffff880065561100, ffff880065562100) The buggy address belongs to the page: page:ffffea0001955800 count:1 mapcount:0 mapping:ffff88006c402480 index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 ffffea0001a7c000 0000000200000002 ffff88006c402480 raw: 0000000000000000 0000000080070007 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880065561800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880065561880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff880065561900: 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff880065561980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff880065561a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Cc: <stable(a)vger.kernel.org> # 4.15 Fixes: 1ee47ab3e8d8 ("IB/mlx5: Enable QP creation with a given blue flame index") Reported-by: Noa Osherovich <noaos(a)mellanox.com> Signed-off-by: Leon Romanovsky <leonro(a)mellanox.com> Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com> diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 93087409f4b8..04a5d82c9cf3 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -1330,6 +1330,6 @@ unsigned long mlx5_ib_get_xlt_emergency_page(void); void mlx5_ib_put_xlt_emergency_page(void); int bfregn_to_uar_index(struct mlx5_ib_dev *dev, - struct mlx5_bfreg_info *bfregi, int bfregn, + struct mlx5_bfreg_info *bfregi, u32 bfregn, bool dyn_bfreg); #endif /* MLX5_IB_H */ diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index 51e68ca20215..d4414015b64f 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -631,22 +631,23 @@ static void mlx5_ib_unlock_cqs(struct mlx5_ib_cq *send_cq, struct mlx5_ib_cq *recv_cq); int bfregn_to_uar_index(struct mlx5_ib_dev *dev, - struct mlx5_bfreg_info *bfregi, int bfregn, + struct mlx5_bfreg_info *bfregi, u32 bfregn, bool dyn_bfreg) { - int bfregs_per_sys_page; - int index_of_sys_page; - int offset; + unsigned int bfregs_per_sys_page; + u32 index_of_sys_page; + u32 offset; bfregs_per_sys_page = get_uars_per_sys_page(dev, bfregi->lib_uar_4k) * MLX5_NON_FP_BFREGS_PER_UAR; index_of_sys_page = bfregn / bfregs_per_sys_page; - if (index_of_sys_page >= bfregi->num_sys_pages) - return -EINVAL; - if (dyn_bfreg) { index_of_sys_page += bfregi->num_static_sys_pages; + + if (index_of_sys_page >= bfregi->num_sys_pages) + return -EINVAL; + if (bfregn > bfregi->num_dyn_bfregs || bfregi->sys_pages[index_of_sys_page] == MLX5_IB_INVALID_UAR_INDEX) { mlx5_ib_dbg(dev, "Invalid dynamic uar index\n");

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] ib_srpt: Fix a use-after-free in __srpt_close_all_ch()" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 14d15c2b278011056482eb015dff89f9cbf2b841 Mon Sep 17 00:00:00 2001 From: Bart Van Assche <bart.vanassche(a)wdc.com> Date: Mon, 2 Jul 2018 14:08:45 -0700 Subject: [PATCH] ib_srpt: Fix a use-after-free in __srpt_close_all_ch() BUG: KASAN: use-after-free in srpt_set_enabled+0x1a9/0x1e0 [ib_srpt] Read of size 4 at addr ffff8801269d23f8 by task check/29726 CPU: 4 PID: 29726 Comm: check Not tainted 4.18.0-rc2-dbg+ #4 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xa4/0xf5 print_address_description+0x6f/0x270 kasan_report+0x241/0x360 __asan_load4+0x78/0x80 srpt_set_enabled+0x1a9/0x1e0 [ib_srpt] srpt_tpg_enable_store+0xb8/0x120 [ib_srpt] configfs_write_file+0x14e/0x1d0 [configfs] __vfs_write+0xd2/0x3b0 vfs_write+0x101/0x270 ksys_write+0xab/0x120 __x64_sys_write+0x43/0x50 do_syscall_64+0x77/0x230 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f235cfe6154 Fixes: aaf45bd83eba ("IB/srpt: Detect session shutdown reliably") Signed-off-by: Bart Van Assche <bart.vanassche(a)wdc.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Jason Gunthorpe <jgg(a)mellanox.com> diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index 754da8d30952..e42eec20c631 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -1940,8 +1940,8 @@ static void __srpt_close_all_ch(struct srpt_port *sport) list_for_each_entry(nexus, &sport->nexus_list, entry) { list_for_each_entry(ch, &nexus->ch_list, list) { if (srpt_disconnect_ch(ch) >= 0) - pr_info("Closing channel %s-%d because target %s_%d has been disabled\n", - ch->sess_name, ch->qp->qp_num, + pr_info("Closing channel %s because target %s_%d has been disabled\n", + ch->sess_name, sport->sdev->device->name, sport->port); srpt_close_ch(ch); }

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] powerpc/pseries: Defer the logging of rtas error to irq work" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 94675cceacaec27a30eefb142c4c59a9d3131742 Mon Sep 17 00:00:00 2001 From: Mahesh Salgaonkar <mahesh(a)linux.vnet.ibm.com> Date: Wed, 4 Jul 2018 23:27:21 +0530 Subject: [PATCH] powerpc/pseries: Defer the logging of rtas error to irq work queue. rtas_log_buf is a buffer to hold RTAS event data that are communicated to kernel by hypervisor. This buffer is then used to pass RTAS event data to user through proc fs. This buffer is allocated from vmalloc (non-linear mapping) area. On Machine check interrupt, register r3 points to RTAS extended event log passed by hypervisor that contains the MCE event. The pseries machine check handler then logs this error into rtas_log_buf. The rtas_log_buf is a vmalloc-ed (non-linear) buffer we end up taking up a page fault (vector 0x300) while accessing it. Since machine check interrupt handler runs in NMI context we can not afford to take any page fault. Page faults are not honored in NMI context and causes kernel panic. Apart from that, as Nick pointed out, pSeries_log_error() also takes a spin_lock while logging error which is not safe in NMI context. It may endup in deadlock if we get another MCE before releasing the lock. Fix this by deferring the logging of rtas error to irq work queue. Current implementation uses two different buffers to hold rtas error log depending on whether extended log is provided or not. This makes bit difficult to identify which buffer has valid data that needs to logged later in irq work. Simplify this using single buffer, one per paca, and copy rtas log to it irrespective of whether extended log is provided or not. Allocate this buffer below RMA region so that it can be accessed in real mode mce handler. Fixes: b96672dd840f ("powerpc: Machine check interrupt is a non-maskable interrupt") Cc: stable(a)vger.kernel.org # v4.14+ Reviewed-by: Nicholas Piggin <npiggin(a)gmail.com> Signed-off-by: Mahesh Salgaonkar <mahesh(a)linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h index 4e9cede5a7e7..ad4f16164619 100644 --- a/arch/powerpc/include/asm/paca.h +++ b/arch/powerpc/include/asm/paca.h @@ -247,6 +247,9 @@ struct paca_struct { void *rfi_flush_fallback_area; u64 l1d_flush_size; #endif +#ifdef CONFIG_PPC_PSERIES + u8 *mce_data_buf; /* buffer to hold per cpu rtas errlog */ +#endif /* CONFIG_PPC_PSERIES */ } ____cacheline_aligned; extern void copy_mm_to_paca(struct mm_struct *mm); diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index ef104144d4bc..14a46b07ab2f 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -22,6 +22,7 @@ #include <linux/of.h> #include <linux/fs.h> #include <linux/reboot.h> +#include <linux/irq_work.h> #include <asm/machdep.h> #include <asm/rtas.h> @@ -32,11 +33,13 @@ static unsigned char ras_log_buf[RTAS_ERROR_LOG_MAX]; static DEFINE_SPINLOCK(ras_log_buf_lock); -static char global_mce_data_buf[RTAS_ERROR_LOG_MAX]; -static DEFINE_PER_CPU(__u64, mce_data_buf); - static int ras_check_exception_token; +static void mce_process_errlog_event(struct irq_work *work); +static struct irq_work mce_errlog_process_work = { + .func = mce_process_errlog_event, +}; + #define EPOW_SENSOR_TOKEN 9 #define EPOW_SENSOR_INDEX 0 @@ -330,16 +333,20 @@ static irqreturn_t ras_error_interrupt(int irq, void *dev_id) ((((A) >= 0x7000) && ((A) < 0x7ff0)) || \ (((A) >= rtas.base) && ((A) < (rtas.base + rtas.size - 16)))) +static inline struct rtas_error_log *fwnmi_get_errlog(void) +{ + return (struct rtas_error_log *)local_paca->mce_data_buf; +} + /* * Get the error information for errors coming through the * FWNMI vectors. The pt_regs' r3 will be updated to reflect * the actual r3 if possible, and a ptr to the error log entry * will be returned if found. * - * If the RTAS error is not of the extended type, then we put it in a per - * cpu 64bit buffer. If it is the extended type we use global_mce_data_buf. + * Use one buffer mce_data_buf per cpu to store RTAS error. * - * The global_mce_data_buf does not have any locks or protection around it, + * The mce_data_buf does not have any locks or protection around it, * if a second machine check comes in, or a system reset is done * before we have logged the error, then we will get corruption in the * error log. This is preferable over holding off on calling @@ -349,7 +356,7 @@ static irqreturn_t ras_error_interrupt(int irq, void *dev_id) static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) { unsigned long *savep; - struct rtas_error_log *h, *errhdr = NULL; + struct rtas_error_log *h; /* Mask top two bits */ regs->gpr[3] &= ~(0x3UL << 62); @@ -362,22 +369,20 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) savep = __va(regs->gpr[3]); regs->gpr[3] = savep[0]; /* restore original r3 */ - /* If it isn't an extended log we can use the per cpu 64bit buffer */ h = (struct rtas_error_log *)&savep[1]; + /* Use the per cpu buffer from paca to store rtas error log */ + memset(local_paca->mce_data_buf, 0, RTAS_ERROR_LOG_MAX); if (!rtas_error_extended(h)) { - memcpy(this_cpu_ptr(&mce_data_buf), h, sizeof(__u64)); - errhdr = (struct rtas_error_log *)this_cpu_ptr(&mce_data_buf); + memcpy(local_paca->mce_data_buf, h, sizeof(__u64)); } else { int len, error_log_length; error_log_length = 8 + rtas_error_extended_log_length(h); len = min_t(int, error_log_length, RTAS_ERROR_LOG_MAX); - memset(global_mce_data_buf, 0, RTAS_ERROR_LOG_MAX); - memcpy(global_mce_data_buf, h, len); - errhdr = (struct rtas_error_log *)global_mce_data_buf; + memcpy(local_paca->mce_data_buf, h, len); } - return errhdr; + return (struct rtas_error_log *)local_paca->mce_data_buf; } /* Call this when done with the data returned by FWNMI_get_errinfo. @@ -422,6 +427,17 @@ int pSeries_system_reset_exception(struct pt_regs *regs) return 0; /* need to perform reset */ } +/* + * Process MCE rtas errlog event. + */ +static void mce_process_errlog_event(struct irq_work *work) +{ + struct rtas_error_log *err; + + err = fwnmi_get_errlog(); + log_error((char *)err, ERR_TYPE_RTAS_LOG, 0); +} + /* * See if we can recover from a machine check exception. * This is only called on power4 (or above) and only via @@ -466,7 +482,8 @@ static int recover_mce(struct pt_regs *regs, struct rtas_error_log *err) recovered = 1; } - log_error((char *)err, ERR_TYPE_RTAS_LOG, 0); + /* Queue irq work to log this rtas event later. */ + irq_work_queue(&mce_errlog_process_work); return recovered; } diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 9948ad16f788..b411a74b861d 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -41,6 +41,7 @@ #include <linux/root_dev.h> #include <linux/of.h> #include <linux/of_pci.h> +#include <linux/memblock.h> #include <asm/mmu.h> #include <asm/processor.h> @@ -102,6 +103,9 @@ static void pSeries_show_cpuinfo(struct seq_file *m) static void __init fwnmi_init(void) { unsigned long system_reset_addr, machine_check_addr; + u8 *mce_data_buf; + unsigned int i; + int nr_cpus = num_possible_cpus(); int ibm_nmi_register = rtas_token("ibm,nmi-register"); if (ibm_nmi_register == RTAS_UNKNOWN_SERVICE) @@ -115,6 +119,18 @@ static void __init fwnmi_init(void) if (0 == rtas_call(ibm_nmi_register, 2, 1, NULL, system_reset_addr, machine_check_addr)) fwnmi_active = 1; + + /* + * Allocate a chunk for per cpu buffer to hold rtas errorlog. + * It will be used in real mode mce handler, hence it needs to be + * below RMA. + */ + mce_data_buf = __va(memblock_alloc_base(RTAS_ERROR_LOG_MAX * nr_cpus, + RTAS_ERROR_LOG_MAX, ppc64_rma_size)); + for_each_possible_cpu(i) { + paca_ptrs[i]->mce_data_buf = mce_data_buf + + (RTAS_ERROR_LOG_MAX * i); + } } static void pseries_8259_cascade(struct irq_desc *desc)

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] powerpc/pseries: Defer the logging of rtas error to irq work" failed to apply to 4.18-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.18-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 94675cceacaec27a30eefb142c4c59a9d3131742 Mon Sep 17 00:00:00 2001 From: Mahesh Salgaonkar <mahesh(a)linux.vnet.ibm.com> Date: Wed, 4 Jul 2018 23:27:21 +0530 Subject: [PATCH] powerpc/pseries: Defer the logging of rtas error to irq work queue. rtas_log_buf is a buffer to hold RTAS event data that are communicated to kernel by hypervisor. This buffer is then used to pass RTAS event data to user through proc fs. This buffer is allocated from vmalloc (non-linear mapping) area. On Machine check interrupt, register r3 points to RTAS extended event log passed by hypervisor that contains the MCE event. The pseries machine check handler then logs this error into rtas_log_buf. The rtas_log_buf is a vmalloc-ed (non-linear) buffer we end up taking up a page fault (vector 0x300) while accessing it. Since machine check interrupt handler runs in NMI context we can not afford to take any page fault. Page faults are not honored in NMI context and causes kernel panic. Apart from that, as Nick pointed out, pSeries_log_error() also takes a spin_lock while logging error which is not safe in NMI context. It may endup in deadlock if we get another MCE before releasing the lock. Fix this by deferring the logging of rtas error to irq work queue. Current implementation uses two different buffers to hold rtas error log depending on whether extended log is provided or not. This makes bit difficult to identify which buffer has valid data that needs to logged later in irq work. Simplify this using single buffer, one per paca, and copy rtas log to it irrespective of whether extended log is provided or not. Allocate this buffer below RMA region so that it can be accessed in real mode mce handler. Fixes: b96672dd840f ("powerpc: Machine check interrupt is a non-maskable interrupt") Cc: stable(a)vger.kernel.org # v4.14+ Reviewed-by: Nicholas Piggin <npiggin(a)gmail.com> Signed-off-by: Mahesh Salgaonkar <mahesh(a)linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h index 4e9cede5a7e7..ad4f16164619 100644 --- a/arch/powerpc/include/asm/paca.h +++ b/arch/powerpc/include/asm/paca.h @@ -247,6 +247,9 @@ struct paca_struct { void *rfi_flush_fallback_area; u64 l1d_flush_size; #endif +#ifdef CONFIG_PPC_PSERIES + u8 *mce_data_buf; /* buffer to hold per cpu rtas errlog */ +#endif /* CONFIG_PPC_PSERIES */ } ____cacheline_aligned; extern void copy_mm_to_paca(struct mm_struct *mm); diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index ef104144d4bc..14a46b07ab2f 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -22,6 +22,7 @@ #include <linux/of.h> #include <linux/fs.h> #include <linux/reboot.h> +#include <linux/irq_work.h> #include <asm/machdep.h> #include <asm/rtas.h> @@ -32,11 +33,13 @@ static unsigned char ras_log_buf[RTAS_ERROR_LOG_MAX]; static DEFINE_SPINLOCK(ras_log_buf_lock); -static char global_mce_data_buf[RTAS_ERROR_LOG_MAX]; -static DEFINE_PER_CPU(__u64, mce_data_buf); - static int ras_check_exception_token; +static void mce_process_errlog_event(struct irq_work *work); +static struct irq_work mce_errlog_process_work = { + .func = mce_process_errlog_event, +}; + #define EPOW_SENSOR_TOKEN 9 #define EPOW_SENSOR_INDEX 0 @@ -330,16 +333,20 @@ static irqreturn_t ras_error_interrupt(int irq, void *dev_id) ((((A) >= 0x7000) && ((A) < 0x7ff0)) || \ (((A) >= rtas.base) && ((A) < (rtas.base + rtas.size - 16)))) +static inline struct rtas_error_log *fwnmi_get_errlog(void) +{ + return (struct rtas_error_log *)local_paca->mce_data_buf; +} + /* * Get the error information for errors coming through the * FWNMI vectors. The pt_regs' r3 will be updated to reflect * the actual r3 if possible, and a ptr to the error log entry * will be returned if found. * - * If the RTAS error is not of the extended type, then we put it in a per - * cpu 64bit buffer. If it is the extended type we use global_mce_data_buf. + * Use one buffer mce_data_buf per cpu to store RTAS error. * - * The global_mce_data_buf does not have any locks or protection around it, + * The mce_data_buf does not have any locks or protection around it, * if a second machine check comes in, or a system reset is done * before we have logged the error, then we will get corruption in the * error log. This is preferable over holding off on calling @@ -349,7 +356,7 @@ static irqreturn_t ras_error_interrupt(int irq, void *dev_id) static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) { unsigned long *savep; - struct rtas_error_log *h, *errhdr = NULL; + struct rtas_error_log *h; /* Mask top two bits */ regs->gpr[3] &= ~(0x3UL << 62); @@ -362,22 +369,20 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) savep = __va(regs->gpr[3]); regs->gpr[3] = savep[0]; /* restore original r3 */ - /* If it isn't an extended log we can use the per cpu 64bit buffer */ h = (struct rtas_error_log *)&savep[1]; + /* Use the per cpu buffer from paca to store rtas error log */ + memset(local_paca->mce_data_buf, 0, RTAS_ERROR_LOG_MAX); if (!rtas_error_extended(h)) { - memcpy(this_cpu_ptr(&mce_data_buf), h, sizeof(__u64)); - errhdr = (struct rtas_error_log *)this_cpu_ptr(&mce_data_buf); + memcpy(local_paca->mce_data_buf, h, sizeof(__u64)); } else { int len, error_log_length; error_log_length = 8 + rtas_error_extended_log_length(h); len = min_t(int, error_log_length, RTAS_ERROR_LOG_MAX); - memset(global_mce_data_buf, 0, RTAS_ERROR_LOG_MAX); - memcpy(global_mce_data_buf, h, len); - errhdr = (struct rtas_error_log *)global_mce_data_buf; + memcpy(local_paca->mce_data_buf, h, len); } - return errhdr; + return (struct rtas_error_log *)local_paca->mce_data_buf; } /* Call this when done with the data returned by FWNMI_get_errinfo. @@ -422,6 +427,17 @@ int pSeries_system_reset_exception(struct pt_regs *regs) return 0; /* need to perform reset */ } +/* + * Process MCE rtas errlog event. + */ +static void mce_process_errlog_event(struct irq_work *work) +{ + struct rtas_error_log *err; + + err = fwnmi_get_errlog(); + log_error((char *)err, ERR_TYPE_RTAS_LOG, 0); +} + /* * See if we can recover from a machine check exception. * This is only called on power4 (or above) and only via @@ -466,7 +482,8 @@ static int recover_mce(struct pt_regs *regs, struct rtas_error_log *err) recovered = 1; } - log_error((char *)err, ERR_TYPE_RTAS_LOG, 0); + /* Queue irq work to log this rtas event later. */ + irq_work_queue(&mce_errlog_process_work); return recovered; } diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 9948ad16f788..b411a74b861d 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -41,6 +41,7 @@ #include <linux/root_dev.h> #include <linux/of.h> #include <linux/of_pci.h> +#include <linux/memblock.h> #include <asm/mmu.h> #include <asm/processor.h> @@ -102,6 +103,9 @@ static void pSeries_show_cpuinfo(struct seq_file *m) static void __init fwnmi_init(void) { unsigned long system_reset_addr, machine_check_addr; + u8 *mce_data_buf; + unsigned int i; + int nr_cpus = num_possible_cpus(); int ibm_nmi_register = rtas_token("ibm,nmi-register"); if (ibm_nmi_register == RTAS_UNKNOWN_SERVICE) @@ -115,6 +119,18 @@ static void __init fwnmi_init(void) if (0 == rtas_call(ibm_nmi_register, 2, 1, NULL, system_reset_addr, machine_check_addr)) fwnmi_active = 1; + + /* + * Allocate a chunk for per cpu buffer to hold rtas errorlog. + * It will be used in real mode mce handler, hence it needs to be + * below RMA. + */ + mce_data_buf = __va(memblock_alloc_base(RTAS_ERROR_LOG_MAX * nr_cpus, + RTAS_ERROR_LOG_MAX, ppc64_rma_size)); + for_each_possible_cpu(i) { + paca_ptrs[i]->mce_data_buf = mce_data_buf + + (RTAS_ERROR_LOG_MAX * i); + } } static void pseries_8259_cascade(struct irq_desc *desc)

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] powerpc/64s: Fix page table fragment refcount race vs" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 4231aba000f5a4583dd9f67057aadb68c3eca99d Mon Sep 17 00:00:00 2001 From: Nicholas Piggin <npiggin(a)gmail.com> Date: Fri, 27 Jul 2018 21:48:17 +1000 Subject: [PATCH] powerpc/64s: Fix page table fragment refcount race vs speculative references The page table fragment allocator uses the main page refcount racily with respect to speculative references. A customer observed a BUG due to page table page refcount underflow in the fragment allocator. This can be caused by the fragment allocator set_page_count stomping on a speculative reference, and then the speculative failure handler decrements the new reference, and the underflow eventually pops when the page tables are freed. Fix this by using a dedicated field in the struct page for the page table fragment allocator. Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory wastage") Cc: stable(a)vger.kernel.org # v3.10+ Reviewed-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c index 8b24168ea8c4..4a892d894a0f 100644 --- a/arch/powerpc/mm/mmu_context_book3s64.c +++ b/arch/powerpc/mm/mmu_context_book3s64.c @@ -200,9 +200,9 @@ static void pte_frag_destroy(void *pte_frag) /* drop all the pending references */ count = ((unsigned long)pte_frag & ~PAGE_MASK) >> PTE_FRAG_SIZE_SHIFT; /* We allow PTE_FRAG_NR fragments from a PTE page */ - if (page_ref_sub_and_test(page, PTE_FRAG_NR - count)) { + if (atomic_sub_and_test(PTE_FRAG_NR - count, &page->pt_frag_refcount)) { pgtable_page_dtor(page); - free_unref_page(page); + __free_page(page); } } @@ -215,9 +215,9 @@ static void pmd_frag_destroy(void *pmd_frag) /* drop all the pending references */ count = ((unsigned long)pmd_frag & ~PAGE_MASK) >> PMD_FRAG_SIZE_SHIFT; /* We allow PTE_FRAG_NR fragments from a PTE page */ - if (page_ref_sub_and_test(page, PMD_FRAG_NR - count)) { + if (atomic_sub_and_test(PMD_FRAG_NR - count, &page->pt_frag_refcount)) { pgtable_pmd_page_dtor(page); - free_unref_page(page); + __free_page(page); } } diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c index 4afbfbb64bfd..78d0b3d5ebad 100644 --- a/arch/powerpc/mm/pgtable-book3s64.c +++ b/arch/powerpc/mm/pgtable-book3s64.c @@ -270,6 +270,8 @@ static pmd_t *__alloc_for_pmdcache(struct mm_struct *mm) return NULL; } + atomic_set(&page->pt_frag_refcount, 1); + ret = page_address(page); /* * if we support only one fragment just return the @@ -285,7 +287,7 @@ static pmd_t *__alloc_for_pmdcache(struct mm_struct *mm) * count. */ if (likely(!mm->context.pmd_frag)) { - set_page_count(page, PMD_FRAG_NR); + atomic_set(&page->pt_frag_refcount, PMD_FRAG_NR); mm->context.pmd_frag = ret + PMD_FRAG_SIZE; } spin_unlock(&mm->page_table_lock); @@ -308,9 +310,10 @@ void pmd_fragment_free(unsigned long *pmd) { struct page *page = virt_to_page(pmd); - if (put_page_testzero(page)) { + BUG_ON(atomic_read(&page->pt_frag_refcount) <= 0); + if (atomic_dec_and_test(&page->pt_frag_refcount)) { pgtable_pmd_page_dtor(page); - free_unref_page(page); + __free_page(page); } } @@ -352,6 +355,7 @@ static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel) return NULL; } + atomic_set(&page->pt_frag_refcount, 1); ret = page_address(page); /* @@ -367,7 +371,7 @@ static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel) * count. */ if (likely(!mm->context.pte_frag)) { - set_page_count(page, PTE_FRAG_NR); + atomic_set(&page->pt_frag_refcount, PTE_FRAG_NR); mm->context.pte_frag = ret + PTE_FRAG_SIZE; } spin_unlock(&mm->page_table_lock); @@ -390,10 +394,11 @@ void pte_fragment_free(unsigned long *table, int kernel) { struct page *page = virt_to_page(table); - if (put_page_testzero(page)) { + BUG_ON(atomic_read(&page->pt_frag_refcount) <= 0); + if (atomic_dec_and_test(&page->pt_frag_refcount)) { if (!kernel) pgtable_page_dtor(page); - free_unref_page(page); + __free_page(page); } } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 99ce070e7dcb..22651e124071 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -139,7 +139,10 @@ struct page { unsigned long _pt_pad_1; /* compound_head */ pgtable_t pmd_huge_pte; /* protected by page->ptl */ unsigned long _pt_pad_2; /* mapping */ - struct mm_struct *pt_mm; /* x86 pgds only */ + union { + struct mm_struct *pt_mm; /* x86 pgds only */ + atomic_t pt_frag_refcount; /* powerpc */ + }; #if ALLOC_SPLIT_PTLOCKS spinlock_t *ptl; #else

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] powerpc/64s: Fix page table fragment refcount race vs" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 4231aba000f5a4583dd9f67057aadb68c3eca99d Mon Sep 17 00:00:00 2001 From: Nicholas Piggin <npiggin(a)gmail.com> Date: Fri, 27 Jul 2018 21:48:17 +1000 Subject: [PATCH] powerpc/64s: Fix page table fragment refcount race vs speculative references The page table fragment allocator uses the main page refcount racily with respect to speculative references. A customer observed a BUG due to page table page refcount underflow in the fragment allocator. This can be caused by the fragment allocator set_page_count stomping on a speculative reference, and then the speculative failure handler decrements the new reference, and the underflow eventually pops when the page tables are freed. Fix this by using a dedicated field in the struct page for the page table fragment allocator. Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory wastage") Cc: stable(a)vger.kernel.org # v3.10+ Reviewed-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c index 8b24168ea8c4..4a892d894a0f 100644 --- a/arch/powerpc/mm/mmu_context_book3s64.c +++ b/arch/powerpc/mm/mmu_context_book3s64.c @@ -200,9 +200,9 @@ static void pte_frag_destroy(void *pte_frag) /* drop all the pending references */ count = ((unsigned long)pte_frag & ~PAGE_MASK) >> PTE_FRAG_SIZE_SHIFT; /* We allow PTE_FRAG_NR fragments from a PTE page */ - if (page_ref_sub_and_test(page, PTE_FRAG_NR - count)) { + if (atomic_sub_and_test(PTE_FRAG_NR - count, &page->pt_frag_refcount)) { pgtable_page_dtor(page); - free_unref_page(page); + __free_page(page); } } @@ -215,9 +215,9 @@ static void pmd_frag_destroy(void *pmd_frag) /* drop all the pending references */ count = ((unsigned long)pmd_frag & ~PAGE_MASK) >> PMD_FRAG_SIZE_SHIFT; /* We allow PTE_FRAG_NR fragments from a PTE page */ - if (page_ref_sub_and_test(page, PMD_FRAG_NR - count)) { + if (atomic_sub_and_test(PMD_FRAG_NR - count, &page->pt_frag_refcount)) { pgtable_pmd_page_dtor(page); - free_unref_page(page); + __free_page(page); } } diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c index 4afbfbb64bfd..78d0b3d5ebad 100644 --- a/arch/powerpc/mm/pgtable-book3s64.c +++ b/arch/powerpc/mm/pgtable-book3s64.c @@ -270,6 +270,8 @@ static pmd_t *__alloc_for_pmdcache(struct mm_struct *mm) return NULL; } + atomic_set(&page->pt_frag_refcount, 1); + ret = page_address(page); /* * if we support only one fragment just return the @@ -285,7 +287,7 @@ static pmd_t *__alloc_for_pmdcache(struct mm_struct *mm) * count. */ if (likely(!mm->context.pmd_frag)) { - set_page_count(page, PMD_FRAG_NR); + atomic_set(&page->pt_frag_refcount, PMD_FRAG_NR); mm->context.pmd_frag = ret + PMD_FRAG_SIZE; } spin_unlock(&mm->page_table_lock); @@ -308,9 +310,10 @@ void pmd_fragment_free(unsigned long *pmd) { struct page *page = virt_to_page(pmd); - if (put_page_testzero(page)) { + BUG_ON(atomic_read(&page->pt_frag_refcount) <= 0); + if (atomic_dec_and_test(&page->pt_frag_refcount)) { pgtable_pmd_page_dtor(page); - free_unref_page(page); + __free_page(page); } } @@ -352,6 +355,7 @@ static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel) return NULL; } + atomic_set(&page->pt_frag_refcount, 1); ret = page_address(page); /* @@ -367,7 +371,7 @@ static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel) * count. */ if (likely(!mm->context.pte_frag)) { - set_page_count(page, PTE_FRAG_NR); + atomic_set(&page->pt_frag_refcount, PTE_FRAG_NR); mm->context.pte_frag = ret + PTE_FRAG_SIZE; } spin_unlock(&mm->page_table_lock); @@ -390,10 +394,11 @@ void pte_fragment_free(unsigned long *table, int kernel) { struct page *page = virt_to_page(table); - if (put_page_testzero(page)) { + BUG_ON(atomic_read(&page->pt_frag_refcount) <= 0); + if (atomic_dec_and_test(&page->pt_frag_refcount)) { if (!kernel) pgtable_page_dtor(page); - free_unref_page(page); + __free_page(page); } } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 99ce070e7dcb..22651e124071 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -139,7 +139,10 @@ struct page { unsigned long _pt_pad_1; /* compound_head */ pgtable_t pmd_huge_pte; /* protected by page->ptl */ unsigned long _pt_pad_2; /* mapping */ - struct mm_struct *pt_mm; /* x86 pgds only */ + union { + struct mm_struct *pt_mm; /* x86 pgds only */ + atomic_t pt_frag_refcount; /* powerpc */ + }; #if ALLOC_SPLIT_PTLOCKS spinlock_t *ptl; #else

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] powerpc/64s: Fix page table fragment refcount race vs" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 4231aba000f5a4583dd9f67057aadb68c3eca99d Mon Sep 17 00:00:00 2001 From: Nicholas Piggin <npiggin(a)gmail.com> Date: Fri, 27 Jul 2018 21:48:17 +1000 Subject: [PATCH] powerpc/64s: Fix page table fragment refcount race vs speculative references The page table fragment allocator uses the main page refcount racily with respect to speculative references. A customer observed a BUG due to page table page refcount underflow in the fragment allocator. This can be caused by the fragment allocator set_page_count stomping on a speculative reference, and then the speculative failure handler decrements the new reference, and the underflow eventually pops when the page tables are freed. Fix this by using a dedicated field in the struct page for the page table fragment allocator. Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory wastage") Cc: stable(a)vger.kernel.org # v3.10+ Reviewed-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c index 8b24168ea8c4..4a892d894a0f 100644 --- a/arch/powerpc/mm/mmu_context_book3s64.c +++ b/arch/powerpc/mm/mmu_context_book3s64.c @@ -200,9 +200,9 @@ static void pte_frag_destroy(void *pte_frag) /* drop all the pending references */ count = ((unsigned long)pte_frag & ~PAGE_MASK) >> PTE_FRAG_SIZE_SHIFT; /* We allow PTE_FRAG_NR fragments from a PTE page */ - if (page_ref_sub_and_test(page, PTE_FRAG_NR - count)) { + if (atomic_sub_and_test(PTE_FRAG_NR - count, &page->pt_frag_refcount)) { pgtable_page_dtor(page); - free_unref_page(page); + __free_page(page); } } @@ -215,9 +215,9 @@ static void pmd_frag_destroy(void *pmd_frag) /* drop all the pending references */ count = ((unsigned long)pmd_frag & ~PAGE_MASK) >> PMD_FRAG_SIZE_SHIFT; /* We allow PTE_FRAG_NR fragments from a PTE page */ - if (page_ref_sub_and_test(page, PMD_FRAG_NR - count)) { + if (atomic_sub_and_test(PMD_FRAG_NR - count, &page->pt_frag_refcount)) { pgtable_pmd_page_dtor(page); - free_unref_page(page); + __free_page(page); } } diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c index 4afbfbb64bfd..78d0b3d5ebad 100644 --- a/arch/powerpc/mm/pgtable-book3s64.c +++ b/arch/powerpc/mm/pgtable-book3s64.c @@ -270,6 +270,8 @@ static pmd_t *__alloc_for_pmdcache(struct mm_struct *mm) return NULL; } + atomic_set(&page->pt_frag_refcount, 1); + ret = page_address(page); /* * if we support only one fragment just return the @@ -285,7 +287,7 @@ static pmd_t *__alloc_for_pmdcache(struct mm_struct *mm) * count. */ if (likely(!mm->context.pmd_frag)) { - set_page_count(page, PMD_FRAG_NR); + atomic_set(&page->pt_frag_refcount, PMD_FRAG_NR); mm->context.pmd_frag = ret + PMD_FRAG_SIZE; } spin_unlock(&mm->page_table_lock); @@ -308,9 +310,10 @@ void pmd_fragment_free(unsigned long *pmd) { struct page *page = virt_to_page(pmd); - if (put_page_testzero(page)) { + BUG_ON(atomic_read(&page->pt_frag_refcount) <= 0); + if (atomic_dec_and_test(&page->pt_frag_refcount)) { pgtable_pmd_page_dtor(page); - free_unref_page(page); + __free_page(page); } } @@ -352,6 +355,7 @@ static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel) return NULL; } + atomic_set(&page->pt_frag_refcount, 1); ret = page_address(page); /* @@ -367,7 +371,7 @@ static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel) * count. */ if (likely(!mm->context.pte_frag)) { - set_page_count(page, PTE_FRAG_NR); + atomic_set(&page->pt_frag_refcount, PTE_FRAG_NR); mm->context.pte_frag = ret + PTE_FRAG_SIZE; } spin_unlock(&mm->page_table_lock); @@ -390,10 +394,11 @@ void pte_fragment_free(unsigned long *table, int kernel) { struct page *page = virt_to_page(table); - if (put_page_testzero(page)) { + BUG_ON(atomic_read(&page->pt_frag_refcount) <= 0); + if (atomic_dec_and_test(&page->pt_frag_refcount)) { if (!kernel) pgtable_page_dtor(page); - free_unref_page(page); + __free_page(page); } } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 99ce070e7dcb..22651e124071 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -139,7 +139,10 @@ struct page { unsigned long _pt_pad_1; /* compound_head */ pgtable_t pmd_huge_pte; /* protected by page->ptl */ unsigned long _pt_pad_2; /* mapping */ - struct mm_struct *pt_mm; /* x86 pgds only */ + union { + struct mm_struct *pt_mm; /* x86 pgds only */ + atomic_t pt_frag_refcount; /* powerpc */ + }; #if ALLOC_SPLIT_PTLOCKS spinlock_t *ptl; #else

7 years, 3 months

1
0
0 0

[PATCH for 4.4 1/1] x86/mm/pat: Fix L1TF stable backport for CPA, 2nd call

by Jiri Slaby

From: Andi Kleen <ak(a)linux.intel.com> Mostly recycling the commit log from adaba23ccd7d which fixed populate_pmd, but did not fix populate_pud. The same problem exists there. Stable trees reverted the following patch: Revert "x86/mm/pat: Ensure cpa->pfn only contains page frame numbers" This reverts commit 87e2bd898d3a79a8c609f183180adac47879a2a4 which is commit edc3b9129cecd0f0857112136f5b8b1bc1d45918 upstream. but the L1TF patch 02ff2769edbc backported here x86/mm/pat: Make set_memory_np() L1TF safe commit 958f79b9ee55dfaf00c8106ed1c22a2919e0028b upstream set_memory_np() is used to mark kernel mappings not present, but it has it's own open coded mechanism which does not have the L1TF protection of inverting the address bits. assumed that cpa->pfn contains a PFN. With the above patch reverted it does not, which causes the PUD to be set to an incorrect address shifted by 12 bits, which can cause various failures. Convert the address to a PFN before passing it to pud_pfn(). This is a 4.4 stable only patch to fix the L1TF patches backport there. Cc: stable(a)vger.kernel.org # 4.4-only Cc: Andi Kleen <ak(a)linux.intel.com> Signed-off-by: Jiri Slaby <jslaby(a)suse.cz> --- arch/x86/mm/pageattr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 1007fa80f5a6..0e1dd7d47f05 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -1079,7 +1079,7 @@ static int populate_pud(struct cpa_data *cpa, unsigned long start, pgd_t *pgd, * Map everything starting from the Gb boundary, possibly with 1G pages */ while (end - start >= PUD_SIZE) { - set_pud(pud, pud_mkhuge(pfn_pud(cpa->pfn, + set_pud(pud, pud_mkhuge(pfn_pud(cpa->pfn >> PAGE_SHIFT, canon_pgprot(pud_pgprot)))); start += PUD_SIZE; -- 2.18.0

7 years, 3 months

1
1
0 0

FAILED: patch "[PATCH] libertas: fix suspend and resume for SDIO connected cards" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 7444a8092906ed44c09459780c56ba57043e39b1 Mon Sep 17 00:00:00 2001 From: Daniel Mack <daniel(a)zonque.org> Date: Wed, 27 Jun 2018 20:58:45 +0200 Subject: [PATCH] libertas: fix suspend and resume for SDIO connected cards Prior to commit 573185cc7e64 ("mmc: core: Invoke sdio func driver's PM callbacks from the sdio bus"), the MMC core used to call into the power management functions of SDIO clients itself and removed the card if the return code was non-zero. IOW, the mmc handled errors gracefully and didn't upchain them to the pm core. Since this change, the mmc core relies on generic power management functions which treat all errors as a reason to cancel the suspend immediately. This causes suspend attempts to fail when the libertas driver is loaded. To fix this, power down the card explicitly in if_sdio_suspend() when we know we're about to lose power and return success. Also set a flag in these cases, and power up the card again in if_sdio_resume(). Fixes: 573185cc7e64 ("mmc: core: Invoke sdio func driver's PM callbacks from the sdio bus") Cc: <stable(a)vger.kernel.org> Signed-off-by: Daniel Mack <daniel(a)zonque.org> Reviewed-by: Chris Ball <chris(a)printf.net> Reviewed-by: Ulf Hansson <ulf.hansson(a)linaro.org> Signed-off-by: Kalle Valo <kvalo(a)codeaurora.org> diff --git a/drivers/net/wireless/marvell/libertas/dev.h b/drivers/net/wireless/marvell/libertas/dev.h index dd1ee1f0af48..469134930026 100644 --- a/drivers/net/wireless/marvell/libertas/dev.h +++ b/drivers/net/wireless/marvell/libertas/dev.h @@ -104,6 +104,7 @@ struct lbs_private { u8 fw_ready; u8 surpriseremoved; u8 setup_fw_on_resume; + u8 power_up_on_resume; int (*hw_host_to_card) (struct lbs_private *priv, u8 type, u8 *payload, u16 nb); void (*reset_card) (struct lbs_private *priv); int (*power_save) (struct lbs_private *priv); diff --git a/drivers/net/wireless/marvell/libertas/if_sdio.c b/drivers/net/wireless/marvell/libertas/if_sdio.c index 2300e796c6ab..43743c26c071 100644 --- a/drivers/net/wireless/marvell/libertas/if_sdio.c +++ b/drivers/net/wireless/marvell/libertas/if_sdio.c @@ -1290,15 +1290,23 @@ static void if_sdio_remove(struct sdio_func *func) static int if_sdio_suspend(struct device *dev) { struct sdio_func *func = dev_to_sdio_func(dev); - int ret; struct if_sdio_card *card = sdio_get_drvdata(func); + struct lbs_private *priv = card->priv; + int ret; mmc_pm_flag_t flags = sdio_get_host_pm_caps(func); + priv->power_up_on_resume = false; /* If we're powered off anyway, just let the mmc layer remove the * card. */ - if (!lbs_iface_active(card->priv)) - return -ENOSYS; + if (!lbs_iface_active(priv)) { + if (priv->fw_ready) { + priv->power_up_on_resume = true; + if_sdio_power_off(card); + } + + return 0; + } dev_info(dev, "%s: suspend: PM flags = 0x%x\n", sdio_func_id(func), flags); @@ -1306,9 +1314,14 @@ static int if_sdio_suspend(struct device *dev) /* If we aren't being asked to wake on anything, we should bail out * and let the SD stack power down the card. */ - if (card->priv->wol_criteria == EHS_REMOVE_WAKEUP) { + if (priv->wol_criteria == EHS_REMOVE_WAKEUP) { dev_info(dev, "Suspend without wake params -- powering down card\n"); - return -ENOSYS; + if (priv->fw_ready) { + priv->power_up_on_resume = true; + if_sdio_power_off(card); + } + + return 0; } if (!(flags & MMC_PM_KEEP_POWER)) { @@ -1321,7 +1334,7 @@ static int if_sdio_suspend(struct device *dev) if (ret) return ret; - ret = lbs_suspend(card->priv); + ret = lbs_suspend(priv); if (ret) return ret; @@ -1336,6 +1349,11 @@ static int if_sdio_resume(struct device *dev) dev_info(dev, "%s: resume: we're back\n", sdio_func_id(func)); + if (card->priv->power_up_on_resume) { + if_sdio_power_on(card); + wait_event(card->pwron_waitq, card->priv->fw_ready); + } + ret = lbs_resume(card->priv); return ret;

7 years, 3 months

1
0
0 0

[PATCH] drm/i915/userptr: reject zero user_size

by Loic

Hello, Tested without any problem so please picked up this. From: Matthew Auld [ Upstream commit c11c7bfd213495784b22ef82a69b6489f8d0092f ] Operating on a zero sized GEM userptr object will lead to explosions. Fixes: 5cc9ed4b9a7a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl") Testcase: igt/gem_userptr_blits/input-checking Signed-off-by: Matthew Auld <matthew.auld(a)intel.com> Cc: Chris Wilson <chris(a)chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris(a)chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180502195021.30900-1-matthe… --- drivers/gpu/drm/i915/i915_gem_userptr.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index d596a8302ca3c..854bd51b9478a 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -778,6 +778,9 @@ i915_gem_userptr_ioctl(struct drm_device *dev, I915_USERPTR_UNSYNCHRONIZED)) return -EINVAL; + if (!args->user_size) + return -EINVAL; + if (offset_in_page(args->user_ptr | args->user_size)) return -EINVAL; -- 2.17.1

7 years, 3 months

2
3
0 0

[PATCH v2] usb: xhci: fix interrupt transfer error happened on MTK platforms

by Chunfeng Yun

The MTK xHCI controller use some reserved bytes in endpoint context for bandwidth scheduling, so need keep them in xhci_endpoint_copy(); The issue is introduced by: commit f5249461b504 ("xhci: Clear the host side toggle manually when endpoint is soft reset") It resets endpoints and will drop bandwidth scheduling parameters used by interrupt or isochronous endpoints on MTK xHCI controller. Fixes: f5249461b504 ("xhci: Clear the host side toggle manually when endpoint is soft reset") Cc: stable(a)vger.kernel.org Signed-off-by: Chunfeng Yun <chunfeng.yun(a)mediatek.com> Tested-by: Sean Wang <sean.wang(a)mediatek.com> --- v2: add fix tag, Cc and Tested-by --- drivers/usb/host/xhci-mem.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index ef350c3..b1f27aa 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -1613,6 +1613,10 @@ void xhci_endpoint_copy(struct xhci_hcd *xhci, in_ep_ctx->ep_info2 = out_ep_ctx->ep_info2; in_ep_ctx->deq = out_ep_ctx->deq; in_ep_ctx->tx_info = out_ep_ctx->tx_info; + if (xhci->quirks & XHCI_MTK_HOST) { + in_ep_ctx->reserved[0] = out_ep_ctx->reserved[0]; + in_ep_ctx->reserved[1] = out_ep_ctx->reserved[1]; + } } /* Copy output xhci_slot_ctx to the input xhci_slot_ctx. -- 1.9.1

7 years, 3 months

2
3
0 0

FAILED: patch "[PATCH] bcache: set max writeback rate when I/O request is idle" failed to apply to 4.18-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.18-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From ea8c5356d39048bc94bae068228f51ddbecc6b89 Mon Sep 17 00:00:00 2001 From: Coly Li <colyli(a)suse.de> Date: Thu, 9 Aug 2018 15:48:49 +0800 Subject: [PATCH] bcache: set max writeback rate when I/O request is idle Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle") allows the writeback rate to be faster if there is no I/O request on a bcache device. It works well if there is only one bcache device attached to the cache set. If there are many bcache devices attached to a cache set, it may introduce performance regression because multiple faster writeback threads of the idle bcache devices will compete the btree level locks with the bcache device who have I/O requests coming. This patch fixes the above issue by only permitting fast writebac when all bcache devices attached on the cache set are idle. And if one of the bcache devices has new I/O request coming, minimized all writeback throughput immediately and let PI controller __update_writeback_rate() to decide the upcoming writeback rate for each bcache device. Also when all bcache devices are idle, limited wrieback rate to a small number is wast of thoughput, especially when backing devices are slower non-rotation devices (e.g. SATA SSD). This patch sets a max writeback rate for each backing device if the whole cache set is idle. A faster writeback rate in idle time means new I/Os may have more available space for dirty data, and people may observe a better write performance then. Please note bcache may change its cache mode in run time, and this patch still works if the cache mode is switched from writeback mode and there is still dirty data on cache. Fixes: Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle") Cc: stable(a)vger.kernel.org #4.16+ Signed-off-by: Coly Li <colyli(a)suse.de> Tested-by: Kai Krakow <kai(a)kaishome.de> Tested-by: Stefan Priebe <s.priebe(a)profihost.ag> Cc: Michael Lyle <mlyle(a)lyle.org> Signed-off-by: Jens Axboe <axboe(a)kernel.dk> diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index b393b3fd06b6..05f82ff6f016 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -328,13 +328,6 @@ struct cached_dev { */ atomic_t has_dirty; - /* - * Set to zero by things that touch the backing volume-- except - * writeback. Incremented by writeback. Used to determine when to - * accelerate idle writeback. - */ - atomic_t backing_idle; - struct bch_ratelimit writeback_rate; struct delayed_work writeback_rate_update; @@ -515,6 +508,8 @@ struct cache_set { struct cache_accounting accounting; unsigned long flags; + atomic_t idle_counter; + atomic_t at_max_writeback_rate; struct cache_sb sb; @@ -524,6 +519,7 @@ struct cache_set { struct bcache_device **devices; unsigned devices_max_used; + atomic_t attached_dev_nr; struct list_head cached_devs; uint64_t cached_dev_sectors; atomic_long_t flash_dev_dirty_sectors; diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c index 914d501ad1e0..7dbe8b6316a0 100644 --- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -1103,6 +1103,44 @@ static void detached_dev_do_request(struct bcache_device *d, struct bio *bio) generic_make_request(bio); } +static void quit_max_writeback_rate(struct cache_set *c, + struct cached_dev *this_dc) +{ + int i; + struct bcache_device *d; + struct cached_dev *dc; + + /* + * mutex bch_register_lock may compete with other parallel requesters, + * or attach/detach operations on other backing device. Waiting to + * the mutex lock may increase I/O request latency for seconds or more. + * To avoid such situation, if mutext_trylock() failed, only writeback + * rate of current cached device is set to 1, and __update_write_back() + * will decide writeback rate of other cached devices (remember now + * c->idle_counter is 0 already). + */ + if (mutex_trylock(&bch_register_lock)) { + for (i = 0; i < c->devices_max_used; i++) { + if (!c->devices[i]) + continue; + + if (UUID_FLASH_ONLY(&c->uuids[i])) + continue; + + d = c->devices[i]; + dc = container_of(d, struct cached_dev, disk); + /* + * set writeback rate to default minimum value, + * then let update_writeback_rate() to decide the + * upcoming rate. + */ + atomic_long_set(&dc->writeback_rate.rate, 1); + } + mutex_unlock(&bch_register_lock); + } else + atomic_long_set(&this_dc->writeback_rate.rate, 1); +} + /* Cached devices - read & write stuff */ static blk_qc_t cached_dev_make_request(struct request_queue *q, @@ -1120,8 +1158,25 @@ static blk_qc_t cached_dev_make_request(struct request_queue *q, return BLK_QC_T_NONE; } - atomic_set(&dc->backing_idle, 0); - generic_start_io_acct(q, bio_op(bio), bio_sectors(bio), &d->disk->part0); + if (likely(d->c)) { + if (atomic_read(&d->c->idle_counter)) + atomic_set(&d->c->idle_counter, 0); + /* + * If at_max_writeback_rate of cache set is true and new I/O + * comes, quit max writeback rate of all cached devices + * attached to this cache set, and set at_max_writeback_rate + * to false. + */ + if (unlikely(atomic_read(&d->c->at_max_writeback_rate) == 1)) { + atomic_set(&d->c->at_max_writeback_rate, 0); + quit_max_writeback_rate(d->c, dc); + } + } + + generic_start_io_acct(q, + bio_op(bio), + bio_sectors(bio), + &d->disk->part0); bio_set_dev(bio, dc->bdev); bio->bi_iter.bi_sector += dc->sb.data_offset; diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 1e85cbb4c159..55a37641aa95 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -696,6 +696,8 @@ static void bcache_device_detach(struct bcache_device *d) { lockdep_assert_held(&bch_register_lock); + atomic_dec(&d->c->attached_dev_nr); + if (test_bit(BCACHE_DEV_DETACHING, &d->flags)) { struct uuid_entry *u = d->c->uuids + d->id; @@ -1144,6 +1146,7 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c, bch_cached_dev_run(dc); bcache_device_link(&dc->disk, c, "bdev"); + atomic_inc(&c->attached_dev_nr); /* Allow the writeback thread to proceed */ up_write(&dc->writeback_lock); @@ -1696,6 +1699,7 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb) c->block_bits = ilog2(sb->block_size); c->nr_uuids = bucket_bytes(c) / sizeof(struct uuid_entry); c->devices_max_used = 0; + atomic_set(&c->attached_dev_nr, 0); c->btree_pages = bucket_pages(c); if (c->btree_pages > BTREE_MAX_PAGES) c->btree_pages = max_t(int, c->btree_pages / 4, diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index 3e9d3459a224..6e88142514fb 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -171,7 +171,8 @@ SHOW(__bch_cached_dev) var_printf(writeback_running, "%i"); var_print(writeback_delay); var_print(writeback_percent); - sysfs_hprint(writeback_rate, wb ? dc->writeback_rate.rate << 9 : 0); + sysfs_hprint(writeback_rate, + wb ? atomic_long_read(&dc->writeback_rate.rate) << 9 : 0); sysfs_hprint(io_errors, atomic_read(&dc->io_errors)); sysfs_printf(io_error_limit, "%i", dc->error_limit); sysfs_printf(io_disable, "%i", dc->io_disable); @@ -193,7 +194,9 @@ SHOW(__bch_cached_dev) * Except for dirty and target, other values should * be 0 if writeback is not running. */ - bch_hprint(rate, wb ? dc->writeback_rate.rate << 9 : 0); + bch_hprint(rate, + wb ? atomic_long_read(&dc->writeback_rate.rate) << 9 + : 0); bch_hprint(dirty, bcache_dev_sectors_dirty(&dc->disk) << 9); bch_hprint(target, dc->writeback_rate_target << 9); bch_hprint(proportional, @@ -261,8 +264,12 @@ STORE(__cached_dev) sysfs_strtoul_clamp(writeback_percent, dc->writeback_percent, 0, 40); - sysfs_strtoul_clamp(writeback_rate, - dc->writeback_rate.rate, 1, INT_MAX); + if (attr == &sysfs_writeback_rate) { + int v; + + sysfs_strtoul_clamp(writeback_rate, v, 1, INT_MAX); + atomic_long_set(&dc->writeback_rate.rate, v); + } sysfs_strtoul_clamp(writeback_rate_update_seconds, dc->writeback_rate_update_seconds, diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c index fc479b026d6d..b15256bcf0e7 100644 --- a/drivers/md/bcache/util.c +++ b/drivers/md/bcache/util.c @@ -200,7 +200,7 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done) { uint64_t now = local_clock(); - d->next += div_u64(done * NSEC_PER_SEC, d->rate); + d->next += div_u64(done * NSEC_PER_SEC, atomic_long_read(&d->rate)); /* Bound the time. Don't let us fall further than 2 seconds behind * (this prevents unnecessary backlog that would make it impossible diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h index cced87f8eb27..f7b0133c9d2f 100644 --- a/drivers/md/bcache/util.h +++ b/drivers/md/bcache/util.h @@ -442,7 +442,7 @@ struct bch_ratelimit { * Rate at which we want to do work, in units per second * The units here correspond to the units passed to bch_next_delay() */ - uint32_t rate; + atomic_long_t rate; }; static inline void bch_ratelimit_reset(struct bch_ratelimit *d) diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c index 912e969fedba..481d4cf38ac0 100644 --- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -104,11 +104,56 @@ static void __update_writeback_rate(struct cached_dev *dc) dc->writeback_rate_proportional = proportional_scaled; dc->writeback_rate_integral_scaled = integral_scaled; - dc->writeback_rate_change = new_rate - dc->writeback_rate.rate; - dc->writeback_rate.rate = new_rate; + dc->writeback_rate_change = new_rate - + atomic_long_read(&dc->writeback_rate.rate); + atomic_long_set(&dc->writeback_rate.rate, new_rate); dc->writeback_rate_target = target; } +static bool set_at_max_writeback_rate(struct cache_set *c, + struct cached_dev *dc) +{ + /* + * Idle_counter is increased everytime when update_writeback_rate() is + * called. If all backing devices attached to the same cache set have + * identical dc->writeback_rate_update_seconds values, it is about 6 + * rounds of update_writeback_rate() on each backing device before + * c->at_max_writeback_rate is set to 1, and then max wrteback rate set + * to each dc->writeback_rate.rate. + * In order to avoid extra locking cost for counting exact dirty cached + * devices number, c->attached_dev_nr is used to calculate the idle + * throushold. It might be bigger if not all cached device are in write- + * back mode, but it still works well with limited extra rounds of + * update_writeback_rate(). + */ + if (atomic_inc_return(&c->idle_counter) < + atomic_read(&c->attached_dev_nr) * 6) + return false; + + if (atomic_read(&c->at_max_writeback_rate) != 1) + atomic_set(&c->at_max_writeback_rate, 1); + + atomic_long_set(&dc->writeback_rate.rate, INT_MAX); + + /* keep writeback_rate_target as existing value */ + dc->writeback_rate_proportional = 0; + dc->writeback_rate_integral_scaled = 0; + dc->writeback_rate_change = 0; + + /* + * Check c->idle_counter and c->at_max_writeback_rate agagain in case + * new I/O arrives during before set_at_max_writeback_rate() returns. + * Then the writeback rate is set to 1, and its new value should be + * decided via __update_writeback_rate(). + */ + if ((atomic_read(&c->idle_counter) < + atomic_read(&c->attached_dev_nr) * 6) || + !atomic_read(&c->at_max_writeback_rate)) + return false; + + return true; +} + static void update_writeback_rate(struct work_struct *work) { struct cached_dev *dc = container_of(to_delayed_work(work), @@ -136,13 +181,20 @@ static void update_writeback_rate(struct work_struct *work) return; } - down_read(&dc->writeback_lock); - - if (atomic_read(&dc->has_dirty) && - dc->writeback_percent) - __update_writeback_rate(dc); + if (atomic_read(&dc->has_dirty) && dc->writeback_percent) { + /* + * If the whole cache set is idle, set_at_max_writeback_rate() + * will set writeback rate to a max number. Then it is + * unncessary to update writeback rate for an idle cache set + * in maximum writeback rate number(s). + */ + if (!set_at_max_writeback_rate(c, dc)) { + down_read(&dc->writeback_lock); + __update_writeback_rate(dc); + up_read(&dc->writeback_lock); + } + } - up_read(&dc->writeback_lock); /* * CACHE_SET_IO_DISABLE might be set via sysfs interface, @@ -422,27 +474,6 @@ static void read_dirty(struct cached_dev *dc) delay = writeback_delay(dc, size); - /* If the control system would wait for at least half a - * second, and there's been no reqs hitting the backing disk - * for awhile: use an alternate mode where we have at most - * one contiguous set of writebacks in flight at a time. If - * someone wants to do IO it will be quick, as it will only - * have to contend with one operation in flight, and we'll - * be round-tripping data to the backing disk as quickly as - * it can accept it. - */ - if (delay >= HZ / 2) { - /* 3 means at least 1.5 seconds, up to 7.5 if we - * have slowed way down. - */ - if (atomic_inc_return(&dc->backing_idle) >= 3) { - /* Wait for current I/Os to finish */ - closure_sync(&cl); - /* And immediately launch a new set. */ - delay = 0; - } - } - while (!kthread_should_stop() && !test_bit(CACHE_SET_IO_DISABLE, &dc->disk.c->flags) && delay) { @@ -741,7 +772,7 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc) dc->writeback_running = true; dc->writeback_percent = 10; dc->writeback_delay = 30; - dc->writeback_rate.rate = 1024; + atomic_long_set(&dc->writeback_rate.rate, 1024); dc->writeback_rate_minimum = 8; dc->writeback_rate_update_seconds = WRITEBACK_RATE_UPDATE_SECS_DEFAULT;

7 years, 3 months

1
0
0 0

WTF: patch "[PATCH] bcache: do not check return value of debugfs_create_dir()" was seriously submitted to be applied to the 4.18-stable tree?

by gregkh＠linuxfoundation.org

The patch below was submitted to be applied to the 4.18-stable tree. I fail to see how this patch meets the stable kernel rules as found at Documentation/process/stable-kernel-rules.rst. I could be totally wrong, and if so, please respond to <stable(a)vger.kernel.org> and let me know why this patch should be applied. Otherwise, it is now dropped from my patch queues, never to be seen again. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 78ac2107176baa0daf65b0fb8e561d2ed14c83ca Mon Sep 17 00:00:00 2001 From: Coly Li <colyli(a)suse.de> Date: Thu, 9 Aug 2018 15:48:42 +0800 Subject: [PATCH] bcache: do not check return value of debugfs_create_dir() Greg KH suggests that normal code should not care about debugfs. Therefore no matter successful or failed of debugfs_create_dir() execution, it is unncessary to check its return value. There are two functions called debugfs_create_dir() and check the return value, which are bch_debug_init() and closure_debug_init(). This patch changes these two functions from int to void type, and ignore return values of debugfs_create_dir(). This patch does not fix exact bug, just makes things work as they should. Signed-off-by: Coly Li <colyli(a)suse.de> Suggested-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: stable(a)vger.kernel.org Cc: Kai Krakow <kai(a)kaishome.de> Cc: Kent Overstreet <kent.overstreet(a)gmail.com> Signed-off-by: Jens Axboe <axboe(a)kernel.dk> diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index 872ef4d67711..0a3e82b0876d 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -1001,7 +1001,7 @@ void bch_open_buckets_free(struct cache_set *); int bch_cache_allocator_start(struct cache *ca); void bch_debug_exit(void); -int bch_debug_init(struct kobject *); +void bch_debug_init(struct kobject *kobj); void bch_request_exit(void); int bch_request_init(void); diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c index 0e14969182c6..618253683d40 100644 --- a/drivers/md/bcache/closure.c +++ b/drivers/md/bcache/closure.c @@ -199,11 +199,16 @@ static const struct file_operations debug_ops = { .release = single_release }; -int __init closure_debug_init(void) +void __init closure_debug_init(void) { - closure_debug = debugfs_create_file("closures", - 0400, bcache_debug, NULL, &debug_ops); - return IS_ERR_OR_NULL(closure_debug); + if (!IS_ERR_OR_NULL(bcache_debug)) + /* + * it is unnecessary to check return value of + * debugfs_create_file(), we should not care + * about this. + */ + closure_debug = debugfs_create_file( + "closures", 0400, bcache_debug, NULL, &debug_ops); } #endif diff --git a/drivers/md/bcache/closure.h b/drivers/md/bcache/closure.h index 71427eb5fdae..7c2c5bc7c88b 100644 --- a/drivers/md/bcache/closure.h +++ b/drivers/md/bcache/closure.h @@ -186,13 +186,13 @@ static inline void closure_sync(struct closure *cl) #ifdef CONFIG_BCACHE_CLOSURES_DEBUG -int closure_debug_init(void); +void closure_debug_init(void); void closure_debug_create(struct closure *cl); void closure_debug_destroy(struct closure *cl); #else -static inline int closure_debug_init(void) { return 0; } +static inline void closure_debug_init(void) {} static inline void closure_debug_create(struct closure *cl) {} static inline void closure_debug_destroy(struct closure *cl) {} diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c index 04d146711950..12034c07257b 100644 --- a/drivers/md/bcache/debug.c +++ b/drivers/md/bcache/debug.c @@ -252,11 +252,12 @@ void bch_debug_exit(void) debugfs_remove_recursive(bcache_debug); } -int __init bch_debug_init(struct kobject *kobj) +void __init bch_debug_init(struct kobject *kobj) { - if (!IS_ENABLED(CONFIG_DEBUG_FS)) - return 0; - + /* + * it is unnecessary to check return value of + * debugfs_create_file(), we should not care + * about this. + */ bcache_debug = debugfs_create_dir("bcache", NULL); - return IS_ERR_OR_NULL(bcache_debug); } diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index e0a92104ca23..c7ffa6ef3f82 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -2345,10 +2345,12 @@ static int __init bcache_init(void) goto err; if (bch_request_init() || - bch_debug_init(bcache_kobj) || closure_debug_init() || sysfs_create_files(bcache_kobj, files)) goto err; + bch_debug_init(bcache_kobj); + closure_debug_init(); + return 0; err: bcache_exit();

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] block: blk_init_allocated_queue() set q->fq as NULL in the" failed to apply to 3.18-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 3.18-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 54648cf1ec2d7f4b6a71767799c45676a138ca24 Mon Sep 17 00:00:00 2001 From: xiao jin <jin.xiao(a)intel.com> Date: Mon, 30 Jul 2018 14:11:12 +0800 Subject: [PATCH] block: blk_init_allocated_queue() set q->fq as NULL in the fail case We find the memory use-after-free issue in __blk_drain_queue() on the kernel 4.14. After read the latest kernel 4.18-rc6 we think it has the same problem. Memory is allocated for q->fq in the blk_init_allocated_queue(). If the elevator init function called with error return, it will run into the fail case to free the q->fq. Then the __blk_drain_queue() uses the same memory after the free of the q->fq, it will lead to the unpredictable event. The patch is to set q->fq as NULL in the fail case of blk_init_allocated_queue(). Fixes: commit 7c94e1c157a2 ("block: introduce blk_flush_queue to drive flush machinery") Cc: <stable(a)vger.kernel.org> Reviewed-by: Ming Lei <ming.lei(a)redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche(a)wdc.com> Signed-off-by: xiao jin <jin.xiao(a)intel.com> Signed-off-by: Jens Axboe <axboe(a)kernel.dk> diff --git a/block/blk-core.c b/block/blk-core.c index 03a4ea93a5f3..23cd1b7770e7 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1184,6 +1184,7 @@ int blk_init_allocated_queue(struct request_queue *q) q->exit_rq_fn(q, q->fq->flush_rq); out_free_flush_queue: blk_free_flush_queue(q->fq); + q->fq = NULL; return -ENOMEM; } EXPORT_SYMBOL(blk_init_allocated_queue);

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] block: blk_init_allocated_queue() set q->fq as NULL in the" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 54648cf1ec2d7f4b6a71767799c45676a138ca24 Mon Sep 17 00:00:00 2001 From: xiao jin <jin.xiao(a)intel.com> Date: Mon, 30 Jul 2018 14:11:12 +0800 Subject: [PATCH] block: blk_init_allocated_queue() set q->fq as NULL in the fail case We find the memory use-after-free issue in __blk_drain_queue() on the kernel 4.14. After read the latest kernel 4.18-rc6 we think it has the same problem. Memory is allocated for q->fq in the blk_init_allocated_queue(). If the elevator init function called with error return, it will run into the fail case to free the q->fq. Then the __blk_drain_queue() uses the same memory after the free of the q->fq, it will lead to the unpredictable event. The patch is to set q->fq as NULL in the fail case of blk_init_allocated_queue(). Fixes: commit 7c94e1c157a2 ("block: introduce blk_flush_queue to drive flush machinery") Cc: <stable(a)vger.kernel.org> Reviewed-by: Ming Lei <ming.lei(a)redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche(a)wdc.com> Signed-off-by: xiao jin <jin.xiao(a)intel.com> Signed-off-by: Jens Axboe <axboe(a)kernel.dk> diff --git a/block/blk-core.c b/block/blk-core.c index 03a4ea93a5f3..23cd1b7770e7 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1184,6 +1184,7 @@ int blk_init_allocated_queue(struct request_queue *q) q->exit_rq_fn(q, q->fq->flush_rq); out_free_flush_queue: blk_free_flush_queue(q->fq); + q->fq = NULL; return -ENOMEM; } EXPORT_SYMBOL(blk_init_allocated_queue);

7 years, 3 months

1
0
0 0

FAILED: patch "[PATCH] block: blk_init_allocated_queue() set q->fq as NULL in the" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 54648cf1ec2d7f4b6a71767799c45676a138ca24 Mon Sep 17 00:00:00 2001 From: xiao jin <jin.xiao(a)intel.com> Date: Mon, 30 Jul 2018 14:11:12 +0800 Subject: [PATCH] block: blk_init_allocated_queue() set q->fq as NULL in the fail case We find the memory use-after-free issue in __blk_drain_queue() on the kernel 4.14. After read the latest kernel 4.18-rc6 we think it has the same problem. Memory is allocated for q->fq in the blk_init_allocated_queue(). If the elevator init function called with error return, it will run into the fail case to free the q->fq. Then the __blk_drain_queue() uses the same memory after the free of the q->fq, it will lead to the unpredictable event. The patch is to set q->fq as NULL in the fail case of blk_init_allocated_queue(). Fixes: commit 7c94e1c157a2 ("block: introduce blk_flush_queue to drive flush machinery") Cc: <stable(a)vger.kernel.org> Reviewed-by: Ming Lei <ming.lei(a)redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche(a)wdc.com> Signed-off-by: xiao jin <jin.xiao(a)intel.com> Signed-off-by: Jens Axboe <axboe(a)kernel.dk> diff --git a/block/blk-core.c b/block/blk-core.c index 03a4ea93a5f3..23cd1b7770e7 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1184,6 +1184,7 @@ int blk_init_allocated_queue(struct request_queue *q) q->exit_rq_fn(q, q->fq->flush_rq); out_free_flush_queue: blk_free_flush_queue(q->fq); + q->fq = NULL; return -ENOMEM; } EXPORT_SYMBOL(blk_init_allocated_queue);

7 years, 3 months

1
0
0 0

[PATCH] clk: sunxi-ng: sun4i: Set VCO and PLL bias current to lowest setting

by Chen-Yu Tsai

The default mid-level PLL bias current setting interferes with sigma delta modulation. This manifests as decreased audio quality at lower sampling rates, which sounds like radio broadcast quality, and distortion noises at sampling rates at 48 kHz or above. Changing the bias current settings to the lowest gets rid of the noise. Fixes: de3448519194 ("clk: sunxi-ng: sun4i: Use sigma-delta modulation for audio PLL") Cc: <stable(a)vger.kernel.org> # 4.15.x Signed-off-by: Chen-Yu Tsai <wens(a)csie.org> --- drivers/clk/sunxi-ng/ccu-sun4i-a10.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/clk/sunxi-ng/ccu-sun4i-a10.c b/drivers/clk/sunxi-ng/ccu-sun4i-a10.c index ffa5dac221e4..129ebd2588fd 100644 --- a/drivers/clk/sunxi-ng/ccu-sun4i-a10.c +++ b/drivers/clk/sunxi-ng/ccu-sun4i-a10.c @@ -1434,8 +1434,16 @@ static void __init sun4i_ccu_init(struct device_node *node, return; } - /* Force the PLL-Audio-1x divider to 1 */ val = readl(reg + SUN4I_PLL_AUDIO_REG); + + /* + * Force VCO and PLL bias current to lowest setting. Higher + * settings interfere with sigma-delta modulation and result + * in audible noise and distortions when using SPDIF or I2S. + */ + val &= ~GENMASK(25, 16); + + /* Force the PLL-Audio-1x divider to 1 */ val &= ~GENMASK(29, 26); writel(val | (1 << 26), reg + SUN4I_PLL_AUDIO_REG); -- 2.19.0.rc1

7 years, 3 months

2
1
0 0

rcu: Make expedited GPs handle CPU 0 being offline

by Paul E. McKenney

Hello! Please apply the following commit to the v4.18 -stable tree: rcu: Make expedited GPs handle CPU 0 being offline fcc63543650150629c8a873cbef3578770acecd9 This patch fixes a v4.18 regression that is causing the RISC-V people (CCed) some trouble. It is a small patch and has been tested in the failing situation running v4.18 by Atish Patra and Andreas Schwab (both CCed). Please let me know if more information is required. Thanx, Paul

7 years, 3 months

2
1
0 0

[PATCH AUTOSEL 4.4 01/30] usb: usbtest: use irqsave() in USB's complete callback

by Sasha Levin

From: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de> [ Upstream commit 6f3fde684d0232e66ada3410f016a58e09a87689 ] The USB completion callback does not disable interrupts while acquiring the lock. We want to remove the local_irq_disable() invocation from __usb_hcd_giveback_urb() and therefore it is required for the callback handler to disable the interrupts while acquiring the lock. The callback may be invoked either in IRQ or BH context depending on the USB host controller. Use the _irqsave() variant of the locking primitives. Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Acked-by: Alan Stern <stern(a)rowland.harvard.edu> Signed-off-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com> --- drivers/usb/misc/usbtest.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/usb/misc/usbtest.c b/drivers/usb/misc/usbtest.c index bc92a498ec03..4632a29d5403 100644 --- a/drivers/usb/misc/usbtest.c +++ b/drivers/usb/misc/usbtest.c @@ -1053,11 +1053,12 @@ static void ctrl_complete(struct urb *urb) struct usb_ctrlrequest *reqp; struct subcase *subcase; int status = urb->status; + unsigned long flags; reqp = (struct usb_ctrlrequest *)urb->setup_packet; subcase = container_of(reqp, struct subcase, setup); - spin_lock(&ctx->lock); + spin_lock_irqsave(&ctx->lock, flags); ctx->count--; ctx->pending--; @@ -1156,7 +1157,7 @@ static void ctrl_complete(struct urb *urb) /* signal completion when nothing's queued */ if (ctx->pending == 0) complete(&ctx->complete); - spin_unlock(&ctx->lock); + spin_unlock_irqrestore(&ctx->lock, flags); } static int @@ -1832,8 +1833,9 @@ struct transfer_context { static void complicated_callback(struct urb *urb) { struct transfer_context *ctx = urb->context; + unsigned long flags; - spin_lock(&ctx->lock); + spin_lock_irqsave(&ctx->lock, flags); ctx->count--; ctx->packet_count += urb->number_of_packets; @@ -1873,7 +1875,7 @@ static void complicated_callback(struct urb *urb) complete(&ctx->done); } done: - spin_unlock(&ctx->lock); + spin_unlock_irqrestore(&ctx->lock, flags); } static struct urb *iso_alloc_urb( -- 2.17.1

7 years, 3 months

2
30
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror September 2018