Linux-stable-mirror September 2023

linux-stable-mirror@lists.linaro.org

442 participants
1075 discussions

hwmon: (nzxt-smart2) backport device ids to v6.1

by Aleksandr Mezin

Please pick the following commits: - e247510e1baad04e9b7b8ed7190dbb00989387b9 hwmon: (nzxt-smart2) Add device id - 4a148e9b1ee04e608263fa9536a96214d5561220 hwmon: (nzxt-smart2) add another USB ID into v6.1 stable kernel. They add device ids for nzxt-smart2 hwmon driver, and they don't require any other code changes. This will synchronize the driver code with v6.3.

2 years, 2 months

[PATCH] block: fix use-after-free of q->q_usage_counter

by Saranya Muruganandam

From: Ming Lei <ming.lei(a)redhat.com> commit d36a9ea5e7766961e753ee38d4c331bbe6ef659b upstream. For blk-mq, queue release handler is usually called after blk_mq_freeze_queue_wait() returns. However, the q_usage_counter->release() handler may not be run yet at that time, so this can cause a use-after-free. Fix the issue by moving percpu_ref_exit() into blk_free_queue_rcu(). Since ->release() is called with rcu read lock held, it is agreed that the race should be covered in caller per discussion from the two links. Backport-notes: Not a clean cherry-pick since a lot has changed, however essentially the same fix. Reported-by: Zhang Wensheng <zhangwensheng(a)huaweicloud.com> Reported-by: Zhong Jinghua <zhongjinghua(a)huawei.com> Link: https://lore.kernel.org/linux-block/Y5prfOjyyjQKUrtH@T590/T/#u Link: https://lore.kernel.org/lkml/Y4%2FmzMd4evRg9yDi@fedora/ Cc: Hillf Danton <hdanton(a)sina.com> Cc: Yu Kuai <yukuai3(a)huawei.com> Cc: Dennis Zhou <dennis(a)kernel.org> Fixes: 2b0d3d3e4fcf ("percpu_ref: reduce memory footprint of percpu_ref in fast path") Signed-off-by: Ming Lei <ming.lei(a)redhat.com> Link: https://lore.kernel.org/r/20221215021629.74870-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe(a)kernel.dk> Signed-off-by: Saranya Muruganandam <saranyamohan(a)google.com> --- block/blk-core.c | 2 -- block/blk-sysfs.c | 2 ++ 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index d0d0dd8151f7..e5eeec801f56 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -414,8 +414,6 @@ void blk_cleanup_queue(struct request_queue *q) blk_mq_sched_free_requests(q); mutex_unlock(&q->sysfs_lock); - percpu_ref_exit(&q->q_usage_counter); - /* @q is and will stay empty, shutdown and put */ blk_put_queue(q); } diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 8c5816364dd1..9174137a913c 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -726,6 +726,8 @@ static void blk_free_queue_rcu(struct rcu_head *rcu_head) { struct request_queue *q = container_of(rcu_head, struct request_queue, rcu_head); + + percpu_ref_exit(&q->q_usage_counter); kmem_cache_free(blk_requestq_cachep, q); } -- 2.42.0.515.g380fc7ccd1-goog

2 years, 2 months

FAILED: patch "[PATCH] ext4: fix rec_len verify error" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y git checkout FETCH_HEAD git cherry-pick -x 7fda67e8c3ab6069f75888f67958a6d30454a9f6 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023092057-company-unworried-210b@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^.. Possible dependencies: 7fda67e8c3ab ("ext4: fix rec_len verify error") 46c116b920eb ("ext4: verify dir block before splitting it") f036adb39976 ("ext4: rename "dirent_csum" functions to use "dirblock"") b886ee3e778e ("ext4: Support case-insensitive file name lookups") ee73f9a52a34 ("ext4: convert to new i_version API") ae5e165d855d ("fs: new API for handling inode->i_version") 5cea7647e646 ("Merge branch 'for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 7fda67e8c3ab6069f75888f67958a6d30454a9f6 Mon Sep 17 00:00:00 2001 From: Shida Zhang <zhangshida(a)kylinos.cn> Date: Thu, 3 Aug 2023 14:09:38 +0800 Subject: [PATCH] ext4: fix rec_len verify error With the configuration PAGE_SIZE 64k and filesystem blocksize 64k, a problem occurred when more than 13 million files were directly created under a directory: EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D. EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D. EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum When enough files are created, the fake_dirent->reclen will be 0xffff. it doesn't equal to the blocksize 65536, i.e. 0x10000. But it is not the same condition when blocksize equals to 4k. when enough files are created, the fake_dirent->reclen will be 0x1000. it equals to the blocksize 4k, i.e. 0x1000. The problem seems to be related to the limitation of the 16-bit field when the blocksize is set to 64k. To address this, helpers like ext4_rec_len_{from,to}_disk has already been introduced to complete the conversion between the encoded and the plain form of rec_len. So fix this one by using the helper, and all the other in this file too. Cc: stable(a)kernel.org Fixes: dbe89444042a ("ext4: Calculate and verify checksums for htree nodes") Suggested-by: Andreas Dilger <adilger(a)dilger.ca> Suggested-by: Darrick J. Wong <djwong(a)kernel.org> Signed-off-by: Shida Zhang <zhangshida(a)kylinos.cn> Reviewed-by: Andreas Dilger <adilger(a)dilger.ca> Reviewed-by: Darrick J. Wong <djwong(a)kernel.org> Link: https://lore.kernel.org/r/20230803060938.1929759-1-zhangshida@kylinos.cn Signed-off-by: Theodore Ts'o <tytso(a)mit.edu> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index c0f0b4e2413b..c1ceccab05f5 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -343,17 +343,17 @@ static struct ext4_dir_entry_tail *get_dirent_tail(struct inode *inode, struct buffer_head *bh) { struct ext4_dir_entry_tail *t; + int blocksize = EXT4_BLOCK_SIZE(inode->i_sb); #ifdef PARANOID struct ext4_dir_entry *d, *top; d = (struct ext4_dir_entry *)bh->b_data; top = (struct ext4_dir_entry *)(bh->b_data + - (EXT4_BLOCK_SIZE(inode->i_sb) - - sizeof(struct ext4_dir_entry_tail))); - while (d < top && d->rec_len) + (blocksize - sizeof(struct ext4_dir_entry_tail))); + while (d < top && ext4_rec_len_from_disk(d->rec_len, blocksize)) d = (struct ext4_dir_entry *)(((void *)d) + - le16_to_cpu(d->rec_len)); + ext4_rec_len_from_disk(d->rec_len, blocksize)); if (d != top) return NULL; @@ -364,7 +364,8 @@ static struct ext4_dir_entry_tail *get_dirent_tail(struct inode *inode, #endif if (t->det_reserved_zero1 || - le16_to_cpu(t->det_rec_len) != sizeof(struct ext4_dir_entry_tail) || + (ext4_rec_len_from_disk(t->det_rec_len, blocksize) != + sizeof(struct ext4_dir_entry_tail)) || t->det_reserved_zero2 || t->det_reserved_ft != EXT4_FT_DIR_CSUM) return NULL; @@ -445,13 +446,14 @@ static struct dx_countlimit *get_dx_countlimit(struct inode *inode, struct ext4_dir_entry *dp; struct dx_root_info *root; int count_offset; + int blocksize = EXT4_BLOCK_SIZE(inode->i_sb); + unsigned int rlen = ext4_rec_len_from_disk(dirent->rec_len, blocksize); - if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb)) + if (rlen == blocksize) count_offset = 8; - else if (le16_to_cpu(dirent->rec_len) == 12) { + else if (rlen == 12) { dp = (struct ext4_dir_entry *)(((void *)dirent) + 12); - if (le16_to_cpu(dp->rec_len) != - EXT4_BLOCK_SIZE(inode->i_sb) - 12) + if (ext4_rec_len_from_disk(dp->rec_len, blocksize) != blocksize - 12) return NULL; root = (struct dx_root_info *)(((void *)dp + 12)); if (root->reserved_zero || @@ -1315,6 +1317,7 @@ static int dx_make_map(struct inode *dir, struct buffer_head *bh, unsigned int buflen = bh->b_size; char *base = bh->b_data; struct dx_hash_info h = *hinfo; + int blocksize = EXT4_BLOCK_SIZE(dir->i_sb); if (ext4_has_metadata_csum(dir->i_sb)) buflen -= sizeof(struct ext4_dir_entry_tail); @@ -1335,11 +1338,12 @@ static int dx_make_map(struct inode *dir, struct buffer_head *bh, map_tail--; map_tail->hash = h.hash; map_tail->offs = ((char *) de - base)>>2; - map_tail->size = le16_to_cpu(de->rec_len); + map_tail->size = ext4_rec_len_from_disk(de->rec_len, + blocksize); count++; cond_resched(); } - de = ext4_next_entry(de, dir->i_sb->s_blocksize); + de = ext4_next_entry(de, blocksize); } return count; }

2 years, 2 months

[PATCH 5.10] drm/mediatek: Fix backport issue in mtk_drm_gem_prime_vmap()

by Nathan Chancellor

When building with clang: drivers/gpu/drm/mediatek/mtk_drm_gem.c:255:10: error: incompatible integer to pointer conversion returning 'int' from a function with result type 'void *' [-Wint-conversion] 255 | return -ENOMEM; | ^~~~~~~ 1 error generated. GCC reports the same issue as a warning, rather than an error. Prior to commit 7e542ff8b463 ("drm/mediatek: Use struct dma_buf_map in GEM vmap ops"), this function returned a pointer rather than an integer. This function is indirectly called in drm_gem_vmap(), which treats NULL as -ENOMEM through an error pointer. Return NULL in this block to resolve the warning but keep the same end result. Fixes: 43f561e809aa ("drm/mediatek: Fix potential memory leak if vmap() fail") Signed-off-by: Nathan Chancellor <nathan(a)kernel.org> --- This is a fix for a 5.10 backport, so it has no upstream relevance but I've still cc'd the relevant maintainers in case they have any comments or want to double check my work. --- drivers/gpu/drm/mediatek/mtk_drm_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c b/drivers/gpu/drm/mediatek/mtk_drm_gem.c index fe64bf2176f3..b20ea58907c2 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c @@ -252,7 +252,7 @@ void *mtk_drm_gem_prime_vmap(struct drm_gem_object *obj) if (!mtk_gem->kvaddr) { kfree(sgt); kfree(mtk_gem->pages); - return -ENOMEM; + return NULL; } out: kfree(sgt); --- base-commit: ff0bfa8f23eb4c5a65ee6b0d0b7dc2e3439f1063 change-id: 20230922-5-10-fix-drm-mediatek-backport-0ee69329fef0 Best regards, -- Nathan Chancellor <nathan(a)kernel.org>

2 years, 2 months

[PATCH 6.1] drm/amd/display: Adjust the MST resume flow

by Mario Limonciello

From: Wayne Lin <wayne.lin(a)amd.com> [Why] In drm_dp_mst_topology_mgr_resume() today, it will resume the mst branch to be ready handling mst mode and also consecutively do the mst topology probing. Which will cause the dirver have chance to fire hotplug event before restoring the old state. Then Userspace will react to the hotplug event based on a wrong state. [How] Adjust the mst resume flow as: 1. set dpcd to resume mst branch status 2. restore source old state 3. Do mst resume topology probing For drm_dp_mst_topology_mgr_resume(), it's better to adjust it to pull out topology probing work into a 2nd part procedure of the mst resume. Will have a follow up patch in drm. Reviewed-by: Chao-kai Wang <stylon.wang(a)amd.com> Cc: Mario Limonciello <mario.limonciello(a)amd.com> Cc: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org Acked-by: Stylon Wang <stylon.wang(a)amd.com> Signed-off-by: Wayne Lin <wayne.lin(a)amd.com> Tested-by: Daniel Wheeler <daniel.wheeler(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> (cherry picked from commit ec5fa9fcdeca69edf7dab5ca3b2e0ceb1c08fe9a) Adjust for missing variable rename in f0127cb11299 ("drm/amdgpu/display/mst: adjust the naming of mst_port and port of aconnector") Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com> --- This is a follow up for https://lore.kernel.org/stable/2023092029-banter-truth-cf72@gregkh/ .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 93 ++++++++++++++++--- 1 file changed, 80 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index c8e562dcd99d..b98ee4b41863 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -2339,14 +2339,62 @@ static int dm_late_init(void *handle) return detect_mst_link_for_all_connectors(adev_to_drm(adev)); } +static void resume_mst_branch_status(struct drm_dp_mst_topology_mgr *mgr) +{ + int ret; + u8 guid[16]; + u64 tmp64; + + mutex_lock(&mgr->lock); + if (!mgr->mst_primary) + goto out_fail; + + if (drm_dp_read_dpcd_caps(mgr->aux, mgr->dpcd) < 0) { + drm_dbg_kms(mgr->dev, "dpcd read failed - undocked during suspend?\n"); + goto out_fail; + } + + ret = drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL, + DP_MST_EN | + DP_UP_REQ_EN | + DP_UPSTREAM_IS_SRC); + if (ret < 0) { + drm_dbg_kms(mgr->dev, "mst write failed - undocked during suspend?\n"); + goto out_fail; + } + + /* Some hubs forget their guids after they resume */ + ret = drm_dp_dpcd_read(mgr->aux, DP_GUID, guid, 16); + if (ret != 16) { + drm_dbg_kms(mgr->dev, "dpcd read failed - undocked during suspend?\n"); + goto out_fail; + } + + if (memchr_inv(guid, 0, 16) == NULL) { + tmp64 = get_jiffies_64(); + memcpy(&guid[0], &tmp64, sizeof(u64)); + memcpy(&guid[8], &tmp64, sizeof(u64)); + + ret = drm_dp_dpcd_write(mgr->aux, DP_GUID, guid, 16); + + if (ret != 16) { + drm_dbg_kms(mgr->dev, "check mstb guid failed - undocked during suspend?\n"); + goto out_fail; + } + } + + memcpy(mgr->mst_primary->guid, guid, 16); + +out_fail: + mutex_unlock(&mgr->lock); +} + static void s3_handle_mst(struct drm_device *dev, bool suspend) { struct amdgpu_dm_connector *aconnector; struct drm_connector *connector; struct drm_connector_list_iter iter; struct drm_dp_mst_topology_mgr *mgr; - int ret; - bool need_hotplug = false; drm_connector_list_iter_begin(dev, &iter); drm_for_each_connector_iter(connector, &iter) { @@ -2368,18 +2416,15 @@ static void s3_handle_mst(struct drm_device *dev, bool suspend) if (!dp_is_lttpr_present(aconnector->dc_link)) dc_link_aux_try_to_configure_timeout(aconnector->dc_link->ddc, LINK_AUX_DEFAULT_TIMEOUT_PERIOD); - ret = drm_dp_mst_topology_mgr_resume(mgr, true); - if (ret < 0) { - dm_helpers_dp_mst_stop_top_mgr(aconnector->dc_link->ctx, - aconnector->dc_link); - need_hotplug = true; - } + /* TODO: move resume_mst_branch_status() into drm mst resume again + * once topology probing work is pulled out from mst resume into mst + * resume 2nd step. mst resume 2nd step should be called after old + * state getting restored (i.e. drm_atomic_helper_resume()). + */ + resume_mst_branch_status(mgr); } } drm_connector_list_iter_end(&iter); - - if (need_hotplug) - drm_kms_helper_hotplug_event(dev); } static int amdgpu_dm_smu_write_watermarks_table(struct amdgpu_device *adev) @@ -2768,7 +2813,8 @@ static int dm_resume(void *handle) struct dm_atomic_state *dm_state = to_dm_atomic_state(dm->atomic_obj.state); enum dc_connection_type new_connection_type = dc_connection_none; struct dc_state *dc_state; - int i, r, j; + int i, r, j, ret; + bool need_hotplug = false; if (amdgpu_in_reset(adev)) { dc_state = dm->cached_dc_state; @@ -2866,7 +2912,7 @@ static int dm_resume(void *handle) continue; /* - * this is the case when traversing through already created + * this is the case when traversing through already created end sink * MST connectors, should be skipped */ if (aconnector && aconnector->mst_port) @@ -2926,6 +2972,27 @@ static int dm_resume(void *handle) dm->cached_state = NULL; + /* Do mst topology probing after resuming cached state*/ + drm_connector_list_iter_begin(ddev, &iter); + drm_for_each_connector_iter(connector, &iter) { + aconnector = to_amdgpu_dm_connector(connector); + if (aconnector->dc_link->type != dc_connection_mst_branch || + aconnector->mst_port) + continue; + + ret = drm_dp_mst_topology_mgr_resume(&aconnector->mst_mgr, true); + + if (ret < 0) { + dm_helpers_dp_mst_stop_top_mgr(aconnector->dc_link->ctx, + aconnector->dc_link); + need_hotplug = true; + } + } + drm_connector_list_iter_end(&iter); + + if (need_hotplug) + drm_kms_helper_hotplug_event(ddev); + amdgpu_dm_irq_resume_late(adev); amdgpu_dm_smu_write_watermarks_table(adev); -- 2.34.1

2 years, 2 months

[PATCH stable 4.14.y] vc_screen: reload load of struct vc_data pointer in vcs_write() to avoid UAF

by Suraj Jitindar Singh

From: George Kennedy <george.kennedy(a)oracle.com> commit 8fb9ea65c9d1338b0d2bb0a9122dc942cdd32357 upstream. After a call to console_unlock() in vcs_write() the vc_data struct can be freed by vc_port_destruct(). Because of that, the struct vc_data pointer must be reloaded in the while loop in vcs_write() after console_lock() to avoid a UAF when vcs_size() is called. Syzkaller reported a UAF in vcs_size(). BUG: KASAN: slab-use-after-free in vcs_size (drivers/tty/vt/vc_screen.c:215) Read of size 4 at addr ffff8880beab89a8 by task repro_vcs_size/4119 Call Trace: <TASK> __asan_report_load4_noabort (mm/kasan/report_generic.c:380) vcs_size (drivers/tty/vt/vc_screen.c:215) vcs_write (drivers/tty/vt/vc_screen.c:664) vfs_write (fs/read_write.c:582 fs/read_write.c:564) ... <TASK> Allocated by task 1213: kmalloc_trace (mm/slab_common.c:1064) vc_allocate (./include/linux/slab.h:559 ./include/linux/slab.h:680 drivers/tty/vt/vt.c:1078 drivers/tty/vt/vt.c:1058) con_install (drivers/tty/vt/vt.c:3334) tty_init_dev (drivers/tty/tty_io.c:1303 drivers/tty/tty_io.c:1415 drivers/tty/tty_io.c:1392) tty_open (drivers/tty/tty_io.c:2082 drivers/tty/tty_io.c:2128) chrdev_open (fs/char_dev.c:415) do_dentry_open (fs/open.c:921) vfs_open (fs/open.c:1052) ... Freed by task 4116: kfree (mm/slab_common.c:1016) vc_port_destruct (drivers/tty/vt/vt.c:1044) tty_port_destructor (drivers/tty/tty_port.c:296) tty_port_put (drivers/tty/tty_port.c:312) vt_disallocate_all (drivers/tty/vt/vt_ioctl.c:662 (discriminator 2)) vt_ioctl (drivers/tty/vt/vt_ioctl.c:903) tty_ioctl (drivers/tty/tty_io.c:2778) ... The buggy address belongs to the object at ffff8880beab8800 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 424 bytes inside of freed 1024-byte region [ffff8880beab8800, ffff8880beab8c00) The buggy address belongs to the physical page: page:00000000afc77580 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xbeab8 head:00000000afc77580 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff) page_type: 0xffffffff() raw: 000fffffc0010200 ffff888100042dc0 ffffea000426de00 dead000000000002 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8880beab8880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880beab8900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff8880beab8980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8880beab8a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880beab8a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Disabling lock debugging due to kernel taint Fixes: ac751efa6a0d ("console: rename acquire/release_console_sem() to console_lock/unlock()") Cc: stable <stable(a)kernel.org> Reported-by: syzkaller <syzkaller(a)googlegroups.com> Signed-off-by: George Kennedy <george.kennedy(a)oracle.com> Reviewed-by: Thomas Weißschuh <linux(a)weissschuh.net> Link: https://lore.kernel.org/r/1683889728-10411-1-git-send-email-george.kennedy@… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> [ Adjust context due to missing commit 71d4abfab322 ("vc_screen: rewrite vcs_size to accept vc, not inode") in 4.14.y stable ] Signed-off-by: Suraj Jitindar Singh <surajjs(a)amazon.com> --- drivers/tty/vt/vc_screen.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/tty/vt/vc_screen.c b/drivers/tty/vt/vc_screen.c index 85b6634f518a..42c9ef64108f 100644 --- a/drivers/tty/vt/vc_screen.c +++ b/drivers/tty/vt/vc_screen.c @@ -437,10 +437,17 @@ vcs_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) } } - /* The vcs_size might have changed while we slept to grab - * the user buffer, so recheck. + /* The vc might have been freed or vcs_size might have changed + * while we slept to grab the user buffer, so recheck. * Return data written up to now on failure. */ + vc = vcs_vc(inode, &viewed); + if (!vc) { + if (written) + break; + ret = -ENXIO; + goto unlock_out; + } size = vcs_size(inode); if (size < 0) { if (written) -- 2.34.1

2 years, 2 months

[PATCH 6.1 0/3] net: add sysctl accept_ra_min_lft

by Patrick Rohr

This series adds a new sysctl accept_ra_min_lft which enforces a minimum lifetime value for individual RA sections; in particular, router lifetime, PIO preferred lifetime, and RIO lifetime. If any of those lifetimes are lower than the configured value, the specific RA section is ignored. This fixes a potential denial of service attack vector where rogue WiFi routers (or devices) can send RAs with low lifetimes to actively drain a mobile device's battery (by preventing sleep). In addition to this change, Android uses hardware offloads to drop RAs for a fraction of the minimum of all lifetimes present in the RA (some networks have very frequent RAs (5s) with high lifetimes (2h)). Despite this, we have encountered networks that set the router lifetime to 30s which results in very frequent CPU wakeups. Instead of disabling IPv6 (and dropping IPv6 ethertype in the WiFi firmware) entirely on such networks, misconfigured routers must be ignored while still processing RAs from other IPv6 routers on the same network (i.e. to support IoT applications). Patches: - 1671bcfd76fd ("net: add sysctl accept_ra_min_rtr_lft") - 5027d54a9c30 ("net: change accept_ra_min_rtr_lft to affect all RA lifetimes") - 5cb249686e67 ("net: release reference to inet6_dev pointer") Patrick Rohr (3): net: add sysctl accept_ra_min_rtr_lft net: change accept_ra_min_rtr_lft to affect all RA lifetimes net: release reference to inet6_dev pointer Documentation/networking/ip-sysctl.rst | 8 ++++++++ include/linux/ipv6.h | 1 + include/uapi/linux/ipv6.h | 1 + net/ipv6/addrconf.c | 13 +++++++++++++ net/ipv6/ndisc.c | 13 +++++++++++-- 5 files changed, 34 insertions(+), 2 deletions(-) -- 2.42.0.515.g380fc7ccd1-goog

2 years, 2 months

[PATCH v3 2/3] mmap: Fix error paths with dup_anon_vma()

by Liam R. Howlett

When the calling function fails after the dup_anon_vma(), the duplication of the anon_vma is not being undone. Add the necessary unlink_anon_vma() call to the error paths that are missing them. This issue showed up during inspection of the error path in vma_merge() for an unrelated vma iterator issue. Users may experience increased memory usage, which may be problematic as the failure would likely be caused by a low memory situation. Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree") Cc: stable(a)vger.kernel.org Cc: Jann Horn <jannh(a)google.com> Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com> --- mm/mmap.c | 30 ++++++++++++++++++++++-------- 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index acb7dea49e23..f9f0a5fe4db4 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -583,11 +583,12 @@ static inline void vma_complete(struct vma_prepare *vp, * dup_anon_vma() - Helper function to duplicate anon_vma * @dst: The destination VMA * @src: The source VMA + * @dup: Pointer to the destination VMA when successful. * * Returns: 0 on success. */ static inline int dup_anon_vma(struct vm_area_struct *dst, - struct vm_area_struct *src) + struct vm_area_struct *src, struct vm_area_struct **dup) { /* * Easily overlooked: when mprotect shifts the boundary, make sure the @@ -595,9 +596,15 @@ static inline int dup_anon_vma(struct vm_area_struct *dst, * anon pages imported. */ if (src->anon_vma && !dst->anon_vma) { + int ret; + vma_assert_write_locked(dst); dst->anon_vma = src->anon_vma; - return anon_vma_clone(dst, src); + ret = anon_vma_clone(dst, src); + if (ret) + return ret; + + *dup = dst; } return 0; @@ -624,6 +631,7 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma, unsigned long start, unsigned long end, pgoff_t pgoff, struct vm_area_struct *next) { + struct vm_area_struct *anon_dup = NULL; bool remove_next = false; struct vma_prepare vp; @@ -633,7 +641,7 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma, remove_next = true; vma_start_write(next); - ret = dup_anon_vma(vma, next); + ret = dup_anon_vma(vma, next, &anon_dup); if (ret) return ret; } @@ -661,6 +669,8 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area_struct *vma, return 0; nomem: + if (anon_dup) + unlink_anon_vmas(anon_dup); return -ENOMEM; } @@ -860,6 +870,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, { struct vm_area_struct *curr, *next, *res; struct vm_area_struct *vma, *adjust, *remove, *remove2; + struct vm_area_struct *anon_dup = NULL; struct vma_prepare vp; pgoff_t vma_pgoff; int err = 0; @@ -927,18 +938,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, vma_start_write(next); remove = next; /* case 1 */ vma_end = next->vm_end; - err = dup_anon_vma(prev, next); + err = dup_anon_vma(prev, next, &anon_dup); if (curr) { /* case 6 */ vma_start_write(curr); remove = curr; remove2 = next; if (!next->anon_vma) - err = dup_anon_vma(prev, curr); + err = dup_anon_vma(prev, curr, &anon_dup); } } else if (merge_prev) { /* case 2 */ if (curr) { vma_start_write(curr); - err = dup_anon_vma(prev, curr); + err = dup_anon_vma(prev, curr, &anon_dup); if (end == curr->vm_end) { /* case 7 */ remove = curr; } else { /* case 5 */ @@ -954,7 +965,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, vma_end = addr; adjust = next; adj_start = -(prev->vm_end - addr); - err = dup_anon_vma(next, prev); + err = dup_anon_vma(next, prev, &anon_dup); } else { /* * Note that cases 3 and 8 are the ONLY ones where prev @@ -968,7 +979,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, vma_pgoff = curr->vm_pgoff; vma_start_write(curr); remove = curr; - err = dup_anon_vma(next, curr); + err = dup_anon_vma(next, curr, &anon_dup); } } } @@ -1018,6 +1029,9 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm, return res; prealloc_fail: + if (anon_dup) + unlink_anon_vmas(anon_dup); + anon_vma_fail: vma_iter_set(vmi, addr); vma_iter_load(vmi); -- 2.40.1

2 years, 3 months

[PATCH] fs/ext4/acl: apply umask if ACL support is disabled

by Max Kellermann

The function ext4_init_acl() calls posix_acl_create() which is responsible for applying the umask. But without CONFIG_EXT4_FS_POSIX_ACL, ext4_init_acl() is an empty inline function, and nobody applies the umask. This fixes a bug which causes the umask to be ignored with O_TMPFILE on ext4: https://github.com/MusicPlayerDaemon/MPD/issues/558 https://bugs.gentoo.org/show_bug.cgi?id=686142#c3 https://bugzilla.kernel.org/show_bug.cgi?id=203625 Reviewed-by: J. Bruce Fields <bfields(a)redhat.com> Cc: stable(a)vger.kernel.org Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com> --- fs/ext4/acl.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/ext4/acl.h b/fs/ext4/acl.h index 0c5a79c3b5d4..ef4c19e5f570 100644 --- a/fs/ext4/acl.h +++ b/fs/ext4/acl.h @@ -68,6 +68,11 @@ extern int ext4_init_acl(handle_t *, struct inode *, struct inode *); static inline int ext4_init_acl(handle_t *handle, struct inode *inode, struct inode *dir) { + /* usually, the umask is applied by posix_acl_create(), but if + ext4 ACL support is disabled at compile time, we need to do + it here, because posix_acl_create() will never be called */ + inode->i_mode &= ~current_umask(); + return 0; } #endif /* CONFIG_EXT4_FS_POSIX_ACL */ -- 2.39.2

2 years, 3 months

[PATCH v4] ext4: fix race between writepages and remount

by Baokun Li

We got a WARNING in ext4_add_complete_io: ================================================================== WARNING: at fs/ext4/page-io.c:231 ext4_put_io_end_defer+0x182/0x250 CPU: 10 PID: 77 Comm: ksoftirqd/10 Tainted: 6.3.0-rc2 #85 RIP: 0010:ext4_put_io_end_defer+0x182/0x250 [ext4] [...] Call Trace: <TASK> ext4_end_bio+0xa8/0x240 [ext4] bio_endio+0x195/0x310 blk_update_request+0x184/0x770 scsi_end_request+0x2f/0x240 scsi_io_completion+0x75/0x450 scsi_finish_command+0xef/0x160 scsi_complete+0xa3/0x180 blk_complete_reqs+0x60/0x80 blk_done_softirq+0x25/0x40 __do_softirq+0x119/0x4c8 run_ksoftirqd+0x42/0x70 smpboot_thread_fn+0x136/0x3c0 kthread+0x140/0x1a0 ret_from_fork+0x2c/0x50 ================================================================== Above issue may happen as follows: cpu1 cpu2 ----------------------------|---------------------------- mount -o dioread_lock ext4_writepages ext4_do_writepages *if (ext4_should_dioread_nolock(inode))* // rsv_blocks is not assigned here mount -o remount,dioread_nolock ext4_journal_start_with_reserve __ext4_journal_start __ext4_journal_start_sb jbd2__journal_start *if (rsv_blocks)* // h_rsv_handle is not initialized here mpage_map_and_submit_extent mpage_map_one_extent dioread_nolock = ext4_should_dioread_nolock(inode) if (dioread_nolock && (map->m_flags & EXT4_MAP_UNWRITTEN)) mpd->io_submit.io_end->handle = handle->h_rsv_handle ext4_set_io_unwritten_flag io_end->flag |= EXT4_IO_END_UNWRITTEN // now io_end->handle is NULL but has EXT4_IO_END_UNWRITTEN flag scsi_finish_command scsi_io_completion scsi_io_completion_action scsi_end_request blk_update_request req_bio_endio bio_endio bio->bi_end_io > ext4_end_bio ext4_put_io_end_defer ext4_add_complete_io // trigger WARN_ON(!io_end->handle && sbi->s_journal); The immediate cause of this problem is that ext4_should_dioread_nolock() function returns inconsistent values in the ext4_do_writepages() and mpage_map_one_extent(). There are four conditions in this function that can be changed at mount time to cause this problem. These four conditions can be divided into two categories: (1) journal_data and EXT4_EXTENTS_FL, which can be changed by ioctl (2) DELALLOC and DIOREAD_NOLOCK, which can be changed by remount The two in the first category have been fixed by commit c8585c6fcaf2 ("ext4: fix races between changing inode journal mode and ext4_writepages") and commit cb85f4d23f79 ("ext4: fix race between writepages and enabling EXT4_EXTENTS_FL") respectively. Two cases in the other category have not yet been fixed, and the above issue is caused by this situation. We refer to the fix for the first category, when applying options during remount, we grab s_writepages_rwsem to avoid racing with writepages ops to trigger this problem. Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io") Cc: stable(a)vger.kernel.org Signed-off-by: Baokun Li <libaokun1(a)huawei.com> --- V1->V2: Grab s_writepages_rwsem unconditionally during remount. Remove patches 1,2 that are no longer needed. V2->V3: Also grab s_writepages_rwsem when restoring options. V3->V4: Rebased on top of mainline. Reference 00d873c17e29 ("ext4: avoid deadlock in fs reclaim with page writeback") to use s_writepages_rwsem. fs/ext4/ext4.h | 3 ++- fs/ext4/super.c | 14 ++++++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 6948d673bba2..97ef99c7f296 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1613,7 +1613,8 @@ struct ext4_sb_info { /* * Barrier between writepages ops and changing any inode's JOURNAL_DATA - * or EXTENTS flag. + * or EXTENTS flag or between writepages ops and changing DELALLOC or + * DIOREAD_NOLOCK mount options on remount. */ struct percpu_rw_semaphore s_writepages_rwsem; struct dax_device *s_daxdev; diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 9680fe753e59..fff42682e4e0 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -6389,6 +6389,7 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb) ext4_group_t g; int err = 0; int enable_rw = 0; + int alloc_ctx; #ifdef CONFIG_QUOTA int enable_quota = 0; int i, j; @@ -6429,7 +6430,16 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb) } + /* + * Changing the DIOREAD_NOLOCK or DELALLOC mount options may cause + * two calls to ext4_should_dioread_nolock() to return inconsistent + * values, triggering WARN_ON in ext4_add_complete_io(). we grab + * here s_writepages_rwsem to avoid race between writepages ops and + * remount. + */ + alloc_ctx = ext4_writepages_down_write(sb); ext4_apply_options(fc, sb); + ext4_writepages_up_write(sb, alloc_ctx); if ((old_opts.s_mount_opt & EXT4_MOUNT_JOURNAL_CHECKSUM) ^ test_opt(sb, JOURNAL_CHECKSUM)) { @@ -6650,6 +6660,8 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb) if ((sb->s_flags & SB_RDONLY) && !(old_sb_flags & SB_RDONLY) && sb_any_quota_suspended(sb)) dquot_resume(sb, -1); + + alloc_ctx = ext4_writepages_down_write(sb); sb->s_flags = old_sb_flags; sbi->s_mount_opt = old_opts.s_mount_opt; sbi->s_mount_opt2 = old_opts.s_mount_opt2; @@ -6658,6 +6670,8 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb) sbi->s_commit_interval = old_opts.s_commit_interval; sbi->s_min_batch_time = old_opts.s_min_batch_time; sbi->s_max_batch_time = old_opts.s_max_batch_time; + ext4_writepages_up_write(sb, alloc_ctx); + if (!test_opt(sb, BLOCK_VALIDITY) && sbi->s_system_blks) ext4_release_system_zone(sb); #ifdef CONFIG_QUOTA -- 2.31.1

2 years, 3 months

← Newer
1
...
6
7
8
9
10
11
12
...
108
Older →

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror September 2023