[BUG]
Currently btrfs can create subvolume with an invalid qgroup inherit
without triggering any error:
# mkfs.btrfs -O quota -f $dev
# mount $dev $mnt
# btrfs subvolume create -i 2/0 $mnt/subv1
# btrfs qgroup show -prce --sync $mnt
Qgroupid Referenced Exclusive Path
-------- ---------- --------- ----
0/5 16.00KiB 16.00KiB <toplevel>
0/256 16.00KiB 16.00KiB subv1
[CAUSE]
We only do a very basic size check for btrfs_qgroup_inherit structure,
but never really verify if the values are correct.
Thus in btrfs_qgroup_inherit() function, we have to skip non-existing
qgroups, and never return any error.
[FIX]
Fix the behavior and introduce extra checks:
- Introduce early check for btrfs_qgroup_inherit structure
Not only the size, but also all the qgroup ids would be verifyed.
And the timing is very early, so we can return error early.
This early check is very important for snapshot creation, as snapshot
is delayed to transaction commit.
- Drop support for btrfs_qgroup_inherit::num_ref_copies and
num_excl_copies
Those two members are used to specify to copy refr/excl numbers from
other qgroups.
This would definitely mark qgroup inconsistent, and btrfs-progs has
dropped the support for them for a long time.
It's time to drop the support for kernel.
- Verify the supported btrfs_qgroup_inherit::flags
Just in case we want to add extra flags for btrfs_qgroup_inherit.
Now above subvolume creation would fail with -ENOENT other than silently
ignore the non-existing qgroup.
CC: stable(a)vger.kernel.org
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/ioctl.c | 16 +++---------
fs/btrfs/qgroup.c | 52 ++++++++++++++++++++++++++++++++++++++
fs/btrfs/qgroup.h | 3 +++
include/uapi/linux/btrfs.h | 1 +
4 files changed, 59 insertions(+), 13 deletions(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 8b80fbea1e72..c19ce2e292dc 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1382,7 +1382,7 @@ static noinline int btrfs_ioctl_snap_create_v2(struct file *file,
if (vol_args->flags & BTRFS_SUBVOL_RDONLY)
readonly = true;
if (vol_args->flags & BTRFS_SUBVOL_QGROUP_INHERIT) {
- u64 nums;
+ struct btrfs_fs_info *fs_info = inode_to_fs_info(file_inode(file));
if (vol_args->size < sizeof(*inherit) ||
vol_args->size > PAGE_SIZE) {
@@ -1395,19 +1395,9 @@ static noinline int btrfs_ioctl_snap_create_v2(struct file *file,
goto free_args;
}
- if (inherit->num_qgroups > PAGE_SIZE ||
- inherit->num_ref_copies > PAGE_SIZE ||
- inherit->num_excl_copies > PAGE_SIZE) {
- ret = -EINVAL;
+ ret = btrfs_qgroup_check_inherit(fs_info, inherit, vol_args->size);
+ if (ret < 0)
goto free_inherit;
- }
-
- nums = inherit->num_qgroups + 2 * inherit->num_ref_copies +
- 2 * inherit->num_excl_copies;
- if (vol_args->size != struct_size(inherit, qgroups, nums)) {
- ret = -EINVAL;
- goto free_inherit;
- }
}
ret = __btrfs_ioctl_snap_create(file, file_mnt_idmap(file),
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 4fa83c76b37b..66968092b554 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3046,6 +3046,58 @@ int btrfs_run_qgroups(struct btrfs_trans_handle *trans)
return ret;
}
+int btrfs_qgroup_check_inherit(struct btrfs_fs_info *fs_info,
+ struct btrfs_qgroup_inherit *inherit,
+ size_t size)
+{
+ if (inherit->flags & ~BTRFS_QGROUP_INHERIT_FLAGS_SUPP)
+ return -EOPNOTSUPP;
+ if (size < sizeof(*inherit) || size > PAGE_SIZE)
+ return -EINVAL;
+
+ /*
+ * In the past we allow btrfs_qgroup_inherit to specify to copy
+ * refr/excl numbers directly from other qgroups.
+ * This behavior has been disable in btrfs-progs for a very long time,
+ * but here we should also disable them for kernel, as this behavior
+ * is known to mark qgroup inconsistent, and a rescan would wipe out the
+ * change anyway.
+ *
+ * So here we just reject any btrfs_qgroup_inherit with num_ref_copies or
+ * num_excl_copies.
+ */
+ if (inherit->num_ref_copies || inherit->num_excl_copies)
+ return -EINVAL;
+
+ if (inherit->num_qgroups > PAGE_SIZE)
+ return -EINVAL;
+
+ if (size != struct_size(inherit, qgroups, inherit->num_qgroups))
+ return -EINVAL;
+
+ /*
+ * Now check all the remaining qgroups, they should all:
+ * - Exist
+ * - Be higher level qgroups.
+ */
+ for (int i = 0; i < inherit->num_qgroups; i++) {
+ struct btrfs_qgroup *qgroup;
+ u64 qgroupid = inherit->qgroups[i];
+
+ if (btrfs_qgroup_level(qgroupid) == 0)
+ return -EINVAL;
+
+ spin_lock(&fs_info->qgroup_lock);
+ qgroup = find_qgroup_rb(fs_info, qgroupid);
+ if (!qgroup) {
+ spin_unlock(&fs_info->qgroup_lock);
+ return -ENOENT;
+ }
+ spin_unlock(&fs_info->qgroup_lock);
+ }
+ return 0;
+}
+
static int qgroup_auto_inherit(struct btrfs_fs_info *fs_info,
u64 inode_rootid,
struct btrfs_qgroup_inherit **inherit)
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h
index 1f664261c064..706640be0ec2 100644
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -350,6 +350,9 @@ int btrfs_qgroup_account_extent(struct btrfs_trans_handle *trans, u64 bytenr,
struct ulist *new_roots);
int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans);
int btrfs_run_qgroups(struct btrfs_trans_handle *trans);
+int btrfs_qgroup_check_inherit(struct btrfs_fs_info *fs_info,
+ struct btrfs_qgroup_inherit *inherit,
+ size_t size);
int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid,
u64 objectid, u64 inode_rootid,
struct btrfs_qgroup_inherit *inherit);
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index f8bc34a6bcfa..cdf6ad872149 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -92,6 +92,7 @@ struct btrfs_qgroup_limit {
* struct btrfs_qgroup_inherit.flags
*/
#define BTRFS_QGROUP_INHERIT_SET_LIMITS (1ULL << 0)
+#define BTRFS_QGROUP_INHERIT_FLAGS_SUPP (BTRFS_QGROUP_INHERIT_SET_LIMITS)
struct btrfs_qgroup_inherit {
__u64 flags;
--
2.43.2
From: Xiubo Li <xiubli(a)redhat.com>
The osd code has remove cursor initilizing code and this will make
the sparse read state into a infinite loop. We should initialize
the cursor just before each sparse-read in messnger v2.
Cc: stable(a)vger.kernel.org
URL: https://tracker.ceph.com/issues/64607
Fixes: 8e46a2d068c9 ("libceph: just wait for more data to be available on the socket")
Reported-by: Luis Henriques <lhenriques(a)suse.de>
Signed-off-by: Xiubo Li <xiubli(a)redhat.com>
---
net/ceph/messenger_v2.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c
index a0ca5414b333..7ae0f80100f4 100644
--- a/net/ceph/messenger_v2.c
+++ b/net/ceph/messenger_v2.c
@@ -2025,6 +2025,7 @@ static int prepare_sparse_read_cont(struct ceph_connection *con)
static int prepare_sparse_read_data(struct ceph_connection *con)
{
struct ceph_msg *msg = con->in_msg;
+ u64 len = con->in_msg->sparse_read_total ? : data_len(con->in_msg);
dout("%s: starting sparse read\n", __func__);
@@ -2034,6 +2035,8 @@ static int prepare_sparse_read_data(struct ceph_connection *con)
if (!con_secure(con))
con->in_data_crc = -1;
+ ceph_msg_data_cursor_init(&con->v2.in_cursor, con->in_msg, len);
+
reset_in_kvecs(con);
con->v2.in_state = IN_S_PREPARE_SPARSE_DATA_CONT;
con->v2.data_len_remain = data_len(msg);
--
2.43.0
when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects
congestion) it is important that the page is redirtied.
nfs_writepage_locked() doesn't do this, so files can become corrupted as
writes can be lost.
Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be
returned. It is needed for kernels v5.18..v6.7. From 6.3 onward the patch
is different as it needs to mention "folio", not "page".
Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka(a)poczta.fm>
Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion")
Signed-off-by: NeilBrown <neilb(a)suse.de>
---
fs/nfs/write.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index f41d24b54fd1..6a0606668417 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -667,8 +667,10 @@ static int nfs_writepage_locked(struct page *page,
int err;
if (wbc->sync_mode == WB_SYNC_NONE &&
- NFS_SERVER(inode)->write_congested)
+ NFS_SERVER(inode)->write_congested) {
+ redirty_page_for_writepage(wbc, page);
return AOP_WRITEPAGE_ACTIVATE;
+ }
nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE);
nfs_pageio_init_write(&pgio, inode, 0,
--
2.43.0
From: Max Krummenacher <max.krummenacher(a)toradex.com>
This reverts commit ef4a40953c8076626875ff91c41e210fcee7a6fd.
The commit was applied to make further commits apply cleanly, but the
commit depends on other commits in the same patchset. I.e. the
controlling DSI host would need a change too. Thus one would need to
backport the full patchset changing the DSI hosts and all downstream
DSI device drivers.
Revert the commit and fix up the conflicts with the backported fixes
to the lt8912b driver.
Signed-off-by: Max Krummenacher <max.krummenacher(a)toradex.com>
Conflicts:
drivers/gpu/drm/bridge/lontium-lt8912b.c
---
drivers/gpu/drm/bridge/lontium-lt8912b.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/bridge/lontium-lt8912b.c b/drivers/gpu/drm/bridge/lontium-lt8912b.c
index 6891863ed5104..e16b0fc0cda0f 100644
--- a/drivers/gpu/drm/bridge/lontium-lt8912b.c
+++ b/drivers/gpu/drm/bridge/lontium-lt8912b.c
@@ -571,6 +571,10 @@ static int lt8912_bridge_attach(struct drm_bridge *bridge,
if (ret)
goto error;
+ ret = lt8912_attach_dsi(lt);
+ if (ret)
+ goto error;
+
return 0;
error:
@@ -726,15 +730,8 @@ static int lt8912_probe(struct i2c_client *client,
drm_bridge_add(<->bridge);
- ret = lt8912_attach_dsi(lt);
- if (ret)
- goto err_attach;
-
return 0;
-err_attach:
- drm_bridge_remove(<->bridge);
- lt8912_free_i2c(lt);
err_i2c:
lt8912_put_dt(lt);
err_dt_parse:
--
2.42.0
Please consider the patches below for backporting to v6.1. They should
all apply cleanly in the given order.
These are prerequisites for NX compat support on x86, but the
remaining changes do not apply cleanly and will be sent as a patch
series at a later date.
By themselves, these changes not only constitute a reasonable cleanup,
they are also needed for future support of x86s [0] CPUs that are no
longer able to transition out of long mode.
Documentation/x86/boot.rst | 2 +-
arch/x86/Kconfig | 17 +
arch/x86/boot/compressed/Makefile | 8 +-
arch/x86/boot/compressed/efi_mixed.S | 383 +++++++++++++++++++
arch/x86/boot/compressed/efi_thunk_64.S | 195 ----------
arch/x86/boot/compressed/head_32.S | 25 +-
arch/x86/boot/compressed/head_64.S | 566 ++++++-----------------------
arch/x86/boot/compressed/mem_encrypt.S | 152 +++++++-
arch/x86/boot/compressed/misc.c | 34 +-
arch/x86/boot/compressed/misc.h | 2 -
arch/x86/boot/compressed/pgtable.h | 10 +-
arch/x86/boot/compressed/pgtable_64.c | 87 ++---
arch/x86/boot/header.S | 2 +-
arch/x86/boot/tools/build.c | 2 +
drivers/firmware/efi/efi.c | 22 ++
drivers/firmware/efi/libstub/alignedmem.c | 5 +-
drivers/firmware/efi/libstub/arm64-stub.c | 6 +-
drivers/firmware/efi/libstub/efistub.h | 6 +-
drivers/firmware/efi/libstub/mem.c | 3 +-
drivers/firmware/efi/libstub/randomalloc.c | 5 +-
drivers/firmware/efi/libstub/x86-stub.c | 53 ++-
drivers/firmware/efi/vars.c | 13 +-
include/linux/decompress/mm.h | 2 +-
23 files changed, 805 insertions(+), 795 deletions(-)
[0] https://www.intel.com/content/www/us/en/developer/articles/technical/envisi…
9cf42bca30e9 efi: libstub: use EFI_LOADER_CODE region when moving the
kernel in memory
cb8bda8ad443 x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.S
e2ab9eab324c x86/boot/compressed: Move 32-bit entrypoint code into .text section
5c3a85f35b58 x86/boot/compressed: Move bootargs parsing out of 32-bit
startup code
91592b5c0c2f x86/boot/compressed: Move efi32_pe_entry into .text section
73a6dec80e2a x86/boot/compressed: Move efi32_entry out of head_64.S
7f22ca396778 x86/boot/compressed: Move efi32_pe_entry() out of head_64.S
4b52016247ae x86/boot/compressed, efi: Merge multiple definitions of
image_offset into one
630f337f0c4f x86/boot/compressed: Simplify IDT/GDT preserve/restore in
the EFI thunk
6aac80a8da46 x86/boot/compressed: Avoid touching ECX in
startup32_set_idt_entry()
d73a257f7f86 x86/boot/compressed: Pull global variable reference into
startup32_load_idt()
c6355995ba47 x86/boot/compressed: Move startup32_load_idt() into .text section
9ea813be3d34 x86/boot/compressed: Move startup32_load_idt() out of head_64.S
b5d854cd4b6a x86/boot/compressed: Move startup32_check_sev_cbit() into .text
9d7eaae6a071 x86/boot/compressed: Move startup32_check_sev_cbit() out
of head_64.S
30c9ca16a527 x86/boot/compressed: Adhere to calling convention in
get_sev_encryption_bit()
61de13df9590 x86/boot/compressed: Only build mem_encrypt.S if AMD_MEM_ENCRYPT=y
bad267f9e18f efi: verify that variable services are supported
0217a40d7ba6 efi: efivars: prevent double registration
cc3fdda2876e x86/efi: Make the deprecated EFI handover protocol optional
7734a0f31e99 x86/boot: Robustify calling startup_{32,64}() from the
decompressor code
d2d7a54f69b6 x86/efistub: Branch straight to kernel entry point from C code
df9215f15206 x86/efistub: Simplify and clean up handover entry code
127920645876 x86/decompressor: Avoid magic offsets for EFI handover entrypoint
d7156b986d4c x86/efistub: Clear BSS in EFI handover protocol entrypoint
8b63cba746f8 x86/decompressor: Store boot_params pointer in callee save register
00c6b0978ec1 x86/decompressor: Assign paging related global variables earlier
e8972a76aa90 x86/decompressor: Call trampoline as a normal function
918a7a04e717 x86/decompressor: Use standard calling convention for trampoline
bd328aa01ff7 x86/decompressor: Avoid the need for a stack in the
32-bit trampoline
64ef578b6b68 x86/decompressor: Call trampoline directly from C code
f97b67a773cd x86/decompressor: Only call the trampoline when changing
paging levels
cb83cece57e1 x86/decompressor: Pass pgtable address to trampoline directly
03dda95137d3 x86/decompressor: Merge trampoline cleanup with switching code
24388292e2d7 x86/decompressor: Move global symbol references to C code
8217ad0a435f decompress: Use 8 byte alignment
An anon THP page is first added to swap cache before reclaiming it.
Initially, each tail page contains the proper swap entry value(stored in
->private field) which is filled from add_to_swap_cache(). After
migrating the THP page sitting on the swap cache, only the swap entry of
the head page is filled(see folio_migrate_mapping()).
Now when this page is tried to split(one case is when this page is again
migrated, see migrate_pages()->try_split_thp()), the tail pages
->private is not stored with proper swap entry values. When this tail
page is now try to be freed, as part of it delete_from_swap_cache() is
called which operates on the wrong swap cache index and eventually
replaces the wrong swap cache index with shadow/NULL value, frees the
page.
This leads to the state with a swap cache containing the freed page.
This issue can manifest in many forms and the most common thing observed
is the rcu stall during the swapin (see mapping_get_entry()).
On the recent kernels, this issues is indirectly getting fixed with the
series[1], to be specific[2].
When tried to back port this series, it is observed many merge
conflicts and also seems dependent on many other changes. As backporting
to LTS branches is not a trivial one, the similar change from [2] is
picked as a fix.
[1] https://lore.kernel.org/all/20230821160849.531668-1-david@redhat.com/
[2] https://lore.kernel.org/all/20230821160849.531668-5-david@redhat.com/
Closes: https://lore.kernel.org/linux-mm/69cb784f-578d-ded1-cd9f-c6db04696336@quici…
Fixes: 3417013e0d18 ("mm/migrate: Add folio_migrate_mapping()")
Cc: <stable(a)vger.kernel.org> # see patch description, applicable to <=6.1
Signed-off-by: Charan Teja Kalla <quic_charante(a)quicinc.com>
---
mm/huge_memory.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 5957794..cc5273f 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2477,6 +2477,8 @@ static void __split_huge_page_tail(struct page *head, int tail,
if (!folio_test_swapcache(page_folio(head))) {
VM_WARN_ON_ONCE_PAGE(page_tail->private != 0, page_tail);
page_tail->private = 0;
+ } else {
+ set_page_private(page_tail, (unsigned long)head->private + tail);
}
/* Page flags must be visible before we make the page non-compound. */
--
2.7.4
Hi Greg and Sasha,
Please apply upstream commit 5b750b22530f ("drm/amd/display: Increase
frame warning limit with KASAN or KCSAN in dml") to linux-6.1.y, as it
is needed to avoid instances of -Wframe-larger-than in allmodconfig,
which has -Werror enabled. It applies cleanly for me and it is already
in 6.6 and 6.7. The fixes tag is not entirely accurate and commit
e63e35f0164c ("drm/amd/display: Increase frame-larger-than for all
display_mode_vba files"), which was recently applied to that tree,
depends on it (I should have made that clearer in the patch).
If there are any issues, please let me know.
Cheers,
Nathan
Hi stable maintainers,
The following patch in mainline is listed as a fix for CVE-2023-2176:
8d037973d48c026224ab285e6a06985ccac6f7bf (RDMA/core: Refactor rdma_bind_addr)
And the following is a fix for a regression in the above patch:
0e15863015d97c1ee2cc29d599abcc7fa2dc3e95 (RDMA/core: Update CMA destination address on rdma_resolve_addr)
To my knowledge, at least back to v6.1 is vulnerable to this same bug.
Since these should apply directly to 6.1.y, can these be picked up for that branch?
Regards,
Brennan