Hello,
The following change seems to cause kexec to sometimes fail (not every
time but about 50% chance) for the stable 6.1 kernels, 6.1.129 and later:
6821918f4519 ("x86/kexec: Allocate PGD for x86_64 transition page tables
separately")
The commit message for that commit states that it is dependent on
another change "x86/mm: Add _PAGE_NOPTISHADOW bit to avoid updating
userspace page tables" but that change does not seem to have been done
in the 6.1 kernel series, which could explain why the 6821918f4519
change causes problems for 6.1.
This appears to be a problem only for 6.1, for the 6.6 and later stable
kernels there is no problem.
I think the reason this problem is seen only for 6.1 and not for 6.6 and
later is that the change "x86/kexec: Allocate PGD for x86_64 transition
page tables separately" relies on things that are not available in 6.1.
In the tests I have done, kexec is called via u-root.
More details are available here:
https://git.glasklar.is/system-transparency/core/stboot/-/issues/227
Cheers,
Elias
Hello,
Outdated or inaccurate contact data wastes valuable time. We provide highly
accurate, *verified contact lists of Engineers, Manufacturers and
Purchasing Managers Contacts*, ensuring you connect with the right people.
Would you like to see a sample? If so, please let me know your *target
geography* I’ll be happy to share details.
Just a note that we can customize the lists to meet any specific
requirements based on your criteria.
Wishing you a wonderful day!
Debbie McCoy
Manager, Demand Generation
If you ever feel like opting out, simply reply with "remove me" in the
subject line
Hello all,
I encountered an issue with the CPU affinity of tasks launched by
systemd in a
slice, after updating from systemd 254 to by systemd >= 256, on the LTS
5.15.x
branch (tested on v5.15.179).
Despite the slice file stipulating AllowedCPUS=2 (and confirming this
was set in
/sys/fs/cgroup/test.slice/cpuset.cpus) tasks launched in the slice would
have
the CPU affinity of the system.slice (i.e all by default) rather than 2.
To reproduce:
* Check kernel version and systemd version (I used a debian testing
image for
testing)
```
# uname -r
5.15.179
# systemctl --version
systemd 257 (257.4-3)
...
```
* Create a test.slice with AllowedCPUS=2
```
# cat <<EOF > /usr/lib/systemd/system/test.slice
[Unit]
Description=Test slice
Before=slices.target
[Slice]
AllowedCPUs=2
[Install]
WantedBy=slices.target
EOF
# systemctl daemon-reload && systemctl start test.slice
```
* Confirm cpuset
```
# cat /sys/fs/cgroup/test.slice/cpuset.cpus
2
```
* Launch task in slice
```
# systemd-run --slice test.slice yes
Running as unit: run-r9187b97c6958498aad5bba213289ac56.service;
invocation ID:
f470f74047ac43b7a60861d03a7ef6f9
# cat
/sys/fs/cgroup/test.slice/run-r9187b97c6958498aad5bba213289ac56.service/cgroup.procs
317
```
# Check affinity
```
# taskset -pc 317
pid 317's current affinity list: 0-7
```
This issue is fixed by applying upstream commits:
18f9a4d47527772515ad6cbdac796422566e6440
cgroup/cpuset: Skip spread flags update on v2
and
42a11bf5c5436e91b040aeb04063be1710bb9f9c
cgroup/cpuset: Make cpuset_fork() handle CLONE_INTO_CGROUP properly
With these applied:
```
# systemd-run --slice test.slice yes
Running as unit: run-r442c444559ff49f48c6c2b8325b3b500.service;
invocation ID:
5211167267154e9292cb6b854585cb91
# cat
/sys/fs/cgroup/test.slice/run-r442c444559ff49f48c6c2b8325b3b500.service/cgroup.procs
291
# taskset -pc 291
pid 291's current affinity list: 2
```
Perhaps these are a good candidate for backport onto the 5.15 LTS
branch?
Thanks
James
When get_block is called with a buffer_head allocated on the stack, such
as do_mpage_readpage, stack corruption due to buffer_head UAF may occur in
the following race condition situation.
<CPU 0> <CPU 1>
mpage_read_folio
<<bh on stack>>
do_mpage_readpage
exfat_get_block
bh_read
__bh_read
get_bh(bh)
submit_bh
wait_on_buffer
...
end_buffer_read_sync
__end_buffer_read_notouch
unlock_buffer
<<keep going>>
...
...
...
...
<<bh is not valid out of mpage_read_folio>>
.
.
another_function
<<variable A on stack>>
put_bh(bh)
atomic_dec(bh->b_count)
* stack corruption here *
This patch returns -EAGAIN if a folio does not have buffers when bh_read
needs to be called. By doing this, the caller can fallback to functions
like block_read_full_folio(), create a buffer_head in the folio, and then
call get_block again.
Let's do not call bh_read() with on-stack buffer_head.
Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength")
Cc: stable(a)vger.kernel.org
Tested-by: Yeongjin Gil <youngjin.gil(a)samsung.com>
Signed-off-by: Sungjong Seo <sj1557.seo(a)samsung.com>
---
v2:
- clear_buffer_mapped if there is any errors.
- remove unnecessary BUG_ON()
---
fs/exfat/inode.c | 39 +++++++++++++++++++++++++++++++++------
1 file changed, 33 insertions(+), 6 deletions(-)
diff --git a/fs/exfat/inode.c b/fs/exfat/inode.c
index 96952d4acb50..f3fdba9f4d21 100644
--- a/fs/exfat/inode.c
+++ b/fs/exfat/inode.c
@@ -344,7 +344,8 @@ static int exfat_get_block(struct inode *inode, sector_t iblock,
* The block has been partially written,
* zero the unwritten part and map the block.
*/
- loff_t size, off, pos;
+ loff_t size, pos;
+ void *addr;
max_blocks = 1;
@@ -355,17 +356,41 @@ static int exfat_get_block(struct inode *inode, sector_t iblock,
if (!bh_result->b_folio)
goto done;
+ /*
+ * No buffer_head is allocated.
+ * (1) bmap: It's enough to fill bh_result without I/O.
+ * (2) read: The unwritten part should be filled with 0
+ * If a folio does not have any buffers,
+ * let's returns -EAGAIN to fallback to
+ * per-bh IO like block_read_full_folio().
+ */
+ if (!folio_buffers(bh_result->b_folio)) {
+ err = -EAGAIN;
+ goto done;
+ }
+
pos = EXFAT_BLK_TO_B(iblock, sb);
size = ei->valid_size - pos;
- off = pos & (PAGE_SIZE - 1);
+ addr = folio_address(bh_result->b_folio) +
+ offset_in_folio(bh_result->b_folio, pos);
+
+ /* Check if bh->b_data points to proper addr in folio */
+ if (bh_result->b_data != addr) {
+ exfat_fs_error_ratelimit(sb,
+ "b_data(%p) != folio_addr(%p)",
+ bh_result->b_data, addr);
+ err = -EINVAL;
+ goto done;
+ }
- folio_set_bh(bh_result, bh_result->b_folio, off);
+ /* Read a block */
err = bh_read(bh_result, 0);
if (err < 0)
- goto unlock_ret;
+ goto done;
- folio_zero_segment(bh_result->b_folio, off + size,
- off + sb->s_blocksize);
+ /* Zero unwritten part of a block */
+ memset(bh_result->b_data + size, 0,
+ bh_result->b_size - size);
} else {
/*
* The range has not been written, clear the mapped flag
@@ -376,6 +401,8 @@ static int exfat_get_block(struct inode *inode, sector_t iblock,
}
done:
bh_result->b_size = EXFAT_BLK_TO_B(max_blocks, sb);
+ if (err < 0)
+ clear_buffer_mapped(bh_result);
unlock_ret:
mutex_unlock(&sbi->s_lock);
return err;
--
2.25.1