Greg,
A Gentoo user found an issue with btrfs where:
"... upstream commit a5d74fa24752
("btrfs: avoid unnecessary device path update for the same device") breaks
6.6 LTS kernel behavior where previously lsblk can properly show multiple
mount points of the same btrfs"
There's a revert on the bug report [1].
The fix by the author can be found on linux-btrfs [2]
[1] https://bugs.gentoo.org/947126
[2] https://lore.kernel.org/linux-btrfs/30aefd8b4e8c1f0c5051630b106a1ff3570d28e…
Mike
--
Mike Pagano
Gentoo Developer - Kernel Project
E-Mail : mpagano(a)gentoo.org
GnuPG FP : 52CC A0B0 F631 0B17 0142 F83F 92A6 DBEC 81F2 B137
Public Key : http://pgp.mit.edu/pks/lookup?search=0x92A6DBEC81F2B137&op=index
The field "eip" (instruction pointer) and "esp" (stack pointer) of a task
can be read from /proc/PID/stat. These fields can be interesting for
coredump.
However, these fields were disabled by commit 0a1eb2d474ed ("fs/proc: Stop
reporting eip and esp in /proc/PID/stat"), because it is generally unsafe
to do so. But it is safe for a coredumping process, and therefore
exceptions were made:
- for a coredumping thread by commit fd7d56270b52 ("fs/proc: Report
eip/esp in /prod/PID/stat for coredumping").
- for all other threads in a coredumping process by commit cb8f381f1613
("fs/proc/array.c: allow reporting eip/esp for all coredumping
threads").
The above two commits check the PF_DUMPCORE flag to determine a coredump thread
and the PF_EXITING flag for the other threads.
Unfortunately, commit 92307383082d ("coredump: Don't perform any cleanups
before dumping core") moved coredump to happen earlier and before PF_EXITING is
set. Thus, checking PF_EXITING is no longer the correct way to determine
threads in a coredumping process.
Instead of PF_EXITING, use PF_POSTCOREDUMP to determine the other threads.
Checking of PF_EXITING was added for coredumping, so it probably can now be
removed. But it doesn't hurt to keep.
Fixes: 92307383082d ("coredump: Don't perform any cleanups before dumping core")
Cc: stable(a)vger.kernel.org
Cc: Eric W. Biederman <ebiederm(a)xmission.com>
Acked-by: Oleg Nesterov <oleg(a)redhat.com>
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
---
fs/proc/array.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 55ed3510d2bb..d6a0369caa93 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -500,7 +500,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
* a program is not able to use ptrace(2) in that case. It is
* safe because the task has stopped executing permanently.
*/
- if (permitted && (task->flags & (PF_EXITING|PF_DUMPCORE))) {
+ if (permitted && (task->flags & (PF_EXITING|PF_DUMPCORE|PF_POSTCOREDUMP))) {
if (try_get_task_stack(task)) {
eip = KSTK_EIP(task);
esp = KSTK_ESP(task);
--
2.39.5
Hi, I'm experiencing UBSAN array-index-out-of-bounds errors while using
my Framework 13" AMD laptop with its Mediatek MT7922 wifi adapter
(mt7921e).
It seems to happen only once on boot, and occurs with both kernel
versions 6.12.7 and 6.13-rc4, both compiled from vanilla upstream kernel
sources on Fedora 41 using the kernel.org LLVM toolchain (19.1.6).
I can try some other kernel series if necessary, and also a bisect if I
find a working version, but that may take me a while.
I wasn't sure if I should mark this as a regression, as I'm not sure
which/if there is a working kernel version at this point.
Thanks.
----
[ 17.754417] UBSAN: array-index-out-of-bounds in /data/linux/net/wireless/scan.c:766:2
[ 17.754423] index 0 is out of range for type 'struct ieee80211_channel *[] __counted_by(n_channels)' (aka 'struct ieee80211_channel *[]')
[ 17.754427] CPU: 13 UID: 0 PID: 620 Comm: kworker/u64:10 Tainted: G T 6.13.0-rc4 #9
[ 17.754433] Tainted: [T]=RANDSTRUCT
[ 17.754435] Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.05 03/29/2024
[ 17.754438] Workqueue: events_unbound cfg80211_wiphy_work
[ 17.754446] Call Trace:
[ 17.754449] <TASK>
[ 17.754452] dump_stack_lvl+0x82/0xc0
[ 17.754459] __ubsan_handle_out_of_bounds+0xe7/0x110
[ 17.754464] ? srso_alias_return_thunk+0x5/0xfbef5
[ 17.754470] ? __kmalloc_noprof+0x1a7/0x280
[ 17.754477] cfg80211_scan_6ghz+0x3bb/0xfd0
[ 17.754482] ? srso_alias_return_thunk+0x5/0xfbef5
[ 17.754486] ? try_to_wake_up+0x368/0x4c0
[ 17.754491] ? try_to_wake_up+0x1a9/0x4c0
[ 17.754496] ___cfg80211_scan_done+0xa9/0x1e0
[ 17.754500] cfg80211_wiphy_work+0xb7/0xe0
[ 17.754504] process_scheduled_works+0x205/0x3a0
[ 17.754509] worker_thread+0x24a/0x300
[ 17.754514] ? __cfi_worker_thread+0x10/0x10
[ 17.754519] kthread+0x158/0x180
[ 17.754524] ? __cfi_kthread+0x10/0x10
[ 17.754528] ret_from_fork+0x40/0x50
[ 17.754534] ? __cfi_kthread+0x10/0x10
[ 17.754538] ret_from_fork_asm+0x11/0x30
[ 17.754544] </TASK>
The following call-chain leads to misuse of spinlock_irq
when spinlock_irqsave was hold.
irq_set_vcpu_affinity
-> irq_get_desc_lock (spinlock_irqsave)
-> its_irq_set_vcpu_affinity
-> guard(raw_spin_lock_irq) <--- this enables interrupts
-> irq_put_desc_unlock // <--- WARN IRQs enabled
Fix the issue by using guard(raw_spinlock), since the function is
already called with irqsave and raw_spin_lock was used before the commit
b97e8a2f7130 ("irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()")
introducing the guard as well.
This was discovered through the lock debugging, and the corresponding
log is as follows:
raw_local_irq_restore() called with IRQs enabled
WARNING: CPU: 38 PID: 444 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x2c/0x38
Call trace:
warn_bogus_irq_restore+0x2c/0x38
_raw_spin_unlock_irqrestore+0x68/0x88
__irq_put_desc_unlock+0x1c/0x48
irq_set_vcpu_affinity+0x74/0xc0
its_map_vlpi+0x44/0x88
kvm_vgic_v4_set_forwarding+0x148/0x230
kvm_arch_irq_bypass_add_producer+0x20/0x28
__connect+0x98/0xb8
irq_bypass_register_consumer+0x150/0x178
kvm_irqfd+0x6dc/0x744
kvm_vm_ioctl+0xe44/0x16b0
Fixes: b97e8a2f7130 ("irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()")
Signed-off-by: Tomas Krcka <krckatom(a)amazon.de>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
drivers/irqchip/irq-gic-v3-its.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 92244cfa0464..8c3ec5734f1e 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -2045,7 +2045,7 @@ static int its_irq_set_vcpu_affinity(struct irq_data *d, void *vcpu_info)
if (!is_v4(its_dev->its))
return -EINVAL;
- guard(raw_spinlock_irq)(&its_dev->event_map.vlpi_lock);
+ guard(raw_spinlock)(&its_dev->event_map.vlpi_lock);
/* Unmap request? */
if (!info)
--
2.40.1
Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
On 28/12/2024 15:20, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> watchdog: s3c2410_wdt: add support for exynosautov920 SoC
>
> to the 6.12-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> watchdog-s3c2410_wdt-add-support-for-exynosautov920-.patch
> and it can be found in the queue-6.12 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
This is a new device support, depending on several other pieces like
syscon support, Exynos PMU and of course arm64/boot/dts patches. It
really relies on the rest of SoC support and alone is useless.
This is way beyond simple quirks thus I believe it does not meet stable
criteria at all.
Based on that I recommend to drop this patch from stable backports.
Best regards,
Krzysztof
From: Peilin He<he.peilin(a)zte.com.cn>
Upstream commit 6c24a03a61a2 ("net: dsa: improve shutdown sequence")
Issue
=====
Repeatedly accessing the DSA Ethernet controller via the ethtool command,
followed by a system reboot, may trigger a DSA null pointer dereference,
causing a kernel panic and preventing the system from rebooting properly.
This can lead to data loss or denial-of-service, resulting in serious
consequences.
The following is the panic log:
[ 172.523467] Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000020
[ 172.706923] Call trace:
[ 172.709371] dsa_master_get_sset_count+0x24/0xa4
[ 172.714000] ethtool_get_drvinfo+0x8c/0x210
[ 172.718193] dev_ethtool+0x780/0x2120
[ 172.721863] dev_ioctl+0x1b0/0x580
[ 172.725273] sock_do_ioctl+0xc0/0x100
[ 172.728944] sock_ioctl+0x130/0x3c0
[ 172.732440] __arm64_sys_ioctl+0xb4/0x100
[ 172.736460] invoke_syscall+0x50/0x120
[ 172.740219] el0_svc_common.constprop.0+0x4c/0xf4
[ 172.744936] do_el0_svc+0x2c/0xa0
[ 172.748257] el0_svc+0x20/0x60
[ 172.751318] el0t_64_sync_handler+0xe8/0x114
[ 172.755599] el0t_64_sync+0x180/0x184
[ 172.759271] Code: a90153f3 2a0103f4 a9025bf5 f9418015 (f94012b6)
[ 172.765383] ---[ end trace 0000000000000002 ]---
Root Cause
==========
Based on analysis of the Linux 5.15 stable version, the function
dsa_master_get_sset_count() accesses members of the structure pointed
to by cpu_dp without checking for a null pointer. If cpu_dp is a
null pointer, this will cause a kernel panic.
static int dsa_master_get_sset_count(struct net_device *dev, int sset)
{
struct dsa_port *cpu_dp = dev->dsa_ptr;
const struct ethtool_ops *ops = cpu_dp->orig_ethtool_ops;
struct dsa_switch *ds = cpu_dp->ds;
...
}
dev->dsa_ptr is set to NULL in the dsa_switch_shutdown() or
dsa_master_teardown() functions. When the DSA module unloads,
dsa_master_ethtool_teardown(dev) restores the original copy of
the DSA device's ethtool_ops using "dev->ethtool_ops =
cpu_dp->orig_ethtool_ops;" before setting dev->dsa_ptr to NULL.
This ensures that ethtool_ops remains accessible after DSA unloads.
However, dsa_switch_shutdown does not restore the original copy of
the DSA device's ethtool_ops, potentially leading to a null pointer
dereference of dsa_ptr and causing a system panic. Essentially,
when we set master->dsa_ptr to NULL, we need to ensure that
no user ports are making requests to the DSA driver.
Solution
========
The addition of the netif_device_detach() function is to ensure that
ioctls, rtnetlinks and ethtool requests on the user ports no longer
propagate down to the driver - we're no longer prepared to handle them.
Fixes: ee534378f005 ("net: dsa: fix panic when DSA master device unbinds on shutdown")
Suggested-by: Vladimir Oltean <vladimir.oltean(a)nxp.com>
Signed-off-by: Peilin He <he.peilin(a)zte.com.cn>
Reviewed-by: xu xin <xu.xin16(a)zte.com.cn>
Signed-off-by: Kun Jiang <jiang.kun2(a)zte.com.cn>
Cc: Fan Yu <fan.yu9(a)zte.com.cn>
Cc: Yutan Qiu <qiu.yutan(a)zte.com.cn>
Cc: Yaxin Wang <wang.yaxin(a)zte.com.cn>
Cc: tuqiang <tu.qiang35(a)zte.com.cn>
Cc: Yang Yang <yang.yang29(a)zte.com.cn>
Cc: ye xingchen <ye.xingchen(a)zte.com.cn>
Cc: Yunkai Zhang <zhang.yunkai(a)zte.com.cn>
---
net/dsa/dsa2.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 543834e31298..bf384b30ec0a 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -1656,6 +1656,7 @@ EXPORT_SYMBOL_GPL(dsa_unregister_switch);
void dsa_switch_shutdown(struct dsa_switch *ds)
{
struct net_device *master, *slave_dev;
+ LIST_HEAD(close_list);
struct dsa_port *dp;
mutex_lock(&dsa2_mutex);
@@ -1665,6 +1666,11 @@ void dsa_switch_shutdown(struct dsa_switch *ds)
rtnl_lock();
+ dsa_switch_for_each_cpu_port(dp, ds)
+ list_add(&dp->master->close_list, &close_list);
+
+ dev_close_many(&close_list, true);
+
list_for_each_entry(dp, &ds->dst->ports, list) {
if (dp->ds != ds)
continue;
@@ -1675,6 +1681,7 @@ void dsa_switch_shutdown(struct dsa_switch *ds)
master = dp->cpu_dp->master;
slave_dev = dp->slave;
+ netif_device_detach(slave_dev);
netdev_upper_dev_unlink(master, slave_dev);
}
--
2.25.1
From: Jennifer Berringer <jberring(a)redhat.com>
When __nvmem_cell_entry_write() is called for an nvmem cell that does
not need bit shifting, it requires that the len parameter exactly
matches the nvmem cell size. However, when the nvmem cell has a nonzero
bit_offset, it was skipping this check.
Accepting values of len larger than the cell size results in
nvmem_cell_prepare_write_buffer() trying to write past the end of a heap
buffer that it allocates. Add a check to avoid that problem and instead
return -EINVAL when len doesn't match the number of bits expected by the
nvmem cell when bit_offset is nonzero.
This check uses cell->nbits in order to allow providing the smaller size
to cells that are shifted into another byte by bit_offset. For example,
a cell with nbits=8 and nonzero bit_offset would have bytes=2 but should
accept a 1-byte write here, although no current callers depend on this.
Fixes: 69aba7948cbe ("nvmem: Add a simple NVMEM framework for consumers")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jennifer Berringer <jberring(a)redhat.com>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla(a)linaro.org>
---
drivers/nvmem/core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
index d6494dfc20a7..845540b57e68 100644
--- a/drivers/nvmem/core.c
+++ b/drivers/nvmem/core.c
@@ -1790,6 +1790,8 @@ static int __nvmem_cell_entry_write(struct nvmem_cell_entry *cell, void *buf, si
return -EINVAL;
if (cell->bit_offset || cell->nbits) {
+ if (len != BITS_TO_BYTES(cell->nbits) && len != cell->bytes)
+ return -EINVAL;
buf = nvmem_cell_prepare_write_buffer(cell, buf, len);
if (IS_ERR(buf))
return PTR_ERR(buf);
--
2.25.1
On 6.12 there is a kernel crash during the release of btusb Mediatek
device.
list_del corruption, ffff8aae1f024000->next is LIST_POISON1 (dead000000000100)
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:56!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 UID: 0 PID: 3770 Comm: qemu-system-x86 Tainted: G W 6.12.5-200.fc41.x86_64 #1
Tainted: [W]=WARN
Hardware name: ASUS System Product Name/PRIME X670E-PRO WIFI, BIOS 3035 09/05/2024
RIP: 0010:__list_del_entry_valid_or_report.cold+0x5c/0x6f
Call Trace:
<TASK>
hci_unregister_dev+0x46/0x1f0 [bluetooth]
btusb_disconnect+0x67/0x170 [btusb]
usb_unbind_interface+0x95/0x2d0
device_release_driver_internal+0x19c/0x200
proc_ioctl+0x1be/0x230
usbdev_ioctl+0x6bd/0x1430
__x64_sys_ioctl+0x91/0xd0
do_syscall_64+0x82/0x160
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Note: Taint is due to the amdgpu warnings, totally unrelated to the
issue.
The bug has been fixed "silently" in upstream with the following series
of 4 commits [1]:
ad0c6f603bb0 ("Bluetooth: btusb: mediatek: move Bluetooth power off command position")
cea1805f165c ("Bluetooth: btusb: mediatek: add callback function in btusb_disconnect")
489304e67087 ("Bluetooth: btusb: mediatek: add intf release flow when usb disconnect")
defc33b5541e ("Bluetooth: btusb: mediatek: change the conditions for ISO interface")
These commits can be cleanly cherry-picked to 6.12.y and I may confirm
they fix the problem.
FWIW, the offending commit is ceac1cb0259d ("Bluetooth: btusb: mediatek:
add ISO data transmission functions") and it is present in 6.11.y and
6.12.y.
6.11.y is EOL, so please apply the patches to 6.12.y.
[1]: https://lore.kernel.org/linux-bluetooth/20240923084705.14123-1-chris.lu@med…
--
Thanks,
Fedor