Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
allowed for newer ASICs to mix GTT and VRAM, this change also noted that
some older boards, such as Stoney and Carrizo do not support this.
It appears that at least one additional ASIC does not support this which
is Raven.
We observed this issue when migrating a device from a 5.4 to 6.6 kernel
and have confirmed that Raven also needs to be excluded from mixing GTT
and VRAM.
Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)")
Cc: Luben Tuikov <luben.tuikov(a)amd.com>
Cc: Christian König <christian.koenig(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org # 6.1+
Tested-by: Thadeu Lima de Souza Cascardo <cascardo(a)igalia.com>
Signed-off-by: Brian Geffon <bgeffon(a)google.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 73403744331a..5d7f13e25b7c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
uint32_t domain)
{
if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
- ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY))) {
+ ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type == CHIP_STONEY) ||
+ (adev->asic_type == CHIP_RAVEN))) {
domain = AMDGPU_GEM_DOMAIN_VRAM;
if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
domain = AMDGPU_GEM_DOMAIN_GTT;
--
2.50.0.727.gbf7dc18ff4-goog
The process of adding an I2C adapter can invoke I2C accesses on that new
adapter (see i2c_detect()).
Ensure we have set the adapter's driver data to avoid null pointer
dereferences in the xfer functions during the adapter add.
This has been noted in the past and the same fix proposed but not
completed. See:
https://lore.kernel.org/lkml/ef597e73-ed71-168e-52af-0d19b03734ac@vigem.de/
Signed-off-by: Hamish Martin <hamish.martin(a)alliedtelesis.co.nz>
Signed-off-by: Jiri Kosina <jkosina(a)suse.cz>
Signed-off-by: Sumanth Gavini <sumanth.gavini(a)yahoo.com>
---
drivers/hid/hid-mcp2221.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hid/hid-mcp2221.c b/drivers/hid/hid-mcp2221.c
index de52e9f7bb8c..9973545c1c4b 100644
--- a/drivers/hid/hid-mcp2221.c
+++ b/drivers/hid/hid-mcp2221.c
@@ -873,12 +873,12 @@ static int mcp2221_probe(struct hid_device *hdev,
"MCP2221 usb-i2c bridge on hidraw%d",
((struct hidraw *)hdev->hidraw)->minor);
+ i2c_set_adapdata(&mcp->adapter, mcp);
ret = i2c_add_adapter(&mcp->adapter);
if (ret) {
hid_err(hdev, "can't add usb-i2c adapter: %d\n", ret);
goto err_i2c;
}
- i2c_set_adapdata(&mcp->adapter, mcp);
/* Setup GPIO chip */
mcp->gc = devm_kzalloc(&hdev->dev, sizeof(*mcp->gc), GFP_KERNEL);
--
2.43.0
Changes from v1 :
- Updated comment for nvmet_pci_epf_queue_response() per Damien's suggestion.
- Fixed typo in commit message.
- Added 3 tags in commit message:
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver")
Cc: stable(a)vger.kernel.org
Best regards,
Rick
Rick Wertenbroek (1):
nvmet: pci-epf: Do not complete commands twice if nvmet_req_init()
fails
drivers/nvme/target/pci-epf.c | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)
--
2.25.1
From: Yang Xiwen <forbidden405(a)outlook.com>
Original logic only sets the return value but doesn't jump out of the
loop if the bus is kept active by a client. This is not expected. A
malicious or buggy i2c client can hang the kernel in this case and
should be avoided. This is observed during a long time test with a
PCA953x GPIO extender.
Fix it by changing the logic to not only sets the return value, but also
jumps out of the loop and return to the caller with -ETIMEDOUT.
Cc: stable(a)vger.kernel.org
Signed-off-by: Yang Xiwen <forbidden405(a)outlook.com>
---
drivers/i2c/busses/i2c-qup.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-qup.c b/drivers/i2c/busses/i2c-qup.c
index 3a36d682ed57..5b053e51f4c9 100644
--- a/drivers/i2c/busses/i2c-qup.c
+++ b/drivers/i2c/busses/i2c-qup.c
@@ -452,8 +452,10 @@ static int qup_i2c_bus_active(struct qup_i2c_dev *qup, int len)
if (!(status & I2C_STATUS_BUS_ACTIVE))
break;
- if (time_after(jiffies, timeout))
+ if (time_after(jiffies, timeout)) {
ret = -ETIMEDOUT;
+ break;
+ }
usleep_range(len, len * 2);
}
---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20250615-qca-i2c-d41bb61aa59e
Best regards,
--
Yang Xiwen <forbidden405(a)outlook.com>
Hi ,
Planning to get the GSX 2025 attendee list?
Expo Name: Global Security Exchange (GSX) 2025
Total Number of records: 17,000 records
List includes: Company Name, Contact Name, Job Title, Mailing Address, Phone, Emails, etc.
Interested in moving forward with these leads? Let me know, and I'll share the price.
Can't wait for your reply
Regards
Lena
Marketing Manager
Pro Tech Insights.,
Please reply with REMOVE if you don't wish to receive further emails
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
On g4x we currently use the 96MHz non-SSC refclk, which can't actually
generate an exact 2.7 Gbps link rate. In practice we end up with 2.688
Gbps which seems to be close enough to actually work, but link training
is currently failing due to miscalculating the DP_LINK_BW value (we
calcualte it directly from port_clock which reflects the actual PLL
outpout frequency).
Ideas how to fix this:
- nudge port_clock back up to 270000 during PLL computation/readout
- track port_clock and the nominal link rate separately so they might
differ a bit
- switch to the 100MHz refclk, but that one should be SSC so perhaps
not something we want
While we ponder about a better solution apply some band aid to the
immediate issue of miscalculated DP_LINK_BW value. With this
I can again use 2.7 Gbps link rate on g4x.
Cc: stable(a)vger.kernel.org
Fixes: 665a7b04092c ("drm/i915: Feed the DPLL output freq back into crtc_state")
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_dp.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
index f48912f308df..7976fec88606 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -1606,6 +1606,12 @@ int intel_dp_rate_select(struct intel_dp *intel_dp, int rate)
void intel_dp_compute_rate(struct intel_dp *intel_dp, int port_clock,
u8 *link_bw, u8 *rate_select)
{
+ struct intel_display *display = to_intel_display(intel_dp);
+
+ /* FIXME g4x can't generate an exact 2.7GHz with the 96MHz non-SSC refclk */
+ if (display->platform.g4x && port_clock == 268800)
+ port_clock = 270000;
+
/* eDP 1.4 rate select method. */
if (intel_dp->use_rate_select) {
*link_bw = 0;
--
2.49.0
Having a Gemalto Cinterion PLS83-W modem attached to USB and activating the
cellular data link would sometimes yield the following RCU stall, leading
to a system freeze:
rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 0-.... } 33108 jiffies s: 201 root: 0x1/.
rcu: blocking rcu_node structures (internal RCU debug):
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
Call trace:
arch_local_irq_enable+0x4/0x8
local_bh_enable+0x18/0x20
__netdev_alloc_skb+0x18c/0x1cc
rx_submit+0x68/0x1f8 [usbnet]
rx_alloc_submit+0x4c/0x74 [usbnet]
usbnet_bh+0x1d8/0x218 [usbnet]
usbnet_bh_tasklet+0x10/0x18 [usbnet]
tasklet_action_common+0xa8/0x110
tasklet_action+0x2c/0x34
handle_softirqs+0x2cc/0x3a0
__do_softirq+0x10/0x18
____do_softirq+0xc/0x14
call_on_irq_stack+0x24/0x34
do_softirq_own_stack+0x18/0x20
__irq_exit_rcu+0xa8/0xb8
irq_exit_rcu+0xc/0x30
el1_interrupt+0x34/0x48
el1h_64_irq_handler+0x14/0x1c
el1h_64_irq+0x68/0x6c
_raw_spin_unlock_irqrestore+0x38/0x48
xhci_urb_dequeue+0x1ac/0x45c [xhci_hcd]
unlink1+0xd4/0xdc [usbcore]
usb_hcd_unlink_urb+0x70/0xb0 [usbcore]
usb_unlink_urb+0x24/0x44 [usbcore]
unlink_urbs.constprop.0.isra.0+0x64/0xa8 [usbnet]
__handle_link_change+0x34/0x70 [usbnet]
usbnet_deferred_kevent+0x1c0/0x320 [usbnet]
process_scheduled_works+0x2d0/0x48c
worker_thread+0x150/0x1dc
kthread+0xd8/0xe8
ret_from_fork+0x10/0x20
It turns out that during the link activation a LINK_CHANGE event is emitted
which causes the active RX URBs to be unlinked, while that is happening
rx_submit() may begin pushing new URBs to the queue being emptied.
Causing the unlink queue to never empty.
Use the same approach as commit 43daa96b166c ("usbnet: Stop RX Q on MTU
change") and pause the RX queue while unlinking the URBs on LINK_CHANGE
as well.
Fixes: 4b49f58fff00 ("usbnet: handle link change")
Cc: stable(a)vger.kernel.org
Signed-off-by: John Ernberg <john.ernberg(a)actia.se>
---
Tested on 6.12.20 and forward ported.
---
drivers/net/usb/usbnet.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index c04e715a4c2a..156f0e85a135 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1115,7 +1115,9 @@ static void __handle_link_change(struct usbnet *dev)
if (!netif_carrier_ok(dev->net)) {
/* kill URBs for reading packets to save bus bandwidth */
+ usbnet_pause_rx(dev);
unlink_urbs(dev, &dev->rxq);
+ usbnet_resume_rx(dev);
/*
* tx_timeout will unlink URBs for sending packets and
--
2.49.0
The patch titled
Subject: kasan: use vmalloc_dump_obj() for vmalloc error reports
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
kasan-use-vmalloc_dump_obj-for-vmalloc-error-reports.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Marco Elver <elver(a)google.com>
Subject: kasan: use vmalloc_dump_obj() for vmalloc error reports
Date: Wed, 16 Jul 2025 17:23:28 +0200
Since 6ee9b3d84775 ("kasan: remove kasan_find_vm_area() to prevent
possible deadlock"), more detailed info about the vmalloc mapping and the
origin was dropped due to potential deadlocks.
While fixing the deadlock is necessary, that patch was too quick in
killing an otherwise useful feature, and did no due-diligence in
understanding if an alternative option is available.
Restore printing more helpful vmalloc allocation info in KASAN reports
with the help of vmalloc_dump_obj(). Example report:
| BUG: KASAN: vmalloc-out-of-bounds in vmalloc_oob+0x4c9/0x610
| Read of size 1 at addr ffffc900002fd7f3 by task kunit_try_catch/493
|
| CPU: [...]
| Call Trace:
| <TASK>
| dump_stack_lvl+0xa8/0xf0
| print_report+0x17e/0x810
| kasan_report+0x155/0x190
| vmalloc_oob+0x4c9/0x610
| [...]
|
| The buggy address belongs to a 1-page vmalloc region starting at 0xffffc900002fd000 allocated at vmalloc_oob+0x36/0x610
| The buggy address belongs to the physical page:
| page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x126364
| flags: 0x200000000000000(node=0|zone=2)
| raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000
| raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
| page dumped because: kasan: bad access detected
|
| [..]
Link: https://lkml.kernel.org/r/20250716152448.3877201-1-elver@google.com
Fixes: 6ee9b3d84775 ("kasan: remove kasan_find_vm_area() to prevent possible deadlock")
Signed-off-by: Marco Elver <elver(a)google.com>
Suggested-by: Uladzislau Rezki <urezki(a)gmail.com>
Acked-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Yeoreum Yun <yeoreum.yun(a)arm.com>
Cc: Yunseong Kim <ysk(a)kzalloc.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kasan/report.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/mm/kasan/report.c~kasan-use-vmalloc_dump_obj-for-vmalloc-error-reports
+++ a/mm/kasan/report.c
@@ -399,7 +399,9 @@ static void print_address_description(vo
}
if (is_vmalloc_addr(addr)) {
- pr_err("The buggy address %px belongs to a vmalloc virtual mapping\n", addr);
+ pr_err("The buggy address belongs to a");
+ if (!vmalloc_dump_obj(addr))
+ pr_cont(" vmalloc virtual mapping\n");
page = vmalloc_to_page(addr);
}
_
Patches currently in -mm which might be from elver(a)google.com are
kasan-use-vmalloc_dump_obj-for-vmalloc-error-reports.patch
From: "Michael C. Pratt" <mcpratt(a)pm.me>
On 11 Oct 2022, it was reported that the crc32 verification
of the u-boot environment failed only on big-endian systems
for the u-boot-env nvmem layout driver with the following error.
Invalid calculated CRC32: 0x88cd6f09 (expected: 0x096fcd88)
This problem has been present since the driver was introduced,
and before it was made into a layout driver.
The suggested fix at the time was to use further endianness
conversion macros in order to have both the stored and calculated
crc32 values to compare always represented in the system's endianness.
This was not accepted due to sparse warnings
and some disagreement on how to handle the situation.
Later on in a newer revision of the patch, it was proposed to use
cpu_to_le32() for both values to compare instead of le32_to_cpu()
and store the values as __le32 type to remove compilation errors.
The necessity of this is based on the assumption that the use of crc32()
requires endianness conversion because the algorithm uses little-endian,
however, this does not prove to be the case and the issue is unrelated.
Upon inspecting the current kernel code,
there already is an existing use of le32_to_cpu() in this driver,
which suggests there already is special handling for big-endian systems,
however, it is big-endian systems that have the problem.
This, being the only functional difference between architectures
in the driver combined with the fact that the suggested fix
was to use the exact same endianness conversion for the values
brings up the possibility that it was not necessary to begin with,
as the same endianness conversion for two values expected to be the same
is expected to be equivalent to no conversion at all.
After inspecting the u-boot environment of devices of both endianness
and trying to remove the existing endianness conversion,
the problem is resolved in an equivalent way as the other suggested fixes.
Ultimately, it seems that u-boot is agnostic to endianness
at least for the purpose of environment variables.
In other words, u-boot reads and writes the stored crc32 value
with the same endianness that the crc32 value is calculated with
in whichever endianness a certain architecture runs on.
Therefore, the u-boot-env driver does not need to convert endianness.
Remove the usage of endianness macros in the u-boot-env driver,
and change the type of local variables to maintain the same return type.
If there is a special situation in the case of endianness,
it would be a corner case and should be handled by a unique "compatible".
Even though it is not necessary to use endianness conversion macros here,
it may be useful to use them in the future for consistent error printing.
Fixes: d5542923f200 ("nvmem: add driver handling U-Boot environment variables")
Reported-by: INAGAKI Hiroshi <musashino.open(a)gmail.com>
Link: https://lore.kernel.org/all/20221011024928.1807-1-musashino.open@gmail.com
Cc: stable(a)vger.kernel.org # 6.12.x
Cc: stable(a)vger.kernel.org # 6.6.x: f4cf4e5: Revert "nvmem: add new config option"
Cc: stable(a)vger.kernel.org # 6.6.x: 7f38b70: of: device: Export of_device_make_bus_id()
Cc: stable(a)vger.kernel.org # 6.6.x: 4a1a402: nvmem: Move of_nvmem_layout_get_container() in another header
Cc: stable(a)vger.kernel.org # 6.6.x: fc29fd8: nvmem: core: Rework layouts to become regular devices
Cc: stable(a)vger.kernel.org # 6.6.x: 0331c61: nvmem: core: Expose cells through sysfs
Cc: stable(a)vger.kernel.org # 6.6.x: 401df0d: nvmem: layouts: refactor .add_cells() callback arguments
Cc: stable(a)vger.kernel.org # 6.6.x: 6d0ca4a: nvmem: layouts: store owner from modules with nvmem_layout_driver_register()
Cc: stable(a)vger.kernel.org # 6.6.x: 5f15811: nvmem: layouts: add U-Boot env layout
Cc: stable(a)vger.kernel.org # 6.6.x
Signed-off-by: Michael C. Pratt <mcpratt(a)pm.me>
Signed-off-by: Srinivas Kandagatla <srini(a)kernel.org>
---
drivers/nvmem/layouts/u-boot-env.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/nvmem/layouts/u-boot-env.c b/drivers/nvmem/layouts/u-boot-env.c
index 436426d4e8f9..8571aac56295 100644
--- a/drivers/nvmem/layouts/u-boot-env.c
+++ b/drivers/nvmem/layouts/u-boot-env.c
@@ -92,7 +92,7 @@ int u_boot_env_parse(struct device *dev, struct nvmem_device *nvmem,
size_t crc32_data_offset;
size_t crc32_data_len;
size_t crc32_offset;
- __le32 *crc32_addr;
+ uint32_t *crc32_addr;
size_t data_offset;
size_t data_len;
size_t dev_size;
@@ -143,8 +143,8 @@ int u_boot_env_parse(struct device *dev, struct nvmem_device *nvmem,
goto err_kfree;
}
- crc32_addr = (__le32 *)(buf + crc32_offset);
- crc32 = le32_to_cpu(*crc32_addr);
+ crc32_addr = (uint32_t *)(buf + crc32_offset);
+ crc32 = *crc32_addr;
crc32_data_len = dev_size - crc32_data_offset;
data_len = dev_size - data_offset;
--
2.43.0
Hi Greg,
please consider backporting
a5a441ae283d ("ice/ptp: fix crosstimestamp reporting")
into linux-6.12.y
It fixes a regression from the series around
d4bea547ebb57 ("ice/ptp: Remove convert_art_to_tsc()")
which affected multiple drivers and occasionally
caused phc2sys to fail on ioctl(fd, PTP_SYS_OFFSET_PRECISE, ...).
This was the initial fix for ice but apparently tagging it
for stable was forgotten during submission.
A similar fix for e1000e can be found here:
Link: https://lore.kernel.org/lkml/20250709-e1000e_crossts-v2-1-2aae94384c59@bloc…
The hunk was moved around slightly in the upstream commit
92456e795ac6 ("ice: Add unified ice_capture_crosststamp").
Let me know if you therefore want a separate patch,
I just didn't want to to steal the credits here.
Thanks a lot!
Markus
--
Commit 7ded842b356d ("s390/bpf: Fix bpf_plt pointer arithmetic") has
accidentally removed the critical piece of commit c730fce7c70c
("s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL"), causing
intermittent kernel panics in e.g. perf's on_switch() prog to reappear.
Restore the fix and add a comment.
Fixes: 7ded842b356d ("s390/bpf: Fix bpf_plt pointer arithmetic")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ilya Leoshkevich <iii(a)linux.ibm.com>
---
arch/s390/net/bpf_jit_comp.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index 8bb738f1b1b6..bb17efe29d65 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -576,7 +576,15 @@ static void bpf_jit_plt(struct bpf_plt *plt, void *ret, void *target)
{
memcpy(plt, &bpf_plt, sizeof(*plt));
plt->ret = ret;
- plt->target = target;
+ /*
+ * (target == NULL) implies that the branch to this PLT entry was
+ * patched and became a no-op. However, some CPU could have jumped
+ * to this PLT entry before patching and may be still executing it.
+ *
+ * Since the intention in this case is to make the PLT entry a no-op,
+ * make the target point to the return label instead of NULL.
+ */
+ plt->target = target ?: ret;
}
/*
--
2.50.1
When a call is released, rxrpc takes the spinlock and removes it from
->recvmsg_q in an effort to prevent racing recvmsg() invocations from
seeing the same call. Now, rxrpc_recvmsg() only takes the spinlock when
actually removing a call from the queue; it doesn't, however, take it in
the lead up to that when it checks to see if the queue is empty. It *does*
hold the socket lock, which prevents a recvmsg/recvmsg race - but this
doesn't prevent sendmsg from ending the call because sendmsg() drops the
socket lock and relies on the call->user_mutex.
Fix this by firstly removing the bit in rxrpc_release_call() that dequeues
the released call and, instead, rely on recvmsg() to simply discard
released calls (done in a preceding fix).
Secondly, rxrpc_notify_socket() is abandoned if the call is already marked
as released rather than trying to be clever by setting both pointers in
call->recvmsg_link to NULL to trick list_empty(). This isn't perfect and
can still race, resulting in a released call on the queue, but recvmsg()
will now clean that up.
Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Jeffrey Altman <jaltman(a)auristor.com>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Jakub Kicinski <kuba(a)kernel.org>
cc: Paolo Abeni <pabeni(a)redhat.com>
cc: "David S. Miller" <davem(a)davemloft.net>
cc: Eric Dumazet <edumazet(a)google.com>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
cc: netdev(a)vger.kernel.org
cc: stable(a)vger.kernel.org
---
include/trace/events/rxrpc.h | 2 +-
net/rxrpc/call_object.c | 28 ++++++++++++----------------
net/rxrpc/recvmsg.c | 4 ++++
3 files changed, 17 insertions(+), 17 deletions(-)
diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h
index e7dcfb1369b6..8e5a73eb5268 100644
--- a/include/trace/events/rxrpc.h
+++ b/include/trace/events/rxrpc.h
@@ -325,7 +325,6 @@
EM(rxrpc_call_put_release_sock, "PUT rls-sock") \
EM(rxrpc_call_put_release_sock_tba, "PUT rls-sk-a") \
EM(rxrpc_call_put_sendmsg, "PUT sendmsg ") \
- EM(rxrpc_call_put_unnotify, "PUT unnotify") \
EM(rxrpc_call_put_userid_exists, "PUT u-exists") \
EM(rxrpc_call_put_userid, "PUT user-id ") \
EM(rxrpc_call_see_accept, "SEE accept ") \
@@ -338,6 +337,7 @@
EM(rxrpc_call_see_disconnected, "SEE disconn ") \
EM(rxrpc_call_see_distribute_error, "SEE dist-err") \
EM(rxrpc_call_see_input, "SEE input ") \
+ EM(rxrpc_call_see_notify_released, "SEE nfy-rlsd") \
EM(rxrpc_call_see_recvmsg, "SEE recvmsg ") \
EM(rxrpc_call_see_release, "SEE release ") \
EM(rxrpc_call_see_userid_exists, "SEE u-exists") \
diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c
index 15067ff7b1f2..918f41d97a2f 100644
--- a/net/rxrpc/call_object.c
+++ b/net/rxrpc/call_object.c
@@ -561,7 +561,7 @@ static void rxrpc_cleanup_rx_buffers(struct rxrpc_call *call)
void rxrpc_release_call(struct rxrpc_sock *rx, struct rxrpc_call *call)
{
struct rxrpc_connection *conn = call->conn;
- bool put = false, putu = false;
+ bool putu = false;
_enter("{%d,%d}", call->debug_id, refcount_read(&call->ref));
@@ -573,23 +573,13 @@ void rxrpc_release_call(struct rxrpc_sock *rx, struct rxrpc_call *call)
rxrpc_put_call_slot(call);
- /* Make sure we don't get any more notifications */
+ /* Note that at this point, the call may still be on or may have been
+ * added back on to the socket receive queue. recvmsg() must discard
+ * released calls. The CALL_RELEASED flag should prevent further
+ * notifications.
+ */
spin_lock_irq(&rx->recvmsg_lock);
-
- if (!list_empty(&call->recvmsg_link)) {
- _debug("unlinking once-pending call %p { e=%lx f=%lx }",
- call, call->events, call->flags);
- list_del(&call->recvmsg_link);
- put = true;
- }
-
- /* list_empty() must return false in rxrpc_notify_socket() */
- call->recvmsg_link.next = NULL;
- call->recvmsg_link.prev = NULL;
-
spin_unlock_irq(&rx->recvmsg_lock);
- if (put)
- rxrpc_put_call(call, rxrpc_call_put_unnotify);
write_lock(&rx->call_lock);
@@ -638,6 +628,12 @@ void rxrpc_release_calls_on_socket(struct rxrpc_sock *rx)
rxrpc_put_call(call, rxrpc_call_put_release_sock);
}
+ while ((call = list_first_entry_or_null(&rx->recvmsg_q,
+ struct rxrpc_call, recvmsg_link))) {
+ list_del_init(&call->recvmsg_link);
+ rxrpc_put_call(call, rxrpc_call_put_release_recvmsg_q);
+ }
+
_leave("");
}
diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
index 6990e37697de..7fa7e77f6bb9 100644
--- a/net/rxrpc/recvmsg.c
+++ b/net/rxrpc/recvmsg.c
@@ -29,6 +29,10 @@ void rxrpc_notify_socket(struct rxrpc_call *call)
if (!list_empty(&call->recvmsg_link))
return;
+ if (test_bit(RXRPC_CALL_RELEASED, &call->flags)) {
+ rxrpc_see_call(call, rxrpc_call_see_notify_released);
+ return;
+ }
rcu_read_lock();
The hwprobe vDSO data for some keys, like MISALIGNED_VECTOR_PERF,
is determined by an asynchronous kthread. This can create a race
condition where the kthread finishes after the vDSO data has
already been populated, causing userspace to read stale values.
To fix this, a completion-based framework is introduced to robustly
synchronize the async probes with the vDSO data population. The
waiting function, init_hwprobe_vdso_data(), now blocks on
wait_for_completion() until all probes signal they are done.
Furthermore, to prevent this potential blocking from impacting boot
performance, the initialization is deferred to late_initcall. This
is safe as the data is only required by userspace (which starts
much later) and moves the synchronization delay off the critical
boot path.
Reported-by: Tsukasa OI <research_trasio(a)irq.a4lg.com>
Closes: https://lore.kernel.org/linux-riscv/760d637b-b13b-4518-b6bf-883d55d44e7f@ir…
Fixes: e7c9d66e313b ("RISC-V: Report vector unaligned access speed hwprobe")
Cc: Palmer Dabbelt <palmer(a)dabbelt.com>
Cc: Alexandre Ghiti <alexghiti(a)rivosinc.com>
Cc: Olof Johansson <olof(a)lixom.net>
Cc: stable(a)vger.kernel.org
Signed-off-by: Jingwei Wang <wangjingwei(a)iscas.ac.cn>
---
Changes in v5:
- Reworked the synchronization logic to a robust "sentinel-count"
pattern based on feedback from Alexandre.
- Fixed a "multiple definition" linker error for nommu builds by changing
the header-file stub functions to `static inline`, as pointed out by Olof.
- Updated the commit message to better explain the rationale for moving
the vDSO initialization to `late_initcall`.
Changes in v4:
- Reworked the synchronization mechanism based on feedback from Palmer
and Alexandre.
- Instead of a post-hoc refresh, this version introduces a robust
completion-based framework using an atomic counter to ensure async
probes are finished before populating the vDSO.
- Moved the vdso data initialization to a late_initcall to avoid
impacting boot time.
Changes in v3:
- Retained existing blank line.
Changes in v2:
- Addressed feedback from Yixun's regarding #ifdef CONFIG_MMU usage.
- Updated commit message to provide a high-level summary.
- Added Fixes tag for commit e7c9d66e313b.
v1: https://lore.kernel.org/linux-riscv/20250521052754.185231-1-wangjingwei@isc…
arch/riscv/include/asm/hwprobe.h | 8 +++++++-
arch/riscv/kernel/sys_hwprobe.c | 20 +++++++++++++++++++-
arch/riscv/kernel/unaligned_access_speed.c | 9 +++++++--
3 files changed, 33 insertions(+), 4 deletions(-)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h
index 7fe0a379474ae2c6..3b2888126e659ea1 100644
--- a/arch/riscv/include/asm/hwprobe.h
+++ b/arch/riscv/include/asm/hwprobe.h
@@ -40,5 +40,11 @@ static inline bool riscv_hwprobe_pair_cmp(struct riscv_hwprobe *pair,
return pair->value == other_pair->value;
}
-
+#ifdef CONFIG_MMU
+void riscv_hwprobe_register_async_probe(void);
+void riscv_hwprobe_complete_async_probe(void);
+#else
+static inline void riscv_hwprobe_register_async_probe(void) {}
+static inline void riscv_hwprobe_complete_async_probe(void) {}
+#endif
#endif
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c
index 0b170e18a2beba57..ee02aeb03e7bd3d8 100644
--- a/arch/riscv/kernel/sys_hwprobe.c
+++ b/arch/riscv/kernel/sys_hwprobe.c
@@ -5,6 +5,8 @@
* more details.
*/
#include <linux/syscalls.h>
+#include <linux/completion.h>
+#include <linux/atomic.h>
#include <asm/cacheflush.h>
#include <asm/cpufeature.h>
#include <asm/hwprobe.h>
@@ -467,6 +469,20 @@ static int do_riscv_hwprobe(struct riscv_hwprobe __user *pairs,
#ifdef CONFIG_MMU
+static DECLARE_COMPLETION(boot_probes_done);
+static atomic_t pending_boot_probes = ATOMIC_INIT(1);
+
+void riscv_hwprobe_register_async_probe(void)
+{
+ atomic_inc(&pending_boot_probes);
+}
+
+void riscv_hwprobe_complete_async_probe(void)
+{
+ if (atomic_dec_and_test(&pending_boot_probes))
+ complete(&boot_probes_done);
+}
+
static int __init init_hwprobe_vdso_data(void)
{
struct vdso_arch_data *avd = vdso_k_arch_data;
@@ -474,6 +490,8 @@ static int __init init_hwprobe_vdso_data(void)
struct riscv_hwprobe pair;
int key;
+ if (unlikely(!atomic_dec_and_test(&pending_boot_probes)))
+ wait_for_completion(&boot_probes_done);
/*
* Initialize vDSO data with the answers for the "all CPUs" case, to
* save a syscall in the common case.
@@ -504,7 +522,7 @@ static int __init init_hwprobe_vdso_data(void)
return 0;
}
-arch_initcall_sync(init_hwprobe_vdso_data);
+late_initcall(init_hwprobe_vdso_data);
#endif /* CONFIG_MMU */
diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c
index ae2068425fbcd207..4b8ad2673b0f7470 100644
--- a/arch/riscv/kernel/unaligned_access_speed.c
+++ b/arch/riscv/kernel/unaligned_access_speed.c
@@ -379,6 +379,7 @@ static void check_vector_unaligned_access(struct work_struct *work __always_unus
static int __init vec_check_unaligned_access_speed_all_cpus(void *unused __always_unused)
{
schedule_on_each_cpu(check_vector_unaligned_access);
+ riscv_hwprobe_complete_async_probe();
return 0;
}
@@ -473,8 +474,12 @@ static int __init check_unaligned_access_all_cpus(void)
per_cpu(vector_misaligned_access, cpu) = unaligned_vector_speed_param;
} else if (!check_vector_unaligned_access_emulated_all_cpus() &&
IS_ENABLED(CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS)) {
- kthread_run(vec_check_unaligned_access_speed_all_cpus,
- NULL, "vec_check_unaligned_access_speed_all_cpus");
+ riscv_hwprobe_register_async_probe();
+ if (IS_ERR(kthread_run(vec_check_unaligned_access_speed_all_cpus,
+ NULL, "vec_check_unaligned_access_speed_all_cpus"))) {
+ pr_warn("Failed to create vec_unalign_check kthread\n");
+ riscv_hwprobe_complete_async_probe();
+ }
}
/*
--
2.50.0
This is the start of the stable review cycle for the 6.6.99 release.
There are 111 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 17 Jul 2025 16:35:12 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.99-rc2…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.6.99-rc2
Michael Jeanson <mjeanson(a)efficios.com>
rseq: Fix segfault on registration when rseq_cs is non-zero
Lukas Wunner <lukas(a)wunner.de>
crypto: ecdsa - Harden against integer overflows in DIV_ROUND_UP()
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix potential use-after-free in oplock/lease break ack
Yeoreum Yun <yeoreum.yun(a)arm.com>
kasan: remove kasan_find_vm_area() to prevent possible deadlock
Paulo Alcantara <pc(a)manguebit.com>
smb: client: fix potential race in cifs_put_tcon()
Willem de Bruijn <willemb(a)google.com>
selftests/bpf: adapt one more case in test_lru_map to the new target_free
Hans de Goede <hdegoede(a)redhat.com>
Input: atkbd - do not skip atkbd_deactivate() when skipping ATKBD_CMD_GETID
Chia-Lin Kao (AceLan) <acelan.kao(a)canonical.com>
HID: quirks: Add quirk for 2 Chicony Electronics HP 5MP Cameras
Zhang Heng <zhangheng(a)kylinos.cn>
HID: Add IGNORE quirk for SMARTLINKTECHNOLOGY
Willem de Bruijn <willemb(a)google.com>
bpf: Adjust free target to avoid global starvation of LRU map
Nicolas Pitre <npitre(a)baylibre.com>
vt: add missing notification when switching back to text mode
Filipe Manana <fdmanana(a)suse.com>
btrfs: fix assertion when building free space tree
Long Li <longli(a)microsoft.com>
net: mana: Record doorbell physical address in PF mode
Akira Inoue <niyarium(a)gmail.com>
HID: lenovo: Add support for ThinkPad X1 Tablet Thin Keyboard Gen2
Xiaowei Li <xiaowei.li(a)simcom.com>
net: usb: qmi_wwan: add SIMCom 8230C composition
Yasmin Fitzgerald <sunoflife1.git(a)gmail.com>
ALSA: hda/realtek - Enable mute LED on HP Pavilion Laptop 15-eg100
Yuzuru10 <yuzuru_10(a)proton.me>
ASoC: amd: yc: add quirk for Acer Nitro ANV15-41 internal mic
Fengnan Chang <changfengnan(a)bytedance.com>
io_uring: make fallocate be hashed work
Tiwei Bie <tiwei.btw(a)antgroup.com>
um: vector: Reduce stack usage in vector_eth_configure()
Thomas Fourier <fourier.thomas(a)gmail.com>
atm: idt77252: Add missing `dma_map_error()`
Ronnie Sahlberg <rsahlberg(a)whamcloud.com>
ublk: sanity check add_dev input for underflow
Somnath Kotur <somnath.kotur(a)broadcom.com>
bnxt_en: Set DMA unmap len correctly for XDP_REDIRECT
Shravya KN <shravya.k-n(a)broadcom.com>
bnxt_en: Fix DCB ETS validation
Alok Tiwari <alok.a.tiwari(a)oracle.com>
net: ll_temac: Fix missing tx_pending check in ethtools_set_ringparam()
Sean Nyekjaer <sean(a)geanix.com>
can: m_can: m_can_handle_lost_msg(): downgrade msg lost in rx message to debug level
Oleksij Rempel <o.rempel(a)pengutronix.de>
net: phy: microchip: limit 100M workaround to link-down events on LAN88xx
Mingming Cao <mmc(a)linux.ibm.com>
ibmvnic: Fix hardcoded NUM_RX_STATS/NUM_TX_STATS with dynamic sizeof
Kito Xu <veritas501(a)foxmail.com>
net: appletalk: Fix device refcount leak in atrtr_create()
Eric Dumazet <edumazet(a)google.com>
netfilter: flowtable: account for Ethernet header in nf_flow_pppoe_proto()
Zheng Qixing <zhengqixing(a)huawei.com>
nbd: fix uaf in nbd_genl_connect() error path
Nigel Croxon <ncroxon(a)redhat.com>
raid10: cleanup memleak at raid10_make_request
Wang Jinchao <wangjinchao600(a)gmail.com>
md/raid1: Fix stack memory use after return in raid1_reshape
Mikko Perttunen <mperttunen(a)nvidia.com>
drm/tegra: nvdec: Fix dma_alloc_coherent error check
Daniil Dulov <d.dulov(a)aladdin.ru>
wifi: zd1211rw: Fix potential NULL pointer dereference in zd_mac_tx_to_dev()
Shyam Prasad N <sprasad(a)microsoft.com>
cifs: all initializations for tcon should happen in tcon_info_alloc
Paulo Alcantara <pc(a)manguebit.com>
smb: client: fix DFS interlink failover
Paulo Alcantara <pc(a)manguebit.com>
smb: client: avoid unnecessary reconnects when refreshing referrals
Kuen-Han Tsai <khtsai(a)google.com>
usb: dwc3: Abort suspend on soft disconnect failure
Pawel Laszczak <pawell(a)cadence.com>
usb: cdnsp: Fix issue with CV Bad Descriptor test
Lee Jones <lee(a)kernel.org>
usb: cdnsp: Replace snprintf() with the safer scnprintf() variant
Pawel Laszczak <pawell(a)cadence.com>
usb:cdnsp: remove TRB_FLUSH_ENDPOINT command
Filipe Manana <fdmanana(a)suse.com>
btrfs: fix inode lookup error handling during log replay
Filipe Manana <fdmanana(a)suse.com>
btrfs: return a btrfs_inode from btrfs_iget_logging()
Filipe Manana <fdmanana(a)suse.com>
btrfs: remove redundant root argument from fixup_inode_link_count()
Filipe Manana <fdmanana(a)suse.com>
btrfs: remove redundant root argument from btrfs_update_inode_fallback()
Filipe Manana <fdmanana(a)suse.com>
btrfs: remove noinline from btrfs_update_inode()
Jakub Kicinski <kuba(a)kernel.org>
netlink: make sure we allow at least one dump skb
Kuniyuki Iwashima <kuniyu(a)google.com>
netlink: Fix rmem check in netlink_broadcast_deliver().
Chao Yu <chao(a)kernel.org>
erofs: fix to add missing tracepoint in erofs_read_folio()
Al Viro <viro(a)zeniv.linux.org.uk>
ksmbd: fix a mount write count leak in ksmbd_vfs_kern_path_locked()
Stefan Metzmacher <metze(a)samba.org>
smb: server: make use of rdma_destroy_qp()
Jann Horn <jannh(a)google.com>
x86/mm: Disable hugetlb page table sharing on 32-bit
Mikhail Paulyshka <me(a)mixaill.net>
x86/rdrand: Disable RDSEED on AMD Cyan Skillfish
Uwe Kleine-König <u.kleine-koenig(a)baylibre.com>
pwm: mediatek: Ensure to disable clocks in error path
Alexander Gordeev <agordeev(a)linux.ibm.com>
mm/vmalloc: leave lazy MMU mode on PTE mapping error
Florian Fainelli <florian.fainelli(a)broadcom.com>
scripts/gdb: fix interrupts.py after maple tree conversion
Florian Fainelli <florian.fainelli(a)broadcom.com>
scripts/gdb: de-reference per-CPU MCE interrupts
Florian Fainelli <florian.fainelli(a)broadcom.com>
scripts/gdb: fix interrupts display after MCP on x86
Baolin Wang <baolin.wang(a)linux.alibaba.com>
mm: fix the inaccurate memory statistics issue for users
Wei Yang <richard.weiyang(a)gmail.com>
maple_tree: fix mt_destroy_walk() on root leaf node
Achill Gilgenast <fossdd(a)pwned.life>
kallsyms: fix build without execinfo
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Revert "ACPI: battery: negate current when discharging"
Thomas Zimmermann <tzimmermann(a)suse.de>
drm/framebuffer: Acquire internal references on GEM handles
Kuen-Han Tsai <khtsai(a)google.com>
Revert "usb: gadget: u_serial: Add null pointer check in gs_start_io"
Kuen-Han Tsai <khtsai(a)google.com>
usb: gadget: u_serial: Fix race condition in TTY wakeup
Simona Vetter <simona.vetter(a)ffwll.ch>
drm/gem: Fix race in drm_gem_handle_create_tail()
Christian König <christian.koenig(a)amd.com>
drm/ttm: fix error handling in ttm_buffer_object_transfer
Matthew Brost <matthew.brost(a)intel.com>
drm/sched: Increment job count before swapping tail spsc queue
Thomas Zimmermann <tzimmermann(a)suse.de>
drm/gem: Acquire references on GEM handles for framebuffers
Mathy Vanhoef <Mathy.Vanhoef(a)kuleuven.be>
wifi: prevent A-MSDU attacks in mesh networks
Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
pinctrl: qcom: msm: mark certain pins as invalid for interrupts
Håkon Bugge <haakon.bugge(a)oracle.com>
md/md-bitmap: fix GPF in bitmap_get_stats()
Guillaume Nault <gnault(a)redhat.com>
gre: Fix IPv6 multicast route creation.
Sean Christopherson <seanjc(a)google.com>
KVM: SVM: Reject SEV{-ES} intra host migration if vCPU creation is in-flight
David Woodhouse <dwmw(a)amazon.co.uk>
KVM: x86/xen: Allow 'out of range' event channel ports in IRQ routing table.
JP Kobryn <inwardvessel(a)gmail.com>
x86/mce: Make sure CMCI banks are cleared during shutdown on Intel
Yazen Ghannam <yazen.ghannam(a)amd.com>
x86/mce: Don't remove sysfs if thresholding sysfs init fails
Yazen Ghannam <yazen.ghannam(a)amd.com>
x86/mce/amd: Fix threshold limit reset
Yazen Ghannam <yazen.ghannam(a)amd.com>
x86/mce/amd: Add default names for MCA banks and blocks
Dan Carpenter <dan.carpenter(a)linaro.org>
ipmi:msghandler: Fix potential memory corruption in ipmi_create_user()
David Howells <dhowells(a)redhat.com>
rxrpc: Fix oops due to non-existence of prealloc backlog struct
Christian Eggers <ceggers(a)arri.de>
Bluetooth: HCI: Set extended advertising data synchronously
Leo Yan <leo.yan(a)arm.com>
perf: build: Setup PKG_CONFIG_LIBDIR for cross compilation
Liam R. Howlett <Liam.Howlett(a)oracle.com>
maple_tree: fix MA_STATE_PREALLOC flag in mas_preallocate()
David Howells <dhowells(a)redhat.com>
rxrpc: Fix bug due to prealloc collision
Victor Nogueira <victor(a)mojatatu.com>
net/sched: Abort __tc_modify_qdisc if parent class does not exist
Yue Haibing <yuehaibing(a)huawei.com>
atm: clip: Fix NULL pointer dereference in vcc_sendmsg()
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: clip: Fix infinite recursive call of clip_push().
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: clip: Fix memory leak of struct clip_vcc.
Kuniyuki Iwashima <kuniyu(a)google.com>
atm: clip: Fix potential null-ptr-deref in to_atmarpd().
Oleksij Rempel <o.rempel(a)pengutronix.de>
net: phy: smsc: Fix link failure in forced mode with Auto-MDIX
Oleksij Rempel <o.rempel(a)pengutronix.de>
net: phy: smsc: Force predictable MDI-X state on LAN87xx
Oleksij Rempel <o.rempel(a)pengutronix.de>
net: phy: smsc: Fix Auto-MDIX configuration when disabled by strap
EricChan <chenchuangyu(a)xiaomi.com>
net: stmmac: Fix interrupt handling for level-triggered mode in DWC_XGMAC2
Michal Luczaj <mhal(a)rbox.co>
vsock: Fix IOCTL_VM_SOCKETS_GET_LOCAL_CID to check also `transport_local`
Michal Luczaj <mhal(a)rbox.co>
vsock: Fix transport_* TOCTOU
Michal Luczaj <mhal(a)rbox.co>
vsock: Fix transport_{g2h,h2g} TOCTOU
Jiayuan Chen <jiayuan.chen(a)linux.dev>
tcp: Correct signedness in skb remaining space calculation
Kuniyuki Iwashima <kuniyu(a)google.com>
tipc: Fix use-after-free in tipc_conn_close().
Stefano Garzarella <sgarzare(a)redhat.com>
vsock: fix `vsock_proto` declaration
Kuniyuki Iwashima <kuniyu(a)google.com>
netlink: Fix wraparounds of sk->sk_rmem_alloc.
Al Viro <viro(a)zeniv.linux.org.uk>
fix proc_sys_compare() handling of in-lookup dentries
Mario Limonciello <mario.limonciello(a)amd.com>
pinctrl: amd: Clear GPIO debounce for suspend
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_event: Fix not marking Broadcast Sink BIS as connected
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_sync: Fix not disabling advertising instance
Richard Fitzgerald <rf(a)opensource.cirrus.com>
ASoC: cs35l56: probe() should fail if the device ID is not recognized
Peter Zijlstra <peterz(a)infradead.org>
perf: Revert to requiring CAP_SYS_ADMIN for uprobes
Luo Gengkun <luogengkun(a)huaweicloud.com>
perf/core: Fix the WARN_ON_ONCE is out of lock protected region
Shengjiu Wang <shengjiu.wang(a)nxp.com>
ASoC: fsl_asrc: use internal measured ratio for non-ideal ratio mode
Kaustabh Chakraborty <kauschluss(a)disroot.org>
drm/exynos: exynos7_drm_decon: add vblank check in IRQ handling
Linus Torvalds <torvalds(a)linux-foundation.org>
eventpoll: don't decrement ep refcount while still holding the ep mutex
-------------
Diffstat:
Documentation/bpf/map_hash.rst | 8 +-
Documentation/bpf/map_lru_hash_update.dot | 6 +-
Makefile | 4 +-
arch/um/drivers/vector_kern.c | 42 +--
arch/x86/Kconfig | 2 +-
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/amd.c | 7 +
arch/x86/kernel/cpu/mce/amd.c | 28 +-
arch/x86/kernel/cpu/mce/core.c | 8 +-
arch/x86/kernel/cpu/mce/intel.c | 1 +
arch/x86/kvm/svm/sev.c | 4 +
arch/x86/kvm/xen.c | 15 +-
crypto/ecc.c | 2 +-
drivers/acpi/battery.c | 19 +-
drivers/atm/idt77252.c | 5 +
drivers/block/nbd.c | 6 +-
drivers/block/ublk_drv.c | 3 +-
drivers/char/ipmi/ipmi_msghandler.c | 3 +-
drivers/gpu/drm/drm_framebuffer.c | 31 +-
drivers/gpu/drm/drm_gem.c | 74 ++++-
drivers/gpu/drm/drm_internal.h | 2 +
drivers/gpu/drm/exynos/exynos7_drm_decon.c | 4 +
drivers/gpu/drm/tegra/nvdec.c | 6 +-
drivers/gpu/drm/ttm/ttm_bo_util.c | 13 +-
drivers/hid/hid-ids.h | 6 +
drivers/hid/hid-lenovo.c | 8 +
drivers/hid/hid-multitouch.c | 8 +-
drivers/hid/hid-quirks.c | 3 +
drivers/input/keyboard/atkbd.c | 3 +-
drivers/md/md-bitmap.c | 3 +-
drivers/md/raid1.c | 1 +
drivers/md/raid10.c | 10 +-
drivers/net/can/m_can/m_can.c | 2 +-
drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c | 2 +
drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 2 +-
drivers/net/ethernet/ibm/ibmvnic.h | 8 +-
drivers/net/ethernet/microsoft/mana/gdma_main.c | 3 +
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c | 24 +-
drivers/net/ethernet/xilinx/ll_temac_main.c | 2 +-
drivers/net/phy/microchip.c | 2 +-
drivers/net/phy/smsc.c | 57 +++-
drivers/net/usb/qmi_wwan.c | 1 +
drivers/net/wireless/zydas/zd1211rw/zd_mac.c | 6 +-
drivers/pinctrl/pinctrl-amd.c | 11 +
drivers/pinctrl/qcom/pinctrl-msm.c | 20 ++
drivers/pwm/pwm-mediatek.c | 13 +-
drivers/tty/vt/vt.c | 1 +
drivers/usb/cdns3/cdnsp-debug.h | 358 ++++++++++-----------
drivers/usb/cdns3/cdnsp-ep0.c | 18 +-
drivers/usb/cdns3/cdnsp-gadget.c | 6 +-
drivers/usb/cdns3/cdnsp-gadget.h | 11 +-
drivers/usb/cdns3/cdnsp-ring.c | 27 +-
drivers/usb/dwc3/core.c | 9 +-
drivers/usb/dwc3/gadget.c | 22 +-
drivers/usb/gadget/function/u_serial.c | 12 +-
fs/btrfs/btrfs_inode.h | 2 +-
fs/btrfs/free-space-tree.c | 16 +-
fs/btrfs/inode.c | 18 +-
fs/btrfs/transaction.c | 2 +-
fs/btrfs/tree-log.c | 331 +++++++++++--------
fs/erofs/data.c | 2 +
fs/eventpoll.c | 12 +-
fs/proc/inode.c | 2 +-
fs/proc/proc_sysctl.c | 18 +-
fs/proc/task_mmu.c | 14 +-
fs/smb/client/cifsglob.h | 3 +
fs/smb/client/cifsproto.h | 13 +-
fs/smb/client/connect.c | 47 ++-
fs/smb/client/dfs.c | 73 ++---
fs/smb/client/dfs.h | 42 ++-
fs/smb/client/dfs_cache.c | 198 +++++++-----
fs/smb/client/fs_context.h | 1 +
fs/smb/client/misc.c | 9 +
fs/smb/client/namespace.c | 2 +-
fs/smb/server/smb2pdu.c | 29 +-
fs/smb/server/transport_rdma.c | 5 +-
fs/smb/server/vfs.c | 1 +
include/drm/drm_file.h | 3 +
include/drm/drm_framebuffer.h | 7 +
include/drm/spsc_queue.h | 4 +-
include/linux/math.h | 12 +
include/linux/mm.h | 5 +
include/net/af_vsock.h | 2 +-
include/net/netfilter/nf_flow_table.h | 2 +-
io_uring/opdef.c | 1 +
kernel/bpf/bpf_lru_list.c | 9 +-
kernel/bpf/bpf_lru_list.h | 1 +
kernel/events/core.c | 6 +-
kernel/rseq.c | 60 +++-
lib/maple_tree.c | 14 +-
mm/kasan/report.c | 13 +-
mm/vmalloc.c | 22 +-
net/appletalk/ddp.c | 1 +
net/atm/clip.c | 64 +++-
net/bluetooth/hci_event.c | 39 +--
net/bluetooth/hci_sync.c | 215 ++++++++-----
net/ipv4/tcp.c | 2 +-
net/ipv6/addrconf.c | 9 +-
net/netlink/af_netlink.c | 90 +++---
net/rxrpc/call_accept.c | 4 +
net/sched/sch_api.c | 23 +-
net/tipc/topsrv.c | 2 +
net/vmw_vsock/af_vsock.c | 57 +++-
net/wireless/util.c | 52 ++-
scripts/gdb/linux/constants.py.in | 7 +
scripts/gdb/linux/interrupts.py | 16 +-
scripts/gdb/linux/mapletree.py | 252 +++++++++++++++
scripts/gdb/linux/xarray.py | 28 ++
sound/pci/hda/patch_realtek.c | 1 +
sound/soc/amd/yc/acp6x-mach.c | 7 +
sound/soc/codecs/cs35l56-shared.c | 2 +-
sound/soc/fsl/fsl_asrc.c | 3 +-
tools/arch/x86/include/asm/msr-index.h | 1 +
tools/build/feature/Makefile | 25 +-
tools/include/linux/kallsyms.h | 4 +
tools/perf/Makefile.perf | 27 +-
tools/testing/selftests/bpf/test_lru_map.c | 105 +++---
117 files changed, 1948 insertions(+), 1042 deletions(-)
From gregkh(a)linuxfoundation.org Tue Jul 15 18:35:42 2025
Message-ID: <20250715163542.121531643(a)linuxfoundation.org>
User-Agent: quilt/0.68
Date: Tue, 15 Jul 2025 18:35:43 +0200
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
To: stable(a)vger.kernel.org
Cc: patches(a)lists.linux.dev, linux-kernel(a)vger.kernel.org, torvalds(a)linux-foundation.org, akpm(a)linux-foundation.org, linux(a)roeck-us.net, shuah(a)kernel.org, patches(a)kernelci.org, lkft-triage(a)lists.linaro.org, pavel(a)denx.de, jonathanh(a)nvidia.com, f.fainelli(a)gmail.com, sudipm.mukherjee(a)gmail.com, srw(a)sladewatkins.net, rwarsow(a)gmx.de, conor(a)kernel.org, hargar(a)microsoft.com, broonie(a)kernel.org,
Jann Horn <jannh(a)google.com>,
Alexander Viro <viro(a)zeniv.linux.org.uk>,
Christian Brauner <brauner(a)kernel.org>,
Jan Kara <jack(a)suse.cz>,
Linus Torvalds <torvalds(a)linux-foundation.org>
X-stable: review
X-Patchwork-Hint: ignore
Subject: [PATCH 6.6 001/111] eventpoll: dont decrement ep refcount while still holding the ep mutex
MIME-Version: 1.0
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Torvalds <torvalds(a)linux-foundation.org>
commit 8c2e52ebbe885c7eeaabd3b7ddcdc1246fc400d2 upstream.
Jann Horn points out that epoll is decrementing the ep refcount and then
doing a
mutex_unlock(&ep->mtx);
afterwards. That's very wrong, because it can lead to a use-after-free.
That pattern is actually fine for the very last reference, because the
code in question will delay the actual call to "ep_free(ep)" until after
it has unlocked the mutex.
But it's wrong for the much subtler "next to last" case when somebody
*else* may also be dropping their reference and free the ep while we're
still using the mutex.
Note that this is true even if that other user is also using the same ep
mutex: mutexes, unlike spinlocks, can not be used for object ownership,
even if they guarantee mutual exclusion.
A mutex "unlock" operation is not atomic, and as one user is still
accessing the mutex as part of unlocking it, another user can come in
and get the now released mutex and free the data structure while the
first user is still cleaning up.
See our mutex documentation in Documentation/locking/mutex-design.rst,
in particular the section [1] about semantics:
"mutex_unlock() may access the mutex structure even after it has
internally released the lock already - so it's not safe for
another context to acquire the mutex and assume that the
mutex_unlock() context is not using the structure anymore"
So if we drop our ep ref before the mutex unlock, but we weren't the
last one, we may then unlock the mutex, another user comes in, drops
_their_ reference and releases the 'ep' as it now has no users - all
while the mutex_unlock() is still accessing it.
Fix this by simply moving the ep refcount dropping to outside the mutex:
the refcount itself is atomic, and doesn't need mutex protection (that's
the whole _point_ of refcounts: unlike mutexes, they are inherently
about object lifetimes).
Reported-by: Jann Horn <jannh(a)google.com>
Link: https://docs.kernel.org/locking/mutex-design.html#semantics [1]
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Jan Kara <jack(a)suse.cz>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/eventpoll.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -772,7 +772,7 @@ static bool __ep_remove(struct eventpoll
call_rcu(&epi->rcu, epi_rcu_free);
percpu_counter_dec(&ep->user->epoll_watches);
- return ep_refcount_dec_and_test(ep);
+ return true;
}
/*
@@ -780,14 +780,14 @@ static bool __ep_remove(struct eventpoll
*/
static void ep_remove_safe(struct eventpoll *ep, struct epitem *epi)
{
- WARN_ON_ONCE(__ep_remove(ep, epi, false));
+ if (__ep_remove(ep, epi, false))
+ WARN_ON_ONCE(ep_refcount_dec_and_test(ep));
}
static void ep_clear_and_put(struct eventpoll *ep)
{
struct rb_node *rbp, *next;
struct epitem *epi;
- bool dispose;
/* We need to release all tasks waiting for these file */
if (waitqueue_active(&ep->poll_wait))
@@ -820,10 +820,8 @@ static void ep_clear_and_put(struct even
cond_resched();
}
- dispose = ep_refcount_dec_and_test(ep);
mutex_unlock(&ep->mtx);
-
- if (dispose)
+ if (ep_refcount_dec_and_test(ep))
ep_free(ep);
}
@@ -1003,7 +1001,7 @@ again:
dispose = __ep_remove(ep, epi, true);
mutex_unlock(&ep->mtx);
- if (dispose)
+ if (dispose && ep_refcount_dec_and_test(ep))
ep_free(ep);
goto again;
}
Since 6ee9b3d84775 ("kasan: remove kasan_find_vm_area() to prevent
possible deadlock"), more detailed info about the vmalloc mapping and
the origin was dropped due to potential deadlocks.
While fixing the deadlock is necessary, that patch was too quick in
killing an otherwise useful feature, and did no due-diligence in
understanding if an alternative option is available.
Restore printing more helpful vmalloc allocation info in KASAN reports
with the help of vmalloc_dump_obj(). Example report:
| BUG: KASAN: vmalloc-out-of-bounds in vmalloc_oob+0x4c9/0x610
| Read of size 1 at addr ffffc900002fd7f3 by task kunit_try_catch/493
|
| CPU: [...]
| Call Trace:
| <TASK>
| dump_stack_lvl+0xa8/0xf0
| print_report+0x17e/0x810
| kasan_report+0x155/0x190
| vmalloc_oob+0x4c9/0x610
| [...]
|
| The buggy address belongs to a 1-page vmalloc region starting at 0xffffc900002fd000 allocated at vmalloc_oob+0x36/0x610
| The buggy address belongs to the physical page:
| page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x126364
| flags: 0x200000000000000(node=0|zone=2)
| raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000
| raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
| page dumped because: kasan: bad access detected
|
| [..]
Fixes: 6ee9b3d84775 ("kasan: remove kasan_find_vm_area() to prevent possible deadlock")
Suggested-by: Uladzislau Rezki <urezki(a)gmail.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Yeoreum Yun <yeoreum.yun(a)arm.com>
Cc: Yunseong Kim <ysk(a)kzalloc.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Marco Elver <elver(a)google.com>
---
mm/kasan/report.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index b0877035491f..62c01b4527eb 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -399,7 +399,9 @@ static void print_address_description(void *addr, u8 tag,
}
if (is_vmalloc_addr(addr)) {
- pr_err("The buggy address %px belongs to a vmalloc virtual mapping\n", addr);
+ pr_err("The buggy address belongs to a");
+ if (!vmalloc_dump_obj(addr))
+ pr_cont(" vmalloc virtual mapping\n");
page = vmalloc_to_page(addr);
}
--
2.50.0.727.gbf7dc18ff4-goog
After a recent change in clang to expose uninitialized warnings from
const variables [1], there is a warning in cxacru_heavy_init():
drivers/usb/atm/cxacru.c:1104:6: error: variable 'bp' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
1104 | if (instance->modem_type->boot_rom_patch) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/usb/atm/cxacru.c:1113:39: note: uninitialized use occurs here
1113 | cxacru_upload_firmware(instance, fw, bp);
| ^~
drivers/usb/atm/cxacru.c:1104:2: note: remove the 'if' if its condition is always true
1104 | if (instance->modem_type->boot_rom_patch) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/usb/atm/cxacru.c:1095:32: note: initialize the variable 'bp' to silence this warning
1095 | const struct firmware *fw, *bp;
| ^
| = NULL
This warning occurs in clang's frontend before inlining occurs, so it
cannot notice that bp is only used within cxacru_upload_firmware() under
the same condition that initializes it in cxacru_heavy_init(). Just
initialize bp to NULL to silence the warning without functionally
changing the code, which is what happens with modern compilers when they
support '-ftrivial-auto-var-init=zero' (CONFIG_INIT_STACK_ALL_ZERO=y).
Cc: stable(a)vger.kernel.org
Fixes: 1b0e61465234 ("[PATCH] USB ATM: driver for the Conexant AccessRunner chipset cxacru")
Closes: https://github.com/ClangBuiltLinux/linux/issues/2102
Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cd… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
drivers/usb/atm/cxacru.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/atm/cxacru.c b/drivers/usb/atm/cxacru.c
index a12ab90b3db7..b7c3b224a759 100644
--- a/drivers/usb/atm/cxacru.c
+++ b/drivers/usb/atm/cxacru.c
@@ -1092,7 +1092,7 @@ static int cxacru_find_firmware(struct cxacru_data *instance,
static int cxacru_heavy_init(struct usbatm_data *usbatm_instance,
struct usb_interface *usb_intf)
{
- const struct firmware *fw, *bp;
+ const struct firmware *fw, *bp = NULL;
struct cxacru_data *instance = usbatm_instance->driver_data;
int ret = cxacru_find_firmware(instance, "fw", &fw);
---
base-commit: fdfa018c6962c86d2faa183187669569be4d513f
change-id: 20250715-usb-cxacru-fix-clang-21-uninit-warning-9430d96c6bc1
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
mptcp_connect.sh can be executed manually with "-m <MODE>" and "-C" to
make sure everything works as expected when using "mmap" and "sendfile"
modes instead of "poll", and with the MPTCP checksum support.
These modes should be validated, but they are not when the selftests are
executed via the kselftest helpers. It means that most CIs validating
these selftests, like NIPA for the net development trees and LKFT for
the stable ones, are not covering these modes.
To fix that, new test programs have been added, simply calling
mptcp_connect.sh with the right parameters.
The first patch can be backported up to v5.6, and the second one up to
v5.14.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Changes in v2:
- force using a different prefix in the subtests to avoid having the
same test names in all mptcp_connect*.sh selftests.
- Link to v1: https://lore.kernel.org/r/20250714-net-mptcp-sft-connect-alt-v1-0-bf1c5abbe…
---
Matthieu Baerts (NGI0) (2):
selftests: mptcp: connect: also cover alt modes
selftests: mptcp: connect: also cover checksum
tools/testing/selftests/net/mptcp/Makefile | 3 ++-
tools/testing/selftests/net/mptcp/mptcp_connect_checksum.sh | 5 +++++
tools/testing/selftests/net/mptcp/mptcp_connect_mmap.sh | 5 +++++
tools/testing/selftests/net/mptcp/mptcp_connect_sendfile.sh | 5 +++++
4 files changed, 17 insertions(+), 1 deletion(-)
---
base-commit: b640daa2822a39ff76e70200cb2b7b892b896dce
change-id: 20250714-net-mptcp-sft-connect-alt-c1aaf073ef4e
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
When operating on struct vhost_net_ubuf_ref, the following execution
sequence is theoretically possible:
CPU0 is finalizing DMA operation CPU1 is doing VHOST_NET_SET_BACKEND
// &ubufs->refcount == 2
vhost_net_ubuf_put() vhost_net_ubuf_put_wait_and_free(oldubufs)
vhost_net_ubuf_put_and_wait()
vhost_net_ubuf_put()
int r = atomic_sub_return(1, &ubufs->refcount);
// r = 1
int r = atomic_sub_return(1, &ubufs->refcount);
// r = 0
wait_event(ubufs->wait, !atomic_read(&ubufs->refcount));
// no wait occurs here because condition is already true
kfree(ubufs);
if (unlikely(!r))
wake_up(&ubufs->wait); // use-after-free
This leads to use-after-free on ubufs access. This happens because CPU1
skips waiting for wake_up() when refcount is already zero.
To prevent that use a completion instead of wait_queue as the ubufs
notification mechanism. wait_for_completion() guarantees that there will
be complete() call prior to its return.
We also need to reinit completion because refcnt == 0 does not mean
freeing in case of vhost_net_flush() - it then sets refcnt back to 1.
AFAIK concurrent calls to vhost_net_ubuf_put_and_wait() with the same
ubufs object aren't possible since those calls (through vhost_net_flush()
or vhost_net_set_backend()) are protected by the device mutex.
So reinit_completion() right after wait_for_completion() should be fine.
Cc: stable(a)vger.kernel.org
Fixes: 0ad8b480d6ee9 ("vhost: fix ref cnt checking deadlock")
Reported-by: Andrey Ryabinin <arbn(a)yandex-team.com>
Suggested-by: Andrey Smetanin <asmetanin(a)yandex-team.ru>
Signed-off-by: Nikolay Kuratov <kniv(a)yandex-team.ru>
---
drivers/vhost/net.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 7cbfc7d718b3..454d179fffeb 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -94,7 +94,7 @@ struct vhost_net_ubuf_ref {
* >1: outstanding ubufs
*/
atomic_t refcount;
- wait_queue_head_t wait;
+ struct completion wait;
struct vhost_virtqueue *vq;
};
@@ -240,7 +240,7 @@ vhost_net_ubuf_alloc(struct vhost_virtqueue *vq, bool zcopy)
if (!ubufs)
return ERR_PTR(-ENOMEM);
atomic_set(&ubufs->refcount, 1);
- init_waitqueue_head(&ubufs->wait);
+ init_completion(&ubufs->wait);
ubufs->vq = vq;
return ubufs;
}
@@ -249,14 +249,15 @@ static int vhost_net_ubuf_put(struct vhost_net_ubuf_ref *ubufs)
{
int r = atomic_sub_return(1, &ubufs->refcount);
if (unlikely(!r))
- wake_up(&ubufs->wait);
+ complete_all(&ubufs->wait);
return r;
}
static void vhost_net_ubuf_put_and_wait(struct vhost_net_ubuf_ref *ubufs)
{
vhost_net_ubuf_put(ubufs);
- wait_event(ubufs->wait, !atomic_read(&ubufs->refcount));
+ wait_for_completion(&ubufs->wait);
+ reinit_completion(&ubufs->wait);
}
static void vhost_net_ubuf_put_wait_and_free(struct vhost_net_ubuf_ref *ubufs)
--
2.34.1
Hi Andrii and Yonghong,
On Fri, May 23, 2025 at 09:13:40PM -0700, Yonghong Song wrote:
> Add two tests:
> - one test has 'rX <op> r10' where rX is not r10, and
> - another test has 'rX <op> rY' where rX and rY are not r10
> but there is an early insn 'rX = r10'.
>
> Without previous verifier change, both tests will fail.
>
> Signed-off-by: Yonghong Song <yonghong.song(a)linux.dev>
> ---
> .../selftests/bpf/progs/verifier_precision.c | 53 +++++++++++++++++++
> 1 file changed, 53 insertions(+)
I was looking this commit (5ffb537e416e) since it was a BPF selftest
test for CVE-2025-38279, but upon looking I found that the commit
differs from the patch, there is an extra hunk that changed
kernel/bpf/verifier.c that wasn't found the Yonghong's original patch.
I suppose it was meant to be squashed into the previous commit
e2d2115e56c4 "bpf: Do not include stack ptr register in precision
backtracking bookkeeping"?
Since stable backports got only e2d2115e56c4, but not the 5ffb537e416e
here with the extra change for kernel/bpf/verifier.c, I'd guess the
backtracking logic in the stable kernel isn't correct at the moment,
so I'll send 5ffb537e416e "selftests/bpf: Add tests with stack ptr
register in conditional jmp" to stable as well. Let me know if that's
not the right thing to do.
Shung-Hsi
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 98c52829936e..a7d6e0c5928b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -16456,6 +16456,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
if (src_reg->type == PTR_TO_STACK)
insn_flags |= INSN_F_SRC_REG_STACK;
+ if (dst_reg->type == PTR_TO_STACK)
+ insn_flags |= INSN_F_DST_REG_STACK;
} else {
if (insn->src_reg != BPF_REG_0) {
verbose(env, "BPF_JMP/JMP32 uses reserved fields\n");
@@ -16465,10 +16467,11 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
memset(src_reg, 0, sizeof(*src_reg));
src_reg->type = SCALAR_VALUE;
__mark_reg_known(src_reg, insn->imm);
+
+ if (dst_reg->type == PTR_TO_STACK)
+ insn_flags |= INSN_F_DST_REG_STACK;
}
- if (dst_reg->type == PTR_TO_STACK)
- insn_flags |= INSN_F_DST_REG_STACK;
if (insn_flags) {
err = push_insn_history(env, this_branch, insn_flags, 0);
if (err)
> diff --git a/tools/testing/selftests/bpf/progs/verifier_precision.c b/tools/testing/selftests/bpf/progs/verifier_precision.c
...
From: Lance Yang <lance.yang(a)linux.dev>
As pointed out by David[1], the batched unmap logic in try_to_unmap_one()
may read past the end of a PTE table when a large folio's PTE mappings
are not fully contained within a single page table.
While this scenario might be rare, an issue triggerable from userspace must
be fixed regardless of its likelihood. This patch fixes the out-of-bounds
access by refactoring the logic into a new helper, folio_unmap_pte_batch().
The new helper correctly calculates the safe batch size by capping the scan
at both the VMA and PMD boundaries. To simplify the code, it also supports
partial batching (i.e., any number of pages from 1 up to the calculated
safe maximum), as there is no strong reason to special-case for fully
mapped folios.
[1] https://lore.kernel.org/linux-mm/a694398c-9f03-4737-81b9-7e49c857fcbe@redha…
Cc: <stable(a)vger.kernel.org>
Reported-by: David Hildenbrand <david(a)redhat.com>
Closes: https://lore.kernel.org/linux-mm/a694398c-9f03-4737-81b9-7e49c857fcbe@redha…
Fixes: 354dffd29575 ("mm: support batched unmap for lazyfree large folios during reclamation")
Suggested-by: Barry Song <baohua(a)kernel.org>
Acked-by: Barry Song <baohua(a)kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Lance Yang <lance.yang(a)linux.dev>
---
v3 -> v4:
- Add Reported-by + Closes tags (per David)
- Pick RB from Lorenzo - thanks!
- Pick AB from David - thanks!
- https://lore.kernel.org/linux-mm/20250630011305.23754-1-lance.yang@linux.dev
v2 -> v3:
- Tweak changelog (per Barry and David)
- Pick AB from Barry - thanks!
- https://lore.kernel.org/linux-mm/20250627062319.84936-1-lance.yang@linux.dev
v1 -> v2:
- Update subject and changelog (per Barry)
- https://lore.kernel.org/linux-mm/20250627025214.30887-1-lance.yang@linux.dev
mm/rmap.c | 46 ++++++++++++++++++++++++++++------------------
1 file changed, 28 insertions(+), 18 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index fb63d9256f09..1320b88fab74 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1845,23 +1845,32 @@ void folio_remove_rmap_pud(struct folio *folio, struct page *page,
#endif
}
-/* We support batch unmapping of PTEs for lazyfree large folios */
-static inline bool can_batch_unmap_folio_ptes(unsigned long addr,
- struct folio *folio, pte_t *ptep)
+static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
+ struct page_vma_mapped_walk *pvmw,
+ enum ttu_flags flags, pte_t pte)
{
const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY;
- int max_nr = folio_nr_pages(folio);
- pte_t pte = ptep_get(ptep);
+ unsigned long end_addr, addr = pvmw->address;
+ struct vm_area_struct *vma = pvmw->vma;
+ unsigned int max_nr;
+
+ if (flags & TTU_HWPOISON)
+ return 1;
+ if (!folio_test_large(folio))
+ return 1;
+ /* We may only batch within a single VMA and a single page table. */
+ end_addr = pmd_addr_end(addr, vma->vm_end);
+ max_nr = (end_addr - addr) >> PAGE_SHIFT;
+
+ /* We only support lazyfree batching for now ... */
if (!folio_test_anon(folio) || folio_test_swapbacked(folio))
- return false;
+ return 1;
if (pte_unused(pte))
- return false;
- if (pte_pfn(pte) != folio_pfn(folio))
- return false;
+ return 1;
- return folio_pte_batch(folio, addr, ptep, pte, max_nr, fpb_flags, NULL,
- NULL, NULL) == max_nr;
+ return folio_pte_batch(folio, addr, pvmw->pte, pte, max_nr, fpb_flags,
+ NULL, NULL, NULL);
}
/*
@@ -2024,9 +2033,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
if (pte_dirty(pteval))
folio_mark_dirty(folio);
} else if (likely(pte_present(pteval))) {
- if (folio_test_large(folio) && !(flags & TTU_HWPOISON) &&
- can_batch_unmap_folio_ptes(address, folio, pvmw.pte))
- nr_pages = folio_nr_pages(folio);
+ nr_pages = folio_unmap_pte_batch(folio, &pvmw, flags, pteval);
end_addr = address + nr_pages * PAGE_SIZE;
flush_cache_range(vma, address, end_addr);
@@ -2206,13 +2213,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
hugetlb_remove_rmap(folio);
} else {
folio_remove_rmap_ptes(folio, subpage, nr_pages, vma);
- folio_ref_sub(folio, nr_pages - 1);
}
if (vma->vm_flags & VM_LOCKED)
mlock_drain_local();
- folio_put(folio);
- /* We have already batched the entire folio */
- if (nr_pages > 1)
+ folio_put_refs(folio, nr_pages);
+
+ /*
+ * If we are sure that we batched the entire folio and cleared
+ * all PTEs, we can just optimize and stop right here.
+ */
+ if (nr_pages == folio_nr_pages(folio))
goto walk_done;
continue;
walk_abort:
--
2.49.0
From: "Michael C. Pratt" <mcpratt(a)pm.me>
On 11 Oct 2022, it was reported that the crc32 verification
of the u-boot environment failed only on big-endian systems
for the u-boot-env nvmem layout driver with the following error.
Invalid calculated CRC32: 0x88cd6f09 (expected: 0x096fcd88)
This problem has been present since the driver was introduced,
and before it was made into a layout driver.
The suggested fix at the time was to use further endianness
conversion macros in order to have both the stored and calculated
crc32 values to compare always represented in the system's endianness.
This was not accepted due to sparse warnings
and some disagreement on how to handle the situation.
Later on in a newer revision of the patch, it was proposed to use
cpu_to_le32() for both values to compare instead of le32_to_cpu()
and store the values as __le32 type to remove compilation errors.
The necessity of this is based on the assumption that the use of crc32()
requires endianness conversion because the algorithm uses little-endian,
however, this does not prove to be the case and the issue is unrelated.
Upon inspecting the current kernel code,
there already is an existing use of le32_to_cpu() in this driver,
which suggests there already is special handling for big-endian systems,
however, it is big-endian systems that have the problem.
This, being the only functional difference between architectures
in the driver combined with the fact that the suggested fix
was to use the exact same endianness conversion for the values
brings up the possibility that it was not necessary to begin with,
as the same endianness conversion for two values expected to be the same
is expected to be equivalent to no conversion at all.
After inspecting the u-boot environment of devices of both endianness
and trying to remove the existing endianness conversion,
the problem is resolved in an equivalent way as the other suggested fixes.
Ultimately, it seems that u-boot is agnostic to endianness
at least for the purpose of environment variables.
In other words, u-boot reads and writes the stored crc32 value
with the same endianness that the crc32 value is calculated with
in whichever endianness a certain architecture runs on.
Therefore, the u-boot-env driver does not need to convert endianness.
Remove the usage of endianness macros in the u-boot-env driver,
and change the type of local variables to maintain the same return type.
If there is a special situation in the case of endianness,
it would be a corner case and should be handled by a unique "compatible".
Even though it is not necessary to use endianness conversion macros here,
it may be useful to use them in the future for consistent error printing.
Fixes: d5542923f200 ("nvmem: add driver handling U-Boot environment variables")
Reported-by: INAGAKI Hiroshi <musashino.open(a)gmail.com>
Link: https://lore.kernel.org/all/20221011024928.1807-1-musashino.open@gmail.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael C. Pratt <mcpratt(a)pm.me>
Signed-off-by: Srinivas Kandagatla <srini(a)kernel.org>
---
Changes since v1:
- removed long list of short git ids as it was too much for
small patch.
drivers/nvmem/layouts/u-boot-env.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/nvmem/layouts/u-boot-env.c b/drivers/nvmem/layouts/u-boot-env.c
index 436426d4e8f9..8571aac56295 100644
--- a/drivers/nvmem/layouts/u-boot-env.c
+++ b/drivers/nvmem/layouts/u-boot-env.c
@@ -92,7 +92,7 @@ int u_boot_env_parse(struct device *dev, struct nvmem_device *nvmem,
size_t crc32_data_offset;
size_t crc32_data_len;
size_t crc32_offset;
- __le32 *crc32_addr;
+ uint32_t *crc32_addr;
size_t data_offset;
size_t data_len;
size_t dev_size;
@@ -143,8 +143,8 @@ int u_boot_env_parse(struct device *dev, struct nvmem_device *nvmem,
goto err_kfree;
}
- crc32_addr = (__le32 *)(buf + crc32_offset);
- crc32 = le32_to_cpu(*crc32_addr);
+ crc32_addr = (uint32_t *)(buf + crc32_offset);
+ crc32 = *crc32_addr;
crc32_data_len = dev_size - crc32_data_offset;
data_len = dev_size - data_offset;
--
2.43.0
From: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
[ Upstream commit 691b5b53dbcc30bb3572cbb255374990723af0d2 ]
The display connector family of bridges is used on a plenty of ARM64
platforms (including, but not being limited to several Qualcomm Robotics
and Dragonboard platforms). It doesn't make sense for the DRM drivers to
select the driver, so select it via the defconfig.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Link: https://lore.kernel.org/r/20250214-arm64-display-connector-v1-1-306bca76316…
Signed-off-by: Bjorn Andersson <andersson(a)kernel.org>
[ Backport to 6.12.y ]
Signed-off-by: Macpaul Lin <macpaul.lin(a)mediatek.com>
---
arch/arm64/configs/defconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 7e475f38f3e1..219ef05ee5a7 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -911,6 +911,7 @@ CONFIG_DRM_PANEL_SAMSUNG_ATNA33XC20=m
CONFIG_DRM_PANEL_SITRONIX_ST7703=m
CONFIG_DRM_PANEL_TRULY_NT35597_WQXGA=m
CONFIG_DRM_PANEL_VISIONOX_VTDR6130=m
+CONFIG_DRM_DISPLAY_CONNECTOR=m
CONFIG_DRM_FSL_LDB=m
CONFIG_DRM_LONTIUM_LT8912B=m
CONFIG_DRM_LONTIUM_LT9611=m
--
2.45.2
The AMD IOMMU documentation seems pretty clear that the V2 table follows
the normal CPU expectation of sign extension. This is shown in
Figure 25: AMD64 Long Mode 4-Kbyte Page Address Translation
Where bits Sign-Extend [63:57] == [56]. This is typical for x86 which
would have three regions in the page table: lower, non-canonical, upper.
The manual describes that the V1 table does not sign extend in section
2.2.4 Sharing AMD64 Processor and IOMMU Page Tables GPA-to-SPA
Further, Vasant has checked this and indicates the HW has an addtional
behavior that the manual does not yet describe. The AMDv2 table does not
have the sign extended behavior when attached to PASID 0, which may
explain why this has gone unnoticed.
The iommu domain geometry does not directly support sign extended page
tables. The driver should report only one of the lower/upper spaces. Solve
this by removing the top VA bit from the geometry to use only the lower
space.
This will also make the iommu_domain work consistently on all PASID 0 and
PASID != 1.
Adjust dma_max_address() to remove the top VA bit. It now returns:
5 Level:
Before 0x1ffffffffffffff
After 0x0ffffffffffffff
4 Level:
Before 0xffffffffffff
After 0x7fffffffffff
Fixes: 11c439a19466 ("iommu/amd/pgtbl_v2: Fix domain max address")
Link: https://lore.kernel.org/all/8858d4d6-d360-4ef0-935c-bfd13ea54f42@amd.com/
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
---
drivers/iommu/amd/iommu.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
v2:
- Revise the commit message and comment with the new information
from Vasant.
v1: https://patch.msgid.link/r/0-v1-6925ece6b623+296-amdv2_geo_jgg@nvidia.com
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 3117d99cf83d0d..1baa9d3583f369 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2526,8 +2526,21 @@ static inline u64 dma_max_address(enum protection_domain_mode pgtable)
if (pgtable == PD_MODE_V1)
return ~0ULL;
- /* V2 with 4/5 level page table */
- return ((1ULL << PM_LEVEL_SHIFT(amd_iommu_gpt_level)) - 1);
+ /*
+ * V2 with 4/5 level page table. Note that "2.2.6.5 AMD64 4-Kbyte Page
+ * Translation" shows that the V2 table sign extends the top of the
+ * address space creating a reserved region in the middle of the
+ * translation, just like the CPU does. Further Vasant says the docs are
+ * incomplete and this only applies to non-zero PASIDs. If the AMDv2
+ * page table is assigned to the 0 PASID then there is no sign extension
+ * check.
+ *
+ * Since the IOMMU must have a fixed geometry, and the core code does
+ * not understand sign extended addressing, we have to chop off the high
+ * bit to get consistent behavior with attachments of the domain to any
+ * PASID.
+ */
+ return ((1ULL << (PM_LEVEL_SHIFT(amd_iommu_gpt_level) - 1)) - 1);
}
static bool amd_iommu_hd_support(struct amd_iommu *iommu)
base-commit: eb328711b15b17987021dbb674f446b7b008dca5
--
2.43.0
Hi,
Charles Bordet reported the following issue (full context in
https://bugs.debian.org/1108860)
> Dear Maintainer,
>
> What led up to the situation?
> We run a production environment using Debian 12 VMs, with a network
> topology involving VXLAN tunnels encapsulated inside Wireguard
> interfaces. This setup has worked reliably for over a year, with MTU set
> to 1500 on all interfaces except the Wireguard interface (set to 1420).
> Wireguard kernel fragmentation allowed this configuration to function
> without issues, even though the effective path MTU is lower than 1500.
>
> What exactly did you do (or not do) that was effective (or ineffective)?
> We performed a routine system upgrade, updating all packages include the
> kernel. After the upgrade, we observed severe network issues (timeouts,
> very slow HTTP/HTTPS, and apt update failures) on all VMs behind the
> router. SSH and small-packet traffic continued to work.
>
> To diagnose, we:
>
> * Restored a backup (with the previous kernel): the problem disappeared.
> * Repeated the upgrade, confirming the issue reappeared.
> * Systematically tested each kernel version from 6.1.124-1 up to
> 6.1.140-1. The problem first appears with kernel 6.1.135-1; all earlier
> versions work as expected.
> * Kernel version from the backports (6.12.32-1) did not resolve the
> problem.
>
> What was the outcome of this action?
>
> * With kernel 6.1.135-1 or later, network timeouts occur for
> large-packet protocols (HTTP, apt, etc.), while SSH and small-packet
> protocols work.
> * With kernel 6.1.133-1 or earlier, everything works as expected.
>
> What outcome did you expect instead?
> We expected the network to function as before, with Wireguard handling
> fragmentation transparently and no application-level timeouts,
> regardless of the kernel version.
While triaging the issue we found that the commit 8930424777e4
("tunnels: Accept PACKET_HOST in skb_tunnel_check_pmtu()." introduces
the issue and Charles confirmed that the issue was present as well in
6.12.35 and 6.15.4 (other version up could potentially still be
affected, but we wanted to check it is not a 6.1.y specific
regression).
Reverthing the commit fixes Charles' issue.
Does that ring a bell?
Regards,
Salvatore
Under some circumstances, such as when a server socket is closing, ABORT
packets will be generated in response to incoming packets. Unfortunately,
this also may include generating aborts in response to incoming aborts -
which may cause a cycle. It appears this may be made possible by giving
the client a multicast address.
Fix this such that rxrpc_reject_packet() will refuse to generate aborts in
response to aborts.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Jeffrey Altman <jaltman(a)auristor.com>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Linus Torvalds <torvalds(a)linux-foundation.org>
cc: Jakub Kicinski <kuba(a)kernel.org>
cc: Paolo Abeni <pabeni(a)redhat.com>
cc: "David S. Miller" <davem(a)davemloft.net>
cc: Eric Dumazet <edumazet(a)google.com>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
cc: netdev(a)vger.kernel.org
cc: stable(a)vger.kernel.org
---
net/rxrpc/output.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c
index ef7b3096c95e..17c33b5cf7dd 100644
--- a/net/rxrpc/output.c
+++ b/net/rxrpc/output.c
@@ -814,6 +814,9 @@ void rxrpc_reject_packet(struct rxrpc_local *local, struct sk_buff *skb)
__be32 code;
int ret, ioc;
+ if (sp->hdr.type == RXRPC_PACKET_TYPE_ABORT)
+ return; /* Never abort an abort. */
+
rxrpc_see_skb(skb, rxrpc_skb_see_reject);
iov[0].iov_base = &whdr;
If a call receives an event (such as incoming data), the call gets placed
on the socket's queue and a thread in recvmsg can be awakened to go and
process it. Once the thread has picked up the call off of the queue,
further events will cause it to be requeued, and once the socket lock is
dropped (recvmsg uses call->user_mutex to allow the socket to be used in
parallel), a second thread can come in and its recvmsg can pop the call off
the socket queue again.
In such a case, the first thread will be receiving stuff from the call and
the second thread will be blocked on call->user_mutex. The first thread
can, at this point, process both the event that it picked call for and the
event that the second thread picked the call for and may see the call
terminate - in which case the call will be "released", decoupling the call
from the user call ID assigned to it (RXRPC_USER_CALL_ID in the control
message).
The first thread will return okay, but then the second thread will wake up
holding the user_mutex and, if it sees that the call has been released by
the first thread, it will BUG thusly:
kernel BUG at net/rxrpc/recvmsg.c:474!
Fix this by just dequeuing the call and ignoring it if it is seen to be
already released. We can't tell userspace about it anyway as the user call
ID has become stale.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Jeffrey Altman <jaltman(a)auristor.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Jakub Kicinski <kuba(a)kernel.org>
cc: Paolo Abeni <pabeni(a)redhat.com>
cc: "David S. Miller" <davem(a)davemloft.net>
cc: Eric Dumazet <edumazet(a)google.com>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
cc: netdev(a)vger.kernel.org
cc: stable(a)vger.kernel.org
---
include/trace/events/rxrpc.h | 3 +++
net/rxrpc/call_accept.c | 1 +
net/rxrpc/recvmsg.c | 19 +++++++++++++++++--
3 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h
index 378d2dfc7392..e7dcfb1369b6 100644
--- a/include/trace/events/rxrpc.h
+++ b/include/trace/events/rxrpc.h
@@ -330,12 +330,15 @@
EM(rxrpc_call_put_userid, "PUT user-id ") \
EM(rxrpc_call_see_accept, "SEE accept ") \
EM(rxrpc_call_see_activate_client, "SEE act-clnt") \
+ EM(rxrpc_call_see_already_released, "SEE alrdy-rl") \
EM(rxrpc_call_see_connect_failed, "SEE con-fail") \
EM(rxrpc_call_see_connected, "SEE connect ") \
EM(rxrpc_call_see_conn_abort, "SEE conn-abt") \
+ EM(rxrpc_call_see_discard, "SEE discard ") \
EM(rxrpc_call_see_disconnected, "SEE disconn ") \
EM(rxrpc_call_see_distribute_error, "SEE dist-err") \
EM(rxrpc_call_see_input, "SEE input ") \
+ EM(rxrpc_call_see_recvmsg, "SEE recvmsg ") \
EM(rxrpc_call_see_release, "SEE release ") \
EM(rxrpc_call_see_userid_exists, "SEE u-exists") \
EM(rxrpc_call_see_waiting_call, "SEE q-conn ") \
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 226b4bf82747..a4d76f2da684 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -219,6 +219,7 @@ void rxrpc_discard_prealloc(struct rxrpc_sock *rx)
tail = b->call_backlog_tail;
while (CIRC_CNT(head, tail, size) > 0) {
struct rxrpc_call *call = b->call_backlog[tail];
+ rxrpc_see_call(call, rxrpc_call_see_discard);
rcu_assign_pointer(call->socket, rx);
if (rx->app_ops &&
rx->app_ops->discard_new_call) {
diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
index 86a27fb55a1c..6990e37697de 100644
--- a/net/rxrpc/recvmsg.c
+++ b/net/rxrpc/recvmsg.c
@@ -447,6 +447,16 @@ int rxrpc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
goto try_again;
}
+ rxrpc_see_call(call, rxrpc_call_see_recvmsg);
+ if (test_bit(RXRPC_CALL_RELEASED, &call->flags)) {
+ rxrpc_see_call(call, rxrpc_call_see_already_released);
+ list_del_init(&call->recvmsg_link);
+ spin_unlock_irq(&rx->recvmsg_lock);
+ release_sock(&rx->sk);
+ trace_rxrpc_recvmsg(call->debug_id, rxrpc_recvmsg_unqueue, 0);
+ rxrpc_put_call(call, rxrpc_call_put_recvmsg);
+ goto try_again;
+ }
if (!(flags & MSG_PEEK))
list_del_init(&call->recvmsg_link);
else
@@ -470,8 +480,13 @@ int rxrpc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
release_sock(&rx->sk);
- if (test_bit(RXRPC_CALL_RELEASED, &call->flags))
- BUG();
+ if (test_bit(RXRPC_CALL_RELEASED, &call->flags)) {
+ rxrpc_see_call(call, rxrpc_call_see_already_released);
+ mutex_unlock(&call->user_mutex);
+ if (!(flags & MSG_PEEK))
+ rxrpc_put_call(call, rxrpc_call_put_recvmsg);
+ goto try_again;
+ }
ret = rxrpc_recvmsg_user_id(call, msg, flags);
if (ret < 0)
The rxrpc_assess_MTU_size() function calls down into the IP layer to find
out the MTU size for a route. When accepting an incoming call, this is
called from rxrpc_new_incoming_call() which holds interrupts disabled
across the code that calls down to it. Unfortunately, the IP layer uses
local_bh_enable() which, config dependent, throws a warning if IRQs are
enabled:
WARNING: CPU: 1 PID: 5544 at kernel/softirq.c:387 __local_bh_enable_ip+0x43/0xd0
...
RIP: 0010:__local_bh_enable_ip+0x43/0xd0
...
Call Trace:
<TASK>
rt_cache_route+0x7e/0xa0
rt_set_nexthop.isra.0+0x3b3/0x3f0
__mkroute_output+0x43a/0x460
ip_route_output_key_hash+0xf7/0x140
ip_route_output_flow+0x1b/0x90
rxrpc_assess_MTU_size.isra.0+0x2a0/0x590
rxrpc_new_incoming_peer+0x46/0x120
rxrpc_alloc_incoming_call+0x1b1/0x400
rxrpc_new_incoming_call+0x1da/0x5e0
rxrpc_input_packet+0x827/0x900
rxrpc_io_thread+0x403/0xb60
kthread+0x2f7/0x310
ret_from_fork+0x2a/0x230
ret_from_fork_asm+0x1a/0x30
...
hardirqs last enabled at (23): _raw_spin_unlock_irq+0x24/0x50
hardirqs last disabled at (24): _raw_read_lock_irq+0x17/0x70
softirqs last enabled at (0): copy_process+0xc61/0x2730
softirqs last disabled at (25): rt_add_uncached_list+0x3c/0x90
Fix this by moving the call to rxrpc_assess_MTU_size() out of
rxrpc_init_peer() and further up the stack where it can be done without
interrupts disabled.
It shouldn't be a problem for rxrpc_new_incoming_call() to do it after the
locks are dropped as pmtud is going to be performed by the I/O thread - and
we're in the I/O thread at this point.
Fixes: a2ea9a907260 ("rxrpc: Use irq-disabling spinlocks between app and I/O thread")
Signed-off-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Jeffrey Altman <jaltman(a)auristor.com>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Jakub Kicinski <kuba(a)kernel.org>
cc: Paolo Abeni <pabeni(a)redhat.com>
cc: "David S. Miller" <davem(a)davemloft.net>
cc: Eric Dumazet <edumazet(a)google.com>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
cc: netdev(a)vger.kernel.org
cc: stable(a)vger.kernel.org
---
net/rxrpc/ar-internal.h | 1 +
net/rxrpc/call_accept.c | 1 +
net/rxrpc/peer_object.c | 6 ++----
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 376e33dce8c1..df1a618dbf7d 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -1383,6 +1383,7 @@ struct rxrpc_peer *rxrpc_lookup_peer_rcu(struct rxrpc_local *,
const struct sockaddr_rxrpc *);
struct rxrpc_peer *rxrpc_lookup_peer(struct rxrpc_local *local,
struct sockaddr_rxrpc *srx, gfp_t gfp);
+void rxrpc_assess_MTU_size(struct rxrpc_local *local, struct rxrpc_peer *peer);
struct rxrpc_peer *rxrpc_alloc_peer(struct rxrpc_local *, gfp_t,
enum rxrpc_peer_trace);
void rxrpc_new_incoming_peer(struct rxrpc_local *local, struct rxrpc_peer *peer);
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 49fccee1a726..226b4bf82747 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -406,6 +406,7 @@ bool rxrpc_new_incoming_call(struct rxrpc_local *local,
spin_unlock(&rx->incoming_lock);
read_unlock_irq(&local->services_lock);
+ rxrpc_assess_MTU_size(local, call->peer);
if (hlist_unhashed(&call->error_link)) {
spin_lock_irq(&call->peer->lock);
diff --git a/net/rxrpc/peer_object.c b/net/rxrpc/peer_object.c
index e2f35e6c04d6..366431b0736c 100644
--- a/net/rxrpc/peer_object.c
+++ b/net/rxrpc/peer_object.c
@@ -149,8 +149,7 @@ struct rxrpc_peer *rxrpc_lookup_peer_rcu(struct rxrpc_local *local,
* assess the MTU size for the network interface through which this peer is
* reached
*/
-static void rxrpc_assess_MTU_size(struct rxrpc_local *local,
- struct rxrpc_peer *peer)
+void rxrpc_assess_MTU_size(struct rxrpc_local *local, struct rxrpc_peer *peer)
{
struct net *net = local->net;
struct dst_entry *dst;
@@ -277,8 +276,6 @@ static void rxrpc_init_peer(struct rxrpc_local *local, struct rxrpc_peer *peer,
peer->hdrsize += sizeof(struct rxrpc_wire_header);
peer->max_data = peer->if_mtu - peer->hdrsize;
-
- rxrpc_assess_MTU_size(local, peer);
}
/*
@@ -297,6 +294,7 @@ static struct rxrpc_peer *rxrpc_create_peer(struct rxrpc_local *local,
if (peer) {
memcpy(&peer->srx, srx, sizeof(*srx));
rxrpc_init_peer(local, peer, hash_key);
+ rxrpc_assess_MTU_size(local, peer);
}
_leave(" = %p", peer);
> This is the start of the stable review cycle for the 6.15.7 release.
> There are 192 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu, 17 Jul 2025 13:07:32 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.15.7-rc1…
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.15.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
Tested Linux kernel 6.15.7-rc1 on Fedora 37 (x86_64) with Intel i7-11800H.
All major tests including boot, Wi-Fi, Bluetooth, audio, video, and USB
mass storage detection passed successfully.
Kernel Version : 6.15.7-rc1
Fedora Version : 37 (Thirty Seven)
Processor : 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz
Build Architecture: x86_64
Test Results:
- Boot Test : PASS
- Wi-Fi Test : PASS
- Bluetooth Test : PASS
- Audio Test : PASS
- Video Test : PASS
- USB Mass Storage Drive Detect : PASS
Tested-by: Dileep Malepu <dileep.debian(a)gmail.com>
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
A modern Linux system creates much more than 20 threads at bootup.
When I booted up OpenWrt in qemu the system sometimes failed to boot up
when it wanted to create the 419th thread. The VM had 128MB RAM and the
calculation in set_max_threads() calculated that max_threads should be
set to 419. When the system booted up it tried to notify the user space
about every device it created because CONFIG_UEVENT_HELPER was set and
used. I counted 1299 calles to call_usermodehelper_setup(), all of
them try to create a new thread and call the userspace hotplug script in
it.
This fixes bootup of Linux on systems with low memory.
I saw the problem with qemu 10.0.2 using these commands:
qemu-system-aarch64 -machine virt -cpu cortex-a57 -nographic
Cc: stable(a)vger.kernel.org
Signed-off-by: Hauke Mehrtens <hauke(a)hauke-m.de>
---
kernel/fork.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 7966c9a1c163..388299525f3c 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -115,7 +115,7 @@
/*
* Minimum number of threads to boot the kernel
*/
-#define MIN_THREADS 20
+#define MIN_THREADS 600
/*
* Maximum number of threads
--
2.50.1
When a card is present in the reader, the driver currently defers
autosuspend by returning -EAGAIN during the suspend callback to
trigger USB remote wakeup signaling. However, this does not guarantee
that the mmc child device has been resumed, which may cause issues if
it remains suspended while the card is accessible.
This patch ensures that all child devices, including the mmc host
controller, are explicitly resumed before returning -EAGAIN. This
fixes a corner case introduced by earlier remote wakeup handling,
improving reliability of runtime PM when a card is inserted.
Fixes: 883a87ddf2f1 ("misc: rtsx_usb: Use USB remote wakeup signaling for card insertion detection")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ricky Wu <ricky_wu(a)realtek.com>
---
drivers/misc/cardreader/rtsx_usb.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/misc/cardreader/rtsx_usb.c b/drivers/misc/cardreader/rtsx_usb.c
index 148107a4547c..d007a4455ce5 100644
--- a/drivers/misc/cardreader/rtsx_usb.c
+++ b/drivers/misc/cardreader/rtsx_usb.c
@@ -698,6 +698,12 @@ static void rtsx_usb_disconnect(struct usb_interface *intf)
}
#ifdef CONFIG_PM
+static int rtsx_usb_resume_child(struct device *dev, void *data)
+{
+ pm_request_resume(dev);
+ return 0;
+}
+
static int rtsx_usb_suspend(struct usb_interface *intf, pm_message_t message)
{
struct rtsx_ucr *ucr =
@@ -713,8 +719,10 @@ static int rtsx_usb_suspend(struct usb_interface *intf, pm_message_t message)
mutex_unlock(&ucr->dev_mutex);
/* Defer the autosuspend if card exists */
- if (val & (SD_CD | MS_CD))
+ if (val & (SD_CD | MS_CD)) {
+ device_for_each_child(&intf->dev, NULL, rtsx_usb_resume_child);
return -EAGAIN;
+ }
} else {
/* There is an ongoing operation*/
return -EAGAIN;
@@ -724,12 +732,6 @@ static int rtsx_usb_suspend(struct usb_interface *intf, pm_message_t message)
return 0;
}
-static int rtsx_usb_resume_child(struct device *dev, void *data)
-{
- pm_request_resume(dev);
- return 0;
-}
-
static int rtsx_usb_resume(struct usb_interface *intf)
{
device_for_each_child(&intf->dev, NULL, rtsx_usb_resume_child);
--
2.25.1