This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.184-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.15.184-rc1
Alexander Lobakin alexandr.lobakin@intel.com ice: arfs: fix use-after-free when freeing @rx_cpu_rmap
Florian Westphal fw@strlen.de netfilter: nf_tables: do not defer rule destruction via call_rcu
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: wait for rcu grace period on net_device removal
Florian Westphal fw@strlen.de netfilter: nf_tables: pass nft_chain to destroy function, not nft_ctx
Josef Bacik josef@toxicpanda.com btrfs: do not clean up repair bio if submit fails
Filipe Manana fdmanana@suse.com btrfs: don't BUG_ON() when 0 reference count at btrfs_lookup_extent_info()
Eric Dumazet edumazet@google.com sctp: add mutual exclusion in proc_sctp_do_udp_port()
Feng Tang feng.tang@linux.alibaba.com selftests/mm: compaction_test: support platform with huge mount of memory
GONG Ruiqi gongruiqi1@huawei.com usb: typec: fix pm usage counter imbalance in ucsi_ccg_sync_control()
Dan Carpenter dan.carpenter@linaro.org usb: typec: fix potential array underflow in ucsi_ccg_sync_control()
RD Babiera rdbabiera@google.com usb: typec: altmodes/displayport: create sysfs nodes as driver's default device attribute group
Andrei Kuchynski akuchynski@chromium.org usb: typec: ucsi: displayport: Fix deadlock
Sebastian Andrzej Siewior bigeasy@linutronix.de clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable()
Fengnan Chang changfengnan@bytedance.com block: fix direct io NOWAIT flag not work
Shuai Xue xueshuai@linux.alibaba.com dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groups
Shuai Xue xueshuai@linux.alibaba.com dmaengine: idxd: fix memory leak in error handling path of idxd_setup_engines
Yemike Abhilash Chandra y-abhilashchandra@ti.com dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy
Ronald Wahl ronald.wahl@legrand.com dmaengine: ti: k3-udma: Add missing locking
Fedor Pchelkin pchelkin@ispras.ru wifi: mt76: disable napi on driver removal
Claudiu Beznea claudiu.beznea.uj@bp.renesas.com phy: renesas: rcar-gen3-usb2: Set timing registers only once
Ma Ke make24@iscas.ac.cn phy: Fix error handling in tegra_xusb_port_init
Steven Rostedt rostedt@goodmis.org tracing: samples: Initialize trace_array_printk() with the correct function
pengdonglin pengdonglin@xiaomi.com ftrace: Fix preemption accounting for stacktrace filter command
pengdonglin pengdonglin@xiaomi.com ftrace: Fix preemption accounting for stacktrace trigger command
Nicolas Chauvet kwizart@gmail.com ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera
Christian Heusel christian@heusel.eu ALSA: usb-audio: Add sample rate quirk for Audioengine D1
Wentao Liang vulab@iscas.ac.cn ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2()
Jeremy Linton jeremy.linton@arm.com ACPI: PPTT: Fix processor subtable walk
Filipe Manana fdmanana@suse.com btrfs: fix discard worker infinite loop after disabling discard
Nathan Lynch nathan.lynch@amd.com dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted"
Peter Zijlstra peterz@infradead.org x86/its: FineIBT-paranoid vs ITS
Eric Biggers ebiggers@google.com x86/its: Fix build errors when CONFIG_MODULES=n
Peter Zijlstra peterz@infradead.org x86/its: Use dynamic thunks for indirect branches
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Align RETs in BHB clear sequence to avoid thunking
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Add "vmexit" option to skip mitigation on some CPUs
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Enable Indirect Target Selection mitigation
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Add support for ITS-safe return thunk
Josh Poimboeuf jpoimboe@kernel.org x86/alternatives: Remove faulty optimization
Borislav Petkov (AMD) bp@alien8.de x86/alternative: Optimize returns patching
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Add support for ITS-safe indirect thunk
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Enumerate Indirect Target Selection (ITS) bug
Pawan Gupta pawan.kumar.gupta@linux.intel.com Documentation: x86/bugs/its: Add ITS documentation
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/speculation: Remove the extra #ifdef around CALL_NOSPEC
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/speculation: Add a conditional CS prefix to CALL_NOSPEC
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/speculation: Simplify and make CALL_NOSPEC consistent
Peter Zijlstra peterz@infradead.org x86,nospec: Simplify {JMP,CALL}_NOSPEC
Trond Myklebust trond.myklebust@hammerspace.com NFSv4/pnfs: Reset the layout state after a layoutreturn
Abdun Nihaal abdun.nihaal@gmail.com qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd()
Geert Uytterhoeven geert+renesas@glider.be ALSA: sh: SND_AICA should depend on SH_DMA_API
Vladimir Oltean vladimir.oltean@nxp.com net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENING
Mathieu Othacehe othacehe@gnu.org net: cadence: macb: Fix a possible deadlock in macb_halt_tx.
Cong Wang xiyou.wangcong@gmail.com net_sched: Flush gso_skb list too during ->change()
Geert Uytterhoeven geert+renesas@glider.be spi: loopback-test: Do not split 1024-byte hexdumps
Li Lingfeng lilingfeng3@huawei.com nfs: handle failure of nfs_get_lock_context in unlock path
Zhu Yanjun yanjun.zhu@linux.dev RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug
David Lechner dlechner@baylibre.com iio: chemical: sps30: use aligned_s64 for timestamp
Jonathan Cameron Jonathan.Cameron@huawei.com iio: adc: ad7768-1: Fix insufficient alignment of timestamp.
Masami Hiramatsu (Google) mhiramat@kernel.org tracing: probes: Fix a possible race in trace_probe_log APIs
Hans de Goede hdegoede@redhat.com platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection
-------------
Diffstat:
Documentation/ABI/testing/sysfs-devices-system-cpu | 1 + Documentation/admin-guide/hw-vuln/index.rst | 1 + .../hw-vuln/indirect-target-selection.rst | 156 +++++++++++++ Documentation/admin-guide/kernel-parameters.txt | 15 ++ Makefile | 4 +- arch/x86/Kconfig | 11 + arch/x86/entry/entry_64.S | 20 +- arch/x86/include/asm/alternative.h | 32 +++ arch/x86/include/asm/cpufeatures.h | 3 + arch/x86/include/asm/msr-index.h | 8 + arch/x86/include/asm/nospec-branch.h | 57 +++-- arch/x86/kernel/alternative.c | 243 ++++++++++++++++++++- arch/x86/kernel/cpu/bugs.c | 139 +++++++++++- arch/x86/kernel/cpu/common.c | 63 +++++- arch/x86/kernel/ftrace.c | 2 +- arch/x86/kernel/module.c | 7 + arch/x86/kernel/static_call.c | 2 +- arch/x86/kernel/vmlinux.lds.S | 10 + arch/x86/kvm/x86.c | 4 +- arch/x86/lib/retpoline.S | 39 ++++ arch/x86/net/bpf_jit_comp.c | 8 +- block/fops.c | 5 +- drivers/acpi/pptt.c | 11 +- drivers/base/cpu.c | 8 + drivers/clocksource/i8253.c | 6 +- drivers/dma/dmatest.c | 6 +- drivers/dma/idxd/init.c | 8 + drivers/dma/ti/k3-udma.c | 10 +- drivers/iio/adc/ad7768-1.c | 2 +- drivers/iio/chemical/sps30.c | 2 +- drivers/infiniband/sw/rxe/rxe_cq.c | 5 +- drivers/net/dsa/sja1105/sja1105_main.c | 6 +- drivers/net/ethernet/cadence/macb_main.c | 19 +- drivers/net/ethernet/intel/ice/ice_arfs.c | 9 +- drivers/net/ethernet/intel/ice/ice_lib.c | 5 +- drivers/net/ethernet/intel/ice/ice_main.c | 20 +- .../ethernet/qlogic/qlcnic/qlcnic_sriov_common.c | 7 +- drivers/net/wireless/mediatek/mt76/dma.c | 1 + drivers/phy/renesas/phy-rcar-gen3-usb2.c | 7 +- drivers/phy/tegra/xusb.c | 8 +- drivers/platform/x86/asus-wmi.c | 3 +- drivers/spi/spi-loopback-test.c | 2 +- drivers/usb/typec/altmodes/displayport.c | 18 +- drivers/usb/typec/ucsi/displayport.c | 19 +- drivers/usb/typec/ucsi/ucsi.c | 34 +++ drivers/usb/typec/ucsi/ucsi.h | 3 + drivers/usb/typec/ucsi/ucsi_ccg.c | 5 + fs/btrfs/discard.c | 17 +- fs/btrfs/extent-tree.c | 25 ++- fs/btrfs/extent_io.c | 15 +- fs/nfs/nfs4proc.c | 9 +- fs/nfs/pnfs.c | 9 + include/linux/cpu.h | 2 + include/linux/module.h | 5 + include/net/netfilter/nf_tables.h | 2 +- include/net/sch_generic.h | 15 ++ kernel/trace/trace_dynevent.c | 16 +- kernel/trace/trace_dynevent.h | 1 + kernel/trace/trace_events_trigger.c | 2 +- kernel/trace/trace_functions.c | 6 +- kernel/trace/trace_kprobe.c | 2 +- kernel/trace/trace_probe.c | 9 + kernel/trace/trace_uprobe.c | 2 +- net/netfilter/nf_tables_api.c | 54 +++-- net/netfilter/nft_immediate.c | 2 +- net/sched/sch_codel.c | 2 +- net/sched/sch_fq.c | 2 +- net/sched/sch_fq_codel.c | 2 +- net/sched/sch_fq_pie.c | 2 +- net/sched/sch_hhf.c | 2 +- net/sched/sch_pie.c | 2 +- net/sctp/sysctl.c | 4 + samples/ftrace/sample-trace-array.c | 2 +- sound/pci/es1968.c | 6 +- sound/sh/Kconfig | 2 +- sound/usb/quirks.c | 4 + tools/testing/selftests/vm/compaction_test.c | 19 +- 77 files changed, 1112 insertions(+), 184 deletions(-)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit bfcfe6d335a967f8ea0c1980960e6f0205b5de6e ]
The wlan_ctrl_by_user detection was introduced by commit a50bd128f28c ("asus-wmi: record wlan status while controlled by userapp").
Quoting from that commit's commit message:
""" When you call WMIMethod(DSTS, 0x00010011) to get WLAN status, it may return
(1) 0x00050001 (On) (2) 0x00050000 (Off) (3) 0x00030001 (On) (4) 0x00030000 (Off) (5) 0x00000002 (Unknown)
(1), (2) means that the model has hardware GPIO for WLAN, you can call WMIMethod(DEVS, 0x00010011, 1 or 0) to turn WLAN on/off. (3), (4) means that the model doesn’t have hardware GPIO, you need to use API or driver library to turn WLAN on/off, and call WMIMethod(DEVS, 0x00010012, 1 or 0) to set WLAN LED status. After you set WLAN LED status, you can see the WLAN status is changed with WMIMethod(DSTS, 0x00010011). Because the status is recorded lastly (ex: Windows), you can use it for synchronization. (5) means that the model doesn’t have WLAN device.
WLAN is the ONLY special case with upper rule. """
The wlan_ctrl_by_user flag should be set on 0x0003000? ((3), (4) above) return values, but the flag mistakenly also gets set on laptops with 0x0005000? ((1), (2)) return values. This is causing rfkill problems on laptops where 0x0005000? is returned.
Fix the check to only set the wlan_ctrl_by_user flag for 0x0003000? return values.
Fixes: a50bd128f28c ("asus-wmi: record wlan status while controlled by userapp") Link: https://bugzilla.kernel.org/show_bug.cgi?id=219786 Signed-off-by: Hans de Goede hdegoede@redhat.com Reviewed-by: Armin Wolf W_Armin@gmx.de Link: https://lore.kernel.org/r/20250501131702.103360-2-hdegoede@redhat.com Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/asus-wmi.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c index a34d0f53ad16f..d9933d3718129 100644 --- a/drivers/platform/x86/asus-wmi.c +++ b/drivers/platform/x86/asus-wmi.c @@ -3052,7 +3052,8 @@ static int asus_wmi_add(struct platform_device *pdev) goto fail_leds;
asus_wmi_get_devstate(asus, ASUS_WMI_DEVID_WLAN, &result); - if (result & (ASUS_WMI_DSTS_PRESENCE_BIT | ASUS_WMI_DSTS_USER_BIT)) + if ((result & (ASUS_WMI_DSTS_PRESENCE_BIT | ASUS_WMI_DSTS_USER_BIT)) == + (ASUS_WMI_DSTS_PRESENCE_BIT | ASUS_WMI_DSTS_USER_BIT)) asus->driver->wlan_ctrl_by_user = 1;
if (!(asus->driver->wlan_ctrl_by_user && ashs_present())) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu (Google) mhiramat@kernel.org
[ Upstream commit fd837de3c9cb1a162c69bc1fb1f438467fe7f2f5 ]
Since the shared trace_probe_log variable can be accessed and modified via probe event create operation of kprobe_events, uprobe_events, and dynamic_events, it should be protected. In the dynamic_events, all operations are serialized by `dyn_event_ops_mutex`. But kprobe_events and uprobe_events interfaces are not serialized.
To solve this issue, introduces dyn_event_create(), which runs create() operation under the mutex, for kprobe_events and uprobe_events. This also uses lockdep to check the mutex is held when using trace_probe_log* APIs.
Link: https://lore.kernel.org/all/174684868120.551552.3068655787654268804.stgit@de...
Reported-by: Paul Cacheux paulcacheux@gmail.com Closes: https://lore.kernel.org/all/20250510074456.805a16872b591e2971a4d221@kernel.o... Fixes: ab105a4fb894 ("tracing: Use tracing error_log with probe events") Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace_dynevent.c | 16 +++++++++++++++- kernel/trace/trace_dynevent.h | 1 + kernel/trace/trace_kprobe.c | 2 +- kernel/trace/trace_probe.c | 9 +++++++++ kernel/trace/trace_uprobe.c | 2 +- 5 files changed, 27 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/trace_dynevent.c b/kernel/trace/trace_dynevent.c index d4f7137233234..6d0e9f869ad68 100644 --- a/kernel/trace/trace_dynevent.c +++ b/kernel/trace/trace_dynevent.c @@ -16,7 +16,7 @@ #include "trace_output.h" /* for trace_event_sem */ #include "trace_dynevent.h"
-static DEFINE_MUTEX(dyn_event_ops_mutex); +DEFINE_MUTEX(dyn_event_ops_mutex); static LIST_HEAD(dyn_event_ops_list);
bool trace_event_dyn_try_get_ref(struct trace_event_call *dyn_call) @@ -125,6 +125,20 @@ int dyn_event_release(const char *raw_command, struct dyn_event_operations *type return ret; }
+/* + * Locked version of event creation. The event creation must be protected by + * dyn_event_ops_mutex because of protecting trace_probe_log. + */ +int dyn_event_create(const char *raw_command, struct dyn_event_operations *type) +{ + int ret; + + mutex_lock(&dyn_event_ops_mutex); + ret = type->create(raw_command); + mutex_unlock(&dyn_event_ops_mutex); + return ret; +} + static int create_dyn_event(const char *raw_command) { struct dyn_event_operations *ops; diff --git a/kernel/trace/trace_dynevent.h b/kernel/trace/trace_dynevent.h index 936477a111d3e..beee3f8d75444 100644 --- a/kernel/trace/trace_dynevent.h +++ b/kernel/trace/trace_dynevent.h @@ -100,6 +100,7 @@ void *dyn_event_seq_next(struct seq_file *m, void *v, loff_t *pos); void dyn_event_seq_stop(struct seq_file *m, void *v); int dyn_events_release_all(struct dyn_event_operations *type); int dyn_event_release(const char *raw_command, struct dyn_event_operations *type); +int dyn_event_create(const char *raw_command, struct dyn_event_operations *type);
/* * for_each_dyn_event - iterate over the dyn_event list diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 3a1c54c9918b4..e062f4efec8d0 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -971,7 +971,7 @@ static int create_or_delete_trace_kprobe(const char *raw_command) if (raw_command[0] == '-') return dyn_event_release(raw_command, &trace_kprobe_ops);
- ret = trace_kprobe_create(raw_command); + ret = dyn_event_create(raw_command, &trace_kprobe_ops); return ret == -ECANCELED ? -EINVAL : ret; }
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index d2a1b7f030685..38fa6cc118daf 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -143,9 +143,12 @@ static const struct fetch_type *find_fetch_type(const char *type) }
static struct trace_probe_log trace_probe_log; +extern struct mutex dyn_event_ops_mutex;
void trace_probe_log_init(const char *subsystem, int argc, const char **argv) { + lockdep_assert_held(&dyn_event_ops_mutex); + trace_probe_log.subsystem = subsystem; trace_probe_log.argc = argc; trace_probe_log.argv = argv; @@ -154,11 +157,15 @@ void trace_probe_log_init(const char *subsystem, int argc, const char **argv)
void trace_probe_log_clear(void) { + lockdep_assert_held(&dyn_event_ops_mutex); + memset(&trace_probe_log, 0, sizeof(trace_probe_log)); }
void trace_probe_log_set_index(int index) { + lockdep_assert_held(&dyn_event_ops_mutex); + trace_probe_log.index = index; }
@@ -167,6 +174,8 @@ void __trace_probe_log_err(int offset, int err_type) char *command, *p; int i, len = 0, pos = 0;
+ lockdep_assert_held(&dyn_event_ops_mutex); + if (!trace_probe_log.argv) return;
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c index 720b46b34ab94..322d56661d04a 100644 --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -729,7 +729,7 @@ static int create_or_delete_trace_uprobe(const char *raw_command) if (raw_command[0] == '-') return dyn_event_release(raw_command, &trace_uprobe_ops);
- ret = trace_uprobe_create(raw_command); + ret = dyn_event_create(raw_command, &trace_uprobe_ops); return ret == -ECANCELED ? -EINVAL : ret; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Cameron Jonathan.Cameron@huawei.com
[ Upstream commit ffbc26bc91c1f1eb3dcf5d8776e74cbae21ee13a ]
On architectures where an s64 is not 64-bit aligned, this may result insufficient alignment of the timestamp and the structure being too small. Use aligned_s64 to force the alignment.
Fixes: a1caeebab07e ("iio: adc: ad7768-1: Fix too small buffer passed to iio_push_to_buffers_with_timestamp()") # aligned_s64 newer Reported-by: David Lechner dlechner@baylibre.com Reviewed-by: Nuno Sá nuno.sa@analog.com Reviewed-by: David Lechner dlechner@baylibre.com Link: https://patch.msgid.link/20250413103443.2420727-3-jic23@kernel.org Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/adc/ad7768-1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/adc/ad7768-1.c b/drivers/iio/adc/ad7768-1.c index c922faab4a52b..e240fac8b6b37 100644 --- a/drivers/iio/adc/ad7768-1.c +++ b/drivers/iio/adc/ad7768-1.c @@ -169,7 +169,7 @@ struct ad7768_state { union { struct { __be32 chan; - s64 timestamp; + aligned_s64 timestamp; } scan; __be32 d32; u8 d8[2];
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Lechner dlechner@baylibre.com
[ Upstream commit bb49d940344bcb8e2b19e69d7ac86f567887ea9a ]
Follow the pattern of other drivers and use aligned_s64 for the timestamp. This will ensure that the timestamp is correctly aligned on all architectures.
Fixes: a5bf6fdd19c3 ("iio:chemical:sps30: Fix timestamp alignment") Signed-off-by: David Lechner dlechner@baylibre.com Reviewed-by: Nuno Sá nuno.sa@analog.com Link: https://patch.msgid.link/20250417-iio-more-timestamp-alignment-v1-5-eafac1e2... Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/chemical/sps30.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/chemical/sps30.c b/drivers/iio/chemical/sps30.c index d51314505115e..43991fe2e35bf 100644 --- a/drivers/iio/chemical/sps30.c +++ b/drivers/iio/chemical/sps30.c @@ -108,7 +108,7 @@ static irqreturn_t sps30_trigger_handler(int irq, void *p) int ret; struct { s32 data[4]; /* PM1, PM2P5, PM4, PM10 */ - s64 ts; + aligned_s64 ts; } scan;
mutex_lock(&state->lock);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhu Yanjun yanjun.zhu@linux.dev
[ Upstream commit f81b33582f9339d2dc17c69b92040d3650bb4bae ]
Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x7d/0xa0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xcf/0x610 mm/kasan/report.c:489 kasan_report+0xb5/0xe0 mm/kasan/report.c:602 rxe_queue_cleanup+0xd0/0xe0 drivers/infiniband/sw/rxe/rxe_queue.c:195 rxe_cq_cleanup+0x3f/0x50 drivers/infiniband/sw/rxe/rxe_cq.c:132 __rxe_cleanup+0x168/0x300 drivers/infiniband/sw/rxe/rxe_pool.c:232 rxe_create_cq+0x22e/0x3a0 drivers/infiniband/sw/rxe/rxe_verbs.c:1109 create_cq+0x658/0xb90 drivers/infiniband/core/uverbs_cmd.c:1052 ib_uverbs_create_cq+0xc7/0x120 drivers/infiniband/core/uverbs_cmd.c:1095 ib_uverbs_write+0x969/0xc90 drivers/infiniband/core/uverbs_main.c:679 vfs_write fs/read_write.c:677 [inline] vfs_write+0x26a/0xcc0 fs/read_write.c:659 ksys_write+0x1b8/0x200 fs/read_write.c:731 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xaa/0x1b0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
In the function rxe_create_cq, when rxe_cq_from_init fails, the function rxe_cleanup will be called to handle the allocated resources. In fact, some memory resources have already been freed in the function rxe_cq_from_init. Thus, this problem will occur.
The solution is to let rxe_cleanup do all the work.
Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://paste.ubuntu.com/p/tJgC42wDf6/ Tested-by: liuyi liuy22@mails.tsinghua.edu.cn Signed-off-by: Zhu Yanjun yanjun.zhu@linux.dev Link: https://patch.msgid.link/20250412075714.3257358-1-yanjun.zhu@linux.dev Reviewed-by: Daisuke Matsuda matsuda-daisuke@fujitsu.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/sw/rxe/rxe_cq.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c index 4eedaa0244b39..f22f8e950baef 100644 --- a/drivers/infiniband/sw/rxe/rxe_cq.c +++ b/drivers/infiniband/sw/rxe/rxe_cq.c @@ -71,11 +71,8 @@ int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe,
err = do_mmap_info(rxe, uresp ? &uresp->mi : NULL, udata, cq->queue->buf, cq->queue->buf_size, &cq->queue->ip); - if (err) { - vfree(cq->queue->buf); - kfree(cq->queue); + if (err) return err; - }
if (uresp) cq->is_user = 1;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li Lingfeng lilingfeng3@huawei.com
[ Upstream commit c457dc1ec770a22636b473ce5d35614adfe97636 ]
When memory is insufficient, the allocation of nfs_lock_context in nfs_get_lock_context() fails and returns -ENOMEM. If we mistakenly treat an nfs4_unlockdata structure (whose l_ctx member has been set to -ENOMEM) as valid and proceed to execute rpc_run_task(), this will trigger a NULL pointer dereference in nfs4_locku_prepare. For example:
BUG: kernel NULL pointer dereference, address: 000000000000000c PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 15 UID: 0 PID: 12 Comm: kworker/u64:0 Not tainted 6.15.0-rc2-dirty #60 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 Workqueue: rpciod rpc_async_schedule RIP: 0010:nfs4_locku_prepare+0x35/0xc2 Code: 89 f2 48 89 fd 48 c7 c7 68 69 ef b5 53 48 8b 8e 90 00 00 00 48 89 f3 RSP: 0018:ffffbbafc006bdb8 EFLAGS: 00010246 RAX: 000000000000004b RBX: ffff9b964fc1fa00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: fffffffffffffff4 RDI: ffff9ba53fddbf40 RBP: ffff9ba539934000 R08: 0000000000000000 R09: ffffbbafc006bc38 R10: ffffffffb6b689c8 R11: 0000000000000003 R12: ffff9ba539934030 R13: 0000000000000001 R14: 0000000004248060 R15: ffffffffb56d1c30 FS: 0000000000000000(0000) GS:ffff9ba5881f0000(0000) knlGS:00000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000000c CR3: 000000093f244000 CR4: 00000000000006f0 Call Trace: <TASK> __rpc_execute+0xbc/0x480 rpc_async_schedule+0x2f/0x40 process_one_work+0x232/0x5d0 worker_thread+0x1da/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0x10d/0x240 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x34/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Modules linked in: CR2: 000000000000000c ---[ end trace 0000000000000000 ]---
Free the allocated nfs4_unlockdata when nfs_get_lock_context() fails and return NULL to terminate subsequent rpc_run_task, preventing NULL pointer dereference.
Fixes: f30cb757f680 ("NFS: Always wait for I/O completion before unlock") Signed-off-by: Li Lingfeng lilingfeng3@huawei.com Reviewed-by: Jeff Layton jlayton@kernel.org Link: https://lore.kernel.org/r/20250417072508.3850532-1-lilingfeng3@huawei.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4proc.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 4a0691aeb7c1d..e4b3f25bb8e48 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -6835,10 +6835,18 @@ static struct nfs4_unlockdata *nfs4_alloc_unlockdata(struct file_lock *fl, struct nfs4_unlockdata *p; struct nfs4_state *state = lsp->ls_state; struct inode *inode = state->inode; + struct nfs_lock_context *l_ctx;
p = kzalloc(sizeof(*p), GFP_KERNEL); if (p == NULL) return NULL; + l_ctx = nfs_get_lock_context(ctx); + if (!IS_ERR(l_ctx)) { + p->l_ctx = l_ctx; + } else { + kfree(p); + return NULL; + } p->arg.fh = NFS_FH(inode); p->arg.fl = &p->fl; p->arg.seqid = seqid; @@ -6846,7 +6854,6 @@ static struct nfs4_unlockdata *nfs4_alloc_unlockdata(struct file_lock *fl, p->lsp = lsp; /* Ensure we don't close file until we're done freeing locks! */ p->ctx = get_nfs_open_context(ctx); - p->l_ctx = nfs_get_lock_context(ctx); locks_init_lock(&p->fl); locks_copy_lock(&p->fl, fl); p->server = NFS_SERVER(inode);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Geert Uytterhoeven geert+renesas@glider.be
[ Upstream commit a73fa3690a1f3014d6677e368dce4e70767a6ba2 ]
spi_test_print_hex_dump() prints buffers holding less than 1024 bytes in full. Larger buffers are truncated: only the first 512 and the last 512 bytes are printed, separated by a truncation message. The latter is confusing in case the buffer holds exactly 1024 bytes, as all data is printed anyway.
Fix this by printing buffers holding up to and including 1024 bytes in full.
Fixes: 84e0c4e5e2c4ef42 ("spi: add loopback test driver to allow for spi_master regression tests") Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://patch.msgid.link/37ee1bc90c6554c9347040adabf04188c8f704aa.1746184171... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-loopback-test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/spi/spi-loopback-test.c b/drivers/spi/spi-loopback-test.c index 4d4f77a186a98..89fccb9da1b8e 100644 --- a/drivers/spi/spi-loopback-test.c +++ b/drivers/spi/spi-loopback-test.c @@ -383,7 +383,7 @@ MODULE_LICENSE("GPL"); static void spi_test_print_hex_dump(char *pre, const void *ptr, size_t len) { /* limit the hex_dump */ - if (len < 1024) { + if (len <= 1024) { print_hex_dump(KERN_INFO, pre, DUMP_PREFIX_OFFSET, 16, 1, ptr, len, 0);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cong Wang xiyou.wangcong@gmail.com
[ Upstream commit 2d3cbfd6d54a2c39ce3244f33f85c595844bd7b8 ]
Previously, when reducing a qdisc's limit via the ->change() operation, only the main skb queue was trimmed, potentially leaving packets in the gso_skb list. This could result in NULL pointer dereference when we only check sch->limit against sch->q.qlen.
This patch introduces a new helper, qdisc_dequeue_internal(), which ensures both the gso_skb list and the main queue are properly flushed when trimming excess packets. All relevant qdiscs (codel, fq, fq_codel, fq_pie, hhf, pie) are updated to use this helper in their ->change() routines.
Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM") Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM") Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler") Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler") Fixes: 10239edf86f1 ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc") Fixes: d4b36210c2e6 ("net: pkt_sched: PIE AQM scheme") Reported-by: Will willsroot@protonmail.com Reported-by: Savy savy@syst3mfailure.io Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/sch_generic.h | 15 +++++++++++++++ net/sched/sch_codel.c | 2 +- net/sched/sch_fq.c | 2 +- net/sched/sch_fq_codel.c | 2 +- net/sched/sch_fq_pie.c | 2 +- net/sched/sch_hhf.c | 2 +- net/sched/sch_pie.c | 2 +- 7 files changed, 21 insertions(+), 6 deletions(-)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 0919dfd3a67a6..55127305478df 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -1035,6 +1035,21 @@ static inline struct sk_buff *__qdisc_dequeue_head(struct qdisc_skb_head *qh) return skb; }
+static inline struct sk_buff *qdisc_dequeue_internal(struct Qdisc *sch, bool direct) +{ + struct sk_buff *skb; + + skb = __skb_dequeue(&sch->gso_skb); + if (skb) { + sch->q.qlen--; + return skb; + } + if (direct) + return __qdisc_dequeue_head(&sch->q); + else + return sch->dequeue(sch); +} + static inline struct sk_buff *qdisc_dequeue_head(struct Qdisc *sch) { struct sk_buff *skb = __qdisc_dequeue_head(&sch->q); diff --git a/net/sched/sch_codel.c b/net/sched/sch_codel.c index 30169b3adbbb0..d99c7386e24e6 100644 --- a/net/sched/sch_codel.c +++ b/net/sched/sch_codel.c @@ -174,7 +174,7 @@ static int codel_change(struct Qdisc *sch, struct nlattr *opt,
qlen = sch->q.qlen; while (sch->q.qlen > sch->limit) { - struct sk_buff *skb = __qdisc_dequeue_head(&sch->q); + struct sk_buff *skb = qdisc_dequeue_internal(sch, true);
dropped += qdisc_pkt_len(skb); qdisc_qstats_backlog_dec(sch, skb); diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c index 5a1274199fe33..65b12b39e2ec5 100644 --- a/net/sched/sch_fq.c +++ b/net/sched/sch_fq.c @@ -904,7 +904,7 @@ static int fq_change(struct Qdisc *sch, struct nlattr *opt, sch_tree_lock(sch); } while (sch->q.qlen > sch->limit) { - struct sk_buff *skb = fq_dequeue(sch); + struct sk_buff *skb = qdisc_dequeue_internal(sch, false);
if (!skb) break; diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c index efda894bbb78b..f954969ea8fec 100644 --- a/net/sched/sch_fq_codel.c +++ b/net/sched/sch_fq_codel.c @@ -429,7 +429,7 @@ static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt,
while (sch->q.qlen > sch->limit || q->memory_usage > q->memory_limit) { - struct sk_buff *skb = fq_codel_dequeue(sch); + struct sk_buff *skb = qdisc_dequeue_internal(sch, false);
q->cstats.drop_len += qdisc_pkt_len(skb); rtnl_kfree_skbs(skb, skb); diff --git a/net/sched/sch_fq_pie.c b/net/sched/sch_fq_pie.c index 1fb68c973f451..30259c8756451 100644 --- a/net/sched/sch_fq_pie.c +++ b/net/sched/sch_fq_pie.c @@ -360,7 +360,7 @@ static int fq_pie_change(struct Qdisc *sch, struct nlattr *opt,
/* Drop excess packets if new limit is lower */ while (sch->q.qlen > sch->limit) { - struct sk_buff *skb = fq_pie_qdisc_dequeue(sch); + struct sk_buff *skb = qdisc_dequeue_internal(sch, false);
len_dropped += qdisc_pkt_len(skb); num_dropped += 1; diff --git a/net/sched/sch_hhf.c b/net/sched/sch_hhf.c index 420ede8753229..433bddcbc0c72 100644 --- a/net/sched/sch_hhf.c +++ b/net/sched/sch_hhf.c @@ -563,7 +563,7 @@ static int hhf_change(struct Qdisc *sch, struct nlattr *opt, qlen = sch->q.qlen; prev_backlog = sch->qstats.backlog; while (sch->q.qlen > sch->limit) { - struct sk_buff *skb = hhf_dequeue(sch); + struct sk_buff *skb = qdisc_dequeue_internal(sch, false);
rtnl_kfree_skbs(skb, skb); } diff --git a/net/sched/sch_pie.c b/net/sched/sch_pie.c index 5a457ff61acd8..67ce65af52b5c 100644 --- a/net/sched/sch_pie.c +++ b/net/sched/sch_pie.c @@ -193,7 +193,7 @@ static int pie_change(struct Qdisc *sch, struct nlattr *opt, /* Drop excess packets if new limit is lower */ qlen = sch->q.qlen; while (sch->q.qlen > sch->limit) { - struct sk_buff *skb = __qdisc_dequeue_head(&sch->q); + struct sk_buff *skb = qdisc_dequeue_internal(sch, true);
dropped += qdisc_pkt_len(skb); qdisc_qstats_backlog_dec(sch, skb);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mathieu Othacehe othacehe@gnu.org
[ Upstream commit c92d6089d8ad7d4d815ebcedee3f3907b539ff1f ]
There is a situation where after THALT is set high, TGO stays high as well. Because jiffies are never updated, as we are in a context with interrupts disabled, we never exit that loop and have a deadlock.
That deadlock was noticed on a sama5d4 device that stayed locked for days.
Use retries instead of jiffies so that the timeout really works and we do not have a deadlock anymore.
Fixes: e86cd53afc590 ("net/macb: better manage tx errors") Signed-off-by: Mathieu Othacehe othacehe@gnu.org Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20250509121935.16282-1-othacehe@gnu.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cadence/macb_main.c | 19 ++++++------------- 1 file changed, 6 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 275baaaea0e12..667af80a739b9 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -986,22 +986,15 @@ static void macb_update_stats(struct macb *bp)
static int macb_halt_tx(struct macb *bp) { - unsigned long halt_time, timeout; - u32 status; + u32 status;
macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(THALT));
- timeout = jiffies + usecs_to_jiffies(MACB_HALT_TIMEOUT); - do { - halt_time = jiffies; - status = macb_readl(bp, TSR); - if (!(status & MACB_BIT(TGO))) - return 0; - - udelay(250); - } while (time_before(halt_time, timeout)); - - return -ETIMEDOUT; + /* Poll TSR until TGO is cleared or timeout. */ + return read_poll_timeout_atomic(macb_readl, status, + !(status & MACB_BIT(TGO)), + 250, MACB_HALT_TIMEOUT, false, + bp, TSR); }
static void macb_tx_unmap(struct macb *bp, struct macb_tx_skb *tx_skb)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 498625a8ab2c8e1c9ab5105744310e8d6952cc01 ]
It has been reported that when under a bridge with stp_state=1, the logs get spammed with this message:
[ 251.734607] fsl_dpaa2_eth dpni.5 eth0: Couldn't decode source port
Further debugging shows the following info associated with packets: source_port=-1, switch_id=-1, vid=-1, vbid=1
In other words, they are data plane packets which are supposed to be decoded by dsa_tag_8021q_find_port_by_vbid(), but the latter (correctly) refuses to do so, because no switch port is currently in BR_STATE_LEARNING or BR_STATE_FORWARDING - so the packet is effectively unexpected.
The error goes away after the port progresses to BR_STATE_LEARNING in 15 seconds (the default forward_time of the bridge), because then, dsa_tag_8021q_find_port_by_vbid() can correctly associate the data plane packets with a plausible bridge port in a plausible STP state.
Re-reading IEEE 802.1D-1990, I see the following:
"4.4.2 Learning: (...) The Forwarding Process shall discard received frames."
IEEE 802.1D-2004 further clarifies:
"DISABLED, BLOCKING, LISTENING, and BROKEN all correspond to the DISCARDING port state. While those dot1dStpPortStates serve to distinguish reasons for discarding frames, the operation of the Forwarding and Learning processes is the same for all of them. (...) LISTENING represents a port that the spanning tree algorithm has selected to be part of the active topology (computing a Root Port or Designated Port role) but is temporarily discarding frames to guard against loops or incorrect learning."
Well, this is not what the driver does - instead it sets mac[port].ingress = true.
To get rid of the log spam, prevent unexpected data plane packets to be received by software by discarding them on ingress in the LISTENING state.
In terms of blame attribution: the prints only date back to commit d7f9787a763f ("net: dsa: tag_8021q: add support for imprecise RX based on the VBID"). However, the settings would permit a LISTENING port to forward to a FORWARDING port, and the standard suggests that's not OK.
Fixes: 640f763f98c2 ("net: dsa: sja1105: Add support for Spanning Tree Protocol") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Link: https://patch.msgid.link/20250509113816.2221992-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/sja1105/sja1105_main.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c index 888f10d93b9ab..ec1c0ad591184 100644 --- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -1969,6 +1969,7 @@ static void sja1105_bridge_stp_state_set(struct dsa_switch *ds, int port, switch (state) { case BR_STATE_DISABLED: case BR_STATE_BLOCKING: + case BR_STATE_LISTENING: /* From UM10944 description of DRPDTAG (why put this there?): * "Management traffic flows to the port regardless of the state * of the INGRESS flag". So BPDUs are still be allowed to pass. @@ -1978,11 +1979,6 @@ static void sja1105_bridge_stp_state_set(struct dsa_switch *ds, int port, mac[port].egress = false; mac[port].dyn_learn = false; break; - case BR_STATE_LISTENING: - mac[port].ingress = true; - mac[port].egress = false; - mac[port].dyn_learn = false; - break; case BR_STATE_LEARNING: mac[port].ingress = true; mac[port].egress = false;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Geert Uytterhoeven geert+renesas@glider.be
[ Upstream commit 66e48ef6ef506c89ec1b3851c6f9f5f80b5835ff ]
If CONFIG_SH_DMA_API=n:
WARNING: unmet direct dependencies detected for G2_DMA Depends on [n]: SH_DREAMCAST [=y] && SH_DMA_API [=n] Selected by [y]: - SND_AICA [=y] && SOUND [=y] && SND [=y] && SND_SUPERH [=y] && SH_DREAMCAST [=y]
SND_AICA selects G2_DMA. As the latter depends on SH_DMA_API, the former should depend on SH_DMA_API, too.
Fixes: f477a538c14d07f8 ("sh: dma: fix kconfig dependency for G2_DMA") Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202505131320.PzgTtl9H-lkp@intel.com/ Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://patch.msgid.link/b90625f8a9078d0d304bafe862cbe3a3fab40082.1747121335... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/sh/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/sh/Kconfig b/sound/sh/Kconfig index b75fbb3236a7b..f5fa09d740b4c 100644 --- a/sound/sh/Kconfig +++ b/sound/sh/Kconfig @@ -14,7 +14,7 @@ if SND_SUPERH
config SND_AICA tristate "Dreamcast Yamaha AICA sound" - depends on SH_DREAMCAST + depends on SH_DREAMCAST && SH_DMA_API select SND_PCM select G2_DMA help
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Abdun Nihaal abdun.nihaal@gmail.com
[ Upstream commit 9d8a99c5a7c7f4f7eca2c168a4ec254409670035 ]
In one of the error paths in qlcnic_sriov_channel_cfg_cmd(), the memory allocated in qlcnic_sriov_alloc_bc_mbx_args() for mailbox arguments is not freed. Fix that by jumping to the error path that frees them, by calling qlcnic_free_mbx_args(). This was found using static analysis.
Fixes: f197a7aa6288 ("qlcnic: VF-PF communication channel implementation") Signed-off-by: Abdun Nihaal abdun.nihaal@gmail.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20250512044829.36400-1-abdun.nihaal@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c index d7c93c409a776..3bc2f83176d03 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c @@ -1485,8 +1485,11 @@ static int qlcnic_sriov_channel_cfg_cmd(struct qlcnic_adapter *adapter, u8 cmd_o }
cmd_op = (cmd.rsp.arg[0] & 0xff); - if (cmd.rsp.arg[0] >> 25 == 2) - return 2; + if (cmd.rsp.arg[0] >> 25 == 2) { + ret = 2; + goto out; + } + if (cmd_op == QLCNIC_BC_CMD_CHANNEL_INIT) set_bit(QLC_BC_VF_STATE, &vf->state); else
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit 6d6d7f91cc8c111d40416ac9240a3bb9396c5235 ]
If there are still layout segments in the layout plh_return_lsegs list after a layout return, we should be resetting the state to ensure they eventually get returned as well.
Fixes: 68f744797edd ("pNFS: Do not free layout segments that are marked for return") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/pnfs.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index 4016cc5316230..83935bb1719ad 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -729,6 +729,14 @@ pnfs_mark_matching_lsegs_invalid(struct pnfs_layout_hdr *lo, return remaining; }
+static void pnfs_reset_return_info(struct pnfs_layout_hdr *lo) +{ + struct pnfs_layout_segment *lseg; + + list_for_each_entry(lseg, &lo->plh_return_segs, pls_list) + pnfs_set_plh_return_info(lo, lseg->pls_range.iomode, 0); +} + static void pnfs_free_returned_lsegs(struct pnfs_layout_hdr *lo, struct list_head *free_me, @@ -1177,6 +1185,7 @@ void pnfs_layoutreturn_free_lsegs(struct pnfs_layout_hdr *lo, pnfs_mark_matching_lsegs_invalid(lo, &freeme, range, seq); pnfs_free_returned_lsegs(lo, &freeme, range, seq); pnfs_set_layout_stateid(lo, stateid, NULL, true); + pnfs_reset_return_info(lo); } else pnfs_mark_layout_stateid_invalid(lo, &freeme); out_unlock:
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
commit 09d09531a51a24635bc3331f56d92ee7092f5516 upstream.
Have {JMP,CALL}_NOSPEC generate the same code GCC does for indirect calls and rely on the objtool retpoline patching infrastructure.
There's no reason these should be alternatives while the vast bulk of compiler generated retpolines are not.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/nospec-branch.h | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -119,25 +119,37 @@ .endm
/* + * Equivalent to -mindirect-branch-cs-prefix; emit the 5 byte jmp/call + * to the retpoline thunk with a CS prefix when the register requires + * a RAX prefix byte to encode. Also see apply_retpolines(). + */ +.macro __CS_PREFIX reg:req + .irp rs,r8,r9,r10,r11,r12,r13,r14,r15 + .ifc \reg,\rs + .byte 0x2e + .endif + .endr +.endm + +/* * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple * indirect jmp/call which may be susceptible to the Spectre variant 2 * attack. */ .macro JMP_NOSPEC reg:req #ifdef CONFIG_RETPOLINE - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \ - __stringify(jmp __x86_indirect_thunk_\reg), X86_FEATURE_RETPOLINE, \ - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_LFENCE + __CS_PREFIX \reg + jmp __x86_indirect_thunk_\reg #else jmp *%\reg + int3 #endif .endm
.macro CALL_NOSPEC reg:req #ifdef CONFIG_RETPOLINE - ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \ - __stringify(call __x86_indirect_thunk_\reg), X86_FEATURE_RETPOLINE, \ - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *%\reg), X86_FEATURE_RETPOLINE_LFENCE + __CS_PREFIX \reg + call __x86_indirect_thunk_\reg #else call *%\reg #endif
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit cfceff8526a426948b53445c02bcb98453c7330d upstream.
CALL_NOSPEC macro is used to generate Spectre-v2 mitigation friendly indirect branches. At compile time the macro defaults to indirect branch, and at runtime those can be patched to thunk based mitigations.
This approach is opposite of what is done for the rest of the kernel, where the compile time default is to replace indirect calls with retpoline thunk calls.
Make CALL_NOSPEC consistent with the rest of the kernel, default to retpoline thunk at compile time when CONFIG_RETPOLINE is enabled.
[ pawan: s/CONFIG_MITIGATION_RETPOLINE/CONFIG_RETPOLINE/ ]
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Andrew Cooper <andrew.cooper3@citrix.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Link: https://lore.kernel.org/r/20250228-call-nospec-v3-1-96599fed0f33@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/nospec-branch.h | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -285,16 +285,11 @@ extern retpoline_thunk_t __x86_indirect_ * Inline asm uses the %V modifier which is only in newer GCC * which is ensured when CONFIG_RETPOLINE is defined. */ -# define CALL_NOSPEC \ - ALTERNATIVE_2( \ - ANNOTATE_RETPOLINE_SAFE \ - "call *%[thunk_target]\n", \ - "call __x86_indirect_thunk_%V[thunk_target]\n", \ - X86_FEATURE_RETPOLINE, \ - "lfence;\n" \ - ANNOTATE_RETPOLINE_SAFE \ - "call *%[thunk_target]\n", \ - X86_FEATURE_RETPOLINE_LFENCE) +#ifdef CONFIG_RETPOLINE +#define CALL_NOSPEC "call __x86_indirect_thunk_%V[thunk_target]\n" +#else +#define CALL_NOSPEC "call *%[thunk_target]\n" +#endif
# define THUNK_TARGET(addr) [thunk_target] "r" (addr)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 052040e34c08428a5a388b85787e8531970c0c67 upstream.
Retpoline mitigation for spectre-v2 uses thunks for indirect branches. To support this mitigation compilers add a CS prefix with -mindirect-branch-cs-prefix. For an indirect branch in asm, this needs to be added manually.
CS prefix is already being added to indirect branches in asm files, but not in inline asm. Add CS prefix to CALL_NOSPEC for inline asm as well. There is no JMP_NOSPEC for inline asm.
Reported-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Andrew Cooper <andrew.cooper3@citrix.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Link: https://lore.kernel.org/r/20250228-call-nospec-v3-2-96599fed0f33@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/nospec-branch.h | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -119,9 +119,8 @@ .endm
/* - * Equivalent to -mindirect-branch-cs-prefix; emit the 5 byte jmp/call - * to the retpoline thunk with a CS prefix when the register requires - * a RAX prefix byte to encode. Also see apply_retpolines(). + * Emits a conditional CS prefix that is compatible with + * -mindirect-branch-cs-prefix. */ .macro __CS_PREFIX reg:req .irp rs,r8,r9,r10,r11,r12,r13,r14,r15 @@ -282,11 +281,23 @@ extern retpoline_thunk_t __x86_indirect_ #ifdef CONFIG_X86_64
/* + * Emits a conditional CS prefix that is compatible with + * -mindirect-branch-cs-prefix. + */ +#define __CS_PREFIX(reg) \ + ".irp rs,r8,r9,r10,r11,r12,r13,r14,r15\n" \ + ".ifc \rs," reg "\n" \ + ".byte 0x2e\n" \ + ".endif\n" \ + ".endr\n" + +/* * Inline asm uses the %V modifier which is only in newer GCC * which is ensured when CONFIG_RETPOLINE is defined. */ #ifdef CONFIG_RETPOLINE -#define CALL_NOSPEC "call __x86_indirect_thunk_%V[thunk_target]\n" +#define CALL_NOSPEC __CS_PREFIX("%V[thunk_target]") \ + "call __x86_indirect_thunk_%V[thunk_target]\n" #else #define CALL_NOSPEC "call *%[thunk_target]\n" #endif
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit c8c81458863ab686cda4fe1e603fccaae0f12460 upstream.
Commit:
010c4a461c1d ("x86/speculation: Simplify and make CALL_NOSPEC consistent")
added an #ifdef CONFIG_RETPOLINE around the CALL_NOSPEC definition. This is not required as this code is already under a larger #ifdef.
Remove the extra #ifdef, no functional change.
vmlinux size remains same before and after this change:
CONFIG_RETPOLINE=y: text data bss dec hex filename 25434752 7342290 2301212 35078254 217406e vmlinux.before 25434752 7342290 2301212 35078254 217406e vmlinux.after
# CONFIG_RETPOLINE is not set: text data bss dec hex filename 22943094 6214994 1550152 30708240 1d49210 vmlinux.before 22943094 6214994 1550152 30708240 1d49210 vmlinux.after
[ pawan: s/CONFIG_MITIGATION_RETPOLINE/CONFIG_RETPOLINE/ ]
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Link: https://lore.kernel.org/r/20250320-call-nospec-extra-ifdef-v1-1-d9b084d24820... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/nospec-branch.h | 4 ---- 1 file changed, 4 deletions(-)
--- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -295,12 +295,8 @@ extern retpoline_thunk_t __x86_indirect_ * Inline asm uses the %V modifier which is only in newer GCC * which is ensured when CONFIG_RETPOLINE is defined. */ -#ifdef CONFIG_RETPOLINE #define CALL_NOSPEC __CS_PREFIX("%V[thunk_target]") \ "call __x86_indirect_thunk_%V[thunk_target]\n" -#else -#define CALL_NOSPEC "call *%[thunk_target]\n" -#endif
# define THUNK_TARGET(addr) [thunk_target] "r" (addr)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 1ac116ce6468670eeda39345a5585df308243dca upstream.
Add the admin-guide for Indirect Target Selection (ITS).
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/hw-vuln/index.rst | 1 Documentation/admin-guide/hw-vuln/indirect-target-selection.rst | 156 ++++++++++ 2 files changed, 157 insertions(+)
--- a/Documentation/admin-guide/hw-vuln/index.rst +++ b/Documentation/admin-guide/hw-vuln/index.rst @@ -22,3 +22,4 @@ are configurable at compile, boot or run gather_data_sampling.rst srso reg-file-data-sampling + indirect-target-selection --- /dev/null +++ b/Documentation/admin-guide/hw-vuln/indirect-target-selection.rst @@ -0,0 +1,156 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Indirect Target Selection (ITS) +=============================== + +ITS is a vulnerability in some Intel CPUs that support Enhanced IBRS and were +released before Alder Lake. ITS may allow an attacker to control the prediction +of indirect branches and RETs located in the lower half of a cacheline. + +ITS is assigned CVE-2024-28956 with a CVSS score of 4.7 (Medium). + +Scope of Impact +--------------- +- **eIBRS Guest/Host Isolation**: Indirect branches in KVM/kernel may still be + predicted with unintended target corresponding to a branch in the guest. + +- **Intra-Mode BTI**: In-kernel training such as through cBPF or other native + gadgets. + +- **Indirect Branch Prediction Barrier (IBPB)**: After an IBPB, indirect + branches may still be predicted with targets corresponding to direct branches + executed prior to the IBPB. This is fixed by the IPU 2025.1 microcode, which + should be available via distro updates. Alternatively microcode can be + obtained from Intel's github repository [#f1]_. + +Affected CPUs +------------- +Below is the list of ITS affected CPUs [#f2]_ [#f3]_: + + ======================== ============ ==================== =============== + Common name Family_Model eIBRS Intra-mode BTI + Guest/Host Isolation + ======================== ============ ==================== =============== + SKYLAKE_X (step >= 6) 06_55H Affected Affected + ICELAKE_X 06_6AH Not affected Affected + ICELAKE_D 06_6CH Not affected Affected + ICELAKE_L 06_7EH Not affected Affected + TIGERLAKE_L 06_8CH Not affected Affected + TIGERLAKE 06_8DH Not affected Affected + KABYLAKE_L (step >= 12) 06_8EH Affected Affected + KABYLAKE (step >= 13) 06_9EH Affected Affected + COMETLAKE 06_A5H Affected Affected + COMETLAKE_L 06_A6H Affected Affected + ROCKETLAKE 06_A7H Not affected Affected + ======================== ============ ==================== =============== + +- All affected CPUs enumerate Enhanced IBRS feature. +- IBPB isolation is affected on all ITS affected CPUs, and need a microcode + update for mitigation. +- None of the affected CPUs enumerate BHI_CTRL which was introduced in Golden + Cove (Alder Lake and Sapphire Rapids). This can help guests to determine the + host's affected status. +- Intel Atom CPUs are not affected by ITS. + +Mitigation +---------- +As only the indirect branches and RETs that have their last byte of instruction +in the lower half of the cacheline are vulnerable to ITS, the basic idea behind +the mitigation is to not allow indirect branches in the lower half. + +This is achieved by relying on existing retpoline support in the kernel, and in +compilers. ITS-vulnerable retpoline sites are runtime patched to point to newly +added ITS-safe thunks. These safe thunks consists of indirect branch in the +second half of the cacheline. Not all retpoline sites are patched to thunks, if +a retpoline site is evaluated to be ITS-safe, it is replaced with an inline +indirect branch. + +Dynamic thunks +~~~~~~~~~~~~~~ +From a dynamically allocated pool of safe-thunks, each vulnerable site is +replaced with a new thunk, such that they get a unique address. This could +improve the branch prediction accuracy. Also, it is a defense-in-depth measure +against aliasing. + +Note, for simplicity, indirect branches in eBPF programs are always replaced +with a jump to a static thunk in __x86_indirect_its_thunk_array. If required, +in future this can be changed to use dynamic thunks. + +All vulnerable RETs are replaced with a static thunk, they do not use dynamic +thunks. This is because RETs get their prediction from RSB mostly that does not +depend on source address. RETs that underflow RSB may benefit from dynamic +thunks. But, RETs significantly outnumber indirect branches, and any benefit +from a unique source address could be outweighed by the increased icache +footprint and iTLB pressure. + +Retpoline +~~~~~~~~~ +Retpoline sequence also mitigates ITS-unsafe indirect branches. For this +reason, when retpoline is enabled, ITS mitigation only relocates the RETs to +safe thunks. Unless user requested the RSB-stuffing mitigation. + +Mitigation in guests +^^^^^^^^^^^^^^^^^^^^ +All guests deploy ITS mitigation by default, irrespective of eIBRS enumeration +and Family/Model of the guest. This is because eIBRS feature could be hidden +from a guest. One exception to this is when a guest enumerates BHI_DIS_S, which +indicates that the guest is running on an unaffected host. + +To prevent guests from unnecessarily deploying the mitigation on unaffected +platforms, Intel has defined ITS_NO bit(62) in MSR IA32_ARCH_CAPABILITIES. When +a guest sees this bit set, it should not enumerate the ITS bug. Note, this bit +is not set by any hardware, but is **intended for VMMs to synthesize** it for +guests as per the host's affected status. + +Mitigation options +^^^^^^^^^^^^^^^^^^ +The ITS mitigation can be controlled using the "indirect_target_selection" +kernel parameter. The available options are: + + ======== =================================================================== + on (default) Deploy the "Aligned branch/return thunks" mitigation. + If spectre_v2 mitigation enables retpoline, aligned-thunks are only + deployed for the affected RET instructions. Retpoline mitigates + indirect branches. + + off Disable ITS mitigation. + + vmexit Equivalent to "=on" if the CPU is affected by guest/host isolation + part of ITS. Otherwise, mitigation is not deployed. This option is + useful when host userspace is not in the threat model, and only + attacks from guest to host are considered. + + force Force the ITS bug and deploy the default mitigation. + ======== =================================================================== + +Sysfs reporting +--------------- + +The sysfs file showing ITS mitigation status is: + + /sys/devices/system/cpu/vulnerabilities/indirect_target_selection + +Note, microcode mitigation status is not reported in this file. + +The possible values in this file are: + +.. list-table:: + + * - Not affected + - The processor is not vulnerable. + * - Vulnerable + - System is vulnerable and no mitigation has been applied. + * - Vulnerable, KVM: Not affected + - System is vulnerable to intra-mode BTI, but not affected by eIBRS + guest/host isolation. + * - Mitigation: Aligned branch/return thunks + - The mitigation is enabled, affected indirect branches and RETs are + relocated to safe thunks. + +References +---------- +.. [#f1] Microcode repository - https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files + +.. [#f2] Affected Processors list - https://www.intel.com/content/www/us/en/developer/topic-technology/software-... + +.. [#f3] Affected Processors list (machine readable) - https://github.com/intel/Intel-affected-processor-list
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 159013a7ca18c271ff64192deb62a689b622d860 upstream.
ITS bug in some pre-Alderlake Intel CPUs may allow indirect branches in the first half of a cache line get predicted to a target of a branch located in the second half of the cache line.
Set X86_BUG_ITS on affected CPUs. Mitigation to follow in later commits.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/cpufeatures.h | 1 arch/x86/include/asm/msr-index.h | 8 +++++ arch/x86/kernel/cpu/common.c | 58 +++++++++++++++++++++++++++++-------- arch/x86/kvm/x86.c | 4 +- 4 files changed, 58 insertions(+), 13 deletions(-)
--- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -483,4 +483,5 @@ #define X86_BUG_RFDS X86_BUG(1*32 + 2) /* CPU is vulnerable to Register File Data Sampling */ #define X86_BUG_BHI X86_BUG(1*32 + 3) /* CPU is affected by Branch History Injection */ #define X86_BUG_IBPB_NO_RET X86_BUG(1*32 + 4) /* "ibpb_no_ret" IBPB omits return target predictions */ +#define X86_BUG_ITS X86_BUG(1*32 + 5) /* CPU is affected by Indirect Target Selection */ #endif /* _ASM_X86_CPUFEATURES_H */ --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -183,6 +183,14 @@ * VERW clears CPU Register * File. */ +#define ARCH_CAP_ITS_NO BIT_ULL(62) /* + * Not susceptible to + * Indirect Target Selection. + * This bit is not set by + * HW, but is synthesized by + * VMMs for guests to know + * their affected status. + */
#define MSR_IA32_FLUSH_CMD 0x0000010b #define L1D_FLUSH BIT(0) /* --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1141,6 +1141,8 @@ static const __initconst struct x86_cpu_ #define GDS BIT(6) /* CPU is affected by Register File Data Sampling */ #define RFDS BIT(7) +/* CPU is affected by Indirect Target Selection */ +#define ITS BIT(8)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS), @@ -1152,22 +1154,25 @@ static const struct x86_cpu_id cpu_vuln_ VULNBL_INTEL_STEPPINGS(BROADWELL_G, X86_STEPPING_ANY, SRBDS), VULNBL_INTEL_STEPPINGS(BROADWELL_X, X86_STEPPING_ANY, MMIO), VULNBL_INTEL_STEPPINGS(BROADWELL, X86_STEPPING_ANY, SRBDS), - VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPING_ANY, MMIO | RETBLEED | GDS), + VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPINGS(0x0, 0x5), MMIO | RETBLEED | GDS), + VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | ITS), VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS), VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS), - VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS), - VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS), + VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPINGS(0x0, 0xb), MMIO | RETBLEED | GDS | SRBDS), + VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS | ITS), + VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x0, 0xc), MMIO | RETBLEED | GDS | SRBDS), + VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS | ITS), VULNBL_INTEL_STEPPINGS(CANNONLAKE_L, X86_STEPPING_ANY, RETBLEED), - VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS), - VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS), - VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO | GDS), - VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS), - VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO | RETBLEED), - VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS), - VULNBL_INTEL_STEPPINGS(TIGERLAKE_L, X86_STEPPING_ANY, GDS), - VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS), + VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS | ITS), + VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO | GDS | ITS), + VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO | RETBLEED | ITS), + VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPPINGS(TIGERLAKE_L, X86_STEPPING_ANY, GDS | ITS), + VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS | ITS), VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED), - VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS), + VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | ITS), VULNBL_INTEL_STEPPINGS(ALDERLAKE, X86_STEPPING_ANY, RFDS), VULNBL_INTEL_STEPPINGS(ALDERLAKE_L, X86_STEPPING_ANY, RFDS), VULNBL_INTEL_STEPPINGS(RAPTORLAKE, X86_STEPPING_ANY, RFDS), @@ -1231,6 +1236,32 @@ static bool __init vulnerable_to_rfds(u6 return cpu_matches(cpu_vuln_blacklist, RFDS); }
+static bool __init vulnerable_to_its(u64 x86_arch_cap_msr) +{ + /* The "immunity" bit trumps everything else: */ + if (x86_arch_cap_msr & ARCH_CAP_ITS_NO) + return false; + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) + return false; + + /* None of the affected CPUs have BHI_CTRL */ + if (boot_cpu_has(X86_FEATURE_BHI_CTRL)) + return false; + + /* + * If a VMM did not expose ITS_NO, assume that a guest could + * be running on a vulnerable hardware or may migrate to such + * hardware. + */ + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) + return true; + + if (cpu_matches(cpu_vuln_blacklist, ITS)) + return true; + + return false; +} + static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) { u64 x86_arch_cap_msr = x86_read_arch_cap_msr(); @@ -1358,6 +1389,9 @@ static void __init cpu_set_bug_bits(stru if (cpu_has(c, X86_FEATURE_AMD_IBPB) && !cpu_has(c, X86_FEATURE_AMD_IBPB_RET)) setup_force_cpu_bug(X86_BUG_IBPB_NO_RET);
+ if (vulnerable_to_its(x86_arch_cap_msr)) + setup_force_cpu_bug(X86_BUG_ITS); + if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) return;
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1499,7 +1499,7 @@ static unsigned int num_msr_based_featur ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \ ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \ ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | ARCH_CAP_GDS_NO | \ - ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR | ARCH_CAP_BHI_NO) + ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR | ARCH_CAP_BHI_NO | ARCH_CAP_ITS_NO)
static u64 kvm_get_arch_capabilities(void) { @@ -1538,6 +1538,8 @@ static u64 kvm_get_arch_capabilities(voi data |= ARCH_CAP_MDS_NO; if (!boot_cpu_has_bug(X86_BUG_RFDS)) data |= ARCH_CAP_RFDS_NO; + if (!boot_cpu_has_bug(X86_BUG_ITS)) + data |= ARCH_CAP_ITS_NO;
if (!boot_cpu_has(X86_FEATURE_RTM)) { /*
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 8754e67ad4ac692c67ff1f99c0d07156f04ae40c upstream.
Due to ITS, indirect branches in the lower half of a cacheline may be vulnerable to branch target injection attack.
Introduce ITS-safe thunks to patch indirect branches in the lower half of cacheline with the thunk. Also thunk any eBPF generated indirect branches in emit_indirect_jump().
Below category of indirect branches are not mitigated:
- Indirect branches in the .init section are not mitigated because they are discarded after boot. - Indirect branches that are explicitly marked retpoline-safe.
Note that retpoline also mitigates the indirect branches against ITS. This is because the retpoline sequence fills an RSB entry before RET, and it does not suffer from RSB-underflow part of the ITS.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/Kconfig | 11 +++++ arch/x86/include/asm/cpufeatures.h | 1 arch/x86/include/asm/nospec-branch.h | 5 ++ arch/x86/kernel/alternative.c | 77 +++++++++++++++++++++++++++++++++++ arch/x86/kernel/vmlinux.lds.S | 6 ++ arch/x86/lib/retpoline.S | 28 ++++++++++++ arch/x86/net/bpf_jit_comp.c | 6 ++ 7 files changed, 133 insertions(+), 1 deletion(-)
--- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2517,6 +2517,17 @@ config MITIGATION_SPECTRE_BHI indirect branches. See file:Documentation/admin-guide/hw-vuln/spectre.rst
+config MITIGATION_ITS + bool "Enable Indirect Target Selection mitigation" + depends on CPU_SUP_INTEL && X86_64 + depends on RETPOLINE && RETHUNK + default y + help + Enable Indirect Target Selection (ITS) mitigation. ITS is a bug in + BPU on some Intel CPUs that may allow Spectre V2 style attacks. If + disabled, mitigation cannot be enabled via cmdline. + See file:Documentation/admin-guide/hw-vuln/indirect-target-selection.rst + endif
config ARCH_HAS_ADD_PAGES --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -433,6 +433,7 @@ #define X86_FEATURE_BHI_CTRL (21*32+ 2) /* "" BHI_DIS_S HW control available */ #define X86_FEATURE_CLEAR_BHB_HW (21*32+ 3) /* "" BHI_DIS_S HW control enabled */ #define X86_FEATURE_CLEAR_BHB_LOOP_ON_VMEXIT (21*32+ 4) /* "" Clear branch history at vmexit using SW loop */ +#define X86_FEATURE_INDIRECT_THUNK_ITS (21*32 + 5) /* "" Use thunk for indirect branches in lower half of cacheline */
/* * BUG word(s) --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -271,6 +271,11 @@ extern void (*x86_return_thunk)(void);
typedef u8 retpoline_thunk_t[RETPOLINE_THUNK_SIZE];
+#define ITS_THUNK_SIZE 64 +typedef u8 its_thunk_t[ITS_THUNK_SIZE]; + +extern its_thunk_t __x86_indirect_its_thunk_array[]; + #define GEN(reg) \ extern retpoline_thunk_t __x86_indirect_thunk_ ## reg; #include <asm/GEN-for-each-reg.h> --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -395,6 +395,74 @@ static int emit_indirect(int op, int reg return i; }
+#ifdef CONFIG_MITIGATION_ITS + +static int __emit_trampoline(void *addr, struct insn *insn, u8 *bytes, + void *call_dest, void *jmp_dest) +{ + u8 op = insn->opcode.bytes[0]; + int i = 0; + + /* + * Clang does 'weird' Jcc __x86_indirect_thunk_r11 conditional + * tail-calls. Deal with them. + */ + if (is_jcc32(insn)) { + bytes[i++] = op; + op = insn->opcode.bytes[1]; + goto clang_jcc; + } + + if (insn->length == 6) + bytes[i++] = 0x2e; /* CS-prefix */ + + switch (op) { + case CALL_INSN_OPCODE: + __text_gen_insn(bytes+i, op, addr+i, + call_dest, + CALL_INSN_SIZE); + i += CALL_INSN_SIZE; + break; + + case JMP32_INSN_OPCODE: +clang_jcc: + __text_gen_insn(bytes+i, op, addr+i, + jmp_dest, + JMP32_INSN_SIZE); + i += JMP32_INSN_SIZE; + break; + + default: + WARN(1, "%pS %px %*ph\n", addr, addr, 6, addr); + return -1; + } + + WARN_ON_ONCE(i != insn->length); + + return i; +} + +static int emit_its_trampoline(void *addr, struct insn *insn, int reg, u8 *bytes) +{ + return __emit_trampoline(addr, insn, bytes, + __x86_indirect_its_thunk_array[reg], + __x86_indirect_its_thunk_array[reg]); +} + +/* Check if an indirect branch is at ITS-unsafe address */ +static bool cpu_wants_indirect_its_thunk_at(unsigned long addr, int reg) +{ + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) + return false; + + /* Indirect branch opcode is 2 or 3 bytes depending on reg */ + addr += 1 + reg / 8; + + /* Lower-half of the cacheline? */ + return !(addr & 0x20); +} +#endif + /* * Rewrite the compiler generated retpoline thunk calls. * @@ -466,6 +534,15 @@ static int patch_retpoline(void *addr, s bytes[i++] = 0xe8; /* LFENCE */ }
+#ifdef CONFIG_MITIGATION_ITS + /* + * Check if the address of last byte of emitted-indirect is in + * lower-half of the cacheline. Such branches need ITS mitigation. + */ + if (cpu_wants_indirect_its_thunk_at((unsigned long)addr + i, reg)) + return emit_its_trampoline(addr, insn, reg, bytes); +#endif + ret = emit_indirect(op, reg, bytes + i); if (ret < 0) return ret; --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -532,6 +532,12 @@ INIT_PER_CPU(irq_stack_backing_store); "SRSO function pair won't alias"); #endif
+#if defined(CONFIG_MITIGATION_ITS) && !defined(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B) +. = ASSERT(__x86_indirect_its_thunk_rax & 0x20, "__x86_indirect_thunk_rax not in second half of cacheline"); +. = ASSERT(((__x86_indirect_its_thunk_rcx - __x86_indirect_its_thunk_rax) % 64) == 0, "Indirect thunks are not cacheline apart"); +. = ASSERT(__x86_indirect_its_thunk_array == __x86_indirect_its_thunk_rax, "Gap in ITS thunk array"); +#endif + #endif /* CONFIG_X86_64 */
#ifdef CONFIG_KEXEC_CORE --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -254,6 +254,34 @@ SYM_FUNC_START(entry_untrain_ret) SYM_FUNC_END(entry_untrain_ret) __EXPORT_THUNK(entry_untrain_ret)
+#ifdef CONFIG_MITIGATION_ITS + +.macro ITS_THUNK reg + +SYM_INNER_LABEL(__x86_indirect_its_thunk_\reg, SYM_L_GLOBAL) + UNWIND_HINT_EMPTY + ANNOTATE_NOENDBR + ANNOTATE_RETPOLINE_SAFE + jmp *%\reg + int3 + .align 32, 0xcc /* fill to the end of the line */ + .skip 32, 0xcc /* skip to the next upper half */ +.endm + +/* ITS mitigation requires thunks be aligned to upper half of cacheline */ +.align 64, 0xcc +.skip 32, 0xcc +SYM_CODE_START(__x86_indirect_its_thunk_array) + +#define GEN(reg) ITS_THUNK reg +#include <asm/GEN-for-each-reg.h> +#undef GEN + + .align 64, 0xcc +SYM_CODE_END(__x86_indirect_its_thunk_array) + +#endif + SYM_CODE_START(__x86_return_thunk) UNWIND_HINT_FUNC ANNOTATE_NOENDBR --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -446,7 +446,11 @@ static void emit_indirect_jump(u8 **ppro u8 *prog = *pprog;
#ifdef CONFIG_RETPOLINE - if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) { + if (IS_ENABLED(CONFIG_MITIGATION_ITS) && + cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) { + OPTIMIZER_HIDE_VAR(reg); + emit_jump(&prog, &__x86_indirect_its_thunk_array[reg], ip); + } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) { EMIT_LFENCE(); EMIT2(0xFF, 0xE0 + reg); } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Borislav Petkov (AMD)" bp@alien8.de
commit d2408e043e7296017420aa5929b3bba4d5e61013 upstream.
Instead of decoding each instruction in the return sites range only to realize that that return site is a jump to the default return thunk which is needed - X86_FEATURE_RETHUNK is enabled - lift that check before the loop and get rid of that loop overhead.
Add comments about what gets patched, while at it.
Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20230512120952.7924-1-bp@alien8.de Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/alternative.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
--- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -620,13 +620,12 @@ static int patch_return(void *addr, stru { int i = 0;
+ /* Patch the custom return thunks... */ if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) { - if (x86_return_thunk == __x86_return_thunk) - return -1; - i = JMP32_INSN_SIZE; __text_gen_insn(bytes, JMP32_INSN_OPCODE, addr, x86_return_thunk, i); } else { + /* ... or patch them out if not needed. */ bytes[i++] = RET_INSN_OPCODE; }
@@ -639,6 +638,14 @@ void __init_or_module noinline apply_ret { s32 *s;
+ /* + * Do not patch out the default return thunks if those needed are the + * ones generated by the compiler. + */ + if (cpu_feature_enabled(X86_FEATURE_RETHUNK) && + (x86_return_thunk == __x86_return_thunk)) + return; + for (s = start; s < end; s++) { void *dest = NULL, *addr = (void *)s + *s; struct insn insn;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josh Poimboeuf jpoimboe@kernel.org
commit 4ba89dd6ddeca2a733bdaed7c9a5cbe4e19d9124 upstream.
The following commit
095b8303f383 ("x86/alternative: Make custom return thunk unconditional")
made '__x86_return_thunk' a placeholder value. All code setting X86_FEATURE_RETHUNK also changes the value of 'x86_return_thunk'. So the optimization at the beginning of apply_returns() is dead code.
Also, before the above-mentioned commit, the optimization actually had a bug It bypassed __static_call_fixup(), causing some raw returns to remain unpatched in static call trampolines. Thus the 'Fixes' tag.
Fixes: d2408e043e72 ("x86/alternative: Optimize returns patching") Signed-off-by: Josh Poimboeuf jpoimboe@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Acked-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/r/16d19d2249d4485d8380fb215ffaae81e6b8119e.169388998... Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/alternative.c | 8 -------- 1 file changed, 8 deletions(-)
--- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -638,14 +638,6 @@ void __init_or_module noinline apply_ret { s32 *s;
- /* - * Do not patch out the default return thunks if those needed are the - * ones generated by the compiler. - */ - if (cpu_feature_enabled(X86_FEATURE_RETHUNK) && - (x86_return_thunk == __x86_return_thunk)) - return; - for (s = start; s < end; s++) { void *dest = NULL, *addr = (void *)s + *s; struct insn insn;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit a75bf27fe41abe658c53276a0c486c4bf9adecfc upstream.
RETs in the lower half of cacheline may be affected by ITS bug, specifically when the RSB-underflows. Use ITS-safe return thunk for such RETs.
RETs that are not patched:
- RET in retpoline sequence does not need to be patched, because the sequence itself fills an RSB before RET. - RETs in .init section are not reachable after init. - RETs that are explicitly marked safe with ANNOTATE_UNRET_SAFE.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/alternative.h | 14 ++++++++++++++ arch/x86/include/asm/nospec-branch.h | 6 ++++++ arch/x86/kernel/alternative.c | 17 ++++++++++++++++- arch/x86/kernel/ftrace.c | 2 +- arch/x86/kernel/static_call.c | 2 +- arch/x86/kernel/vmlinux.lds.S | 4 ++++ arch/x86/lib/retpoline.S | 13 ++++++++++++- arch/x86/net/bpf_jit_comp.c | 2 +- 8 files changed, 55 insertions(+), 5 deletions(-)
--- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -80,6 +80,20 @@ extern void apply_returns(s32 *start, s3
struct module;
+#ifdef CONFIG_RETHUNK +extern bool cpu_wants_rethunk(void); +extern bool cpu_wants_rethunk_at(void *addr); +#else +static __always_inline bool cpu_wants_rethunk(void) +{ + return false; +} +static __always_inline bool cpu_wants_rethunk_at(void *addr) +{ + return false; +} +#endif + #ifdef CONFIG_SMP extern void alternatives_smp_module_add(struct module *mod, char *name, void *locks, void *locks_end, --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -250,6 +250,12 @@ extern void __x86_return_thunk(void); static inline void __x86_return_thunk(void) {} #endif
+#ifdef CONFIG_MITIGATION_ITS +extern void its_return_thunk(void); +#else +static inline void its_return_thunk(void) {} +#endif + extern void retbleed_return_thunk(void); extern void srso_return_thunk(void); extern void srso_alias_return_thunk(void); --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -605,6 +605,21 @@ void __init_or_module noinline apply_ret
#ifdef CONFIG_RETHUNK
+bool cpu_wants_rethunk(void) +{ + return cpu_feature_enabled(X86_FEATURE_RETHUNK); +} + +bool cpu_wants_rethunk_at(void *addr) +{ + if (!cpu_feature_enabled(X86_FEATURE_RETHUNK)) + return false; + if (x86_return_thunk != its_return_thunk) + return true; + + return !((unsigned long)addr & 0x20); +} + /* * Rewrite the compiler generated return thunk tail-calls. * @@ -621,7 +636,7 @@ static int patch_return(void *addr, stru int i = 0;
/* Patch the custom return thunks... */ - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) { + if (cpu_wants_rethunk_at(addr)) { i = JMP32_INSN_SIZE; __text_gen_insn(bytes, JMP32_INSN_OPCODE, addr, x86_return_thunk, i); } else { --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -367,7 +367,7 @@ create_trampoline(struct ftrace_ops *ops goto fail;
ip = trampoline + size; - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) + if (cpu_wants_rethunk_at(ip)) __text_gen_insn(ip, JMP32_INSN_OPCODE, ip, x86_return_thunk, JMP32_INSN_SIZE); else memcpy(ip, retq, sizeof(retq)); --- a/arch/x86/kernel/static_call.c +++ b/arch/x86/kernel/static_call.c @@ -81,7 +81,7 @@ static void __ref __static_call_transfor break;
case RET: - if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) + if (cpu_wants_rethunk_at(insn)) code = text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk); else code = &retinsn; --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -538,6 +538,10 @@ INIT_PER_CPU(irq_stack_backing_store); . = ASSERT(__x86_indirect_its_thunk_array == __x86_indirect_its_thunk_rax, "Gap in ITS thunk array"); #endif
+#if defined(CONFIG_MITIGATION_ITS) && !defined(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B) +. = ASSERT(its_return_thunk & 0x20, "its_return_thunk not in second half of cacheline"); +#endif + #endif /* CONFIG_X86_64 */
#ifdef CONFIG_KEXEC_CORE --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -280,7 +280,18 @@ SYM_CODE_START(__x86_indirect_its_thunk_ .align 64, 0xcc SYM_CODE_END(__x86_indirect_its_thunk_array)
-#endif +.align 64, 0xcc +.skip 32, 0xcc +SYM_CODE_START(its_return_thunk) + UNWIND_HINT_FUNC + ANNOTATE_NOENDBR + ANNOTATE_UNRET_SAFE + ret + int3 +SYM_CODE_END(its_return_thunk) +EXPORT_SYMBOL(its_return_thunk) + +#endif /* CONFIG_MITIGATION_ITS */
SYM_CODE_START(__x86_return_thunk) UNWIND_HINT_FUNC --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -466,7 +466,7 @@ static void emit_return(u8 **pprog, u8 * { u8 *prog = *pprog;
- if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) { + if (cpu_wants_rethunk()) { emit_jump(&prog, x86_return_thunk, ip); } else { EMIT1(0xC3); /* ret */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit f4818881c47fd91fcb6d62373c57c7844e3de1c0 upstream.
Indirect Target Selection (ITS) is a bug in some pre-ADL Intel CPUs with eIBRS. It affects prediction of indirect branch and RETs in the lower half of cacheline. Due to ITS such branches may get wrongly predicted to a target of (direct or indirect) branch that is located in the upper half of the cacheline.
Scope of impact ===============
Guest/host isolation -------------------- When eIBRS is used for guest/host isolation, the indirect branches in the VMM may still be predicted with targets corresponding to branches in the guest.
Intra-mode ---------- cBPF or other native gadgets can be used for intra-mode training and disclosure using ITS.
User/kernel isolation --------------------- When eIBRS is enabled user/kernel isolation is not impacted.
Indirect Branch Prediction Barrier (IBPB) ----------------------------------------- After an IBPB, indirect branches may be predicted with targets corresponding to direct branches which were executed prior to IBPB. This is mitigated by a microcode update.
Add cmdline parameter indirect_target_selection=off|on|force to control the mitigation to relocate the affected branches to an ITS-safe thunk i.e. located in the upper half of cacheline. Also add the sysfs reporting.
When retpoline mitigation is deployed, ITS safe-thunks are not needed, because retpoline sequence is already ITS-safe. Similarly, when call depth tracking (CDT) mitigation is deployed (retbleed=stuff), ITS safe return thunk is not used, as CDT prevents RSB-underflow.
To not overcomplicate things, ITS mitigation is not supported with spectre-v2 lfence;jmp mitigation. Moreover, it is less practical to deploy lfence;jmp mitigation on ITS affected parts anyways.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/ABI/testing/sysfs-devices-system-cpu | 1 Documentation/admin-guide/kernel-parameters.txt | 13 ++ arch/x86/kernel/cpu/bugs.c | 128 ++++++++++++++++++++- drivers/base/cpu.c | 8 + include/linux/cpu.h | 2 5 files changed, 149 insertions(+), 3 deletions(-)
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -512,6 +512,7 @@ Description: information about CPUs hete
What: /sys/devices/system/cpu/vulnerabilities /sys/devices/system/cpu/vulnerabilities/gather_data_sampling + /sys/devices/system/cpu/vulnerabilities/indirect_target_selection /sys/devices/system/cpu/vulnerabilities/itlb_multihit /sys/devices/system/cpu/vulnerabilities/l1tf /sys/devices/system/cpu/vulnerabilities/mds --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1926,6 +1926,18 @@ different crypto accelerators. This option can be used to achieve best performance for particular HW.
+ indirect_target_selection= [X86,Intel] Mitigation control for Indirect + Target Selection(ITS) bug in Intel CPUs. Updated + microcode is also required for a fix in IBPB. + + on: Enable mitigation (default). + off: Disable mitigation. + force: Force the ITS bug and deploy default + mitigation. + + For details see: + Documentation/admin-guide/hw-vuln/indirect-target-selection.rst + init= [KNL] Format: <full_path> Run specified binary instead of /sbin/init as init @@ -3073,6 +3085,7 @@ improves system performance, but it may also expose users to several CPU vulnerabilities. Equivalent to: gather_data_sampling=off [X86] + indirect_target_selection=off [X86] kpti=0 [ARM64] kvm.nx_huge_pages=off [X86] l1tf=off [X86] --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -48,6 +48,7 @@ static void __init srbds_select_mitigati static void __init l1d_flush_select_mitigation(void); static void __init gds_select_mitigation(void); static void __init srso_select_mitigation(void); +static void __init its_select_mitigation(void);
/* The base value of the SPEC_CTRL MSR without task-specific bits set */ u64 x86_spec_ctrl_base; @@ -66,6 +67,14 @@ static DEFINE_MUTEX(spec_ctrl_mutex);
void (*x86_return_thunk)(void) __ro_after_init = &__x86_return_thunk;
+static void __init set_return_thunk(void *thunk) +{ + if (x86_return_thunk != __x86_return_thunk) + pr_warn("x86/bugs: return thunk changed\n"); + + x86_return_thunk = thunk; +} + /* Update SPEC_CTRL MSR and its cached copy unconditionally */ static void update_spec_ctrl(u64 val) { @@ -174,6 +183,7 @@ void __init cpu_select_mitigations(void) */ srso_select_mitigation(); gds_select_mitigation(); + its_select_mitigation(); }
/* @@ -1081,7 +1091,7 @@ do_cmd_auto: setup_force_cpu_cap(X86_FEATURE_UNRET);
if (IS_ENABLED(CONFIG_RETHUNK)) - x86_return_thunk = retbleed_return_thunk; + set_return_thunk(retbleed_return_thunk);
if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD && boot_cpu_data.x86_vendor != X86_VENDOR_HYGON) @@ -1143,6 +1153,105 @@ do_cmd_auto: }
#undef pr_fmt +#define pr_fmt(fmt) "ITS: " fmt + +enum its_mitigation_cmd { + ITS_CMD_OFF, + ITS_CMD_ON, +}; + +enum its_mitigation { + ITS_MITIGATION_OFF, + ITS_MITIGATION_ALIGNED_THUNKS, +}; + +static const char * const its_strings[] = { + [ITS_MITIGATION_OFF] = "Vulnerable", + [ITS_MITIGATION_ALIGNED_THUNKS] = "Mitigation: Aligned branch/return thunks", +}; + +static enum its_mitigation its_mitigation __ro_after_init = ITS_MITIGATION_ALIGNED_THUNKS; + +static enum its_mitigation_cmd its_cmd __ro_after_init = + IS_ENABLED(CONFIG_MITIGATION_ITS) ? ITS_CMD_ON : ITS_CMD_OFF; + +static int __init its_parse_cmdline(char *str) +{ + if (!str) + return -EINVAL; + + if (!IS_ENABLED(CONFIG_MITIGATION_ITS)) { + pr_err("Mitigation disabled at compile time, ignoring option (%s)", str); + return 0; + } + + if (!strcmp(str, "off")) { + its_cmd = ITS_CMD_OFF; + } else if (!strcmp(str, "on")) { + its_cmd = ITS_CMD_ON; + } else if (!strcmp(str, "force")) { + its_cmd = ITS_CMD_ON; + setup_force_cpu_bug(X86_BUG_ITS); + } else { + pr_err("Ignoring unknown indirect_target_selection option (%s).", str); + } + + return 0; +} +early_param("indirect_target_selection", its_parse_cmdline); + +static void __init its_select_mitigation(void) +{ + enum its_mitigation_cmd cmd = its_cmd; + + if (!boot_cpu_has_bug(X86_BUG_ITS) || cpu_mitigations_off()) { + its_mitigation = ITS_MITIGATION_OFF; + return; + } + + /* Exit early to avoid irrelevant warnings */ + if (cmd == ITS_CMD_OFF) { + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + if (spectre_v2_enabled == SPECTRE_V2_NONE) { + pr_err("WARNING: Spectre-v2 mitigation is off, disabling ITS\n"); + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + if (!IS_ENABLED(CONFIG_RETPOLINE) || !IS_ENABLED(CONFIG_RETHUNK)) { + pr_err("WARNING: ITS mitigation depends on retpoline and rethunk support\n"); + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + if (IS_ENABLED(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B)) { + pr_err("WARNING: ITS mitigation is not compatible with CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B\n"); + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + if (boot_cpu_has(X86_FEATURE_RETPOLINE_LFENCE)) { + pr_err("WARNING: ITS mitigation is not compatible with lfence mitigation\n"); + its_mitigation = ITS_MITIGATION_OFF; + goto out; + } + + switch (cmd) { + case ITS_CMD_OFF: + its_mitigation = ITS_MITIGATION_OFF; + break; + case ITS_CMD_ON: + its_mitigation = ITS_MITIGATION_ALIGNED_THUNKS; + if (!boot_cpu_has(X86_FEATURE_RETPOLINE)) + setup_force_cpu_cap(X86_FEATURE_INDIRECT_THUNK_ITS); + setup_force_cpu_cap(X86_FEATURE_RETHUNK); + set_return_thunk(its_return_thunk); + break; + } +out: + pr_info("%s\n", its_strings[its_mitigation]); +} + +#undef pr_fmt #define pr_fmt(fmt) "Spectre V2 : " fmt
static enum spectre_v2_user_mitigation spectre_v2_user_stibp __ro_after_init = @@ -2592,10 +2701,10 @@ static void __init srso_select_mitigatio
if (boot_cpu_data.x86 == 0x19) { setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS); - x86_return_thunk = srso_alias_return_thunk; + set_return_thunk(srso_alias_return_thunk); } else { setup_force_cpu_cap(X86_FEATURE_SRSO); - x86_return_thunk = srso_return_thunk; + set_return_thunk(srso_return_thunk); } srso_mitigation = SRSO_MITIGATION_SAFE_RET; } else { @@ -2775,6 +2884,11 @@ static ssize_t rfds_show_state(char *buf return sysfs_emit(buf, "%s\n", rfds_strings[rfds_mitigation]); }
+static ssize_t its_show_state(char *buf) +{ + return sysfs_emit(buf, "%s\n", its_strings[its_mitigation]); +} + static char *stibp_state(void) { if (spectre_v2_in_eibrs_mode(spectre_v2_enabled) && @@ -2959,6 +3073,9 @@ static ssize_t cpu_show_common(struct de case X86_BUG_RFDS: return rfds_show_state(buf);
+ case X86_BUG_ITS: + return its_show_state(buf); + default: break; } @@ -3038,4 +3155,9 @@ ssize_t cpu_show_reg_file_data_sampling( { return cpu_show_common(dev, attr, buf, X86_BUG_RFDS); } + +ssize_t cpu_show_indirect_target_selection(struct device *dev, struct device_attribute *attr, char *buf) +{ + return cpu_show_common(dev, attr, buf, X86_BUG_ITS); +} #endif --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -595,6 +595,12 @@ ssize_t __weak cpu_show_reg_file_data_sa return sysfs_emit(buf, "Not affected\n"); }
+ssize_t __weak cpu_show_indirect_target_selection(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "Not affected\n"); +} + static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); @@ -609,6 +615,7 @@ static DEVICE_ATTR(retbleed, 0444, cpu_s static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL); static DEVICE_ATTR(spec_rstack_overflow, 0444, cpu_show_spec_rstack_overflow, NULL); static DEVICE_ATTR(reg_file_data_sampling, 0444, cpu_show_reg_file_data_sampling, NULL); +static DEVICE_ATTR(indirect_target_selection, 0444, cpu_show_indirect_target_selection, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = { &dev_attr_meltdown.attr, @@ -625,6 +632,7 @@ static struct attribute *cpu_root_vulner &dev_attr_gather_data_sampling.attr, &dev_attr_spec_rstack_overflow.attr, &dev_attr_reg_file_data_sampling.attr, + &dev_attr_indirect_target_selection.attr, NULL };
--- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -76,6 +76,8 @@ extern ssize_t cpu_show_gds(struct devic struct device_attribute *attr, char *buf); extern ssize_t cpu_show_reg_file_data_sampling(struct device *dev, struct device_attribute *attr, char *buf); +extern ssize_t cpu_show_indirect_target_selection(struct device *dev, + struct device_attribute *attr, char *buf);
extern __printf(4, 5) struct device *cpu_device_create(struct device *parent, void *drvdata,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 2665281a07e19550944e8354a2024635a7b2714a upstream.
Ice Lake generation CPUs are not affected by guest/host isolation part of ITS. If a user is only concerned about KVM guests, they can now choose a new cmdline option "vmexit" that will not deploy the ITS mitigation when CPU is not affected by guest/host isolation. This saves the performance overhead of ITS mitigation on Ice Lake gen CPUs.
When "vmexit" option selected, if the CPU is affected by ITS guest/host isolation, the default ITS mitigation is deployed.
Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@kernel.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/kernel-parameters.txt | 2 ++ arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/bugs.c | 11 +++++++++++ arch/x86/kernel/cpu/common.c | 19 ++++++++++++------- 4 files changed, 26 insertions(+), 7 deletions(-)
--- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1934,6 +1934,8 @@ off: Disable mitigation. force: Force the ITS bug and deploy default mitigation. + vmexit: Only deploy mitigation if CPU is affected by + guest/host isolation part of ITS.
For details see: Documentation/admin-guide/hw-vuln/indirect-target-selection.rst --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -485,4 +485,5 @@ #define X86_BUG_BHI X86_BUG(1*32 + 3) /* CPU is affected by Branch History Injection */ #define X86_BUG_IBPB_NO_RET X86_BUG(1*32 + 4) /* "ibpb_no_ret" IBPB omits return target predictions */ #define X86_BUG_ITS X86_BUG(1*32 + 5) /* CPU is affected by Indirect Target Selection */ +#define X86_BUG_ITS_NATIVE_ONLY X86_BUG(1*32 + 6) /* CPU is affected by ITS, VMX is not affected */ #endif /* _ASM_X86_CPUFEATURES_H */ --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1158,15 +1158,18 @@ do_cmd_auto: enum its_mitigation_cmd { ITS_CMD_OFF, ITS_CMD_ON, + ITS_CMD_VMEXIT, };
enum its_mitigation { ITS_MITIGATION_OFF, + ITS_MITIGATION_VMEXIT_ONLY, ITS_MITIGATION_ALIGNED_THUNKS, };
static const char * const its_strings[] = { [ITS_MITIGATION_OFF] = "Vulnerable", + [ITS_MITIGATION_VMEXIT_ONLY] = "Mitigation: Vulnerable, KVM: Not affected", [ITS_MITIGATION_ALIGNED_THUNKS] = "Mitigation: Aligned branch/return thunks", };
@@ -1192,6 +1195,8 @@ static int __init its_parse_cmdline(char } else if (!strcmp(str, "force")) { its_cmd = ITS_CMD_ON; setup_force_cpu_bug(X86_BUG_ITS); + } else if (!strcmp(str, "vmexit")) { + its_cmd = ITS_CMD_VMEXIT; } else { pr_err("Ignoring unknown indirect_target_selection option (%s).", str); } @@ -1239,6 +1244,12 @@ static void __init its_select_mitigation case ITS_CMD_OFF: its_mitigation = ITS_MITIGATION_OFF; break; + case ITS_CMD_VMEXIT: + if (boot_cpu_has_bug(X86_BUG_ITS_NATIVE_ONLY)) { + its_mitigation = ITS_MITIGATION_VMEXIT_ONLY; + goto out; + } + fallthrough; case ITS_CMD_ON: its_mitigation = ITS_MITIGATION_ALIGNED_THUNKS; if (!boot_cpu_has(X86_FEATURE_RETPOLINE)) --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1143,6 +1143,8 @@ static const __initconst struct x86_cpu_ #define RFDS BIT(7) /* CPU is affected by Indirect Target Selection */ #define ITS BIT(8) +/* CPU is affected by Indirect Target Selection, but guest-host isolation is not affected */ +#define ITS_NATIVE_ONLY BIT(9)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS), @@ -1163,16 +1165,16 @@ static const struct x86_cpu_id cpu_vuln_ VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x0, 0xc), MMIO | RETBLEED | GDS | SRBDS), VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS | ITS), VULNBL_INTEL_STEPPINGS(CANNONLAKE_L, X86_STEPPING_ANY, RETBLEED), - VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), - VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS | ITS), - VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO | GDS | ITS), + VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY), + VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS | ITS | ITS_NATIVE_ONLY), + VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO | GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO | RETBLEED | ITS), VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS), - VULNBL_INTEL_STEPPINGS(TIGERLAKE_L, X86_STEPPING_ANY, GDS | ITS), - VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS | ITS), + VULNBL_INTEL_STEPPINGS(TIGERLAKE_L, X86_STEPPING_ANY, GDS | ITS | ITS_NATIVE_ONLY), + VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED), - VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | ITS), + VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY), VULNBL_INTEL_STEPPINGS(ALDERLAKE, X86_STEPPING_ANY, RFDS), VULNBL_INTEL_STEPPINGS(ALDERLAKE_L, X86_STEPPING_ANY, RFDS), VULNBL_INTEL_STEPPINGS(RAPTORLAKE, X86_STEPPING_ANY, RFDS), @@ -1389,8 +1391,11 @@ static void __init cpu_set_bug_bits(stru if (cpu_has(c, X86_FEATURE_AMD_IBPB) && !cpu_has(c, X86_FEATURE_AMD_IBPB_RET)) setup_force_cpu_bug(X86_BUG_IBPB_NO_RET);
- if (vulnerable_to_its(x86_arch_cap_msr)) + if (vulnerable_to_its(x86_arch_cap_msr)) { setup_force_cpu_bug(X86_BUG_ITS); + if (cpu_matches(cpu_vuln_blacklist, ITS_NATIVE_ONLY)) + setup_force_cpu_bug(X86_BUG_ITS_NATIVE_ONLY); + }
if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) return;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit f0cd7091cc5a032c8870b4285305d9172569d126 upstream.
The software mitigation for BHI is to execute BHB clear sequence at syscall entry, and possibly after a cBPF program. ITS mitigation thunks RETs in the lower half of the cacheline. This causes the RETs in the BHB clear sequence to be thunked as well, adding unnecessary branches to the BHB clear sequence.
Since the sequence is in hot path, align the RET instructions in the sequence to avoid thunking.
This is how disassembly clear_bhb_loop() looks like after this change:
0x44 <+4>: mov $0x5,%ecx 0x49 <+9>: call 0xffffffff81001d9b <clear_bhb_loop+91> 0x4e <+14>: jmp 0xffffffff81001de5 <clear_bhb_loop+165> 0x53 <+19>: int3 ... 0x9b <+91>: call 0xffffffff81001dce <clear_bhb_loop+142> 0xa0 <+96>: ret 0xa1 <+97>: int3 ... 0xce <+142>: mov $0x5,%eax 0xd3 <+147>: jmp 0xffffffff81001dd6 <clear_bhb_loop+150> 0xd5 <+149>: nop 0xd6 <+150>: sub $0x1,%eax 0xd9 <+153>: jne 0xffffffff81001dd3 <clear_bhb_loop+147> 0xdb <+155>: sub $0x1,%ecx 0xde <+158>: jne 0xffffffff81001d9b <clear_bhb_loop+91> 0xe0 <+160>: ret 0xe1 <+161>: int3 0xe2 <+162>: int3 0xe3 <+163>: int3 0xe4 <+164>: int3 0xe5 <+165>: lfence 0xe8 <+168>: pop %rbp 0xe9 <+169>: ret
Suggested-by: Andrew Cooper andrew.cooper3@citrix.com Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/entry/entry_64.S | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)
--- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1530,7 +1530,9 @@ SYM_CODE_END(rewind_stack_and_make_dead) * ORC to unwind properly. * * The alignment is for performance and not for safety, and may be safely - * refactored in the future if needed. + * refactored in the future if needed. The .skips are for safety, to ensure + * that all RETs are in the second half of a cacheline to mitigate Indirect + * Target Selection, rather than taking the slowpath via its_return_thunk. */ SYM_FUNC_START(clear_bhb_loop) push %rbp @@ -1540,10 +1542,22 @@ SYM_FUNC_START(clear_bhb_loop) call 1f jmp 5f .align 64, 0xcc + /* + * Shift instructions so that the RET is in the upper half of the + * cacheline and don't take the slowpath to its_return_thunk. + */ + .skip 32 - (.Lret1 - 1f), 0xcc ANNOTATE_INTRA_FUNCTION_CALL 1: call 2f - RET +.Lret1: RET .align 64, 0xcc + /* + * As above shift instructions for RET at .Lret2 as well. + * + * This should be ideally be: .skip 32 - (.Lret2 - 2f), 0xcc + * but some Clang versions (e.g. 18) don't like this. + */ + .skip 32 - 18, 0xcc 2: movl $5, %eax 3: jmp 4f nop @@ -1551,7 +1565,7 @@ SYM_FUNC_START(clear_bhb_loop) jnz 3b sub $1, %ecx jnz 1b - RET +.Lret2: RET 5: lfence pop %rbp RET
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
commit 872df34d7c51a79523820ea6a14860398c639b87 upstream.
ITS mitigation moves the unsafe indirect branches to a safe thunk. This could degrade the prediction accuracy as the source address of indirect branches becomes same for different execution paths.
To improve the predictions, and hence the performance, assign a separate thunk for each indirect callsite. This is also a defense-in-depth measure to avoid indirect branches aliasing with each other.
As an example, 5000 dynamic thunks would utilize around 16 bits of the address space, thereby gaining entropy. For a BTB that uses 32 bits for indexing, dynamic thunks could provide better prediction accuracy over fixed thunks.
Have ITS thunks be variable sized and use EXECMEM_MODULE_TEXT such that they are both more flexible (got to extend them later) and live in 2M TLBs, just like kernel code, avoiding undue TLB pressure.
[ pawan: CONFIG_EXECMEM and CONFIG_EXECMEM_ROX are not supported on backport kernel, made changes to use module_alloc() and set_memory_*() for dynamic thunks. ]
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/alternative.h | 10 ++ arch/x86/kernel/alternative.c | 133 ++++++++++++++++++++++++++++++++++++- arch/x86/kernel/module.c | 7 + include/linux/module.h | 5 + 4 files changed, 152 insertions(+), 3 deletions(-)
--- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -80,6 +80,16 @@ extern void apply_returns(s32 *start, s3
struct module;
+#ifdef CONFIG_MITIGATION_ITS +extern void its_init_mod(struct module *mod); +extern void its_fini_mod(struct module *mod); +extern void its_free_mod(struct module *mod); +#else /* CONFIG_MITIGATION_ITS */ +static inline void its_init_mod(struct module *mod) { } +static inline void its_fini_mod(struct module *mod) { } +static inline void its_free_mod(struct module *mod) { } +#endif + #ifdef CONFIG_RETHUNK extern bool cpu_wants_rethunk(void); extern bool cpu_wants_rethunk_at(void *addr); --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -18,6 +18,7 @@ #include <linux/mmu_context.h> #include <linux/bsearch.h> #include <linux/sync_core.h> +#include <linux/moduleloader.h> #include <asm/text-patching.h> #include <asm/alternative.h> #include <asm/sections.h> @@ -30,6 +31,7 @@ #include <asm/fixmap.h> #include <asm/paravirt.h> #include <asm/asm-prototypes.h> +#include <asm/set_memory.h>
int __read_mostly alternatives_patched;
@@ -397,6 +399,127 @@ static int emit_indirect(int op, int reg
#ifdef CONFIG_MITIGATION_ITS
+static struct module *its_mod; +static void *its_page; +static unsigned int its_offset; + +/* Initialize a thunk with the "jmp *reg; int3" instructions. */ +static void *its_init_thunk(void *thunk, int reg) +{ + u8 *bytes = thunk; + int i = 0; + + if (reg >= 8) { + bytes[i++] = 0x41; /* REX.B prefix */ + reg -= 8; + } + bytes[i++] = 0xff; + bytes[i++] = 0xe0 + reg; /* jmp *reg */ + bytes[i++] = 0xcc; + + return thunk; +} + +void its_init_mod(struct module *mod) +{ + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) + return; + + mutex_lock(&text_mutex); + its_mod = mod; + its_page = NULL; +} + +void its_fini_mod(struct module *mod) +{ + int i; + + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) + return; + + WARN_ON_ONCE(its_mod != mod); + + its_mod = NULL; + its_page = NULL; + mutex_unlock(&text_mutex); + + for (i = 0; i < mod->its_num_pages; i++) { + void *page = mod->its_page_array[i]; + set_memory_ro((unsigned long)page, 1); + set_memory_x((unsigned long)page, 1); + } +} + +void its_free_mod(struct module *mod) +{ + int i; + + if (!cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) + return; + + for (i = 0; i < mod->its_num_pages; i++) { + void *page = mod->its_page_array[i]; + module_memfree(page); + } + kfree(mod->its_page_array); +} + +static void *its_alloc(void) +{ + void *page = module_alloc(PAGE_SIZE); + + if (!page) + return NULL; + + if (its_mod) { + void *tmp = krealloc(its_mod->its_page_array, + (its_mod->its_num_pages+1) * sizeof(void *), + GFP_KERNEL); + if (!tmp) { + module_memfree(page); + return NULL; + } + + its_mod->its_page_array = tmp; + its_mod->its_page_array[its_mod->its_num_pages++] = page; + } + + return page; +} + +static void *its_allocate_thunk(int reg) +{ + int size = 3 + (reg / 8); + void *thunk; + + if (!its_page || (its_offset + size - 1) >= PAGE_SIZE) { + its_page = its_alloc(); + if (!its_page) { + pr_err("ITS page allocation failed\n"); + return NULL; + } + memset(its_page, INT3_INSN_OPCODE, PAGE_SIZE); + its_offset = 32; + } + + /* + * If the indirect branch instruction will be in the lower half + * of a cacheline, then update the offset to reach the upper half. + */ + if ((its_offset + size - 1) % 64 < 32) + its_offset = ((its_offset - 1) | 0x3F) + 33; + + thunk = its_page + its_offset; + its_offset += size; + + set_memory_rw((unsigned long)its_page, 1); + thunk = its_init_thunk(thunk, reg); + set_memory_ro((unsigned long)its_page, 1); + set_memory_x((unsigned long)its_page, 1); + + return thunk; +} + static int __emit_trampoline(void *addr, struct insn *insn, u8 *bytes, void *call_dest, void *jmp_dest) { @@ -444,9 +567,13 @@ clang_jcc:
static int emit_its_trampoline(void *addr, struct insn *insn, int reg, u8 *bytes) { - return __emit_trampoline(addr, insn, bytes, - __x86_indirect_its_thunk_array[reg], - __x86_indirect_its_thunk_array[reg]); + u8 *thunk = __x86_indirect_its_thunk_array[reg]; + u8 *tmp = its_allocate_thunk(reg); + + if (tmp) + thunk = tmp; + + return __emit_trampoline(addr, insn, bytes, thunk, thunk); }
/* Check if an indirect branch is at ITS-unsafe address */ --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -283,10 +283,16 @@ int module_finalize(const Elf_Ehdr *hdr, void *pseg = (void *)para->sh_addr; apply_paravirt(pseg, pseg + para->sh_size); } + + its_init_mod(me); + if (retpolines) { void *rseg = (void *)retpolines->sh_addr; apply_retpolines(rseg, rseg + retpolines->sh_size); } + + its_fini_mod(me); + if (returns) { void *rseg = (void *)returns->sh_addr; apply_returns(rseg, rseg + returns->sh_size); @@ -317,4 +323,5 @@ int module_finalize(const Elf_Ehdr *hdr, void module_arch_cleanup(struct module *mod) { alternatives_smp_module_del(mod); + its_free_mod(mod); } --- a/include/linux/module.h +++ b/include/linux/module.h @@ -528,6 +528,11 @@ struct module { atomic_t refcnt; #endif
+#ifdef CONFIG_MITIGATION_ITS + int its_num_pages; + void **its_page_array; +#endif + #ifdef CONFIG_CONSTRUCTORS /* Constructor functions. */ ctor_fn_t *ctors;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 9f35e33144ae5377d6a8de86dd3bd4d995c6ac65 upstream.
Fix several build errors when CONFIG_MODULES=n, including the following:
../arch/x86/kernel/alternative.c:195:25: error: incomplete definition of type 'struct module' 195 | for (int i = 0; i < mod->its_num_pages; i++) {
[ pawan: backport: Bring ITS dynamic thunk code under CONFIG_MODULES ]
Fixes: 872df34d7c51 ("x86/its: Use dynamic thunks for indirect branches") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Acked-by: Dave Hansen dave.hansen@intel.com Tested-by: Steven Rostedt (Google) rostedt@goodmis.org Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/alternative.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -399,6 +399,7 @@ static int emit_indirect(int op, int reg
#ifdef CONFIG_MITIGATION_ITS
+#ifdef CONFIG_MODULES static struct module *its_mod; static void *its_page; static unsigned int its_offset; @@ -519,6 +520,14 @@ static void *its_allocate_thunk(int reg)
return thunk; } +#else /* CONFIG_MODULES */ + +static void *its_allocate_thunk(int reg) +{ + return NULL; +} + +#endif /* CONFIG_MODULES */
static int __emit_trampoline(void *addr, struct insn *insn, u8 *bytes, void *call_dest, void *jmp_dest)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
commit e52c1dc7455d32c8a55f9949d300e5e87d011fa6 upstream.
FineIBT-paranoid was using the retpoline bytes for the paranoid check, disabling retpolines, because all parts that have IBT also have eIBRS and thus don't need no stinking retpolines.
Except... ITS needs the retpolines for indirect calls must not be in the first half of a cacheline :-/
So what was the paranoid call sequence:
<fineibt_paranoid_start>: 0: 41 ba 78 56 34 12 mov $0x12345678, %r10d 6: 45 3b 53 f7 cmp -0x9(%r11), %r10d a: 4d 8d 5b <f0> lea -0x10(%r11), %r11 e: 75 fd jne d <fineibt_paranoid_start+0xd> 10: 41 ff d3 call *%r11 13: 90 nop
Now becomes:
<fineibt_paranoid_start>: 0: 41 ba 78 56 34 12 mov $0x12345678, %r10d 6: 45 3b 53 f7 cmp -0x9(%r11), %r10d a: 4d 8d 5b f0 lea -0x10(%r11), %r11 e: 2e e8 XX XX XX XX cs call __x86_indirect_paranoid_thunk_r11
Where the paranoid_thunk looks like:
1d: <ea> (bad) __x86_indirect_paranoid_thunk_r11: 1e: 75 fd jne 1d __x86_indirect_its_thunk_r11: 20: 41 ff eb jmp *%r11 23: cc int3
[ dhansen: remove initialization to false ]
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Alexandre Chartre alexandre.chartre@oracle.com [ Just a portion of the original commit, in order to fix a build issue in stable kernels due to backports ] Tested-by: Holger Hoffstätte holger@applied-asynchrony.com Link: https://lore.kernel.org/r/20250514113952.GB16434@noisy.programming.kicks-ass... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/alternative.h | 8 ++++++++ arch/x86/kernel/alternative.c | 8 ++++++++ arch/x86/net/bpf_jit_comp.c | 2 +- 3 files changed, 17 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -5,6 +5,7 @@ #include <linux/types.h> #include <linux/stringify.h> #include <asm/asm.h> +#include <asm/bug.h>
#define ALTINSTR_FLAG_INV (1 << 15) #define ALT_NOT(feat) ((feat) | ALTINSTR_FLAG_INV) @@ -84,10 +85,17 @@ struct module; extern void its_init_mod(struct module *mod); extern void its_fini_mod(struct module *mod); extern void its_free_mod(struct module *mod); +extern u8 *its_static_thunk(int reg); #else /* CONFIG_MITIGATION_ITS */ static inline void its_init_mod(struct module *mod) { } static inline void its_fini_mod(struct module *mod) { } static inline void its_free_mod(struct module *mod) { } +static inline u8 *its_static_thunk(int reg) +{ + WARN_ONCE(1, "ITS not compiled in"); + + return NULL; +} #endif
#ifdef CONFIG_RETHUNK --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -597,6 +597,14 @@ static bool cpu_wants_indirect_its_thunk /* Lower-half of the cacheline? */ return !(addr & 0x20); } + +u8 *its_static_thunk(int reg) +{ + u8 *thunk = __x86_indirect_its_thunk_array[reg]; + + return thunk; +} + #endif
/* --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -449,7 +449,7 @@ static void emit_indirect_jump(u8 **ppro if (IS_ENABLED(CONFIG_MITIGATION_ITS) && cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) { OPTIMIZER_HIDE_VAR(reg); - emit_jump(&prog, &__x86_indirect_its_thunk_array[reg], ip); + emit_jump(&prog, its_static_thunk(reg), ip); } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) { EMIT_LFENCE(); EMIT2(0xFF, 0xE0 + reg);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Lynch nathan.lynch@amd.com
commit df180e65305f8c1e020d54bfc2132349fd693de1 upstream.
Several issues with this change:
* The analysis is flawed and it's unclear what problem is being fixed. There is no difference between wait_event_freezable_timeout() and wait_event_timeout() with respect to device interrupts. And of course "the interrupt notifying the finish of an operation happens during wait_event_freezable_timeout()" -- that's how it's supposed to work.
* The link at the "Closes:" tag appears to be an unrelated use-after-free in idxd.
* It introduces a regression: dmatest threads are meant to be freezable and this change breaks that.
See discussion here: https://lore.kernel.org/dmaengine/878qpa13fe.fsf@AUSNATLYNCH.amd.com/
Fixes: e87ca16e9911 ("dmaengine: dmatest: Fix dmatest waiting less when interrupted") Signed-off-by: Nathan Lynch nathan.lynch@amd.com Link: https://lore.kernel.org/r/20250403-dmaengine-dmatest-revert-waiting-less-v1-... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/dmatest.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/dma/dmatest.c +++ b/drivers/dma/dmatest.c @@ -828,9 +828,9 @@ static int dmatest_func(void *data) } else { dma_async_issue_pending(chan);
- wait_event_timeout(thread->done_wait, - done->done, - msecs_to_jiffies(params->timeout)); + wait_event_freezable_timeout(thread->done_wait, + done->done, + msecs_to_jiffies(params->timeout));
status = dma_async_is_tx_complete(chan, cookie, NULL, NULL);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit 54db6d1bdd71fa90172a2a6aca3308bbf7fa7eb5 upstream.
If the discard worker is running and there's currently only one block group, that block group is a data block group, it's in the unused block groups discard list and is being used (it got an extent allocated from it after becoming unused), the worker can end up in an infinite loop if a transaction abort happens or the async discard is disabled (during remount or unmount for example).
This happens like this:
1) Task A, the discard worker, is at peek_discard_list() and find_next_block_group() returns block group X;
2) Block group X is in the unused block groups discard list (its discard index is BTRFS_DISCARD_INDEX_UNUSED) since at some point in the past it become an unused block group and was added to that list, but then later it got an extent allocated from it, so its ->used counter is not zero anymore;
3) The current transaction is aborted by task B and we end up at __btrfs_handle_fs_error() in the transaction abort path, where we call btrfs_discard_stop(), which clears BTRFS_FS_DISCARD_RUNNING from fs_info, and then at __btrfs_handle_fs_error() we set the fs to RO mode (setting SB_RDONLY in the super block's s_flags field);
4) Task A calls __add_to_discard_list() with the goal of moving the block group from the unused block groups discard list into another discard list, but at __add_to_discard_list() we end up doing nothing because btrfs_run_discard_work() returns false, since the super block has SB_RDONLY set in its flags and BTRFS_FS_DISCARD_RUNNING is not set anymore in fs_info->flags. So block group X remains in the unused block groups discard list;
5) Task A then does a goto into the 'again' label, calls find_next_block_group() again we gets block group X again. Then it repeats the previous steps over and over since there are not other block groups in the discard lists and block group X is never moved out of the unused block groups discard list since btrfs_run_discard_work() keeps returning false and therefore __add_to_discard_list() doesn't move block group X out of that discard list.
When this happens we can get a soft lockup report like this:
[71.957] watchdog: BUG: soft lockup - CPU#0 stuck for 27s! [kworker/u4:3:97] [71.957] Modules linked in: xfs af_packet rfkill (...) [71.957] CPU: 0 UID: 0 PID: 97 Comm: kworker/u4:3 Tainted: G W 6.14.2-1-default #1 openSUSE Tumbleweed 968795ef2b1407352128b466fe887416c33af6fa [71.957] Tainted: [W]=WARN [71.957] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [71.957] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs] [71.957] RIP: 0010:btrfs_discard_workfn+0xc4/0x400 [btrfs] [71.957] Code: c1 01 48 83 (...) [71.957] RSP: 0018:ffffafaec03efe08 EFLAGS: 00000246 [71.957] RAX: ffff897045500000 RBX: ffff8970413ed8d0 RCX: 0000000000000000 [71.957] RDX: 0000000000000001 RSI: ffff8970413ed8d0 RDI: 0000000a8f1272ad [71.957] RBP: 0000000a9d61c60e R08: ffff897045500140 R09: 8080808080808080 [71.957] R10: ffff897040276800 R11: fefefefefefefeff R12: ffff8970413ed860 [71.957] R13: ffff897045500000 R14: ffff8970413ed868 R15: 0000000000000000 [71.957] FS: 0000000000000000(0000) GS:ffff89707bc00000(0000) knlGS:0000000000000000 [71.957] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [71.957] CR2: 00005605bcc8d2f0 CR3: 000000010376a001 CR4: 0000000000770ef0 [71.957] PKRU: 55555554 [71.957] Call Trace: [71.957] <TASK> [71.957] process_one_work+0x17e/0x330 [71.957] worker_thread+0x2ce/0x3f0 [71.957] ? __pfx_worker_thread+0x10/0x10 [71.957] kthread+0xef/0x220 [71.957] ? __pfx_kthread+0x10/0x10 [71.957] ret_from_fork+0x34/0x50 [71.957] ? __pfx_kthread+0x10/0x10 [71.957] ret_from_fork_asm+0x1a/0x30 [71.957] </TASK> [71.957] Kernel panic - not syncing: softlockup: hung tasks [71.987] CPU: 0 UID: 0 PID: 97 Comm: kworker/u4:3 Tainted: G W L 6.14.2-1-default #1 openSUSE Tumbleweed 968795ef2b1407352128b466fe887416c33af6fa [71.989] Tainted: [W]=WARN, [L]=SOFTLOCKUP [71.989] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [71.991] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs] [71.992] Call Trace: [71.993] <IRQ> [71.994] dump_stack_lvl+0x5a/0x80 [71.994] panic+0x10b/0x2da [71.995] watchdog_timer_fn.cold+0x9a/0xa1 [71.996] ? __pfx_watchdog_timer_fn+0x10/0x10 [71.997] __hrtimer_run_queues+0x132/0x2a0 [71.997] hrtimer_interrupt+0xff/0x230 [71.998] __sysvec_apic_timer_interrupt+0x55/0x100 [71.999] sysvec_apic_timer_interrupt+0x6c/0x90 [72.000] </IRQ> [72.000] <TASK> [72.001] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [72.002] RIP: 0010:btrfs_discard_workfn+0xc4/0x400 [btrfs] [72.002] Code: c1 01 48 83 (...) [72.005] RSP: 0018:ffffafaec03efe08 EFLAGS: 00000246 [72.006] RAX: ffff897045500000 RBX: ffff8970413ed8d0 RCX: 0000000000000000 [72.006] RDX: 0000000000000001 RSI: ffff8970413ed8d0 RDI: 0000000a8f1272ad [72.007] RBP: 0000000a9d61c60e R08: ffff897045500140 R09: 8080808080808080 [72.008] R10: ffff897040276800 R11: fefefefefefefeff R12: ffff8970413ed860 [72.009] R13: ffff897045500000 R14: ffff8970413ed868 R15: 0000000000000000 [72.010] ? btrfs_discard_workfn+0x51/0x400 [btrfs 23b01089228eb964071fb7ca156eee8cd3bf996f] [72.011] process_one_work+0x17e/0x330 [72.012] worker_thread+0x2ce/0x3f0 [72.013] ? __pfx_worker_thread+0x10/0x10 [72.014] kthread+0xef/0x220 [72.014] ? __pfx_kthread+0x10/0x10 [72.015] ret_from_fork+0x34/0x50 [72.015] ? __pfx_kthread+0x10/0x10 [72.016] ret_from_fork_asm+0x1a/0x30 [72.017] </TASK> [72.017] Kernel Offset: 0x15000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [72.019] Rebooting in 90 seconds..
So fix this by making sure we move a block group out of the unused block groups discard list when calling __add_to_discard_list().
Fixes: 2bee7eb8bb81 ("btrfs: discard one region at a time in async discard") Link: https://bugzilla.suse.com/show_bug.cgi?id=1242012 CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Boris Burkov boris@bur.io Reviewed-by: Daniel Vacek neelx@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/discard.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
--- a/fs/btrfs/discard.c +++ b/fs/btrfs/discard.c @@ -78,8 +78,6 @@ static void __add_to_discard_list(struct struct btrfs_block_group *block_group) { lockdep_assert_held(&discard_ctl->lock); - if (!btrfs_run_discard_work(discard_ctl)) - return;
if (list_empty(&block_group->discard_list) || block_group->discard_index == BTRFS_DISCARD_INDEX_UNUSED) { @@ -102,6 +100,9 @@ static void add_to_discard_list(struct b if (!btrfs_is_block_group_data_only(block_group)) return;
+ if (!btrfs_run_discard_work(discard_ctl)) + return; + spin_lock(&discard_ctl->lock); __add_to_discard_list(discard_ctl, block_group); spin_unlock(&discard_ctl->lock); @@ -233,6 +234,18 @@ again: block_group->used != 0) { if (btrfs_is_block_group_data_only(block_group)) { __add_to_discard_list(discard_ctl, block_group); + /* + * The block group must have been moved to other + * discard list even if discard was disabled in + * the meantime or a transaction abort happened, + * otherwise we can end up in an infinite loop, + * always jumping into the 'again' label and + * keep getting this block group over and over + * in case there are no other block groups in + * the discard lists. + */ + ASSERT(block_group->discard_index != + BTRFS_DISCARD_INDEX_UNUSED); } else { list_del_init(&block_group->discard_list); btrfs_put_block_group(block_group);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeremy Linton jeremy.linton@arm.com
commit adfab6b39202481bb43286fff94def4953793fdb upstream.
The original PPTT code had a bug where the processor subtable length was not correctly validated when encountering a truncated acpi_pptt_processor node.
Commit 7ab4f0e37a0f4 ("ACPI PPTT: Fix coding mistakes in a couple of sizeof() calls") attempted to fix this by validating the size is as large as the acpi_pptt_processor node structure. This introduced a regression where the last processor node in the PPTT table is ignored if it doesn't contain any private resources. That results errors like:
ACPI PPTT: PPTT table found, but unable to locate core XX (XX) ACPI: SPE must be homogeneous
Furthermore, it fails in a common case where the node length isn't equal to the acpi_pptt_processor structure size, leaving the original bug in a modified form.
Correct the regression by adjusting the loop termination conditions as suggested by the bug reporters. An additional check performed after the subtable node type is detected, validates the acpi_pptt_processor node is fully contained in the PPTT table. Repeating the check in acpi_pptt_leaf_node() is largely redundant as the node is already known to be fully contained in the table.
The case where a final truncated node's parent property is accepted, but the node itself is rejected should not be considered a bug.
Fixes: 7ab4f0e37a0f4 ("ACPI PPTT: Fix coding mistakes in a couple of sizeof() calls") Reported-by: Maximilian Heyne mheyne@amazon.de Closes: https://lore.kernel.org/linux-acpi/20250506-draco-taped-15f475cd@mheyne-amaz... Reported-by: Yicong Yang yangyicong@hisilicon.com Closes: https://lore.kernel.org/linux-acpi/20250507035124.28071-1-yangyicong@huawei.... Signed-off-by: Jeremy Linton jeremy.linton@arm.com Tested-by: Yicong Yang yangyicong@hisilicon.com Reviewed-by: Sudeep Holla sudeep.holla@arm.com Tested-by: Maximilian Heyne mheyne@amazon.de Cc: All applicable stable@vger.kernel.org # 7ab4f0e37a0f4: ACPI PPTT: Fix coding mistakes ... Link: https://patch.msgid.link/20250508023025.1301030-1-jeremy.linton@arm.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/pptt.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/drivers/acpi/pptt.c +++ b/drivers/acpi/pptt.c @@ -219,16 +219,18 @@ static int acpi_pptt_leaf_node(struct ac sizeof(struct acpi_table_pptt)); proc_sz = sizeof(struct acpi_pptt_processor);
- while ((unsigned long)entry + proc_sz < table_end) { + /* ignore subtable types that are smaller than a processor node */ + while ((unsigned long)entry + proc_sz <= table_end) { cpu_node = (struct acpi_pptt_processor *)entry; + if (entry->type == ACPI_PPTT_TYPE_PROCESSOR && cpu_node->parent == node_entry) return 0; if (entry->length == 0) return 0; + entry = ACPI_ADD_PTR(struct acpi_subtable_header, entry, entry->length); - } return 1; } @@ -261,15 +263,18 @@ static struct acpi_pptt_processor *acpi_ proc_sz = sizeof(struct acpi_pptt_processor);
/* find the processor structure associated with this cpuid */ - while ((unsigned long)entry + proc_sz < table_end) { + while ((unsigned long)entry + proc_sz <= table_end) { cpu_node = (struct acpi_pptt_processor *)entry;
if (entry->length == 0) { pr_warn("Invalid zero length subtable\n"); break; } + /* entry->length may not equal proc_sz, revalidate the processor structure length */ if (entry->type == ACPI_PPTT_TYPE_PROCESSOR && acpi_cpu_id == cpu_node->acpi_processor_id && + (unsigned long)entry + entry->length <= table_end && + entry->length == proc_sz + cpu_node->number_of_priv_resources * sizeof(u32) && acpi_pptt_leaf_node(table_hdr, cpu_node)) { return (struct acpi_pptt_processor *)entry; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wentao Liang vulab@iscas.ac.cn
commit 9e000f1b7f31684cc5927e034360b87ac7919593 upstream.
The function snd_es1968_capture_open() calls the function snd_pcm_hw_constraint_pow2(), but does not check its return value. A proper implementation can be found in snd_cx25821_pcm_open().
Add error handling for snd_pcm_hw_constraint_pow2() and propagate its error code.
Fixes: b942cf815b57 ("[ALSA] es1968 - Fix stuttering capture") Cc: stable@vger.kernel.org # v2.6.22 Signed-off-by: Wentao Liang vulab@iscas.ac.cn Link: https://patch.msgid.link/20250514092444.331-1-vulab@iscas.ac.cn Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/es1968.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/sound/pci/es1968.c +++ b/sound/pci/es1968.c @@ -1569,7 +1569,7 @@ static int snd_es1968_capture_open(struc struct snd_pcm_runtime *runtime = substream->runtime; struct es1968 *chip = snd_pcm_substream_chip(substream); struct esschan *es; - int apu1, apu2; + int err, apu1, apu2;
apu1 = snd_es1968_alloc_apu_pair(chip, ESM_APU_PCM_CAPTURE); if (apu1 < 0) @@ -1613,7 +1613,9 @@ static int snd_es1968_capture_open(struc runtime->hw = snd_es1968_capture; runtime->hw.buffer_bytes_max = runtime->hw.period_bytes_max = calc_available_memory_size(chip) - 1024; /* keep MIXBUF size */ - snd_pcm_hw_constraint_pow2(runtime, 0, SNDRV_PCM_HW_PARAM_BUFFER_BYTES); + err = snd_pcm_hw_constraint_pow2(runtime, 0, SNDRV_PCM_HW_PARAM_BUFFER_BYTES); + if (err < 0) + return err;
spin_lock_irq(&chip->substream_lock); list_add(&es->list, &chip->substream_list);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian Heusel christian@heusel.eu
commit 2b24eb060c2bb9ef79e1d3bcf633ba1bc95215d6 upstream.
A user reported on the Arch Linux Forums that their device is emitting the following message in the kernel journal, which is fixed by adding the quirk as submitted in this patch:
> kernel: usb 1-2: current rate 8436480 is different from the runtime rate 48000
There also is an entry for this product line added long time ago. Their specific device has the following ID:
$ lsusb | grep Audio Bus 001 Device 002: ID 1101:0003 EasyPass Industrial Co., Ltd Audioengine D1
Link: https://bbs.archlinux.org/viewtopic.php?id=305494 Fixes: 93f9d1a4ac593 ("ALSA: usb-audio: Apply sample rate quirk for Audioengine D1") Cc: stable@vger.kernel.org Signed-off-by: Christian Heusel christian@heusel.eu Link: https://patch.msgid.link/20250512-audioengine-quirk-addition-v1-1-4c370af6ef... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/quirks.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1845,6 +1845,8 @@ static const struct usb_audio_quirk_flag QUIRK_FLAG_FIXED_RATE), DEVICE_FLG(0x0fd9, 0x0008, /* Hauppauge HVR-950Q */ QUIRK_FLAG_SHARE_MEDIA_DEVICE | QUIRK_FLAG_ALIGN_TRANSFER), + DEVICE_FLG(0x1101, 0x0003, /* Audioengine D1 */ + QUIRK_FLAG_GET_SAMPLE_RATE), DEVICE_FLG(0x1224, 0x2a25, /* Jieli Technology USB PHY 2.0 */ QUIRK_FLAG_GET_SAMPLE_RATE), DEVICE_FLG(0x1395, 0x740a, /* Sennheiser DECT */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Chauvet kwizart@gmail.com
commit 7b9938a14460e8ec7649ca2e80ac0aae9815bf02 upstream.
Microdia JP001 does not support reading the sample rate which leads to many lines of "cannot get freq at ep 0x84". This patch adds the USB ID to quirks.c and avoids those error messages.
usb 7-4: New USB device found, idVendor=0c45, idProduct=636b, bcdDevice= 1.00 usb 7-4: New USB device strings: Mfr=2, Product=1, SerialNumber=3 usb 7-4: Product: JP001 usb 7-4: Manufacturer: JP001 usb 7-4: SerialNumber: JP001 usb 7-4: 3:1: cannot get freq at ep 0x84
Cc: stable@vger.kernel.org Signed-off-by: Nicolas Chauvet kwizart@gmail.com Link: https://patch.msgid.link/20250515102132.73062-1-kwizart@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/quirks.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1837,6 +1837,8 @@ static const struct usb_audio_quirk_flag QUIRK_FLAG_CTL_MSG_DELAY_1M), DEVICE_FLG(0x0c45, 0x6340, /* Sonix HD USB Camera */ QUIRK_FLAG_GET_SAMPLE_RATE), + DEVICE_FLG(0x0c45, 0x636b, /* Microdia JP001 USB Camera */ + QUIRK_FLAG_GET_SAMPLE_RATE), DEVICE_FLG(0x0d8c, 0x0014, /* USB Audio Device */ QUIRK_FLAG_CTL_MSG_DELAY_1M), DEVICE_FLG(0x0ecb, 0x205c, /* JBL Quantum610 Wireless */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: pengdonglin pengdonglin@xiaomi.com
commit e333332657f615ac2b55aa35565c4a882018bbe9 upstream.
When using the stacktrace trigger command to trace syscalls, the preemption count was consistently reported as 1 when the system call event itself had 0 (".").
For example:
root@ubuntu22-vm:/sys/kernel/tracing/events/syscalls/sys_enter_read $ echo stacktrace > trigger $ echo 1 > enable
sshd-416 [002] ..... 232.864910: sys_read(fd: a, buf: 556b1f3221d0, count: 8000) sshd-416 [002] ...1. 232.864913: <stack trace> => ftrace_syscall_enter => syscall_trace_enter => do_syscall_64 => entry_SYSCALL_64_after_hwframe
The root cause is that the trace framework disables preemption in __DO_TRACE before invoking the trigger callback.
Use the tracing_gen_ctx_dec() that will accommodate for the increase of the preemption count in __DO_TRACE when calling the callback. The result is the accurate reporting of:
sshd-410 [004] ..... 210.117660: sys_read(fd: 4, buf: 559b725ba130, count: 40000) sshd-410 [004] ..... 210.117662: <stack trace> => ftrace_syscall_enter => syscall_trace_enter => do_syscall_64 => entry_SYSCALL_64_after_hwframe
Cc: stable@vger.kernel.org Fixes: ce33c845b030c ("tracing: Dump stacktrace trigger to the corresponding instance") Link: https://lore.kernel.org/20250512094246.1167956-1-dolinux.peng@gmail.com Signed-off-by: pengdonglin dolinux.peng@gmail.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events_trigger.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/trace/trace_events_trigger.c +++ b/kernel/trace/trace_events_trigger.c @@ -1244,7 +1244,7 @@ stacktrace_trigger(struct event_trigger_ struct trace_event_file *file = data->private_data;
if (file) - __trace_stack(file->tr, tracing_gen_ctx(), STACK_SKIP); + __trace_stack(file->tr, tracing_gen_ctx_dec(), STACK_SKIP); else trace_dump_stack(STACK_SKIP); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: pengdonglin pengdonglin@xiaomi.com
commit 11aff32439df6ca5b3b891b43032faf88f4a6a29 upstream.
The preemption count of the stacktrace filter command to trace ksys_read is consistently incorrect:
$ echo ksys_read:stacktrace > set_ftrace_filter
<...>-453 [004] ...1. 38.308956: <stack trace> => ksys_read => do_syscall_64 => entry_SYSCALL_64_after_hwframe
The root cause is that the trace framework disables preemption when invoking the filter command callback in function_trace_probe_call:
preempt_disable_notrace(); probe_ops->func(ip, parent_ip, probe_opsbe->tr, probe_ops, probe->data); preempt_enable_notrace();
Use tracing_gen_ctx_dec() to account for the preempt_disable_notrace(), which will output the correct preemption count:
$ echo ksys_read:stacktrace > set_ftrace_filter
<...>-410 [006] ..... 31.420396: <stack trace> => ksys_read => do_syscall_64 => entry_SYSCALL_64_after_hwframe
Cc: stable@vger.kernel.org Fixes: 36590c50b2d07 ("tracing: Merge irqflags + preempt counter.") Link: https://lore.kernel.org/20250512094246.1167956-2-dolinux.peng@gmail.com Signed-off-by: pengdonglin dolinux.peng@gmail.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_functions.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
--- a/kernel/trace/trace_functions.c +++ b/kernel/trace/trace_functions.c @@ -568,11 +568,7 @@ ftrace_traceoff(unsigned long ip, unsign
static __always_inline void trace_stack(struct trace_array *tr) { - unsigned int trace_ctx; - - trace_ctx = tracing_gen_ctx(); - - __trace_stack(tr, trace_ctx, FTRACE_STACK_SKIP); + __trace_stack(tr, tracing_gen_ctx_dec(), FTRACE_STACK_SKIP); }
static void
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt rostedt@goodmis.org
commit 1b0c192c92ea1fe2dcb178f84adf15fe37c3e7c8 upstream.
When using trace_array_printk() on a created instance, the correct function to use to initialize it is:
trace_array_init_printk()
Not
trace_printk_init_buffer()
The former is a proper function to use, the latter is for initializing trace_printk() and causes the NOTICE banner to be displayed.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: Divya Indi divya.indi@oracle.com Link: https://lore.kernel.org/20250509152657.0f6744d9@gandalf.local.home Fixes: 89ed42495ef4a ("tracing: Sample module to demonstrate kernel access to Ftrace instances.") Fixes: 38ce2a9e33db6 ("tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers") Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- samples/ftrace/sample-trace-array.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/samples/ftrace/sample-trace-array.c +++ b/samples/ftrace/sample-trace-array.c @@ -112,7 +112,7 @@ static int __init sample_trace_array_ini /* * If context specific per-cpu buffers havent already been allocated. */ - trace_printk_init_buffers(); + trace_array_init_printk(tr);
simple_tsk = kthread_run(simple_thread, NULL, "sample-instance"); if (IS_ERR(simple_tsk)) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make24@iscas.ac.cn
commit b2ea5f49580c0762d17d80d8083cb89bc3acf74f upstream.
If device_add() fails, do not use device_unregister() for error handling. device_unregister() consists two functions: device_del() and put_device(). device_unregister() should only be called after device_add() succeeded because device_del() undoes what device_add() does if successful. Change device_unregister() to put_device() call before returning from the function.
As comment of device_add() says, 'if device_add() succeeds, you should call device_del() when you want to get rid of it. If device_add() has not succeeded, use only put_device() to drop the reference count'.
Found by code review.
Cc: stable@vger.kernel.org Fixes: 53d2a715c240 ("phy: Add Tegra XUSB pad controller support") Signed-off-by: Ma Ke make24@iscas.ac.cn Acked-by: Thierry Reding treding@nvidia.com Link: https://lore.kernel.org/r/20250303072739.3874987-1-make24@iscas.ac.cn Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/phy/tegra/xusb.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/phy/tegra/xusb.c +++ b/drivers/phy/tegra/xusb.c @@ -542,16 +542,16 @@ static int tegra_xusb_port_init(struct t
err = dev_set_name(&port->dev, "%s-%u", name, index); if (err < 0) - goto unregister; + goto put_device;
err = device_add(&port->dev); if (err < 0) - goto unregister; + goto put_device;
return 0;
-unregister: - device_unregister(&port->dev); +put_device: + put_device(&port->dev); return err; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com
commit 86e70849f4b2b4597ac9f7c7931f2a363774be25 upstream.
phy-rcar-gen3-usb2 driver exports 4 PHYs. The timing registers are common to all PHYs. There is no need to set them every time a PHY is initialized. Set timing register only when the 1st PHY is initialized.
Fixes: f3b5a8d9b50d ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver") Cc: stable@vger.kernel.org Reviewed-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Tested-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Reviewed-by: Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com Link: https://lore.kernel.org/r/20250507125032.565017-6-claudiu.beznea.uj@bp.renes... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/phy/renesas/phy-rcar-gen3-usb2.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/phy/renesas/phy-rcar-gen3-usb2.c +++ b/drivers/phy/renesas/phy-rcar-gen3-usb2.c @@ -453,8 +453,11 @@ static int rcar_gen3_phy_usb2_init(struc val = readl(usb2_base + USB2_INT_ENABLE); val |= USB2_INT_ENABLE_UCOM_INTEN | rphy->int_enable_bits; writel(val, usb2_base + USB2_INT_ENABLE); - writel(USB2_SPD_RSM_TIMSET_INIT, usb2_base + USB2_SPD_RSM_TIMSET); - writel(USB2_OC_TIMSET_INIT, usb2_base + USB2_OC_TIMSET); + + if (!rcar_gen3_is_any_rphy_initialized(channel)) { + writel(USB2_SPD_RSM_TIMSET_INIT, usb2_base + USB2_SPD_RSM_TIMSET); + writel(USB2_OC_TIMSET_INIT, usb2_base + USB2_OC_TIMSET); + }
/* Initialize otg part */ if (channel->is_otg_channel) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fedor Pchelkin pchelkin@ispras.ru
commit 78ab4be549533432d97ea8989d2f00b508fa68d8 upstream.
A warning on driver removal started occurring after commit 9dd05df8403b ("net: warn if NAPI instance wasn't shut down"). Disable tx napi before deleting it in mt76_dma_cleanup().
WARNING: CPU: 4 PID: 18828 at net/core/dev.c:7288 __netif_napi_del_locked+0xf0/0x100 CPU: 4 UID: 0 PID: 18828 Comm: modprobe Not tainted 6.15.0-rc4 #4 PREEMPT(lazy) Hardware name: ASUS System Product Name/PRIME X670E-PRO WIFI, BIOS 3035 09/05/2024 RIP: 0010:__netif_napi_del_locked+0xf0/0x100 Call Trace: <TASK> mt76_dma_cleanup+0x54/0x2f0 [mt76] mt7921_pci_remove+0xd5/0x190 [mt7921e] pci_device_remove+0x47/0xc0 device_release_driver_internal+0x19e/0x200 driver_detach+0x48/0x90 bus_remove_driver+0x6d/0xf0 pci_unregister_driver+0x2e/0xb0 __do_sys_delete_module.isra.0+0x197/0x2e0 do_syscall_64+0x7b/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Tested with mt7921e but the same pattern can be actually applied to other mt76 drivers calling mt76_dma_cleanup() during removal. Tx napi is enabled in their *_dma_init() functions and only toggled off and on again inside their suspend/resume/reset paths. So it should be okay to disable tx napi in such a generic way.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 2ac515a5d74f ("mt76: mt76x02: use napi polling for tx cleanup") Cc: stable@vger.kernel.org Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Tested-by: Ming Yen Hsieh mingyen.hsieh@mediatek.com Link: https://patch.msgid.link/20250506115540.19045-1-pchelkin@ispras.ru Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/mediatek/mt76/dma.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/wireless/mediatek/mt76/dma.c +++ b/drivers/net/wireless/mediatek/mt76/dma.c @@ -684,6 +684,7 @@ void mt76_dma_cleanup(struct mt76_dev *d int i;
mt76_worker_disable(&dev->tx_worker); + napi_disable(&dev->tx_napi); netif_napi_del(&dev->tx_napi);
for (i = 0; i < ARRAY_SIZE(dev->phy.q_tx); i++) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ronald Wahl ronald.wahl@legrand.com
commit fca280992af8c2fbd511bc43f65abb4a17363f2f upstream.
Recent kernels complain about a missing lock in k3-udma.c when the lock validator is enabled:
[ 4.128073] WARNING: CPU: 0 PID: 746 at drivers/dma/ti/../virt-dma.h:169 udma_start.isra.0+0x34/0x238 [ 4.137352] CPU: 0 UID: 0 PID: 746 Comm: kworker/0:3 Not tainted 6.12.9-arm64 #28 [ 4.144867] Hardware name: pp-v12 (DT) [ 4.148648] Workqueue: events udma_check_tx_completion [ 4.153841] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 4.160834] pc : udma_start.isra.0+0x34/0x238 [ 4.165227] lr : udma_start.isra.0+0x30/0x238 [ 4.169618] sp : ffffffc083cabcf0 [ 4.172963] x29: ffffffc083cabcf0 x28: 0000000000000000 x27: ffffff800001b005 [ 4.180167] x26: ffffffc0812f0000 x25: 0000000000000000 x24: 0000000000000000 [ 4.187370] x23: 0000000000000001 x22: 00000000e21eabe9 x21: ffffff8000fa0670 [ 4.194571] x20: ffffff8001b6bf00 x19: ffffff8000fa0430 x18: ffffffc083b95030 [ 4.201773] x17: 0000000000000000 x16: 00000000f0000000 x15: 0000000000000048 [ 4.208976] x14: 0000000000000048 x13: 0000000000000000 x12: 0000000000000001 [ 4.216179] x11: ffffffc08151a240 x10: 0000000000003ea1 x9 : ffffffc08046ab68 [ 4.223381] x8 : ffffffc083cabac0 x7 : ffffffc081df3718 x6 : 0000000000029fc8 [ 4.230583] x5 : ffffffc0817ee6d8 x4 : 0000000000000bc0 x3 : 0000000000000000 [ 4.237784] x2 : 0000000000000000 x1 : 00000000001fffff x0 : 0000000000000000 [ 4.244986] Call trace: [ 4.247463] udma_start.isra.0+0x34/0x238 [ 4.251509] udma_check_tx_completion+0xd0/0xdc [ 4.256076] process_one_work+0x244/0x3fc [ 4.260129] process_scheduled_works+0x6c/0x74 [ 4.264610] worker_thread+0x150/0x1dc [ 4.268398] kthread+0xd8/0xe8 [ 4.271492] ret_from_fork+0x10/0x20 [ 4.275107] irq event stamp: 220 [ 4.278363] hardirqs last enabled at (219): [<ffffffc080a27c7c>] _raw_spin_unlock_irq+0x38/0x50 [ 4.287183] hardirqs last disabled at (220): [<ffffffc080a1c154>] el1_dbg+0x24/0x50 [ 4.294879] softirqs last enabled at (182): [<ffffffc080037e68>] handle_softirqs+0x1c0/0x3cc [ 4.303437] softirqs last disabled at (177): [<ffffffc080010170>] __do_softirq+0x1c/0x28 [ 4.311559] ---[ end trace 0000000000000000 ]---
This commit adds the missing locking.
Fixes: 25dcb5dd7b7c ("dmaengine: ti: New driver for K3 UDMA") Cc: Peter Ujfalusi peter.ujfalusi@gmail.com Cc: Vignesh Raghavendra vigneshr@ti.com Cc: Vinod Koul vkoul@kernel.org Cc: dmaengine@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Ronald Wahl ronald.wahl@legrand.com Acked-by: Peter Ujfalusi peter.ujfalusi@gmail.com Link: https://lore.kernel.org/r/20250414173113.80677-1-rwahl@gmx.de Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/ti/k3-udma.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/dma/ti/k3-udma.c +++ b/drivers/dma/ti/k3-udma.c @@ -1082,8 +1082,11 @@ static void udma_check_tx_completion(str u32 residue_diff; ktime_t time_diff; unsigned long delay; + unsigned long flags;
while (1) { + spin_lock_irqsave(&uc->vc.lock, flags); + if (uc->desc) { /* Get previous residue and time stamp */ residue_diff = uc->tx_drain.residue; @@ -1118,6 +1121,8 @@ static void udma_check_tx_completion(str break; }
+ spin_unlock_irqrestore(&uc->vc.lock, flags); + usleep_range(ktime_to_us(delay), ktime_to_us(delay) + 10); continue; @@ -1134,6 +1139,8 @@ static void udma_check_tx_completion(str
break; } + + spin_unlock_irqrestore(&uc->vc.lock, flags); }
static irqreturn_t udma_ring_irq_handler(int irq, void *data)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yemike Abhilash Chandra y-abhilashchandra@ti.com
commit 8ca9590c39b69b55a8de63d2b21b0d44f523b43a upstream.
Currently, a local dma_cap_mask_t variable is used to store device cap_mask within udma_of_xlate(). However, the DMA_PRIVATE flag in the device cap_mask can get cleared when the last channel is released. This can happen right after storing the cap_mask locally in udma_of_xlate(), and subsequent dma_request_channel() can fail due to mismatch in the cap_mask. Fix this by removing the local dma_cap_mask_t variable and directly using the one from the dma_device structure.
Fixes: 25dcb5dd7b7c ("dmaengine: ti: New driver for K3 UDMA") Cc: stable@vger.kernel.org Signed-off-by: Vaishnav Achath vaishnav.a@ti.com Acked-by: Peter Ujfalusi peter.ujfalusi@gmail.com Reviewed-by: Udit Kumar u-kumar1@ti.com Signed-off-by: Yemike Abhilash Chandra y-abhilashchandra@ti.com Link: https://lore.kernel.org/r/20250417075521.623651-1-y-abhilashchandra@ti.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/ti/k3-udma.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/dma/ti/k3-udma.c +++ b/drivers/dma/ti/k3-udma.c @@ -4210,7 +4210,6 @@ static struct dma_chan *udma_of_xlate(st struct of_dma *ofdma) { struct udma_dev *ud = ofdma->of_dma_data; - dma_cap_mask_t mask = ud->ddev.cap_mask; struct udma_filter_param filter_param; struct dma_chan *chan;
@@ -4242,7 +4241,7 @@ static struct dma_chan *udma_of_xlate(st } }
- chan = __dma_request_channel(&mask, udma_dma_filter_fn, &filter_param, + chan = __dma_request_channel(&ud->ddev.cap_mask, udma_dma_filter_fn, &filter_param, ofdma->of_node); if (!chan) { dev_err(ud->dev, "get channel fail in %s.\n", __func__);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuai Xue xueshuai@linux.alibaba.com
commit 817bced19d1dbdd0b473580d026dc0983e30e17b upstream.
Memory allocated for engines is not freed if an error occurs during idxd_setup_engines(). To fix it, free the allocated memory in the reverse order of allocation before exiting the function in case of an error.
Fixes: 75b911309060 ("dmaengine: idxd: fix engine conf_dev lifetime") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue xueshuai@linux.alibaba.com Reviewed-by: Dave Jiang dave.jiang@intel.com Reviewed-by: Fenghua Yu fenghuay@nvidia.com Link: https://lore.kernel.org/r/20250404120217.48772-3-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/idxd/init.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/dma/idxd/init.c +++ b/drivers/dma/idxd/init.c @@ -289,6 +289,7 @@ static int idxd_setup_engines(struct idx rc = dev_set_name(conf_dev, "engine%d.%d", idxd->id, engine->id); if (rc < 0) { put_device(conf_dev); + kfree(engine); goto err; }
@@ -302,7 +303,10 @@ static int idxd_setup_engines(struct idx engine = idxd->engines[i]; conf_dev = engine_confdev(engine); put_device(conf_dev); + kfree(engine); } + kfree(idxd->engines); + return rc; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuai Xue xueshuai@linux.alibaba.com
commit aa6f4f945b10eac57aed46154ae7d6fada7fccc7 upstream.
Memory allocated for groups is not freed if an error occurs during idxd_setup_groups(). To fix it, free the allocated memory in the reverse order of allocation before exiting the function in case of an error.
Fixes: defe49f96012 ("dmaengine: idxd: fix group conf_dev lifetime") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue xueshuai@linux.alibaba.com Reviewed-by: Dave Jiang dave.jiang@intel.com Reviewed-by: Fenghua Yu fenghuay@nvidia.com Link: https://lore.kernel.org/r/20250404120217.48772-4-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/idxd/init.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/dma/idxd/init.c +++ b/drivers/dma/idxd/init.c @@ -340,6 +340,7 @@ static int idxd_setup_groups(struct idxd rc = dev_set_name(conf_dev, "group%d.%d", idxd->id, group->id); if (rc < 0) { put_device(conf_dev); + kfree(group); goto err; }
@@ -359,7 +360,10 @@ static int idxd_setup_groups(struct idxd while (--i >= 0) { group = idxd->groups[i]; put_device(group_confdev(group)); + kfree(group); } + kfree(idxd->groups); + return rc; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fengnan Chang changfengnan@bytedance.com
commit 8b44b4d81598 ("block: don't allow multiple bios for IOCB_NOWAIT issue") backport a upstream fix, but miss commit b77c88c2100c ("block: pass a block_device and opf to bio_alloc_kiocb"), and introduce this bug. commit b77c88c2100c ("block: pass a block_device and opf to bio_alloc_kiocb") have other depend patch, so just fix it.
Fixes: 8b44b4d81598 ("block: don't allow multiple bios for IOCB_NOWAIT issue") Signed-off-by: Fengnan Chang changfengnan@bytedance.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/fops.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/block/fops.c +++ b/block/fops.c @@ -259,7 +259,6 @@ static ssize_t __blkdev_direct_IO(struct blk_finish_plug(&plug); return -EAGAIN; } - bio->bi_opf |= REQ_NOWAIT; }
if (is_read) { @@ -270,6 +269,10 @@ static ssize_t __blkdev_direct_IO(struct bio->bi_opf = dio_bio_write_op(iocb); task_io_account_write(bio->bi_iter.bi_size); } + + if (iocb->ki_flags & IOCB_NOWAIT) + bio->bi_opf |= REQ_NOWAIT; + dio->size += bio->bi_iter.bi_size; pos += bio->bi_iter.bi_size;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sebastian Andrzej Siewior bigeasy@linutronix.de
commit 94cff94634e506a4a44684bee1875d2dbf782722 upstream.
On x86 during boot, clockevent_i8253_disable() can be invoked via x86_late_time_init -> hpet_time_init() -> pit_timer_init() which happens with enabled interrupts.
If some of the old i8253 hardware is actually used then lockdep will notice that i8253_lock is used in hard interrupt context. This causes lockdep to complain because it observed the lock being acquired with interrupts enabled and in hard interrupt context.
Make clockevent_i8253_disable() acquire the lock with raw_spinlock_irqsave() to cure this.
[ tglx: Massage change log and use guard() ]
Fixes: c8c4076723dac ("x86/timer: Skip PIT initialization on modern chipsets") Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20250404133116.p-XRWJXf@linutronix.de Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clocksource/i8253.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/clocksource/i8253.c +++ b/drivers/clocksource/i8253.c @@ -103,7 +103,9 @@ int __init clocksource_i8253_init(void) #ifdef CONFIG_CLKEVT_I8253 void clockevent_i8253_disable(void) { - raw_spin_lock(&i8253_lock); + unsigned long flags; + + raw_spin_lock_irqsave(&i8253_lock, flags);
/* * Writing the MODE register should stop the counter, according to @@ -133,7 +135,7 @@ void clockevent_i8253_disable(void)
outb_p(0x30, PIT_MODE);
- raw_spin_unlock(&i8253_lock); + raw_spin_unlock_irqrestore(&i8253_lock, flags); }
static int pit_shutdown(struct clock_event_device *evt)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrei Kuchynski akuchynski@chromium.org
commit 364618c89d4c57c85e5fc51a2446cd939bf57802 upstream.
This patch introduces the ucsi_con_mutex_lock / ucsi_con_mutex_unlock functions to the UCSI driver. ucsi_con_mutex_lock ensures the connector mutex is only locked if a connection is established and the partner pointer is valid. This resolves a deadlock scenario where ucsi_displayport_remove_partner holds con->mutex waiting for dp_altmode_work to complete while dp_altmode_work attempts to acquire it.
Cc: stable stable@kernel.org Fixes: af8622f6a585 ("usb: typec: ucsi: Support for DisplayPort alt mode") Signed-off-by: Andrei Kuchynski akuchynski@chromium.org Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20250424084429.3220757-2-akuchynski@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/ucsi/displayport.c | 19 +++++++++++-------- drivers/usb/typec/ucsi/ucsi.c | 34 ++++++++++++++++++++++++++++++++++ drivers/usb/typec/ucsi/ucsi.h | 3 +++ 3 files changed, 48 insertions(+), 8 deletions(-)
--- a/drivers/usb/typec/ucsi/displayport.c +++ b/drivers/usb/typec/ucsi/displayport.c @@ -54,7 +54,8 @@ static int ucsi_displayport_enter(struct u8 cur = 0; int ret;
- mutex_lock(&dp->con->lock); + if (!ucsi_con_mutex_lock(dp->con)) + return -ENOTCONN;
if (!dp->override && dp->initialized) { const struct typec_altmode *p = typec_altmode_get_partner(alt); @@ -100,7 +101,7 @@ static int ucsi_displayport_enter(struct schedule_work(&dp->work); ret = 0; err_unlock: - mutex_unlock(&dp->con->lock); + ucsi_con_mutex_unlock(dp->con);
return ret; } @@ -112,7 +113,8 @@ static int ucsi_displayport_exit(struct u64 command; int ret = 0;
- mutex_lock(&dp->con->lock); + if (!ucsi_con_mutex_lock(dp->con)) + return -ENOTCONN;
if (!dp->override) { const struct typec_altmode *p = typec_altmode_get_partner(alt); @@ -144,7 +146,7 @@ static int ucsi_displayport_exit(struct schedule_work(&dp->work);
out_unlock: - mutex_unlock(&dp->con->lock); + ucsi_con_mutex_unlock(dp->con);
return ret; } @@ -202,20 +204,21 @@ static int ucsi_displayport_vdm(struct t int cmd = PD_VDO_CMD(header); int svdm_version;
- mutex_lock(&dp->con->lock); + if (!ucsi_con_mutex_lock(dp->con)) + return -ENOTCONN;
if (!dp->override && dp->initialized) { const struct typec_altmode *p = typec_altmode_get_partner(alt);
dev_warn(&p->dev, "firmware doesn't support alternate mode overriding\n"); - mutex_unlock(&dp->con->lock); + ucsi_con_mutex_unlock(dp->con); return -EOPNOTSUPP; }
svdm_version = typec_altmode_get_svdm_version(alt); if (svdm_version < 0) { - mutex_unlock(&dp->con->lock); + ucsi_con_mutex_unlock(dp->con); return svdm_version; }
@@ -259,7 +262,7 @@ static int ucsi_displayport_vdm(struct t break; }
- mutex_unlock(&dp->con->lock); + ucsi_con_mutex_unlock(dp->con);
return 0; } --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -1352,6 +1352,40 @@ void ucsi_set_drvdata(struct ucsi *ucsi, EXPORT_SYMBOL_GPL(ucsi_set_drvdata);
/** + * ucsi_con_mutex_lock - Acquire the connector mutex + * @con: The connector interface to lock + * + * Returns true on success, false if the connector is disconnected + */ +bool ucsi_con_mutex_lock(struct ucsi_connector *con) +{ + bool mutex_locked = false; + bool connected = true; + + while (connected && !mutex_locked) { + mutex_locked = mutex_trylock(&con->lock) != 0; + connected = con->status.flags & UCSI_CONSTAT_CONNECTED; + if (connected && !mutex_locked) + msleep(20); + } + + connected = connected && con->partner; + if (!connected && mutex_locked) + mutex_unlock(&con->lock); + + return connected; +} + +/** + * ucsi_con_mutex_unlock - Release the connector mutex + * @con: The connector interface to unlock + */ +void ucsi_con_mutex_unlock(struct ucsi_connector *con) +{ + mutex_unlock(&con->lock); +} + +/** * ucsi_create - Allocate UCSI instance * @dev: Device interface to the PPM (Platform Policy Manager) * @ops: I/O routines --- a/drivers/usb/typec/ucsi/ucsi.h +++ b/drivers/usb/typec/ucsi/ucsi.h @@ -15,6 +15,7 @@
struct ucsi; struct ucsi_altmode; +struct ucsi_connector;
/* UCSI offsets (Bytes) */ #define UCSI_VERSION 0 @@ -62,6 +63,8 @@ int ucsi_register(struct ucsi *ucsi); void ucsi_unregister(struct ucsi *ucsi); void *ucsi_get_drvdata(struct ucsi *ucsi); void ucsi_set_drvdata(struct ucsi *ucsi, void *data); +bool ucsi_con_mutex_lock(struct ucsi_connector *con); +void ucsi_con_mutex_unlock(struct ucsi_connector *con);
void ucsi_connector_change(struct ucsi *ucsi, u8 num);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: RD Babiera rdbabiera@google.com
commit 165376f6b23e9a779850e750fb2eb06622e5a531 upstream.
The DisplayPort driver's sysfs nodes may be present to the userspace before typec_altmode_set_drvdata() completes in dp_altmode_probe. This means that a sysfs read can trigger a NULL pointer error by deferencing dp->hpd in hpd_show or dp->lock in pin_assignment_show, as dev_get_drvdata() returns NULL in those cases.
Remove manual sysfs node creation in favor of adding attribute group as default for devices bound to the driver. The ATTRIBUTE_GROUPS() macro is not used here otherwise the path to the sysfs nodes is no longer compliant with the ABI.
Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode") Cc: stable@vger.kernel.org Signed-off-by: RD Babiera rdbabiera@google.com Link: https://lore.kernel.org/r/20240229001101.3889432-2-rdbabiera@google.com [Minor conflict resolved due to code context change.] Signed-off-by: Jianqi Ren jianqi.ren.cn@windriver.com Signed-off-by: He Zhe zhe.he@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/altmodes/displayport.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
--- a/drivers/usb/typec/altmodes/displayport.c +++ b/drivers/usb/typec/altmodes/displayport.c @@ -521,22 +521,26 @@ static ssize_t pin_assignment_show(struc } static DEVICE_ATTR_RW(pin_assignment);
-static struct attribute *dp_altmode_attrs[] = { +static struct attribute *displayport_attrs[] = { &dev_attr_configuration.attr, &dev_attr_pin_assignment.attr, NULL };
-static const struct attribute_group dp_altmode_group = { +static const struct attribute_group displayport_group = { .name = "displayport", - .attrs = dp_altmode_attrs, + .attrs = displayport_attrs, +}; + +static const struct attribute_group *displayport_groups[] = { + &displayport_group, + NULL, };
int dp_altmode_probe(struct typec_altmode *alt) { const struct typec_altmode *port = typec_altmode_get_partner(alt); struct dp_altmode *dp; - int ret;
/* FIXME: Port can only be DFP_U. */
@@ -547,10 +551,6 @@ int dp_altmode_probe(struct typec_altmod DP_CAP_PIN_ASSIGN_DFP_D(alt->vdo))) return -ENODEV;
- ret = sysfs_create_group(&alt->dev.kobj, &dp_altmode_group); - if (ret) - return ret; - dp = devm_kzalloc(&alt->dev, sizeof(*dp), GFP_KERNEL); if (!dp) return -ENOMEM; @@ -576,7 +576,6 @@ void dp_altmode_remove(struct typec_altm { struct dp_altmode *dp = typec_altmode_get_drvdata(alt);
- sysfs_remove_group(&alt->dev.kobj, &dp_altmode_group); cancel_work_sync(&dp->work); } EXPORT_SYMBOL_GPL(dp_altmode_remove); @@ -594,6 +593,7 @@ static struct typec_altmode_driver dp_al .driver = { .name = "typec_displayport", .owner = THIS_MODULE, + .dev_groups = displayport_groups, }, }; module_typec_altmode_driver(dp_altmode_driver);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
commit e56aac6e5a25630645607b6856d4b2a17b2311a5 upstream.
The "command" variable can be controlled by the user via debugfs. The worry is that if con_index is zero then "&uc->ucsi->connector[con_index - 1]" would be an array underflow.
Fixes: 170a6726d0e2 ("usb: typec: ucsi: add support for separate DP altmode devices") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/c69ef0b3-61b0-4dde-98dd-97b97f81d912@stanley.mount... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [ The function ucsi_ccg_sync_write() is renamed to ucsi_ccg_sync_control() in commit 13f2ec3115c8 ("usb: typec: ucsi:simplify command sending API"). Apply this patch to ucsi_ccg_sync_write() in 6.1.y accordingly. ] Signed-off-by: Bin Lan bin.lan.cn@windriver.com Signed-off-by: He Zhe zhe.he@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/ucsi/ucsi_ccg.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/usb/typec/ucsi/ucsi_ccg.c +++ b/drivers/usb/typec/ucsi/ucsi_ccg.c @@ -573,6 +573,10 @@ static int ucsi_ccg_sync_write(struct uc uc->has_multiple_dp) { con_index = (uc->last_cmd_sent >> 16) & UCSI_CMD_CONNECTOR_MASK; + if (con_index == 0) { + ret = -EINVAL; + goto unlock; + } con = &uc->ucsi->connector[con_index - 1]; ucsi_ccg_update_set_new_cam_cmd(uc, con, (u64 *)val); } @@ -588,6 +592,7 @@ static int ucsi_ccg_sync_write(struct uc err_clear_bit: clear_bit(DEV_CMD_PENDING, &uc->flags); pm_runtime_put_sync(uc->dev); +unlock: mutex_unlock(&uc->lock);
return ret;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: GONG Ruiqi gongruiqi1@huawei.com
commit b0e525d7a22ea350e75e2aec22e47fcfafa4cacd upstream.
The error handling for the case `con_index == 0` should involve dropping the pm usage counter, as ucsi_ccg_sync_control() gets it at the beginning. Fix it.
Cc: stable stable@kernel.org Fixes: e56aac6e5a25 ("usb: typec: fix potential array underflow in ucsi_ccg_sync_control()") Signed-off-by: GONG Ruiqi gongruiqi1@huawei.com Reviewed-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20250107015750.2778646-1-gongruiqi1@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [Minor context change fixed.] Signed-off-by: Bin Lan bin.lan.cn@windriver.com Signed-off-by: He Zhe zhe.he@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/ucsi/ucsi_ccg.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/typec/ucsi/ucsi_ccg.c +++ b/drivers/usb/typec/ucsi/ucsi_ccg.c @@ -575,7 +575,7 @@ static int ucsi_ccg_sync_write(struct uc UCSI_CMD_CONNECTOR_MASK; if (con_index == 0) { ret = -EINVAL; - goto unlock; + goto err_put; } con = &uc->ucsi->connector[con_index - 1]; ucsi_ccg_update_set_new_cam_cmd(uc, con, (u64 *)val); @@ -591,8 +591,8 @@ static int ucsi_ccg_sync_write(struct uc
err_clear_bit: clear_bit(DEV_CMD_PENDING, &uc->flags); +err_put: pm_runtime_put_sync(uc->dev); -unlock: mutex_unlock(&uc->lock);
return ret;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Feng Tang feng.tang@linux.alibaba.com
commit ab00ddd802f80e31fc9639c652d736fe3913feae upstream.
When running mm selftest to verify mm patches, 'compaction_test' case failed on an x86 server with 1TB memory. And the root cause is that it has too much free memory than what the test supports.
The test case tries to allocate 100000 huge pages, which is about 200 GB for that x86 server, and when it succeeds, it expects it's large than 1/3 of 80% of the free memory in system. This logic only works for platform with 750 GB ( 200 / (1/3) / 80% ) or less free memory, and may raise false alarm for others.
Fix it by changing the fixed page number to self-adjustable number according to the real number of free memory.
Link: https://lkml.kernel.org/r/20250423103645.2758-1-feng.tang@linux.alibaba.com Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory") Signed-off-by: Feng Tang feng.tang@linux.alibaba.com Acked-by: Dev Jain dev.jain@arm.com Reviewed-by: Baolin Wang baolin.wang@linux.alibaba.com Tested-by: Baolin Wang baolin.wang@inux.alibaba.com Cc: Shuah Khan shuah@kernel.org Cc: Sri Jayaramappa sjayaram@akamai.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/vm/compaction_test.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
--- a/tools/testing/selftests/vm/compaction_test.c +++ b/tools/testing/selftests/vm/compaction_test.c @@ -89,6 +89,8 @@ int check_compaction(unsigned long mem_f int compaction_index = 0; char initial_nr_hugepages[20] = {0}; char nr_hugepages[20] = {0}; + char target_nr_hugepages[24] = {0}; + int slen;
/* We want to test with 80% of available memory. Else, OOM killer comes in to play */ @@ -118,11 +120,18 @@ int check_compaction(unsigned long mem_f
lseek(fd, 0, SEEK_SET);
- /* Request a large number of huge pages. The Kernel will allocate - as much as it can */ - if (write(fd, "100000", (6*sizeof(char))) != (6*sizeof(char))) { - ksft_test_result_fail("Failed to write 100000 to /proc/sys/vm/nr_hugepages: %s\n", - strerror(errno)); + /* + * Request huge pages for about half of the free memory. The Kernel + * will allocate as much as it can, and we expect it will get at least 1/3 + */ + nr_hugepages_ul = mem_free / hugepage_size / 2; + snprintf(target_nr_hugepages, sizeof(target_nr_hugepages), + "%lu", nr_hugepages_ul); + + slen = strlen(target_nr_hugepages); + if (write(fd, target_nr_hugepages, slen) != slen) { + ksft_test_result_fail("Failed to write %lu to /proc/sys/vm/nr_hugepages: %s\n", + nr_hugepages_ul, strerror(errno)); goto close_fd; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
commit 10206302af856791fbcc27a33ed3c3eb09b2793d upstream.
We must serialize calls to sctp_udp_sock_stop() and sctp_udp_sock_start() or risk a crash as syzbot reported:
Oops: general protection fault, probably for non-canonical address 0xdffffc000000000d: 0000 [#1] SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f] CPU: 1 UID: 0 PID: 6551 Comm: syz.1.44 Not tainted 6.14.0-syzkaller-g7f2ff7b62617 #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 RIP: 0010:kernel_sock_shutdown+0x47/0x70 net/socket.c:3653 Call Trace: <TASK> udp_tunnel_sock_release+0x68/0x80 net/ipv4/udp_tunnel_core.c:181 sctp_udp_sock_stop+0x71/0x160 net/sctp/protocol.c:930 proc_sctp_do_udp_port+0x264/0x450 net/sctp/sysctl.c:553 proc_sys_call_handler+0x3d0/0x5b0 fs/proc/proc_sysctl.c:601 iter_file_splice_write+0x91c/0x1150 fs/splice.c:738 do_splice_from fs/splice.c:935 [inline] direct_splice_actor+0x18f/0x6c0 fs/splice.c:1158 splice_direct_to_actor+0x342/0xa30 fs/splice.c:1102 do_splice_direct_actor fs/splice.c:1201 [inline] do_splice_direct+0x174/0x240 fs/splice.c:1227 do_sendfile+0xafd/0xe50 fs/read_write.c:1368 __do_sys_sendfile64 fs/read_write.c:1429 [inline] __se_sys_sendfile64 fs/read_write.c:1415 [inline] __x64_sys_sendfile64+0x1d8/0x220 fs/read_write.c:1415 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
Fixes: 046c052b475e ("sctp: enable udp tunneling socks") Reported-by: syzbot+fae49d997eb56fa7c74d@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/67ea5c01.050a0220.1547ec.012b.GAE@google.com/... Signed-off-by: Eric Dumazet edumazet@google.com Cc: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Acked-by: Xin Long lucien.xin@gmail.com Link: https://patch.msgid.link/20250331091532.224982-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org [Minor conflict resolved due to code context change.] Signed-off-by: Jianqi Ren jianqi.ren.cn@windriver.com Signed-off-by: He Zhe zhe.he@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/sysctl.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/net/sctp/sysctl.c +++ b/net/sctp/sysctl.c @@ -518,6 +518,8 @@ static int proc_sctp_do_auth(struct ctl_ return ret; }
+static DEFINE_MUTEX(sctp_sysctl_mutex); + static int proc_sctp_do_udp_port(struct ctl_table *ctl, int write, void *buffer, size_t *lenp, loff_t *ppos) { @@ -542,6 +544,7 @@ static int proc_sctp_do_udp_port(struct if (new_value > max || new_value < min) return -EINVAL;
+ mutex_lock(&sctp_sysctl_mutex); net->sctp.udp_port = new_value; sctp_udp_sock_stop(net); if (new_value) { @@ -554,6 +557,7 @@ static int proc_sctp_do_udp_port(struct lock_sock(sk); sctp_sk(sk)->udp_port = htons(net->sctp.udp_port); release_sock(sk); + mutex_unlock(&sctp_sysctl_mutex); }
return ret;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit 28cb13f29faf6290597b24b728dc3100c019356f upstream.
Instead of doing a BUG_ON() handle the error by returning -EUCLEAN, aborting the transaction and logging an error message.
Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com [Minor conflict resolved due to code context change.] Signed-off-by: Jianqi Ren jianqi.ren.cn@windriver.com Signed-off-by: He Zhe zhe.he@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent-tree.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-)
--- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -176,6 +176,14 @@ search_again: ei = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_extent_item); num_refs = btrfs_extent_refs(leaf, ei); + if (unlikely(num_refs == 0)) { + ret = -EUCLEAN; + btrfs_err(fs_info, + "unexpected zero reference count for extent item (%llu %u %llu)", + key.objectid, key.type, key.offset); + btrfs_abort_transaction(trans, ret); + goto out_free; + } extent_flags = btrfs_extent_flags(leaf, ei); } else { ret = -EINVAL; @@ -187,8 +195,6 @@ search_again:
goto out_free; } - - BUG_ON(num_refs == 0); } else { num_refs = 0; extent_flags = 0; @@ -218,10 +224,19 @@ search_again: goto search_again; } spin_lock(&head->lock); - if (head->extent_op && head->extent_op->update_flags) + if (head->extent_op && head->extent_op->update_flags) { extent_flags |= head->extent_op->flags_to_set; - else - BUG_ON(num_refs == 0); + } else if (unlikely(num_refs == 0)) { + spin_unlock(&head->lock); + mutex_unlock(&head->mutex); + spin_unlock(&delayed_refs->lock); + ret = -EUCLEAN; + btrfs_err(fs_info, + "unexpected zero reference count for extent %llu (%s)", + bytenr, metadata ? "metadata" : "data"); + btrfs_abort_transaction(trans, ret); + goto out_free; + }
num_refs += head->ref_mod; spin_unlock(&head->lock);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
commit 8cbc3001a3264d998d6b6db3e23f935c158abd4d upstream.
The submit helper will always run bio_endio() on the bio if it fails to submit, so cleaning up the bio just leads to a variety of use-after-free and NULL pointer dereference bugs because we race with the endio function that is cleaning up the bio. Instead just return BLK_STS_OK as the repair function has to continue to process the rest of the pages, and the endio for the repair bio will do the appropriate cleanup for the page that it was given.
Reviewed-by: Boris Burkov boris@bur.io Signed-off-by: Josef Bacik josef@toxicpanda.com Signed-off-by: David Sterba dsterba@suse.com [Minor context change fixed.] Signed-off-by: Bin Lan bin.lan.cn@windriver.com Signed-off-by: He Zhe zhe.he@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent_io.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
--- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2624,7 +2624,6 @@ int btrfs_repair_one_sector(struct inode const int icsum = bio_offset >> fs_info->sectorsize_bits; struct bio *repair_bio; struct btrfs_io_bio *repair_io_bio; - blk_status_t status;
btrfs_debug(fs_info, "repair read error: read error at %llu", start); @@ -2664,13 +2663,13 @@ int btrfs_repair_one_sector(struct inode "repair read error: submitting new read to mirror %d", failrec->this_mirror);
- status = submit_bio_hook(inode, repair_bio, failrec->this_mirror, - failrec->bio_flags); - if (status) { - free_io_failure(failure_tree, tree, failrec); - bio_put(repair_bio); - } - return blk_status_to_errno(status); + /* + * At this point we have a bio, so any errors from submit_bio_hook() + * will be handled by the endio on the repair_bio, so we can't return an + * error here. + */ + submit_bio_hook(inode, repair_bio, failrec->this_mirror, failrec->bio_flags); + return BLK_STS_OK; }
static void end_page_read(struct page *page, bool uptodate, u64 start, u32 len)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit 8965d42bcf54d42cbc72fe34a9d0ec3f8527debd upstream.
It would be better to not store nft_ctx inside nft_trans object, the netlink ctx strucutre is huge and most of its information is never needed in places that use trans->ctx.
Avoid/reduce its usage if possible, no runtime behaviour change intended.
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/netfilter/nf_tables.h | 2 +- net/netfilter/nf_tables_api.c | 17 ++++++++--------- net/netfilter/nft_immediate.c | 2 +- 3 files changed, 10 insertions(+), 11 deletions(-)
--- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -1088,7 +1088,7 @@ static inline bool nft_chain_is_bound(st
int nft_chain_add(struct nft_table *table, struct nft_chain *chain); void nft_chain_del(struct nft_chain *chain); -void nf_tables_chain_destroy(struct nft_ctx *ctx); +void nf_tables_chain_destroy(struct nft_chain *chain);
struct nft_stats { u64 bytes; --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -1981,9 +1981,9 @@ static void nf_tables_chain_free_chain_r kvfree(chain->rules_next); }
-void nf_tables_chain_destroy(struct nft_ctx *ctx) +void nf_tables_chain_destroy(struct nft_chain *chain) { - struct nft_chain *chain = ctx->chain; + const struct nft_table *table = chain->table; struct nft_hook *hook, *next;
if (WARN_ON(chain->use > 0)) @@ -1995,7 +1995,7 @@ void nf_tables_chain_destroy(struct nft_ if (nft_is_base_chain(chain)) { struct nft_base_chain *basechain = nft_base_chain(chain);
- if (nft_base_chain_netdev(ctx->family, basechain->ops.hooknum)) { + if (nft_base_chain_netdev(table->family, basechain->ops.hooknum)) { list_for_each_entry_safe(hook, next, &basechain->hook_list, list) { list_del_rcu(&hook->list); @@ -2445,7 +2445,7 @@ err_unregister_hook: err_use: nf_tables_unregister_hook(net, table, chain); err_destroy_chain: - nf_tables_chain_destroy(ctx); + nf_tables_chain_destroy(chain);
return err; } @@ -8809,7 +8809,7 @@ static void nft_commit_release(struct nf kfree(nft_trans_chain_name(trans)); break; case NFT_MSG_DELCHAIN: - nf_tables_chain_destroy(&trans->ctx); + nf_tables_chain_destroy(nft_trans_chain(trans)); break; case NFT_MSG_DELRULE: nf_tables_rule_destroy(&trans->ctx, nft_trans_rule(trans)); @@ -9721,7 +9721,7 @@ static void nf_tables_abort_release(stru nf_tables_table_destroy(&trans->ctx); break; case NFT_MSG_NEWCHAIN: - nf_tables_chain_destroy(&trans->ctx); + nf_tables_chain_destroy(nft_trans_chain(trans)); break; case NFT_MSG_NEWRULE: nf_tables_rule_destroy(&trans->ctx, nft_trans_rule(trans)); @@ -10443,7 +10443,7 @@ int __nft_release_basechain(struct nft_c } nft_chain_del(ctx->chain); nft_use_dec(&ctx->table->use); - nf_tables_chain_destroy(ctx); + nf_tables_chain_destroy(ctx->chain);
return 0; } @@ -10519,10 +10519,9 @@ static void __nft_release_table(struct n nft_obj_destroy(&ctx, obj); } list_for_each_entry_safe(chain, nc, &table->chains, list) { - ctx.chain = chain; nft_chain_del(chain); nft_use_dec(&table->use); - nf_tables_chain_destroy(&ctx); + nf_tables_chain_destroy(chain); } nf_tables_table_destroy(&ctx); } --- a/net/netfilter/nft_immediate.c +++ b/net/netfilter/nft_immediate.c @@ -221,7 +221,7 @@ static void nft_immediate_destroy(const list_del(&rule->list); nf_tables_rule_destroy(&chain_ctx, rule); } - nf_tables_chain_destroy(&chain_ctx); + nf_tables_chain_destroy(chain); break; default: break;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
commit c03d278fdf35e73dd0ec543b9b556876b9d9a8dc upstream.
8c873e219970 ("netfilter: core: free hooks with call_rcu") removed synchronize_net() call when unregistering basechain hook, however, net_device removal event handler for the NFPROTO_NETDEV was not updated to wait for RCU grace period.
Note that 835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks on net_device removal") does not remove basechain rules on device removal, I was hinted to remove rules on net_device removal later, see 5ebe0b0eec9d ("netfilter: nf_tables: destroy basechain and rules on netdevice removal").
Although NETDEV_UNREGISTER event is guaranteed to be handled after synchronize_net() call, this path needs to wait for rcu grace period via rcu callback to release basechain hooks if netns is alive because an ongoing netlink dump could be in progress (sockets hold a reference on the netns).
Note that nf_tables_pre_exit_net() unregisters and releases basechain hooks but it is possible to see NETDEV_UNREGISTER at a later stage in the netns exit path, eg. veth peer device in another netns:
cleanup_net() default_device_exit_batch() unregister_netdevice_many_notify() notifier_call_chain() nf_tables_netdev_event() __nft_release_basechain()
In this particular case, same rule of thumb applies: if netns is alive, then wait for rcu grace period because netlink dump in the other netns could be in progress. Otherwise, if the other netns is going away then no netlink dump can be in progress and basechain hooks can be released inmediately.
While at it, turn WARN_ON() into WARN_ON_ONCE() for the basechain validation, which should not ever happen.
Fixes: 835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks on net_device removal") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/netfilter/nf_tables.h | 3 ++ net/netfilter/nf_tables_api.c | 41 +++++++++++++++++++++++++++++++------- 2 files changed, 37 insertions(+), 7 deletions(-)
--- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -1028,6 +1028,7 @@ struct nft_chain { char *name; u16 udlen; u8 *udata; + struct rcu_head rcu_head;
/* Only used during control plane commit phase: */ struct nft_rule **rules_next; @@ -1170,6 +1171,7 @@ static inline void nft_use_inc_restore(u * @sets: sets in the table * @objects: stateful objects in the table * @flowtables: flow tables in the table + * @net: netnamespace this table belongs to * @hgenerator: handle generator state * @handle: table handle * @use: number of chain references to this table @@ -1185,6 +1187,7 @@ struct nft_table { struct list_head sets; struct list_head objects; struct list_head flowtables; + possible_net_t net; u64 hgenerator; u64 handle; u32 use; --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -1360,6 +1360,7 @@ static int nf_tables_newtable(struct sk_ INIT_LIST_HEAD(&table->sets); INIT_LIST_HEAD(&table->objects); INIT_LIST_HEAD(&table->flowtables); + write_pnet(&table->net, net); table->family = family; table->flags = flags; table->handle = ++nft_net->table_handle; @@ -10428,22 +10429,48 @@ int nft_data_dump(struct sk_buff *skb, i } EXPORT_SYMBOL_GPL(nft_data_dump);
-int __nft_release_basechain(struct nft_ctx *ctx) +static void __nft_release_basechain_now(struct nft_ctx *ctx) { struct nft_rule *rule, *nr;
- if (WARN_ON(!nft_is_base_chain(ctx->chain))) - return 0; - - nf_tables_unregister_hook(ctx->net, ctx->chain->table, ctx->chain); list_for_each_entry_safe(rule, nr, &ctx->chain->rules, list) { list_del(&rule->list); - nft_use_dec(&ctx->chain->use); nf_tables_rule_release(ctx, rule); } + nf_tables_chain_destroy(ctx->chain); +} + +static void nft_release_basechain_rcu(struct rcu_head *head) +{ + struct nft_chain *chain = container_of(head, struct nft_chain, rcu_head); + struct nft_ctx ctx = { + .family = chain->table->family, + .chain = chain, + .net = read_pnet(&chain->table->net), + }; + + __nft_release_basechain_now(&ctx); + put_net(ctx.net); +} + +int __nft_release_basechain(struct nft_ctx *ctx) +{ + struct nft_rule *rule; + + if (WARN_ON_ONCE(!nft_is_base_chain(ctx->chain))) + return 0; + + nf_tables_unregister_hook(ctx->net, ctx->chain->table, ctx->chain); + list_for_each_entry(rule, &ctx->chain->rules, list) + nft_use_dec(&ctx->chain->use); + nft_chain_del(ctx->chain); nft_use_dec(&ctx->table->use); - nf_tables_chain_destroy(ctx->chain); + + if (maybe_get_net(ctx->net)) + call_rcu(&ctx->chain->rcu_head, nft_release_basechain_rcu); + else + __nft_release_basechain_now(ctx);
return 0; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit b04df3da1b5c6f6dc7cdccc37941740c078c4043 upstream.
nf_tables_chain_destroy can sleep, it can't be used from call_rcu callbacks.
Moreover, nf_tables_rule_release() is only safe for error unwinding, while transaction mutex is held and the to-be-desroyed rule was not exposed to either dataplane or dumps, as it deactives+frees without the required synchronize_rcu() in-between.
nft_rule_expr_deactivate() callbacks will change ->use counters of other chains/sets, see e.g. nft_lookup .deactivate callback, these must be serialized via transaction mutex.
Also add a few lockdep asserts to make this more explicit.
Calling synchronize_rcu() isn't ideal, but fixing this without is hard and way more intrusive. As-is, we can get:
WARNING: .. net/netfilter/nf_tables_api.c:5515 nft_set_destroy+0x.. Workqueue: events nf_tables_trans_destroy_work RIP: 0010:nft_set_destroy+0x3fe/0x5c0 Call Trace: <TASK> nf_tables_trans_destroy_work+0x6b7/0xad0 process_one_work+0x64a/0xce0 worker_thread+0x613/0x10d0
In case the synchronize_rcu becomes an issue, we can explore alternatives.
One way would be to allocate nft_trans_rule objects + one nft_trans_chain object, deactivate the rules + the chain and then defer the freeing to the nft destroy workqueue. We'd still need to keep the synchronize_rcu path as a fallback to handle -ENOMEM corner cases though.
Reported-by: syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/67478d92.050a0220.253251.0062.GAE@google.com/T/ Fixes: c03d278fdf35 ("netfilter: nf_tables: wait for rcu grace period on net_device removal") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/netfilter/nf_tables.h | 3 --- net/netfilter/nf_tables_api.c | 32 +++++++++++++++----------------- 2 files changed, 15 insertions(+), 20 deletions(-)
--- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -1028,7 +1028,6 @@ struct nft_chain { char *name; u16 udlen; u8 *udata; - struct rcu_head rcu_head;
/* Only used during control plane commit phase: */ struct nft_rule **rules_next; @@ -1171,7 +1170,6 @@ static inline void nft_use_inc_restore(u * @sets: sets in the table * @objects: stateful objects in the table * @flowtables: flow tables in the table - * @net: netnamespace this table belongs to * @hgenerator: handle generator state * @handle: table handle * @use: number of chain references to this table @@ -1187,7 +1185,6 @@ struct nft_table { struct list_head sets; struct list_head objects; struct list_head flowtables; - possible_net_t net; u64 hgenerator; u64 handle; u32 use; --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -1360,7 +1360,6 @@ static int nf_tables_newtable(struct sk_ INIT_LIST_HEAD(&table->sets); INIT_LIST_HEAD(&table->objects); INIT_LIST_HEAD(&table->flowtables); - write_pnet(&table->net, net); table->family = family; table->flags = flags; table->handle = ++nft_net->table_handle; @@ -3441,8 +3440,11 @@ void nf_tables_rule_destroy(const struct kfree(rule); }
+/* can only be used if rule is no longer visible to dumps */ static void nf_tables_rule_release(const struct nft_ctx *ctx, struct nft_rule *rule) { + lockdep_commit_lock_is_held(ctx->net); + nft_rule_expr_deactivate(ctx, rule, NFT_TRANS_RELEASE); nf_tables_rule_destroy(ctx, rule); } @@ -5178,6 +5180,8 @@ void nf_tables_deactivate_set(const stru struct nft_set_binding *binding, enum nft_trans_phase phase) { + lockdep_commit_lock_is_held(ctx->net); + switch (phase) { case NFT_TRANS_PREPARE_ERROR: nft_set_trans_unbind(ctx, set); @@ -10440,19 +10444,6 @@ static void __nft_release_basechain_now( nf_tables_chain_destroy(ctx->chain); }
-static void nft_release_basechain_rcu(struct rcu_head *head) -{ - struct nft_chain *chain = container_of(head, struct nft_chain, rcu_head); - struct nft_ctx ctx = { - .family = chain->table->family, - .chain = chain, - .net = read_pnet(&chain->table->net), - }; - - __nft_release_basechain_now(&ctx); - put_net(ctx.net); -} - int __nft_release_basechain(struct nft_ctx *ctx) { struct nft_rule *rule; @@ -10467,11 +10458,18 @@ int __nft_release_basechain(struct nft_c nft_chain_del(ctx->chain); nft_use_dec(&ctx->table->use);
- if (maybe_get_net(ctx->net)) - call_rcu(&ctx->chain->rcu_head, nft_release_basechain_rcu); - else + if (!maybe_get_net(ctx->net)) { __nft_release_basechain_now(ctx); + return 0; + } + + /* wait for ruleset dumps to complete. Owning chain is no longer in + * lists, so new dumps can't find any of these rules anymore. + */ + synchronize_rcu();
+ __nft_release_basechain_now(ctx); + put_net(ctx->net); return 0; } EXPORT_SYMBOL_GPL(__nft_release_basechain);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Lobakin alexandr.lobakin@intel.com
commit d7442f512b71fc63a99c8a801422dde4fbbf9f93 upstream.
The CI testing bots triggered the following splat:
[ 718.203054] BUG: KASAN: use-after-free in free_irq_cpu_rmap+0x53/0x80 [ 718.206349] Read of size 4 at addr ffff8881bd127e00 by task sh/20834 [ 718.212852] CPU: 28 PID: 20834 Comm: sh Kdump: loaded Tainted: G S W IOE 5.17.0-rc8_nextqueue-devqueue-02643-g23f3121aca93 #1 [ 718.219695] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0012.070720200218 07/07/2020 [ 718.223418] Call Trace: [ 718.227139] [ 718.230783] dump_stack_lvl+0x33/0x42 [ 718.234431] print_address_description.constprop.9+0x21/0x170 [ 718.238177] ? free_irq_cpu_rmap+0x53/0x80 [ 718.241885] ? free_irq_cpu_rmap+0x53/0x80 [ 718.245539] kasan_report.cold.18+0x7f/0x11b [ 718.249197] ? free_irq_cpu_rmap+0x53/0x80 [ 718.252852] free_irq_cpu_rmap+0x53/0x80 [ 718.256471] ice_free_cpu_rx_rmap.part.11+0x37/0x50 [ice] [ 718.260174] ice_remove_arfs+0x5f/0x70 [ice] [ 718.263810] ice_rebuild_arfs+0x3b/0x70 [ice] [ 718.267419] ice_rebuild+0x39c/0xb60 [ice] [ 718.270974] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 718.274472] ? ice_init_phy_user_cfg+0x360/0x360 [ice] [ 718.278033] ? delay_tsc+0x4a/0xb0 [ 718.281513] ? preempt_count_sub+0x14/0xc0 [ 718.284984] ? delay_tsc+0x8f/0xb0 [ 718.288463] ice_do_reset+0x92/0xf0 [ice] [ 718.292014] ice_pci_err_resume+0x91/0xf0 [ice] [ 718.295561] pci_reset_function+0x53/0x80 <...> [ 718.393035] Allocated by task 690: [ 718.433497] Freed by task 20834: [ 718.495688] Last potentially related work creation: [ 718.568966] The buggy address belongs to the object at ffff8881bd127e00 which belongs to the cache kmalloc-96 of size 96 [ 718.574085] The buggy address is located 0 bytes inside of 96-byte region [ffff8881bd127e00, ffff8881bd127e60) [ 718.579265] The buggy address belongs to the page: [ 718.598905] Memory state around the buggy address: [ 718.601809] ffff8881bd127d00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ 718.604796] ffff8881bd127d80: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc [ 718.607794] >ffff8881bd127e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ 718.610811] ^ [ 718.613819] ffff8881bd127e80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc [ 718.617107] ffff8881bd127f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
This is due to that free_irq_cpu_rmap() is always being called *after* (devm_)free_irq() and thus it tries to work with IRQ descs already freed. For example, on device reset the driver frees the rmap right before allocating a new one (the splat above). Make rmap creation and freeing function symmetrical with {request,free}_irq() calls i.e. do that on ifup/ifdown instead of device probe/remove/resume. These operations can be performed independently from the actual device aRFS configuration. Also, make sure ice_vsi_free_irq() clears IRQ affinity notifiers only when aRFS is disabled -- otherwise, CPU rmap sets and clears its own and they must not be touched manually.
Fixes: 28bf26724fdb0 ("ice: Implement aRFS") Co-developed-by: Ivan Vecera ivecera@redhat.com Signed-off-by: Ivan Vecera ivecera@redhat.com Signed-off-by: Alexander Lobakin alexandr.lobakin@intel.com Tested-by: Ivan Vecera ivecera@redhat.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Suraj Jitindar Singh surajjs@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice_arfs.c | 9 ++------- drivers/net/ethernet/intel/ice/ice_lib.c | 5 ++++- drivers/net/ethernet/intel/ice/ice_main.c | 20 ++++++++------------ 3 files changed, 14 insertions(+), 20 deletions(-)
--- a/drivers/net/ethernet/intel/ice/ice_arfs.c +++ b/drivers/net/ethernet/intel/ice/ice_arfs.c @@ -577,7 +577,7 @@ void ice_free_cpu_rx_rmap(struct ice_vsi { struct net_device *netdev;
- if (!vsi || vsi->type != ICE_VSI_PF || !vsi->arfs_fltr_list) + if (!vsi || vsi->type != ICE_VSI_PF) return;
netdev = vsi->netdev; @@ -599,7 +599,7 @@ int ice_set_cpu_rx_rmap(struct ice_vsi * int base_idx, i;
if (!vsi || vsi->type != ICE_VSI_PF) - return -EINVAL; + return 0;
pf = vsi->back; netdev = vsi->netdev; @@ -636,7 +636,6 @@ void ice_remove_arfs(struct ice_pf *pf) if (!pf_vsi) return;
- ice_free_cpu_rx_rmap(pf_vsi); ice_clear_arfs(pf_vsi); }
@@ -653,9 +652,5 @@ void ice_rebuild_arfs(struct ice_pf *pf) return;
ice_remove_arfs(pf); - if (ice_set_cpu_rx_rmap(pf_vsi)) { - dev_err(ice_pf_to_dev(pf), "Failed to rebuild aRFS\n"); - return; - } ice_init_arfs(pf_vsi); } --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -2645,6 +2645,8 @@ void ice_vsi_free_irq(struct ice_vsi *vs return;
vsi->irqs_ready = false; + ice_free_cpu_rx_rmap(vsi); + ice_for_each_q_vector(vsi, i) { u16 vector = i + base; int irq_num; @@ -2658,7 +2660,8 @@ void ice_vsi_free_irq(struct ice_vsi *vs continue;
/* clear the affinity notifier in the IRQ descriptor */ - irq_set_affinity_notifier(irq_num, NULL); + if (!IS_ENABLED(CONFIG_RFS_ACCEL)) + irq_set_affinity_notifier(irq_num, NULL);
/* clear the affinity_mask in the IRQ descriptor */ irq_set_affinity_hint(irq_num, NULL); --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -2393,6 +2393,13 @@ static int ice_vsi_req_irq_msix(struct i irq_set_affinity_hint(irq_num, &q_vector->affinity_mask); }
+ err = ice_set_cpu_rx_rmap(vsi); + if (err) { + netdev_err(vsi->netdev, "Failed to setup CPU RMAP on VSI %u: %pe\n", + vsi->vsi_num, ERR_PTR(err)); + goto free_q_irqs; + } + vsi->irqs_ready = true; return 0;
@@ -3380,22 +3387,12 @@ static int ice_setup_pf_sw(struct ice_pf */ ice_napi_add(vsi);
- status = ice_set_cpu_rx_rmap(vsi); - if (status) { - dev_err(ice_pf_to_dev(pf), "Failed to set CPU Rx map VSI %d error %d\n", - vsi->vsi_num, status); - status = -EINVAL; - goto unroll_napi_add; - } status = ice_init_mac_fltr(pf); if (status) - goto free_cpu_rx_map; + goto unroll_napi_add;
return status;
-free_cpu_rx_map: - ice_free_cpu_rx_rmap(vsi); - unroll_napi_add: if (vsi) { ice_napi_del(vsi); @@ -4886,7 +4883,6 @@ static int __maybe_unused ice_suspend(st continue; ice_vsi_free_q_vectors(pf->vsi[v]); } - ice_free_cpu_rx_rmap(ice_get_main_vsi(pf)); ice_clear_interrupt_scheme(pf);
pci_save_state(pdev);
On 5/20/25 06:49, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.184-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
On 5/20/25 07:49, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.184-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 5/20/25 06:49, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.184-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On 20/05/25 7:19 pm, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
No issues were seen on x86_64 and aarch64 platforms with our testing.
Tested-by: Vijayendra Suman vijayendra.suman@oracle.com
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/ patch-5.15.184-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Thanks, Vijay
On Tue, 20 May 2025 15:49:51 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.184-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.15: 10 builds: 10 pass, 0 fail 28 boots: 28 pass, 0 fail 101 tests: 101 pass, 0 fail
Linux version: 5.15.184-rc1-gba6ee53cdfad Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Tue, 20 May 2025 at 19:23, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.184-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 5.15.184-rc1 * git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git * git commit: ba6ee53cdfadb92bab1c005dfb67a4397a8a7219 * git describe: v5.15.183-60-gba6ee53cdfad * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15....
## Test Regressions (compared to v5.15.182-55-g5aa355897d1b)
## Metric Regressions (compared to v5.15.182-55-g5aa355897d1b)
## Test Fixes (compared to v5.15.182-55-g5aa355897d1b)
## Metric Fixes (compared to v5.15.182-55-g5aa355897d1b)
## Test result summary total: 60301, pass: 46648, fail: 2204, skip: 11039, xfail: 410
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 101 total, 101 passed, 0 failed * arm64: 28 total, 28 passed, 0 failed * i386: 18 total, 18 passed, 0 failed * mips: 22 total, 22 passed, 0 failed * parisc: 3 total, 3 passed, 0 failed * powerpc: 22 total, 22 passed, 0 failed * riscv: 8 total, 8 passed, 0 failed * s390: 9 total, 9 passed, 0 failed * sh: 10 total, 10 passed, 0 failed * sparc: 6 total, 6 passed, 0 failed * x86_64: 24 total, 24 passed, 0 failed
## Test suites summary * boot * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-exec * kselftest-fpu * kselftest-futex * kselftest-intel_pstate * kselftest-kcmp * kselftest-livepatch * kselftest-membarrier * kselftest-mincore * kselftest-mm * kselftest-mqueue * kselftest-net * kselftest-net-mptcp * kselftest-openat2 * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-tc-testing * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user_events * kselftest-vDSO * kselftest-x86 * kunit * kvm-unit-tests * lava * libgpiod * libhugetlbfs * log-parser-boot * log-parser-build-clang * log-parser-build-gcc * log-parser-test * ltp-capability * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-hugetlb * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-smoke * ltp-syscalls * ltp-tracing * perf * rcutorture
-- Linaro LKFT https://lkft.linaro.org
On Tue, May 20, 2025 at 03:49:51PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Tested-by: Mark Brown broonie@kernel.org
On 5/20/25 15:49, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.184-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
It's crashing at boot for me when the ITS mitigation is used (tested on Icelake):
[ OK ] Started udev Coldplug all Devices. Starting udev Wait for Complete Device Initialization... [ 3.567527] BUG: unable to handle page fault for address: ff4fa48f82b9a000 [ 3.575207] #PF: supervisor write access in kernel mode [ 3.581040] #PF: error_code(0x0003) - permissions violation [ 3.587262] PGD 1007f401067 P4D 1007f402067 PUD 3024b3063 PMD 302b99063 PTE 8000000302b9a161 [ 3.596685] Oops: 0003 [#1] SMP NOPTI [ 3.600775] CPU: 73 PID: 1672 Comm: systemd-udevd Not tainted 5.15.184-rc1.its.1.el8.dev.x86_64 #1 [ 3.610779] Hardware name: Oracle Corporation ORACLE SERVER X9-2c/TLA,MB TRAY,X9-2c, BIOS 66110100 07/17/2024 [ 3.621848] RIP: 0010:clear_page_erms+0x7/0x10 [ 3.626813] Code: 48 89 47 18 48 89 47 20 48 89 47 28 48 89 47 30 48 89 47 38 48 8d 7f 40 75 d9 90 e9 13 7f a5 00 0f 1f 00 b9 00 10 00 00 31 c0 <f3> aa e9 02 7f a5 00 cc cc 48 85 ff 0f 84 e5 00 00 00 0f b6 0f 4c [ 3.647774] RSP: 0000:ff63a55d1b8f3cb8 EFLAGS: 00010246 [ 3.653608] RAX: 0000000000000000 RBX: ff63a55d1b8f3d38 RCX: 0000000000001000 [ 3.661565] RDX: ffc82ea4cc0ae680 RSI: ff4fa48d971b1fc0 RDI: ff4fa48f82b9a000 [ 3.669529] RBP: ff4fa50cfffd5d80 R08: ffc82ea4cc0ae6c0 R09: 0000000000000000 [ 3.677496] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [ 3.685460] R13: 0000000000000901 R14: 0000000000000000 R15: 000000000002414b [ 3.693425] FS: 00007f525eb73280(0000) GS:ff4fa50affc40000(0000) knlGS:0000000000000000 [ 3.702451] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.708864] CR2: ff4fa48f82b9a000 CR3: 0000000401476006 CR4: 0000000000771ee0 [ 3.716830] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3.724796] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3.732753] PKRU: 55555554 [ 3.735773] Call Trace: [ 3.738504] <TASK> [ 3.740847] kernel_init_free_pages.part.0+0x46/0x70 [ 3.746394] get_page_from_freelist+0x3df/0x510 [ 3.751453] ? do_set_pte+0xa5/0x100 [ 3.755446] __alloc_pages+0x19a/0x350 [ 3.759631] pte_alloc_one+0x14/0x50 [ 3.763623] do_read_fault+0x12d/0x160 [ 3.767802] do_fault+0x9a/0x2e0 [ 3.771403] __handle_mm_fault+0x3e8/0x6c0 [ 3.775978] handle_mm_fault+0xcf/0x2c0 [ 3.780261] do_user_addr_fault+0x1d2/0x680 [ 3.784932] exc_page_fault+0x68/0x140 [ 3.789119] asm_exc_page_fault+0x22/0x30 [ 3.793598] RIP: 0033:0x557a550175bd [ 3.797591] Code: Unable to access opcode bytes at RIP 0x557a55017593. [ 3.804878] RSP: 002b:00007ffd57006600 EFLAGS: 00010206 [ 3.810710] RAX: 0000000000000000 RBX: 0000557a6a620e40 RCX: 00007f525da098b8 [ 3.818676] RDX: 0000000000000003 RSI: 00007f525da09908 RDI: 0000000000000003 [ 3.826642] RBP: 00007ffd570067d0 R08: 0000000000000000 R09: 000000000000000a [ 3.834607] R10: 00007f525eb73280 R11: 0000000000000206 R12: 0000557a6a620f00 [ 3.842573] R13: 0000557a6a6b76d0 R14: 0000000000000000 R15: 0000557a6a6b87d0 [ 3.850533] </TASK> [ 3.852972] Modules linked in: psample pci_hyperv_intf wmi dm_multipath sunrpc dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi [ 3.879765] CR2: ff4fa48f82b9a000 [ 3.883463] ---[ end trace 5c8bb91d889112a9 ]--- [ 4.498240] RIP: 0010:clear_page_erms+0x7/0x10 [ 4.503205] Code: 48 89 47 18 48 89 47 20 48 89 47 28 48 89 47 30 48 89 47 38 48 8d 7f 40 75 d9 90 e9 13 7f a5 00 0f 1f 00 b9 00 10 00 00 31 c0 <f3> aa e9 02 7f a5 00 cc cc 48 85 ff 0f 84 e5 00 00 00 0f b6 0f 4c [ 4.524155] RSP: 0000:ff63a55d1b8f3cb8 EFLAGS: 00010246 [ 4.529978] RAX: 0000000000000000 RBX: ff63a55d1b8f3d38 RCX: 0000000000001000 [ 4.537945] RDX: ffc82ea4cc0ae680 RSI: ff4fa48d971b1fc0 RDI: ff4fa48f82b9a000 [ 4.545910] RBP: ff4fa50cfffd5d80 R08: ffc82ea4cc0ae6c0 R09: 0000000000000000 [ 4.553874] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [ 4.561840] R13: 0000000000000901 R14: 0000000000000000 R15: 000000000002414b [ 4.569798] FS: 00007f525eb73280(0000) GS:ff4fa50affc40000(0000) knlGS:0000000000000000 [ 4.578831] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.585235] CR2: 0000557a55017593 CR3: 0000000401476006 CR4: 0000000000771ee0 [ 4.593202] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4.601158] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4.609122] PKRU: 55555554 [ 4.612143] Kernel panic - not syncing: Fatal exception [ 4.618980] Kernel Offset: 0x39e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 4.686287] ---[ end Kernel panic - not syncing: Fatal exception ]---
There's no problem when disabling the ITS mitigation.
It looks the problem comes from pages allocated for dynamic thunks for modules, and this patch appears to fix the problem:
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 43ec73da66d8b..9ca6973e56547 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -460,6 +460,8 @@ void its_free_mod(struct module *mod)
for (i = 0; i < mod->its_num_pages; i++) { void *page = mod->its_page_array[i]; + set_memory_nx((unsigned long)page, 1); + set_memory_rw((unsigned long)page, 1); module_memfree(page); } kfree(mod->its_page_array);
I don't know the exact underlying issue but I suspect that the kernel doesn't correctly handle pages freed without the write permission, and restoring page permissions to rw (instead of rox) before freeing prevent the problem.
alex.
On Wed, May 21, 2025 at 09:10:58PM +0200, Alexandre Chartre wrote:
It looks the problem comes from pages allocated for dynamic thunks for modules, and this patch appears to fix the problem:
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 43ec73da66d8b..9ca6973e56547 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -460,6 +460,8 @@ void its_free_mod(struct module *mod) for (i = 0; i < mod->its_num_pages; i++) { void *page = mod->its_page_array[i];
set_memory_nx((unsigned long)page, 1);
set_memory_rw((unsigned long)page, 1); module_memfree(page); } kfree(mod->its_page_array);
I don't know the exact underlying issue but I suspect that the kernel doesn't correctly handle pages freed without the write permission, and restoring page permissions to rw (instead of rox) before freeing prevent the problem.
Your analysis aligns with the proposed fix to backport below patch as well:
x86/modules: Set VM_FLUSH_RESET_PERMS in module_alloc() https://lore.kernel.org/stable/20250521171635.848656-1-pchelkin@ispras.ru/
The kernel, bpf tool and perf tool builds fine for v5.15.184-rc1 on x86 and arm64 Azure VM.
Tested-by: Hardik Garg hargar@linux.microsoft.com
Thanks, Hardik
On 5/20/2025 6:49 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.184-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.15.184-rc1
Alexander Lobakin alexandr.lobakin@intel.com ice: arfs: fix use-after-free when freeing @rx_cpu_rmap
Florian Westphal fw@strlen.de netfilter: nf_tables: do not defer rule destruction via call_rcu
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: wait for rcu grace period on net_device removal
Florian Westphal fw@strlen.de netfilter: nf_tables: pass nft_chain to destroy function, not nft_ctx
Josef Bacik josef@toxicpanda.com btrfs: do not clean up repair bio if submit fails
Filipe Manana fdmanana@suse.com btrfs: don't BUG_ON() when 0 reference count at btrfs_lookup_extent_info()
Eric Dumazet edumazet@google.com sctp: add mutual exclusion in proc_sctp_do_udp_port()
Feng Tang feng.tang@linux.alibaba.com selftests/mm: compaction_test: support platform with huge mount of memory
GONG Ruiqi gongruiqi1@huawei.com usb: typec: fix pm usage counter imbalance in ucsi_ccg_sync_control()
Dan Carpenter dan.carpenter@linaro.org usb: typec: fix potential array underflow in ucsi_ccg_sync_control()
RD Babiera rdbabiera@google.com usb: typec: altmodes/displayport: create sysfs nodes as driver's default device attribute group
Andrei Kuchynski akuchynski@chromium.org usb: typec: ucsi: displayport: Fix deadlock
Sebastian Andrzej Siewior bigeasy@linutronix.de clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable()
Fengnan Chang changfengnan@bytedance.com block: fix direct io NOWAIT flag not work
Shuai Xue xueshuai@linux.alibaba.com dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groups
Shuai Xue xueshuai@linux.alibaba.com dmaengine: idxd: fix memory leak in error handling path of idxd_setup_engines
Yemike Abhilash Chandra y-abhilashchandra@ti.com dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy
Ronald Wahl ronald.wahl@legrand.com dmaengine: ti: k3-udma: Add missing locking
Fedor Pchelkin pchelkin@ispras.ru wifi: mt76: disable napi on driver removal
Claudiu Beznea claudiu.beznea.uj@bp.renesas.com phy: renesas: rcar-gen3-usb2: Set timing registers only once
Ma Ke make24@iscas.ac.cn phy: Fix error handling in tegra_xusb_port_init
Steven Rostedt rostedt@goodmis.org tracing: samples: Initialize trace_array_printk() with the correct function
pengdonglin pengdonglin@xiaomi.com ftrace: Fix preemption accounting for stacktrace filter command
pengdonglin pengdonglin@xiaomi.com ftrace: Fix preemption accounting for stacktrace trigger command
Nicolas Chauvet kwizart@gmail.com ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera
Christian Heusel christian@heusel.eu ALSA: usb-audio: Add sample rate quirk for Audioengine D1
Wentao Liang vulab@iscas.ac.cn ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2()
Jeremy Linton jeremy.linton@arm.com ACPI: PPTT: Fix processor subtable walk
Filipe Manana fdmanana@suse.com btrfs: fix discard worker infinite loop after disabling discard
Nathan Lynch nathan.lynch@amd.com dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted"
Peter Zijlstra peterz@infradead.org x86/its: FineIBT-paranoid vs ITS
Eric Biggers ebiggers@google.com x86/its: Fix build errors when CONFIG_MODULES=n
Peter Zijlstra peterz@infradead.org x86/its: Use dynamic thunks for indirect branches
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Align RETs in BHB clear sequence to avoid thunking
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Add "vmexit" option to skip mitigation on some CPUs
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Enable Indirect Target Selection mitigation
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Add support for ITS-safe return thunk
Josh Poimboeuf jpoimboe@kernel.org x86/alternatives: Remove faulty optimization
Borislav Petkov (AMD) bp@alien8.de x86/alternative: Optimize returns patching
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Add support for ITS-safe indirect thunk
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/its: Enumerate Indirect Target Selection (ITS) bug
Pawan Gupta pawan.kumar.gupta@linux.intel.com Documentation: x86/bugs/its: Add ITS documentation
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/speculation: Remove the extra #ifdef around CALL_NOSPEC
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/speculation: Add a conditional CS prefix to CALL_NOSPEC
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/speculation: Simplify and make CALL_NOSPEC consistent
Peter Zijlstra peterz@infradead.org x86,nospec: Simplify {JMP,CALL}_NOSPEC
Trond Myklebust trond.myklebust@hammerspace.com NFSv4/pnfs: Reset the layout state after a layoutreturn
Abdun Nihaal abdun.nihaal@gmail.com qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd()
Geert Uytterhoeven geert+renesas@glider.be ALSA: sh: SND_AICA should depend on SH_DMA_API
Vladimir Oltean vladimir.oltean@nxp.com net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENING
Mathieu Othacehe othacehe@gnu.org net: cadence: macb: Fix a possible deadlock in macb_halt_tx.
Cong Wang xiyou.wangcong@gmail.com net_sched: Flush gso_skb list too during ->change()
Geert Uytterhoeven geert+renesas@glider.be spi: loopback-test: Do not split 1024-byte hexdumps
Li Lingfeng lilingfeng3@huawei.com nfs: handle failure of nfs_get_lock_context in unlock path
Zhu Yanjun yanjun.zhu@linux.dev RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug
David Lechner dlechner@baylibre.com iio: chemical: sps30: use aligned_s64 for timestamp
Jonathan Cameron Jonathan.Cameron@huawei.com iio: adc: ad7768-1: Fix insufficient alignment of timestamp.
Masami Hiramatsu (Google) mhiramat@kernel.org tracing: probes: Fix a possible race in trace_probe_log APIs
Hans de Goede hdegoede@redhat.com platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection
Diffstat:
Documentation/ABI/testing/sysfs-devices-system-cpu | 1 + Documentation/admin-guide/hw-vuln/index.rst | 1 + .../hw-vuln/indirect-target-selection.rst | 156 +++++++++++++ Documentation/admin-guide/kernel-parameters.txt | 15 ++ Makefile | 4 +- arch/x86/Kconfig | 11 + arch/x86/entry/entry_64.S | 20 +- arch/x86/include/asm/alternative.h | 32 +++ arch/x86/include/asm/cpufeatures.h | 3 + arch/x86/include/asm/msr-index.h | 8 + arch/x86/include/asm/nospec-branch.h | 57 +++-- arch/x86/kernel/alternative.c | 243 ++++++++++++++++++++- arch/x86/kernel/cpu/bugs.c | 139 +++++++++++- arch/x86/kernel/cpu/common.c | 63 +++++- arch/x86/kernel/ftrace.c | 2 +- arch/x86/kernel/module.c | 7 + arch/x86/kernel/static_call.c | 2 +- arch/x86/kernel/vmlinux.lds.S | 10 + arch/x86/kvm/x86.c | 4 +- arch/x86/lib/retpoline.S | 39 ++++ arch/x86/net/bpf_jit_comp.c | 8 +- block/fops.c | 5 +- drivers/acpi/pptt.c | 11 +- drivers/base/cpu.c | 8 + drivers/clocksource/i8253.c | 6 +- drivers/dma/dmatest.c | 6 +- drivers/dma/idxd/init.c | 8 + drivers/dma/ti/k3-udma.c | 10 +- drivers/iio/adc/ad7768-1.c | 2 +- drivers/iio/chemical/sps30.c | 2 +- drivers/infiniband/sw/rxe/rxe_cq.c | 5 +- drivers/net/dsa/sja1105/sja1105_main.c | 6 +- drivers/net/ethernet/cadence/macb_main.c | 19 +- drivers/net/ethernet/intel/ice/ice_arfs.c | 9 +- drivers/net/ethernet/intel/ice/ice_lib.c | 5 +- drivers/net/ethernet/intel/ice/ice_main.c | 20 +- .../ethernet/qlogic/qlcnic/qlcnic_sriov_common.c | 7 +- drivers/net/wireless/mediatek/mt76/dma.c | 1 + drivers/phy/renesas/phy-rcar-gen3-usb2.c | 7 +- drivers/phy/tegra/xusb.c | 8 +- drivers/platform/x86/asus-wmi.c | 3 +- drivers/spi/spi-loopback-test.c | 2 +- drivers/usb/typec/altmodes/displayport.c | 18 +- drivers/usb/typec/ucsi/displayport.c | 19 +- drivers/usb/typec/ucsi/ucsi.c | 34 +++ drivers/usb/typec/ucsi/ucsi.h | 3 + drivers/usb/typec/ucsi/ucsi_ccg.c | 5 + fs/btrfs/discard.c | 17 +- fs/btrfs/extent-tree.c | 25 ++- fs/btrfs/extent_io.c | 15 +- fs/nfs/nfs4proc.c | 9 +- fs/nfs/pnfs.c | 9 + include/linux/cpu.h | 2 + include/linux/module.h | 5 + include/net/netfilter/nf_tables.h | 2 +- include/net/sch_generic.h | 15 ++ kernel/trace/trace_dynevent.c | 16 +- kernel/trace/trace_dynevent.h | 1 + kernel/trace/trace_events_trigger.c | 2 +- kernel/trace/trace_functions.c | 6 +- kernel/trace/trace_kprobe.c | 2 +- kernel/trace/trace_probe.c | 9 + kernel/trace/trace_uprobe.c | 2 +- net/netfilter/nf_tables_api.c | 54 +++-- net/netfilter/nft_immediate.c | 2 +- net/sched/sch_codel.c | 2 +- net/sched/sch_fq.c | 2 +- net/sched/sch_fq_codel.c | 2 +- net/sched/sch_fq_pie.c | 2 +- net/sched/sch_hhf.c | 2 +- net/sched/sch_pie.c | 2 +- net/sctp/sysctl.c | 4 + samples/ftrace/sample-trace-array.c | 2 +- sound/pci/es1968.c | 6 +- sound/sh/Kconfig | 2 +- sound/usb/quirks.c | 4 + tools/testing/selftests/vm/compaction_test.c | 19 +- 77 files changed, 1112 insertions(+), 184 deletions(-)
On 5/20/25 06:49, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
Build reference: v5.15.184 Compiler version: x86_64-linux-gcc (GCC) 12.4.0 Assembler version: GNU assembler (GNU Binutils) 2.40
Configuration file workarounds: "s/CONFIG_FRAME_WARN=.*/CONFIG_FRAME_WARN=0/"
Building i386:defconfig ... passed Building i386:allyesconfig ... failed -------------- Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored) -------------- Building i386:allmodconfig ... failed -------------- Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored) --------------
In v5.15.y, cpu_wants_rethunk_at is only built if CONFIG_STACK_VALIDATION=y, but that is not supported for i386 builds. The dummy function in arch/x86/include/asm/alternative.h doesn't take that dependency into account.
Guenter
On Fri, 23 May 2025, Guenter Roeck wrote:
On 5/20/25 06:49, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
Build reference: v5.15.184 Compiler version: x86_64-linux-gcc (GCC) 12.4.0 Assembler version: GNU assembler (GNU Binutils) 2.40
Configuration file workarounds: "s/CONFIG_FRAME_WARN=.*/CONFIG_FRAME_WARN=0/"
Building i386:defconfig ... passed Building i386:allyesconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
Building i386:allmodconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
In v5.15.y, cpu_wants_rethunk_at is only built if CONFIG_STACK_VALIDATION=y, but that is not supported for i386 builds. The dummy function in arch/x86/include/asm/alternative.h doesn't take that dependency into account.
I found this bug too using the Slackware 15.0 32-bit kernel configuration.
Here is a simple work around patch, but there may be a better solution...
--- arch/x86/kernel/static_call.c.orig 2025-05-22 05:08:28.000000000 -0700 +++ arch/x86/kernel/static_call.c 2025-05-27 10:25:27.630496538 -0700 @@ -81,9 +81,12 @@ break;
case RET: + +#ifdef CONFIG_64BIT if (cpu_wants_rethunk_at(insn)) code = text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk); else +#endif code = &retinsn; break;
-------------- Richard Narron
On Tue, May 27, 2025 at 12:31:47PM -0700, Richard Narron wrote:
On Fri, 23 May 2025, Guenter Roeck wrote:
On 5/20/25 06:49, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
Build reference: v5.15.184 Compiler version: x86_64-linux-gcc (GCC) 12.4.0 Assembler version: GNU assembler (GNU Binutils) 2.40
Configuration file workarounds: "s/CONFIG_FRAME_WARN=.*/CONFIG_FRAME_WARN=0/"
Building i386:defconfig ... passed Building i386:allyesconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
Building i386:allmodconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
In v5.15.y, cpu_wants_rethunk_at is only built if CONFIG_STACK_VALIDATION=y, but that is not supported for i386 builds. The dummy function in arch/x86/include/asm/alternative.h doesn't take that dependency into account.
I found this bug too using the Slackware 15.0 32-bit kernel configuration.
Here is a simple work around patch, but there may be a better solution...
--- arch/x86/kernel/static_call.c.orig 2025-05-22 05:08:28.000000000 -0700 +++ arch/x86/kernel/static_call.c 2025-05-27 10:25:27.630496538 -0700 @@ -81,9 +81,12 @@ break;
case RET:
+#ifdef CONFIG_64BIT if (cpu_wants_rethunk_at(insn)) code = text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk); else +#endif code = &retinsn; break;
Another option is to define the empty function when CONFIG_STACK_VALIDATION=n as below:
--- 8< --- From: Pawan Gupta pawan.kumar.gupta@linux.intel.com Subject: [PATCH] x86/its: Fix undefined reference to cpu_wants_rethunk_at()
Below error was reported in 32-bit kernel build:
static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error
This is because the definition of cpu_wants_rethunk_at() depends on CONFIG_STACK_VALIDATION which is only enabled in 64-bit mode.
Define the empty function when CONFIG_STACK_VALIDATION=n, rethunk mitigation is anyways not supported without it.
Reported-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/stable/0f597436-5da6-4319-b918-9f57bde5634a@roeck-us... Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com --- arch/x86/include/asm/alternative.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 1797f80c10de..a5f704dbb4a1 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -98,7 +98,7 @@ static inline u8 *its_static_thunk(int reg) } #endif
-#ifdef CONFIG_RETHUNK +#if defined(CONFIG_RETHUNK) && defined(CONFIG_STACK_VALIDATION) extern bool cpu_wants_rethunk(void); extern bool cpu_wants_rethunk_at(void *addr); #else
On Tue, 27 May 2025, Pawan Gupta wrote:
On Tue, May 27, 2025 at 12:31:47PM -0700, Richard Narron wrote:
On Fri, 23 May 2025, Guenter Roeck wrote:
On 5/20/25 06:49, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
Build reference: v5.15.184 Compiler version: x86_64-linux-gcc (GCC) 12.4.0 Assembler version: GNU assembler (GNU Binutils) 2.40
Configuration file workarounds: "s/CONFIG_FRAME_WARN=.*/CONFIG_FRAME_WARN=0/"
Building i386:defconfig ... passed Building i386:allyesconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
Building i386:allmodconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
In v5.15.y, cpu_wants_rethunk_at is only built if CONFIG_STACK_VALIDATION=y, but that is not supported for i386 builds. The dummy function in arch/x86/include/asm/alternative.h doesn't take that dependency into account.
I found this bug too using the Slackware 15.0 32-bit kernel configuration.
Here is a simple work around patch, but there may be a better solution...
--- arch/x86/kernel/static_call.c.orig 2025-05-22 05:08:28.000000000 -0700 +++ arch/x86/kernel/static_call.c 2025-05-27 10:25:27.630496538 -0700 @@ -81,9 +81,12 @@ break;
case RET:
+#ifdef CONFIG_64BIT if (cpu_wants_rethunk_at(insn)) code = text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk); else +#endif code = &retinsn; break;
Another option is to define the empty function when CONFIG_STACK_VALIDATION=n as below:
--- 8< --- From: Pawan Gupta pawan.kumar.gupta@linux.intel.com Subject: [PATCH] x86/its: Fix undefined reference to cpu_wants_rethunk_at()
Below error was reported in 32-bit kernel build:
static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error
This is because the definition of cpu_wants_rethunk_at() depends on CONFIG_STACK_VALIDATION which is only enabled in 64-bit mode.
Define the empty function when CONFIG_STACK_VALIDATION=n, rethunk mitigation is anyways not supported without it.
Reported-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/stable/0f597436-5da6-4319-b918-9f57bde5634a@roeck-us... Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com
arch/x86/include/asm/alternative.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 1797f80c10de..a5f704dbb4a1 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -98,7 +98,7 @@ static inline u8 *its_static_thunk(int reg) } #endif
-#ifdef CONFIG_RETHUNK +#if defined(CONFIG_RETHUNK) && defined(CONFIG_STACK_VALIDATION) extern bool cpu_wants_rethunk(void); extern bool cpu_wants_rethunk_at(void *addr);
#else
2.34.1
Thanks for looking at this Pawan.
Your new patch works fine on both Slackware 15.0 32-bit and 64-bit systems.
On Wed, May 28, 2025 at 09:49:40PM -0700, Richard Narron wrote:
On Tue, 27 May 2025, Pawan Gupta wrote:
On Tue, May 27, 2025 at 12:31:47PM -0700, Richard Narron wrote:
On Fri, 23 May 2025, Guenter Roeck wrote:
On 5/20/25 06:49, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
Build reference: v5.15.184 Compiler version: x86_64-linux-gcc (GCC) 12.4.0 Assembler version: GNU assembler (GNU Binutils) 2.40
Configuration file workarounds: "s/CONFIG_FRAME_WARN=.*/CONFIG_FRAME_WARN=0/"
Building i386:defconfig ... passed Building i386:allyesconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
Building i386:allmodconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
In v5.15.y, cpu_wants_rethunk_at is only built if CONFIG_STACK_VALIDATION=y, but that is not supported for i386 builds. The dummy function in arch/x86/include/asm/alternative.h doesn't take that dependency into account.
I found this bug too using the Slackware 15.0 32-bit kernel configuration.
Here is a simple work around patch, but there may be a better solution...
--- arch/x86/kernel/static_call.c.orig 2025-05-22 05:08:28.000000000 -0700 +++ arch/x86/kernel/static_call.c 2025-05-27 10:25:27.630496538 -0700 @@ -81,9 +81,12 @@ break;
case RET:
+#ifdef CONFIG_64BIT if (cpu_wants_rethunk_at(insn)) code = text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk); else +#endif code = &retinsn; break;
Another option is to define the empty function when CONFIG_STACK_VALIDATION=n as below:
--- 8< --- From: Pawan Gupta pawan.kumar.gupta@linux.intel.com Subject: [PATCH] x86/its: Fix undefined reference to cpu_wants_rethunk_at()
Below error was reported in 32-bit kernel build:
static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error
This is because the definition of cpu_wants_rethunk_at() depends on CONFIG_STACK_VALIDATION which is only enabled in 64-bit mode.
Define the empty function when CONFIG_STACK_VALIDATION=n, rethunk mitigation is anyways not supported without it.
Reported-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/stable/0f597436-5da6-4319-b918-9f57bde5634a@roeck-us... Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com
arch/x86/include/asm/alternative.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 1797f80c10de..a5f704dbb4a1 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -98,7 +98,7 @@ static inline u8 *its_static_thunk(int reg) } #endif
-#ifdef CONFIG_RETHUNK +#if defined(CONFIG_RETHUNK) && defined(CONFIG_STACK_VALIDATION) extern bool cpu_wants_rethunk(void); extern bool cpu_wants_rethunk_at(void *addr);
#else
2.34.1
Thanks for looking at this Pawan.
Your new patch works fine on both Slackware 15.0 32-bit and 64-bit systems.
Thank you for verifying the patch.
On Thu, May 29, 2025 at 10:40:17AM -0700, Pawan Gupta wrote:
On Wed, May 28, 2025 at 09:49:40PM -0700, Richard Narron wrote:
On Tue, 27 May 2025, Pawan Gupta wrote:
On Tue, May 27, 2025 at 12:31:47PM -0700, Richard Narron wrote:
On Fri, 23 May 2025, Guenter Roeck wrote:
On 5/20/25 06:49, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.184 release. There are 59 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 22 May 2025 12:57:37 +0000. Anything received after that time might be too late.
Build reference: v5.15.184 Compiler version: x86_64-linux-gcc (GCC) 12.4.0 Assembler version: GNU assembler (GNU Binutils) 2.40
Configuration file workarounds: "s/CONFIG_FRAME_WARN=.*/CONFIG_FRAME_WARN=0/"
Building i386:defconfig ... passed Building i386:allyesconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
Building i386:allmodconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
In v5.15.y, cpu_wants_rethunk_at is only built if CONFIG_STACK_VALIDATION=y, but that is not supported for i386 builds. The dummy function in arch/x86/include/asm/alternative.h doesn't take that dependency into account.
I found this bug too using the Slackware 15.0 32-bit kernel configuration.
Here is a simple work around patch, but there may be a better solution...
--- arch/x86/kernel/static_call.c.orig 2025-05-22 05:08:28.000000000 -0700 +++ arch/x86/kernel/static_call.c 2025-05-27 10:25:27.630496538 -0700 @@ -81,9 +81,12 @@ break;
case RET:
+#ifdef CONFIG_64BIT if (cpu_wants_rethunk_at(insn)) code = text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk); else +#endif code = &retinsn; break;
Another option is to define the empty function when CONFIG_STACK_VALIDATION=n as below:
--- 8< --- From: Pawan Gupta pawan.kumar.gupta@linux.intel.com Subject: [PATCH] x86/its: Fix undefined reference to cpu_wants_rethunk_at()
Below error was reported in 32-bit kernel build:
static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error
This is because the definition of cpu_wants_rethunk_at() depends on CONFIG_STACK_VALIDATION which is only enabled in 64-bit mode.
Define the empty function when CONFIG_STACK_VALIDATION=n, rethunk mitigation is anyways not supported without it.
Reported-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/stable/0f597436-5da6-4319-b918-9f57bde5634a@roeck-us... Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com
arch/x86/include/asm/alternative.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 1797f80c10de..a5f704dbb4a1 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -98,7 +98,7 @@ static inline u8 *its_static_thunk(int reg) } #endif
-#ifdef CONFIG_RETHUNK +#if defined(CONFIG_RETHUNK) && defined(CONFIG_STACK_VALIDATION) extern bool cpu_wants_rethunk(void); extern bool cpu_wants_rethunk_at(void *addr);
#else
2.34.1
Thanks for looking at this Pawan.
Your new patch works fine on both Slackware 15.0 32-bit and 64-bit systems.
Thank you for verifying the patch.
Is someone going to submit this in a form we can apply it in? (hint...)
On Fri, May 30, 2025 at 07:11:58AM +0200, Greg Kroah-Hartman wrote:
On Thu, May 29, 2025 at 10:40:17AM -0700, Pawan Gupta wrote:
On Wed, May 28, 2025 at 09:49:40PM -0700, Richard Narron wrote:
On Tue, 27 May 2025, Pawan Gupta wrote:
On Tue, May 27, 2025 at 12:31:47PM -0700, Richard Narron wrote:
On Fri, 23 May 2025, Guenter Roeck wrote:
On 5/20/25 06:49, Greg Kroah-Hartman wrote: > This is the start of the stable review cycle for the 5.15.184 release. > There are 59 patches in this series, all will be posted as a response > to this one. If anyone has any issues with these being applied, please > let me know. > > Responses should be made by Thu, 22 May 2025 12:57:37 +0000. > Anything received after that time might be too late. >
Build reference: v5.15.184 Compiler version: x86_64-linux-gcc (GCC) 12.4.0 Assembler version: GNU assembler (GNU Binutils) 2.40
Configuration file workarounds: "s/CONFIG_FRAME_WARN=.*/CONFIG_FRAME_WARN=0/"
Building i386:defconfig ... passed Building i386:allyesconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
Building i386:allmodconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
In v5.15.y, cpu_wants_rethunk_at is only built if CONFIG_STACK_VALIDATION=y, but that is not supported for i386 builds. The dummy function in arch/x86/include/asm/alternative.h doesn't take that dependency into account.
I found this bug too using the Slackware 15.0 32-bit kernel configuration.
Here is a simple work around patch, but there may be a better solution...
--- arch/x86/kernel/static_call.c.orig 2025-05-22 05:08:28.000000000 -0700 +++ arch/x86/kernel/static_call.c 2025-05-27 10:25:27.630496538 -0700 @@ -81,9 +81,12 @@ break;
case RET:
+#ifdef CONFIG_64BIT if (cpu_wants_rethunk_at(insn)) code = text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk); else +#endif code = &retinsn; break;
Another option is to define the empty function when CONFIG_STACK_VALIDATION=n as below:
--- 8< --- From: Pawan Gupta pawan.kumar.gupta@linux.intel.com Subject: [PATCH] x86/its: Fix undefined reference to cpu_wants_rethunk_at()
Below error was reported in 32-bit kernel build:
static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error
This is because the definition of cpu_wants_rethunk_at() depends on CONFIG_STACK_VALIDATION which is only enabled in 64-bit mode.
Define the empty function when CONFIG_STACK_VALIDATION=n, rethunk mitigation is anyways not supported without it.
Reported-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/stable/0f597436-5da6-4319-b918-9f57bde5634a@roeck-us... Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com
arch/x86/include/asm/alternative.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 1797f80c10de..a5f704dbb4a1 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -98,7 +98,7 @@ static inline u8 *its_static_thunk(int reg) } #endif
-#ifdef CONFIG_RETHUNK +#if defined(CONFIG_RETHUNK) && defined(CONFIG_STACK_VALIDATION) extern bool cpu_wants_rethunk(void); extern bool cpu_wants_rethunk_at(void *addr);
#else
2.34.1
Thanks for looking at this Pawan.
Your new patch works fine on both Slackware 15.0 32-bit and 64-bit systems.
Thank you for verifying the patch.
Is someone going to submit this in a form we can apply it in? (hint...)
Oops, sending it now.
On Fri, May 30, 2025 at 07:11:58AM +0200, Greg Kroah-Hartman wrote:
On Thu, May 29, 2025 at 10:40:17AM -0700, Pawan Gupta wrote:
On Wed, May 28, 2025 at 09:49:40PM -0700, Richard Narron wrote:
On Tue, 27 May 2025, Pawan Gupta wrote:
On Tue, May 27, 2025 at 12:31:47PM -0700, Richard Narron wrote:
On Fri, 23 May 2025, Guenter Roeck wrote:
On 5/20/25 06:49, Greg Kroah-Hartman wrote: > This is the start of the stable review cycle for the 5.15.184 release. > There are 59 patches in this series, all will be posted as a response > to this one. If anyone has any issues with these being applied, please > let me know. > > Responses should be made by Thu, 22 May 2025 12:57:37 +0000. > Anything received after that time might be too late. >
Build reference: v5.15.184 Compiler version: x86_64-linux-gcc (GCC) 12.4.0 Assembler version: GNU assembler (GNU Binutils) 2.40
Configuration file workarounds: "s/CONFIG_FRAME_WARN=.*/CONFIG_FRAME_WARN=0/"
Building i386:defconfig ... passed Building i386:allyesconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
Building i386:allmodconfig ... failed
Error log: x86_64-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform': static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error 1 (ignored)
In v5.15.y, cpu_wants_rethunk_at is only built if CONFIG_STACK_VALIDATION=y, but that is not supported for i386 builds. The dummy function in arch/x86/include/asm/alternative.h doesn't take that dependency into account.
I found this bug too using the Slackware 15.0 32-bit kernel configuration.
Here is a simple work around patch, but there may be a better solution...
--- arch/x86/kernel/static_call.c.orig 2025-05-22 05:08:28.000000000 -0700 +++ arch/x86/kernel/static_call.c 2025-05-27 10:25:27.630496538 -0700 @@ -81,9 +81,12 @@ break;
case RET:
+#ifdef CONFIG_64BIT if (cpu_wants_rethunk_at(insn)) code = text_gen_insn(JMP32_INSN_OPCODE, insn, x86_return_thunk); else +#endif code = &retinsn; break;
Another option is to define the empty function when CONFIG_STACK_VALIDATION=n as below:
--- 8< --- From: Pawan Gupta pawan.kumar.gupta@linux.intel.com Subject: [PATCH] x86/its: Fix undefined reference to cpu_wants_rethunk_at()
Below error was reported in 32-bit kernel build:
static_call.c:(.ref.text+0x46): undefined reference to `cpu_wants_rethunk_at' make[1]: [Makefile:1234: vmlinux] Error
This is because the definition of cpu_wants_rethunk_at() depends on CONFIG_STACK_VALIDATION which is only enabled in 64-bit mode.
Define the empty function when CONFIG_STACK_VALIDATION=n, rethunk mitigation is anyways not supported without it.
Reported-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/stable/0f597436-5da6-4319-b918-9f57bde5634a@roeck-us... Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com
arch/x86/include/asm/alternative.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index 1797f80c10de..a5f704dbb4a1 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -98,7 +98,7 @@ static inline u8 *its_static_thunk(int reg) } #endif
-#ifdef CONFIG_RETHUNK +#if defined(CONFIG_RETHUNK) && defined(CONFIG_STACK_VALIDATION) extern bool cpu_wants_rethunk(void); extern bool cpu_wants_rethunk_at(void *addr);
#else
2.34.1
Thanks for looking at this Pawan.
Your new patch works fine on both Slackware 15.0 32-bit and 64-bit systems.
Thank you for verifying the patch.
Is someone going to submit this in a form we can apply it in? (hint...)
Now submitted here:
https://lore.kernel.org/all/8c84125f71aec2fd81adf423dbc12156ac11706a.1748584...
linux-stable-mirror@lists.linaro.org