Commit ef0ff68351be ("driver core: Probe devices asynchronously instead of
the driver") speeds up the loading of large numbers of device drivers by
submitting asynchronous probe workers to an unbounded workqueue and binding
each worker to the CPU near the device’s NUMA node. These workers are not
scheduled on isolated CPUs because their cpumask is restricted to
housekeeping_cpumask(HK_TYPE_WQ) and housekeeping_cpumask(HK_TYPE_DOMAIN).
However, when PCI devices reside on the same NUMA node, all their
drivers’ probe workers are bound to the same CPU within that node, yet
the probes still run in parallel because pci_call_probe() invokes
work_on_cpu(). Introduced by commit 873392ca514f ("PCI: work_on_cpu: use
in drivers/pci/pci-driver.c"), work_on_cpu() queues a worker on
system_percpu_wq to bind the probe thread to the first CPU in the
device’s NUMA node (chosen via cpumask_any_and() in pci_call_probe()).
1. The function __driver_attach() submits an asynchronous worker with
callback __driver_attach_async_helper().
__driver_attach()
async_schedule_dev(__driver_attach_async_helper, dev)
async_schedule_node(func, dev, dev_to_node(dev))
async_schedule_node_domain(func, data, node, &async_dfl_domain)
__async_schedule_node_domain(func, data, node, domain, entry)
queue_work_node(node, async_wq, &entry->work)
2. The asynchronous probe worker ultimately calls work_on_cpu() in
pci_call_probe(), binding the worker to the same CPU within the
device’s NUMA node.
__driver_attach_async_helper()
driver_probe_device(drv, dev)
__driver_probe_device(drv, dev)
really_probe(dev, drv)
call_driver_probe(dev, drv)
dev->bus->probe(dev)
pci_device_probe(dev)
__pci_device_probe(drv, pci_dev)
pci_call_probe(drv, pci_dev, id)
cpu = cpumask_any_and(cpumask_of_node(node), wq_domain_mask)
error = work_on_cpu(cpu, local_pci_probe, &ddi)
schedule_work_on(cpu, &wfc.work);
queue_work_on(cpu, system_percpu_wq, work)
To fix the issue, pci_call_probe() must not call work_on_cpu() when it is
already running inside an unbounded asynchronous worker. Because a driver
can be probed asynchronously either by probe_type or by the kernel command
line, we cannot rely on PROBE_PREFER_ASYNCHRONOUS alone. Instead, we test
the PF_WQ_WORKER flag in current->flags; if it is set, pci_call_probe() is
executing within an unbounded workqueue worker and should skip the extra
work_on_cpu() call.
Testing three NVMe devices on the same NUMA node of an AMD EPYC 9A64
2.4 GHz processor shows a 35 % probe-time improvement with the patch:
Before (all on CPU 0):
nvme 0000:01:00.0: CPU: 0, COMM: kworker/0:1, probe cost: 53372612 ns
nvme 0000:02:00.0: CPU: 0, COMM: kworker/0:2, probe cost: 49532941 ns
nvme 0000:03:00.0: CPU: 0, COMM: kworker/0:3, probe cost: 47315175 ns
After (spread across CPUs 1, 2, 5):
nvme 0000:01:00.0: CPU: 5, COMM: kworker/u1025:5, probe cost: 34765890 ns
nvme 0000:02:00.0: CPU: 1, COMM: kworker/u1025:2, probe cost: 34696433 ns
nvme 0000:03:00.0: CPU: 2, COMM: kworker/u1025:3, probe cost: 33233323 ns
The improvement grows with more PCI devices because fewer probes contend
for the same CPU.
Fixes: ef0ff68351be ("driver core: Probe devices asynchronously instead of the driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jinhui Guo <guojinhui.liam(a)bytedance.com>
---
drivers/pci/pci-driver.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 7c2d9d596258..4bc47a84d330 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -366,9 +366,11 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
/*
* Prevent nesting work_on_cpu() for the case where a Virtual Function
* device is probed from work_on_cpu() of the Physical device.
+ * Check PF_WQ_WORKER to prevent invoking work_on_cpu() in an asynchronous
+ * probe worker when the driver allows asynchronous probing.
*/
if (node < 0 || node >= MAX_NUMNODES || !node_online(node) ||
- pci_physfn_is_probed(dev)) {
+ pci_physfn_is_probed(dev) || (current->flags & PF_WQ_WORKER)) {
cpu = nr_cpu_ids;
} else {
cpumask_var_t wq_domain_mask;
--
2.20.1
When software issues a Cache Maintenance Operation (CMO) targeting a
dirty cache line, the CPU and DSU cluster may optimize the operation by
combining the CopyBack Write and CMO into a single combined CopyBack
Write plus CMO transaction presented to the interconnect (MCN).
For these combined transactions, the MCN splits the operation into two
separate transactions, one Write and one CMO, and then propagates the
write and optionally the CMO to the downstream memory system or external
Point of Serialization (PoS).
However, the MCN may return an early CompCMO response to the DSU cluster
before the corresponding Write and CMO transactions have completed at
the external PoS or downstream memory. As a result, stale data may be
observed by external observers that are directly connected to the
external PoS or downstream memory.
This erratum affects any system topology in which the following
conditions apply:
- The Point of Serialization (PoS) is located downstream of the
interconnect.
- A downstream observer accesses memory directly, bypassing the
interconnect.
Conditions:
This erratum occurs only when all of the following conditions are met:
1. Software executes a data cache maintenance operation, specifically,
a clean or invalidate by virtual address (DC CVAC, DC CIVAC, or DC
IVAC), that hits on unique dirty data in the CPU or DSU cache. This
results in a combined CopyBack and CMO being issued to the
interconnect.
2. The interconnect splits the combined transaction into separate Write
and CMO transactions and returns an early completion response to the
CPU or DSU before the write has completed at the downstream memory
or PoS.
3. A downstream observer accesses the affected memory address after the
early completion response is issued but before the actual memory
write has completed. This allows the observer to read stale data
that has not yet been updated at the PoS or downstream memory.
The implementation of workaround put a second loop of CMOs at the same
virtual address whose operation meet erratum conditions to wait until
cache data be cleaned to PoC.. This way of implementation mitigates
performance panalty compared to purly duplicate orignial CMO.
Cc: stable(a)vger.kernel.org # 6.12.x
Signed-off-by: Lucas Wei <lucaswei(a)google.com>
---
Documentation/arch/arm64/silicon-errata.rst | 3 ++
arch/arm64/Kconfig | 19 +++++++++++++
arch/arm64/include/asm/assembler.h | 10 +++++++
arch/arm64/kernel/cpu_errata.c | 31 +++++++++++++++++++++
arch/arm64/mm/cache.S | 13 ++++++++-
arch/arm64/tools/cpucaps | 1 +
6 files changed, 76 insertions(+), 1 deletion(-)
diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst
index a7ec57060f64..98efdf528719 100644
--- a/Documentation/arch/arm64/silicon-errata.rst
+++ b/Documentation/arch/arm64/silicon-errata.rst
@@ -213,6 +213,9 @@ stable kernels.
| ARM | GIC-700 | #2941627 | ARM64_ERRATUM_2941627 |
+----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | SI L1 | #4311569 | ARM64_ERRATUM_4311569 |
++----------------+-----------------+-----------------+-----------------------------+
++----------------+-----------------+-----------------+-----------------------------+
| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 |
+----------------+-----------------+-----------------+-----------------------------+
| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_843419 |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 93173f0a09c7..89326bb26f48 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1155,6 +1155,25 @@ config ARM64_ERRATUM_3194386
If unsure, say Y.
+config ARM64_ERRATUM_4311569
+ bool "SI L1: 4311569: workaround for premature CMO completion erratum"
+ default y
+ help
+ This option adds the workaround for ARM SI L1 erratum 4311569.
+
+ The erratum of SI L1 can cause an early response to a combined write
+ and cache maintenance operation (WR+CMO) before the operation is fully
+ completed to the Point of Serialization (POS).
+ This can result in a non-I/O coherent agent observing stale data,
+ potentially leading to system instability or incorrect behavior.
+
+ Enabling this option implements a software workaround by inserting a
+ second loop of Cache Maintenance Operation (CMO) immediately following the
+ end of function to do CMOs. This ensures that the data is correctly serialized
+ before the buffer is handed off to a non-coherent agent.
+
+ If unsure, say Y.
+
config CAVIUM_ERRATUM_22375
bool "Cavium erratum 22375, 24313"
default y
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index f0ca7196f6fa..d3d46e5f7188 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -381,6 +381,9 @@ alternative_endif
.macro dcache_by_myline_op op, domain, start, end, linesz, tmp, fixup
sub \tmp, \linesz, #1
bic \start, \start, \tmp
+alternative_if ARM64_WORKAROUND_4311569
+ mov \tmp, \start
+alternative_else_nop_endif
.Ldcache_op\@:
.ifc \op, cvau
__dcache_op_workaround_clean_cache \op, \start
@@ -402,6 +405,13 @@ alternative_endif
add \start, \start, \linesz
cmp \start, \end
b.lo .Ldcache_op\@
+alternative_if ARM64_WORKAROUND_4311569
+ .ifnc \op, cvau
+ mov \start, \tmp
+ mov \tmp, xzr
+ cbnz \start, .Ldcache_op\@
+ .endif
+alternative_else_nop_endif
dsb \domain
_cond_uaccess_extable .Ldcache_op\@, \fixup
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 8cb3b575a031..c69678c512f1 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -141,6 +141,30 @@ has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
return (ctr_real != sys) && (ctr_raw != sys);
}
+#ifdef CONFIG_ARM64_ERRATUM_4311569
+DEFINE_STATIC_KEY_FALSE(arm_si_l1_workaround_4311569);
+static int __init early_arm_si_l1_workaround_4311569_cfg(char *arg)
+{
+ static_branch_enable(&arm_si_l1_workaround_4311569);
+ pr_info("Enabling cache maintenance workaround for ARM SI-L1 erratum 4311569\n");
+
+ return 0;
+}
+early_param("arm_si_l1_workaround_4311569", early_arm_si_l1_workaround_4311569_cfg);
+
+/*
+ * We have some earlier use cases to call cache maintenance operation functions, for example,
+ * dcache_inval_poc() and dcache_clean_poc() in head.S, before making decision to turn on this
+ * workaround. Since the scope of this workaround is limited to non-coherent DMA agents, its
+ * safe to have the workaround off by default.
+ */
+static bool
+need_arm_si_l1_workaround_4311569(const struct arm64_cpu_capabilities *entry, int scope)
+{
+ return static_branch_unlikely(&arm_si_l1_workaround_4311569);
+}
+#endif
+
static void
cpu_enable_trap_ctr_access(const struct arm64_cpu_capabilities *cap)
{
@@ -870,6 +894,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
ERRATA_MIDR_RANGE_LIST(erratum_spec_ssbs_list),
},
#endif
+#ifdef CONFIG_ARM64_ERRATUM_4311569
+ {
+ .capability = ARM64_WORKAROUND_4311569,
+ .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+ .matches = need_arm_si_l1_workaround_4311569,
+ },
+#endif
#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
{
.desc = "ARM errata 2966298, 3117295",
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 503567c864fd..ddf0097624ed 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -143,9 +143,14 @@ SYM_FUNC_END(dcache_clean_pou)
* - end - kernel end address of region
*/
SYM_FUNC_START(__pi_dcache_inval_poc)
+alternative_if ARM64_WORKAROUND_4311569
+ mov x4, x0
+ mov x5, x1
+ mov x6, #1
+alternative_else_nop_endif
dcache_line_size x2, x3
sub x3, x2, #1
- tst x1, x3 // end cache line aligned?
+again: tst x1, x3 // end cache line aligned?
bic x1, x1, x3
b.eq 1f
dc civac, x1 // clean & invalidate D / U line
@@ -158,6 +163,12 @@ SYM_FUNC_START(__pi_dcache_inval_poc)
3: add x0, x0, x2
cmp x0, x1
b.lo 2b
+alternative_if ARM64_WORKAROUND_4311569
+ mov x0, x4
+ mov x1, x5
+ sub x6, x6, #1
+ cbz x6, again
+alternative_else_nop_endif
dsb sy
ret
SYM_FUNC_END(__pi_dcache_inval_poc)
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 0fac75f01534..856b6cf6e71e 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -103,6 +103,7 @@ WORKAROUND_2077057
WORKAROUND_2457168
WORKAROUND_2645198
WORKAROUND_2658417
+WORKAROUND_4311569
WORKAROUND_AMPERE_AC03_CPU_38
WORKAROUND_AMPERE_AC04_CPU_23
WORKAROUND_TRBE_OVERWRITE_FILL_MODE
--
2.52.0.358.g0dd7633a29-goog
If SMT is disabled or a partial SMT state is enabled, when a new kernel
image is loaded for kexec, on reboot the following warning is observed:
kexec: Waking offline cpu 228.
WARNING: CPU: 0 PID: 9062 at arch/powerpc/kexec/core_64.c:223 kexec_prepare_cpus+0x1b0/0x1bc
[snip]
NIP kexec_prepare_cpus+0x1b0/0x1bc
LR kexec_prepare_cpus+0x1a0/0x1bc
Call Trace:
kexec_prepare_cpus+0x1a0/0x1bc (unreliable)
default_machine_kexec+0x160/0x19c
machine_kexec+0x80/0x88
kernel_kexec+0xd0/0x118
__do_sys_reboot+0x210/0x2c4
system_call_exception+0x124/0x320
system_call_vectored_common+0x15c/0x2ec
This occurs as add_cpu() fails due to cpu_bootable() returning false for
CPUs that fail the cpu_smt_thread_allowed() check or non primary
threads if SMT is disabled.
Fix the issue by enabling SMT and resetting the number of SMT threads to
the number of threads per core, before attempting to wake up all present
CPUs.
Fixes: 38253464bc82 ("cpu/SMT: Create topology_smt_thread_allowed()")
Reported-by: Sachin P Bappalige <sachinpb(a)linux.ibm.com>
Cc: stable(a)vger.kernel.org # v6.6+
Signed-off-by: Nysal Jan K.A. <nysal(a)linux.ibm.com>
---
arch/powerpc/kexec/core_64.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 222aa326dace..ff6df43720c4 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -216,6 +216,11 @@ static void wake_offline_cpus(void)
{
int cpu = 0;
+ lock_device_hotplug();
+ cpu_smt_num_threads = threads_per_core;
+ cpu_smt_control = CPU_SMT_ENABLED;
+ unlock_device_hotplug();
+
for_each_present_cpu(cpu) {
if (!cpu_online(cpu)) {
printk(KERN_INFO "kexec: Waking offline cpu %d.\n",
--
2.51.0
Since Linux v6.7, booting using BootX on an Old World PowerMac produces
an early crash. Stan Johnson writes, "the symptoms are that the screen
goes blank and the backlight stays on, and the system freezes (Linux
doesn't boot)."
Further testing revealed that the failure can be avoided by disabling
CONFIG_BOOTX_TEXT. Bisection revealed that the regression was caused by
a change to the font bitmap pointer that's used when btext_init() begins
painting characters on the display, early in the boot process.
Christophe Leroy explains, "before kernel text is relocated to its final
location ... data is addressed with an offset which is added to the
Global Offset Table (GOT) entries at the start of bootx_init()
by function reloc_got2(). But the pointers that are located inside a
structure are not referenced in the GOT and are therefore not updated by
reloc_got2(). It is therefore needed to apply the offset manually by using
PTRRELOC() macro."
Cc: Cedar Maxwell <cedarmaxwell(a)mac.com>
Cc: Stan Johnson <userm57(a)yahoo.com>
Cc: "Dr. David Alan Gilbert" <linux(a)treblig.org>
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: stable(a)vger.kernel.org
Link: https://lists.debian.org/debian-powerpc/2025/10/msg00111.html
Link: https://lore.kernel.org/linuxppc-dev/d81ddca8-c5ee-d583-d579-02b19ed95301@y…
Reported-by: Cedar Maxwell <cedarmaxwell(a)mac.com>
Closes: https://lists.debian.org/debian-powerpc/2025/09/msg00031.html
Bisected-by: Stan Johnson <userm57(a)yahoo.com>
Tested-by: Stan Johnson <userm57(a)yahoo.com>
Fixes: 0ebc7feae79a ("powerpc: Use shared font data")
Suggested-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Finn Thain <fthain(a)linux-m68k.org>
---
Changed since v1:
- Improved commit log entry to better explain the need for PTRRELOC().
---
arch/powerpc/kernel/btext.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/btext.c b/arch/powerpc/kernel/btext.c
index 7f63f1cdc6c3..ca00c4824e31 100644
--- a/arch/powerpc/kernel/btext.c
+++ b/arch/powerpc/kernel/btext.c
@@ -20,6 +20,7 @@
#include <asm/io.h>
#include <asm/processor.h>
#include <asm/udbg.h>
+#include <asm/setup.h>
#define NO_SCROLL
@@ -463,7 +464,7 @@ static noinline void draw_byte(unsigned char c, long locX, long locY)
{
unsigned char *base = calc_base(locX << 3, locY << 4);
unsigned int font_index = c * 16;
- const unsigned char *font = font_sun_8x16.data + font_index;
+ const unsigned char *font = PTRRELOC(font_sun_8x16.data) + font_index;
int rb = dispDeviceRowBytes;
rmci_maybe_on();
--
2.49.1
Dear linux-fbdev, stable,
On Mon, Oct 20, 2025 at 09:47:01PM +0800, Junjie Cao wrote:
> bit_putcs_aligned()/unaligned() derived the glyph pointer from the
> character value masked by 0xff/0x1ff, which may exceed the actual font's
> glyph count and read past the end of the built-in font array.
> Clamp the index to the actual glyph count before computing the address.
>
> This fixes a global out-of-bounds read reported by syzbot.
>
> Reported-by: syzbot+793cf822d213be1a74f2(a)syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=793cf822d213be1a74f2
> Tested-by: syzbot+793cf822d213be1a74f2(a)syzkaller.appspotmail.com
> Signed-off-by: Junjie Cao <junjie.cao(a)intel.com>
This commit is applied to v5.10.247 and causes a regression: when
switching VT with ctrl-alt-f2 the screen is blank or completely filled
with angle characters, then new text is not appearing (or not visible).
This commit is found with git bisect from v5.10.246 to v5.10.247:
0998a6cb232674408a03e8561dc15aa266b2f53b is the first bad commit
commit 0998a6cb232674408a03e8561dc15aa266b2f53b
Author: Junjie Cao <junjie.cao(a)intel.com>
AuthorDate: 2025-10-20 21:47:01 +0800
Commit: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
CommitDate: 2025-12-07 06:08:07 +0900
fbdev: bitblit: bound-check glyph index in bit_putcs*
commit 18c4ef4e765a798b47980555ed665d78b71aeadf upstream.
bit_putcs_aligned()/unaligned() derived the glyph pointer from the
character value masked by 0xff/0x1ff, which may exceed the actual font's
glyph count and read past the end of the built-in font array.
Clamp the index to the actual glyph count before computing the address.
This fixes a global out-of-bounds read reported by syzbot.
Reported-by: syzbot+793cf822d213be1a74f2(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=793cf822d213be1a74f2
Tested-by: syzbot+793cf822d213be1a74f2(a)syzkaller.appspotmail.com
Signed-off-by: Junjie Cao <junjie.cao(a)intel.com>
Reviewed-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Signed-off-by: Helge Deller <deller(a)gmx.de>
Cc: stable(a)vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
drivers/video/fbdev/core/bitblit.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
The minimal reproducer in cli, after kernel is booted:
date >/dev/tty2; chvt 2
and the date does not appear.
Thanks,
#regzbot introduced: 0998a6cb232674408a03e8561dc15aa266b2f53b
> ---
> v1: https://lore.kernel.org/linux-fbdev/5d237d1a-a528-4205-a4d8-71709134f1e1@su…
> v1 -> v2:
> - Fix indentation and add blank line after declarations with the .pl helper
> - No functional changes
>
> drivers/video/fbdev/core/bitblit.c | 16 ++++++++++++----
> 1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/video/fbdev/core/bitblit.c b/drivers/video/fbdev/core/bitblit.c
> index 9d2e59796c3e..085ffb44c51a 100644
> --- a/drivers/video/fbdev/core/bitblit.c
> +++ b/drivers/video/fbdev/core/bitblit.c
> @@ -79,12 +79,16 @@ static inline void bit_putcs_aligned(struct vc_data *vc, struct fb_info *info,
> struct fb_image *image, u8 *buf, u8 *dst)
> {
> u16 charmask = vc->vc_hi_font_mask ? 0x1ff : 0xff;
> + unsigned int charcnt = vc->vc_font.charcount;
> u32 idx = vc->vc_font.width >> 3;
> u8 *src;
>
> while (cnt--) {
> - src = vc->vc_font.data + (scr_readw(s++)&
> - charmask)*cellsize;
> + u16 ch = scr_readw(s++) & charmask;
> +
> + if (ch >= charcnt)
> + ch = 0;
> + src = vc->vc_font.data + (unsigned int)ch * cellsize;
>
> if (attr) {
> update_attr(buf, src, attr, vc);
> @@ -112,14 +116,18 @@ static inline void bit_putcs_unaligned(struct vc_data *vc,
> u8 *dst)
> {
> u16 charmask = vc->vc_hi_font_mask ? 0x1ff : 0xff;
> + unsigned int charcnt = vc->vc_font.charcount;
> u32 shift_low = 0, mod = vc->vc_font.width % 8;
> u32 shift_high = 8;
> u32 idx = vc->vc_font.width >> 3;
> u8 *src;
>
> while (cnt--) {
> - src = vc->vc_font.data + (scr_readw(s++)&
> - charmask)*cellsize;
> + u16 ch = scr_readw(s++) & charmask;
> +
> + if (ch >= charcnt)
> + ch = 0;
> + src = vc->vc_font.data + (unsigned int)ch * cellsize;
>
> if (attr) {
> update_attr(buf, src, attr, vc);
> --
> 2.48.1
>
When starting multi-core loongarch virtualization on loongarch physical
machine, loading livepatch on the physical machine will cause an error
similar to the following:
[ 411.686289] livepatch: klp_try_switch_task: CPU 31/KVM:3116 has an
unreliable stack
The specific test steps are as follows:
1.Start a multi-core virtual machine on a physical machine
2.Enter the following command on the physical machine to turn on the debug
switch:
echo "file kernel/livepatch/transition.c +p" > /sys/kernel/debug/\
dynamic_debug/control
3.Load livepatch:
modprobe livepatch-sample
Through the above steps, similar prints can be viewed in dmesg.
The reason for this issue is that the code of the kvm_exc_entry function
was copied in the function kvm_loongarch_env_init. When the cpu needs to
execute kvm_exc_entry, it will switch to the copied address for execution.
The new address of the kvm_exc_entry function cannot be recognized in ORC,
which eventually leads to the arch_stack_walk_reliable function returning
an error and printing an exception message.
To solve the above problems, we directly compile the switch.S file into
the kernel instead of the module. In this way, the function kvm_exc_entry
will no longer need to be copied.
changlog:
V3<-V2:
1.Replace the EXPORT_SYMBOL macro declaration symbol with the
EXPORT_SYMBOL_FOR_KVM macro
2.Add some comments in kvm_enter_guest
3.Place the correct pc address in era
4.Move .p2align after .text
V2<-V1:
1.Rollback the modification of function parameter types such as
kvm_save_fpu. In the asm-prototypes.h header file, only the parameter types
it depends on are included
Cc: Huacai Chen <chenhuacai(a)kernel.org>
Cc: WANG Xuerui <kernel(a)xen0n.name>
Cc: Tianrui Zhao <zhaotianrui(a)loongson.cn>
Cc: Bibo Mao <maobibo(a)loongson.cn>
Cc: Charlie Jenkins <charlie(a)rivosinc.com>
Cc: Xianglai Li <lixianglai(a)loongson.cn>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Cc: Tiezhu Yang <yangtiezhu(a)loongson.cn>
Xianglai Li (2):
LoongArch: KVM: Compile the switch.S file directly into the kernel
LoongArch: KVM: fix "unreliable stack" issue
arch/loongarch/Kbuild | 2 +-
arch/loongarch/include/asm/asm-prototypes.h | 21 +++++++++++++
arch/loongarch/include/asm/kvm_host.h | 3 --
arch/loongarch/kvm/Makefile | 2 +-
arch/loongarch/kvm/main.c | 35 ++-------------------
arch/loongarch/kvm/switch.S | 32 ++++++++++++++++---
6 files changed, 53 insertions(+), 42 deletions(-)
base-commit: 8f0b4cce4481fb22653697cced8d0d04027cb1e8
--
2.39.1
This reverts commit ec9fd499b9c60a187ac8d6414c3c343c77d32e42.
While this fake hotplugging was a nice idea, it has shown that this feature
does not handle PCIe switches correctly:
pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43
pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them
pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44
pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them
pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45
pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them
pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46
pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41])
pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them
pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46
pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41])
pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them
pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46
During the initial scan, PCI core doesn't see the switch and since the Root
Port is not hot plug capable, the secondary bus number gets assigned as the
subordinate bus number. This means, the PCI core assumes that only one bus
will appear behind the Root Port since the Root Port is not hot plug
capable.
This works perfectly fine for PCIe endpoints connected to the Root Port,
since they don't extend the bus. However, if a PCIe switch is connected,
then there is a problem when the downstream busses starts showing up and
the PCI core doesn't extend the subordinate bus number after initial scan
during boot.
The long term plan is to migrate this driver to the pwrctrl framework,
once it adds proper support for powering up and enumerating PCIe switches.
Cc: stable(a)vger.kernel.org
Suggested-by: Manivannan Sadhasivam <mani(a)kernel.org>
Acked-by: Shawn Lin <shawn.lin(a)rock-chips.com>
Tested-by: Shawn Lin <shawn.lin(a)rock-chips.com>
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
drivers/pci/controller/dwc/pcie-dw-rockchip.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
index 8c1c92208802..ca808d8f7975 100644
--- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
+++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
@@ -601,7 +601,6 @@ static int rockchip_pcie_configure_rc(struct platform_device *pdev,
pp = &rockchip->pci.pp;
pp->ops = &rockchip_pcie_host_ops;
- pp->use_linkup_irq = true;
ret = dw_pcie_host_init(pp);
if (ret) {
--
2.52.0
Commit under Fixes enabled loadable module support for the driver under
the assumption that it shall be the sole user of the Cadence Host and
Endpoint library APIs. This assumption guarantees that we won't end up
in a case where the driver is built-in and the library support is built
as a loadable module.
With the introduction of [1], this assumption is no longer valid. The
SG2042 driver could be built as a loadable module, implying that the
Cadence Host library is also selected as a loadable module. However, the
pci-j721e.c driver could be built-in as indicated by CONFIG_PCI_J721E=y
due to which the Cadence Endpoint library is built-in. Despite the
library drivers being built as specified by their respective consumers,
since the 'pci-j721e.c' driver has references to the Cadence Host
library APIs as well, we run into a build error as reported at [0].
Fix this by adding config guards as a temporary workaround. The proper
fix is to split the 'pci-j721e.c' driver into independent Host and
Endpoint drivers as aligned at [2].
Fixes: a2790bf81f0f ("PCI: j721e: Add support to build as a loadable module")
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202511111705.MZ7ls8Hm-lkp@intel.com/
Cc: <stable(a)vger.kernel.org>
[0]: https://lore.kernel.org/r/202511111705.MZ7ls8Hm-lkp@intel.com/
[1]: commit 1c72774df028 ("PCI: sg2042: Add Sophgo SG2042 PCIe driver")
[2]: https://lore.kernel.org/r/37f6f8ce-12b2-44ee-a94c-f21b29c98821@app.fastmail…
Suggested-by: Arnd Bergmann <arnd(a)arndb.de>
Signed-off-by: Siddharth Vadapalli <s-vadapalli(a)ti.com>
---
drivers/pci/controller/cadence/pci-j721e.c | 43 +++++++++++++---------
1 file changed, 26 insertions(+), 17 deletions(-)
diff --git a/drivers/pci/controller/cadence/pci-j721e.c b/drivers/pci/controller/cadence/pci-j721e.c
index 5bc5ab20aa6d..67c5e02afccf 100644
--- a/drivers/pci/controller/cadence/pci-j721e.c
+++ b/drivers/pci/controller/cadence/pci-j721e.c
@@ -628,10 +628,12 @@ static int j721e_pcie_probe(struct platform_device *pdev)
gpiod_set_value_cansleep(gpiod, 1);
}
- ret = cdns_pcie_host_setup(rc);
- if (ret < 0) {
- clk_disable_unprepare(pcie->refclk);
- goto err_pcie_setup;
+ if (IS_ENABLED(CONFIG_PCI_J721E_HOST)) {
+ ret = cdns_pcie_host_setup(rc);
+ if (ret < 0) {
+ clk_disable_unprepare(pcie->refclk);
+ goto err_pcie_setup;
+ }
}
break;
@@ -642,9 +644,11 @@ static int j721e_pcie_probe(struct platform_device *pdev)
goto err_get_sync;
}
- ret = cdns_pcie_ep_setup(ep);
- if (ret < 0)
- goto err_pcie_setup;
+ if (IS_ENABLED(CONFIG_PCI_J721E_EP)) {
+ ret = cdns_pcie_ep_setup(ep);
+ if (ret < 0)
+ goto err_pcie_setup;
+ }
break;
}
@@ -669,10 +673,11 @@ static void j721e_pcie_remove(struct platform_device *pdev)
struct cdns_pcie_ep *ep;
struct cdns_pcie_rc *rc;
- if (pcie->mode == PCI_MODE_RC) {
+ if (IS_ENABLED(CONFIG_PCI_J721E_HOST) &&
+ pcie->mode == PCI_MODE_RC) {
rc = container_of(cdns_pcie, struct cdns_pcie_rc, pcie);
cdns_pcie_host_disable(rc);
- } else {
+ } else if (IS_ENABLED(CONFIG_PCI_J721E_EP)) {
ep = container_of(cdns_pcie, struct cdns_pcie_ep, pcie);
cdns_pcie_ep_disable(ep);
}
@@ -739,10 +744,12 @@ static int j721e_pcie_resume_noirq(struct device *dev)
gpiod_set_value_cansleep(pcie->reset_gpio, 1);
}
- ret = cdns_pcie_host_link_setup(rc);
- if (ret < 0) {
- clk_disable_unprepare(pcie->refclk);
- return ret;
+ if (IS_ENABLED(CONFIG_PCI_J721E_HOST)) {
+ ret = cdns_pcie_host_link_setup(rc);
+ if (ret < 0) {
+ clk_disable_unprepare(pcie->refclk);
+ return ret;
+ }
}
/*
@@ -752,10 +759,12 @@ static int j721e_pcie_resume_noirq(struct device *dev)
for (enum cdns_pcie_rp_bar bar = RP_BAR0; bar <= RP_NO_BAR; bar++)
rc->avail_ib_bar[bar] = true;
- ret = cdns_pcie_host_init(rc);
- if (ret) {
- clk_disable_unprepare(pcie->refclk);
- return ret;
+ if (IS_ENABLED(CONFIG_PCI_J721E_HOST)) {
+ ret = cdns_pcie_host_init(rc);
+ if (ret) {
+ clk_disable_unprepare(pcie->refclk);
+ return ret;
+ }
}
}
--
2.51.1