Hi Palmer, On 2025/6/11 06:25, Palmer Dabbelt wrote:
On Wed, 28 May 2025 07:28:19 PDT (-0700), wangjingwei@iscas.ac.cn wrote:
The riscv_hwprobe vDSO data is populated by init_hwprobe_vdso_data(), an arch_initcall_sync. However, underlying data for some keys, like RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF, is determined asynchronously.
Specifically, the per_cpu(vector_misaligned_access, cpu) values are set by the vec_check_unaligned_access_speed_all_cpus kthread. This kthread is spawned by an earlier arch_initcall (check_unaligned_access_all_cpus) and may complete its benchmark *after* init_hwprobe_vdso_data() has already populated the vDSO with default/stale values.
IIUC there's another race here: we don't ensure these complete before allowing userspace to see the values, so if these took so long to probe userspace started to make hwprobe() calls before they got scheduled we'd be providing the wrong answer.
Unless I'm just missing something, though -- I thought we'd looked at that case?
Thanks for the review. You're right, my current patch doesn't fix the race condition with userspace.
The robust solution here is to use the kernel's `completion`. I've tested this approach: the async probing thread calls `complete()` when finished, and `init_hwprobe_vdso_data()` blocks on `wait_for_completion()`. This guarantees the vDSO data is finalized before userspace can access it.
So, refresh the vDSO data for specified keys (e.g., MISALIGNED_VECTOR_PERF) ensuring it reflects the final boot-time values.
Test by comparing vDSO and syscall results for affected keys (e.g., MISALIGNED_VECTOR_PERF), which now match their final boot-time values.
Wouldn't all the other keys we probe via workqueue be racy as well?
The completion mechanism is easily reusable. If this approach is accepted, I will then identify other potential probe keys and integrate them into this synchronization logic.
And here is my tested code:
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 7fe0a379474ae2c6..87af186d92e75ddb 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -40,5 +40,11 @@ static inline bool riscv_hwprobe_pair_cmp(struct riscv_hwprobe *pair,
return pair->value == other_pair->value; } - +#ifdef CONFIG_MMU +void riscv_hwprobe_register_async_probe(void); +void riscv_hwprobe_complete_async_probe(void); +#else +inline void riscv_hwprobe_register_async_probe(void) {} +inline void riscv_hwprobe_complete_async_probe(void) {} +#endif #endif diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 0b170e18a2beba57..96ce1479e835534e 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -5,6 +5,8 @@ * more details. */ #include <linux/syscalls.h> +#include <linux/completion.h> +#include <linux/atomic.h> #include <asm/cacheflush.h> #include <asm/cpufeature.h> #include <asm/hwprobe.h> @@ -467,6 +469,32 @@ static int do_riscv_hwprobe(struct riscv_hwprobe __user *pairs,
#ifdef CONFIG_MMU
+/* Framework for synchronizing asynchronous boot-time probes */ +static DECLARE_COMPLETION(boot_probes_done); +static atomic_t pending_boot_probes = ATOMIC_INIT(1); + +void riscv_hwprobe_register_async_probe(void) +{ + atomic_inc(&pending_boot_probes); +} + +void riscv_hwprobe_complete_async_probe(void) +{ + if (atomic_dec_and_test(&pending_boot_probes)) + complete(&boot_probes_done); +} + +static void __init wait_for_all_boot_probes(void) +{ + if (atomic_dec_and_test(&pending_boot_probes)) + return; + + pr_info("riscv: waiting for hwprobe asynchronous probes to complete...\n"); + wait_for_completion(&boot_probes_done); + pr_info("riscv: hwprobe asynchronous probes completed.\n"); +} + + static int __init init_hwprobe_vdso_data(void) { struct vdso_arch_data *avd = vdso_k_arch_data; @@ -474,6 +502,8 @@ static int __init init_hwprobe_vdso_data(void) struct riscv_hwprobe pair; int key;
+ wait_for_all_boot_probes(); + /* * Initialize vDSO data with the answers for the "all CPUs" case, to * save a syscall in the common case. diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c index ae2068425fbcd207..57e4169ab58fb9bc 100644 --- a/arch/riscv/kernel/unaligned_access_speed.c +++ b/arch/riscv/kernel/unaligned_access_speed.c @@ -379,6 +380,7 @@ static void check_vector_unaligned_access(struct work_struct *work __always_unus static int __init vec_check_unaligned_access_speed_all_cpus(void *unused __always_unused) { schedule_on_each_cpu(check_vector_unaligned_access); + riscv_hwprobe_complete_async_probe();
return 0; } @@ -473,8 +475,12 @@ static int __init check_unaligned_access_all_cpus(void) per_cpu(vector_misaligned_access, cpu) = unaligned_vector_speed_param; } else if (!check_vector_unaligned_access_emulated_all_cpus() && IS_ENABLED(CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS)) { - kthread_run(vec_check_unaligned_access_speed_all_cpus, - NULL, "vec_check_unaligned_access_speed_all_cpus"); + riscv_hwprobe_register_async_probe(); + if(IS_ERR(kthread_run(vec_check_unaligned_access_speed_all_cpus, + NULL, "vec_check_unaligned_access_speed_all_cpus"))) { + pr_warn("Failed to create vec_unalign_check kthread\n"); + riscv_hwprobe_complete_async_probe(); + } }
/*