The riscv_hwprobe vDSO data is populated by init_hwprobe_vdso_data(), an arch_initcall_sync. However, underlying data for some keys, like RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF, is determined asynchronously.
Specifically, the per_cpu(vector_misaligned_access, cpu) values are set by the vec_check_unaligned_access_speed_all_cpus kthread. This kthread is spawned by an earlier arch_initcall (check_unaligned_access_all_cpus) and may complete its benchmark *after* init_hwprobe_vdso_data() has already populated the vDSO with default/stale values.
So, refresh the vDSO data for specified keys (e.g., MISALIGNED_VECTOR_PERF) ensuring it reflects the final boot-time values.
Test by comparing vDSO and syscall results for affected keys (e.g., MISALIGNED_VECTOR_PERF), which now match their final boot-time values.
Reported-by: Tsukasa OI research_trasio@irq.a4lg.com Closes: https://lore.kernel.org/linux-riscv/760d637b-b13b-4518-b6bf-883d55d44e7f@irq... Fixes: e7c9d66e313b ("RISC-V: Report vector unaligned access speed hwprobe") Cc: stable@vger.kernel.org Reviewed-by: Yanteng Si si.yanteng@linux.dev Reviewed-by: Jesse Taube jesse@rivosinc.com Signed-off-by: Jingwei Wang wangjingwei@iscas.ac.cn --- Changes in v3: - Retained existing blank line.
Changes in v2: - Addressed feedback from Yixun's regarding #ifdef CONFIG_MMU usage. - Updated commit message to provide a high-level summary. - Added Fixes tag for commit e7c9d66e313b.
v1: https://lore.kernel.org/linux-riscv/20250521052754.185231-1-wangjingwei@isca...
arch/riscv/include/asm/hwprobe.h | 6 ++++++ arch/riscv/kernel/sys_hwprobe.c | 16 ++++++++++++++++ arch/riscv/kernel/unaligned_access_speed.c | 1 + 3 files changed, 23 insertions(+)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 1f690fea0e03de6a..58dc847d86c7f2b0 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -40,4 +40,10 @@ static inline bool riscv_hwprobe_pair_cmp(struct riscv_hwprobe *pair, return pair->value == other_pair->value; }
+#ifdef CONFIG_MMU +void riscv_hwprobe_vdso_sync(__s64 sync_key); +#else +static inline void riscv_hwprobe_vdso_sync(__s64 sync_key) { }; +#endif /* CONFIG_MMU */ + #endif diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 249aec8594a92a80..2e3e612b7ac6fd57 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -17,6 +17,7 @@ #include <asm/vector.h> #include <asm/vendor_extensions/thead_hwprobe.h> #include <vdso/vsyscall.h> +#include <vdso/datapage.h>
static void hwprobe_arch_id(struct riscv_hwprobe *pair, @@ -500,6 +501,21 @@ static int __init init_hwprobe_vdso_data(void)
arch_initcall_sync(init_hwprobe_vdso_data);
+void riscv_hwprobe_vdso_sync(__s64 sync_key) +{ + struct vdso_arch_data *avd = vdso_k_arch_data; + struct riscv_hwprobe pair; + + pair.key = sync_key; + hwprobe_one_pair(&pair, cpu_online_mask); + /* + * Update vDSO data for the given key. + * Currently for non-ID key updates (e.g. MISALIGNED_VECTOR_PERF), + * so 'homogeneous_cpus' is not re-evaluated here. + */ + avd->all_cpu_hwprobe_values[sync_key] = pair.value; +} + #endif /* CONFIG_MMU */
SYSCALL_DEFINE5(riscv_hwprobe, struct riscv_hwprobe __user *, pairs, diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c index 585d2dcf2dab1ccb..a2cdb396ff823720 100644 --- a/arch/riscv/kernel/unaligned_access_speed.c +++ b/arch/riscv/kernel/unaligned_access_speed.c @@ -375,6 +375,7 @@ static void check_vector_unaligned_access(struct work_struct *work __always_unus static int __init vec_check_unaligned_access_speed_all_cpus(void *unused __always_unused) { schedule_on_each_cpu(check_vector_unaligned_access); + riscv_hwprobe_vdso_sync(RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF);
return 0; }
On Wed, 28 May 2025 07:28:19 PDT (-0700), wangjingwei@iscas.ac.cn wrote:
The riscv_hwprobe vDSO data is populated by init_hwprobe_vdso_data(), an arch_initcall_sync. However, underlying data for some keys, like RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF, is determined asynchronously.
Specifically, the per_cpu(vector_misaligned_access, cpu) values are set by the vec_check_unaligned_access_speed_all_cpus kthread. This kthread is spawned by an earlier arch_initcall (check_unaligned_access_all_cpus) and may complete its benchmark *after* init_hwprobe_vdso_data() has already populated the vDSO with default/stale values.
IIUC there's another race here: we don't ensure these complete before allowing userspace to see the values, so if these took so long to probe userspace started to make hwprobe() calls before they got scheduled we'd be providing the wrong answer.
Unless I'm just missing something, though -- I thought we'd looked at that case?
So, refresh the vDSO data for specified keys (e.g., MISALIGNED_VECTOR_PERF) ensuring it reflects the final boot-time values.
Test by comparing vDSO and syscall results for affected keys (e.g., MISALIGNED_VECTOR_PERF), which now match their final boot-time values.
Wouldn't all the other keys we probe via workqueue be racy as well?
Reported-by: Tsukasa OI research_trasio@irq.a4lg.com Closes: https://lore.kernel.org/linux-riscv/760d637b-b13b-4518-b6bf-883d55d44e7f@irq... Fixes: e7c9d66e313b ("RISC-V: Report vector unaligned access speed hwprobe") Cc: stable@vger.kernel.org Reviewed-by: Yanteng Si si.yanteng@linux.dev Reviewed-by: Jesse Taube jesse@rivosinc.com Signed-off-by: Jingwei Wang wangjingwei@iscas.ac.cn
Changes in v3:
- Retained existing blank line.
Changes in v2:
- Addressed feedback from Yixun's regarding #ifdef CONFIG_MMU usage.
- Updated commit message to provide a high-level summary.
- Added Fixes tag for commit e7c9d66e313b.
v1: https://lore.kernel.org/linux-riscv/20250521052754.185231-1-wangjingwei@isca...
arch/riscv/include/asm/hwprobe.h | 6 ++++++ arch/riscv/kernel/sys_hwprobe.c | 16 ++++++++++++++++ arch/riscv/kernel/unaligned_access_speed.c | 1 + 3 files changed, 23 insertions(+)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 1f690fea0e03de6a..58dc847d86c7f2b0 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -40,4 +40,10 @@ static inline bool riscv_hwprobe_pair_cmp(struct riscv_hwprobe *pair, return pair->value == other_pair->value; }
+#ifdef CONFIG_MMU +void riscv_hwprobe_vdso_sync(__s64 sync_key); +#else +static inline void riscv_hwprobe_vdso_sync(__s64 sync_key) { }; +#endif /* CONFIG_MMU */
#endif diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 249aec8594a92a80..2e3e612b7ac6fd57 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -17,6 +17,7 @@ #include <asm/vector.h> #include <asm/vendor_extensions/thead_hwprobe.h> #include <vdso/vsyscall.h> +#include <vdso/datapage.h>
static void hwprobe_arch_id(struct riscv_hwprobe *pair, @@ -500,6 +501,21 @@ static int __init init_hwprobe_vdso_data(void)
arch_initcall_sync(init_hwprobe_vdso_data);
+void riscv_hwprobe_vdso_sync(__s64 sync_key) +{
- struct vdso_arch_data *avd = vdso_k_arch_data;
- struct riscv_hwprobe pair;
- pair.key = sync_key;
- hwprobe_one_pair(&pair, cpu_online_mask);
- /*
* Update vDSO data for the given key.
* Currently for non-ID key updates (e.g. MISALIGNED_VECTOR_PERF),
* so 'homogeneous_cpus' is not re-evaluated here.
*/
- avd->all_cpu_hwprobe_values[sync_key] = pair.value;
+}
#endif /* CONFIG_MMU */
SYSCALL_DEFINE5(riscv_hwprobe, struct riscv_hwprobe __user *, pairs, diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c index 585d2dcf2dab1ccb..a2cdb396ff823720 100644 --- a/arch/riscv/kernel/unaligned_access_speed.c +++ b/arch/riscv/kernel/unaligned_access_speed.c @@ -375,6 +375,7 @@ static void check_vector_unaligned_access(struct work_struct *work __always_unus static int __init vec_check_unaligned_access_speed_all_cpus(void *unused __always_unused) { schedule_on_each_cpu(check_vector_unaligned_access);
riscv_hwprobe_vdso_sync(RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF);
return 0;
}
Hi Palmer, On 2025/6/11 06:25, Palmer Dabbelt wrote:
On Wed, 28 May 2025 07:28:19 PDT (-0700), wangjingwei@iscas.ac.cn wrote:
The riscv_hwprobe vDSO data is populated by init_hwprobe_vdso_data(), an arch_initcall_sync. However, underlying data for some keys, like RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF, is determined asynchronously.
Specifically, the per_cpu(vector_misaligned_access, cpu) values are set by the vec_check_unaligned_access_speed_all_cpus kthread. This kthread is spawned by an earlier arch_initcall (check_unaligned_access_all_cpus) and may complete its benchmark *after* init_hwprobe_vdso_data() has already populated the vDSO with default/stale values.
IIUC there's another race here: we don't ensure these complete before allowing userspace to see the values, so if these took so long to probe userspace started to make hwprobe() calls before they got scheduled we'd be providing the wrong answer.
Unless I'm just missing something, though -- I thought we'd looked at that case?
Thanks for the review. You're right, my current patch doesn't fix the race condition with userspace.
The robust solution here is to use the kernel's `completion`. I've tested this approach: the async probing thread calls `complete()` when finished, and `init_hwprobe_vdso_data()` blocks on `wait_for_completion()`. This guarantees the vDSO data is finalized before userspace can access it.
So, refresh the vDSO data for specified keys (e.g., MISALIGNED_VECTOR_PERF) ensuring it reflects the final boot-time values.
Test by comparing vDSO and syscall results for affected keys (e.g., MISALIGNED_VECTOR_PERF), which now match their final boot-time values.
Wouldn't all the other keys we probe via workqueue be racy as well?
The completion mechanism is easily reusable. If this approach is accepted, I will then identify other potential probe keys and integrate them into this synchronization logic.
And here is my tested code:
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 7fe0a379474ae2c6..87af186d92e75ddb 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -40,5 +40,11 @@ static inline bool riscv_hwprobe_pair_cmp(struct riscv_hwprobe *pair,
return pair->value == other_pair->value; } - +#ifdef CONFIG_MMU +void riscv_hwprobe_register_async_probe(void); +void riscv_hwprobe_complete_async_probe(void); +#else +inline void riscv_hwprobe_register_async_probe(void) {} +inline void riscv_hwprobe_complete_async_probe(void) {} +#endif #endif diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 0b170e18a2beba57..96ce1479e835534e 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -5,6 +5,8 @@ * more details. */ #include <linux/syscalls.h> +#include <linux/completion.h> +#include <linux/atomic.h> #include <asm/cacheflush.h> #include <asm/cpufeature.h> #include <asm/hwprobe.h> @@ -467,6 +469,32 @@ static int do_riscv_hwprobe(struct riscv_hwprobe __user *pairs,
#ifdef CONFIG_MMU
+/* Framework for synchronizing asynchronous boot-time probes */ +static DECLARE_COMPLETION(boot_probes_done); +static atomic_t pending_boot_probes = ATOMIC_INIT(1); + +void riscv_hwprobe_register_async_probe(void) +{ + atomic_inc(&pending_boot_probes); +} + +void riscv_hwprobe_complete_async_probe(void) +{ + if (atomic_dec_and_test(&pending_boot_probes)) + complete(&boot_probes_done); +} + +static void __init wait_for_all_boot_probes(void) +{ + if (atomic_dec_and_test(&pending_boot_probes)) + return; + + pr_info("riscv: waiting for hwprobe asynchronous probes to complete...\n"); + wait_for_completion(&boot_probes_done); + pr_info("riscv: hwprobe asynchronous probes completed.\n"); +} + + static int __init init_hwprobe_vdso_data(void) { struct vdso_arch_data *avd = vdso_k_arch_data; @@ -474,6 +502,8 @@ static int __init init_hwprobe_vdso_data(void) struct riscv_hwprobe pair; int key;
+ wait_for_all_boot_probes(); + /* * Initialize vDSO data with the answers for the "all CPUs" case, to * save a syscall in the common case. diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c index ae2068425fbcd207..57e4169ab58fb9bc 100644 --- a/arch/riscv/kernel/unaligned_access_speed.c +++ b/arch/riscv/kernel/unaligned_access_speed.c @@ -379,6 +380,7 @@ static void check_vector_unaligned_access(struct work_struct *work __always_unus static int __init vec_check_unaligned_access_speed_all_cpus(void *unused __always_unused) { schedule_on_each_cpu(check_vector_unaligned_access); + riscv_hwprobe_complete_async_probe();
return 0; } @@ -473,8 +475,12 @@ static int __init check_unaligned_access_all_cpus(void) per_cpu(vector_misaligned_access, cpu) = unaligned_vector_speed_param; } else if (!check_vector_unaligned_access_emulated_all_cpus() && IS_ENABLED(CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS)) { - kthread_run(vec_check_unaligned_access_speed_all_cpus, - NULL, "vec_check_unaligned_access_speed_all_cpus"); + riscv_hwprobe_register_async_probe(); + if(IS_ERR(kthread_run(vec_check_unaligned_access_speed_all_cpus, + NULL, "vec_check_unaligned_access_speed_all_cpus"))) { + pr_warn("Failed to create vec_unalign_check kthread\n"); + riscv_hwprobe_complete_async_probe(); + } }
/*
Hi Jingwei, Palmer,
On 6/20/25 10:43, Jingwei Wang wrote:
Hi Palmer, On 2025/6/11 06:25, Palmer Dabbelt wrote:
On Wed, 28 May 2025 07:28:19 PDT (-0700), wangjingwei@iscas.ac.cn wrote:
The riscv_hwprobe vDSO data is populated by init_hwprobe_vdso_data(), an arch_initcall_sync. However, underlying data for some keys, like RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF, is determined asynchronously.
Specifically, the per_cpu(vector_misaligned_access, cpu) values are set by the vec_check_unaligned_access_speed_all_cpus kthread. This kthread is spawned by an earlier arch_initcall (check_unaligned_access_all_cpus) and may complete its benchmark *after* init_hwprobe_vdso_data() has already populated the vDSO with default/stale values.
IIUC there's another race here: we don't ensure these complete before allowing userspace to see the values, so if these took so long to probe userspace started to make hwprobe() calls before they got scheduled we'd be providing the wrong answer.
Unless I'm just missing something, though -- I thought we'd looked at that case?
Thanks for the review. You're right, my current patch doesn't fix the race condition with userspace.
I don't think there could be a race since all initcalls are executed sequentially, meaning userspace won't be up before the arch_initcall level is finished.
But that means that this patch in its current form will probably slow down the whole boot process. To avoid that (and in addition to this patch), can we move init_hwprobe_vdso_data() to late_initcall()?
The robust solution here is to use the kernel's `completion`. I've tested this approach: the async probing thread calls `complete()` when finished, and `init_hwprobe_vdso_data()` blocks on `wait_for_completion()`. This guarantees the vDSO data is finalized before userspace can access it.
So, refresh the vDSO data for specified keys (e.g., MISALIGNED_VECTOR_PERF) ensuring it reflects the final boot-time values.
Test by comparing vDSO and syscall results for affected keys (e.g., MISALIGNED_VECTOR_PERF), which now match their final boot-time values.
Wouldn't all the other keys we probe via workqueue be racy as well?
The completion mechanism is easily reusable. If this approach is accepted, I will then identify other potential probe keys and integrate them into this synchronization logic.
Yes, I'd say that's the right way to do, there aren't lots of asynchronous initialization stuff so we can handle that when new ones land.
Thanks,
Alex
And here is my tested code:
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 7fe0a379474ae2c6..87af186d92e75ddb 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -40,5 +40,11 @@ static inline bool riscv_hwprobe_pair_cmp(struct riscv_hwprobe *pair,
return pair->value == other_pair->value; }
+#ifdef CONFIG_MMU +void riscv_hwprobe_register_async_probe(void); +void riscv_hwprobe_complete_async_probe(void); +#else +inline void riscv_hwprobe_register_async_probe(void) {} +inline void riscv_hwprobe_complete_async_probe(void) {} +#endif #endif diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 0b170e18a2beba57..96ce1479e835534e 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -5,6 +5,8 @@ * more details. */ #include <linux/syscalls.h> +#include <linux/completion.h> +#include <linux/atomic.h> #include <asm/cacheflush.h> #include <asm/cpufeature.h> #include <asm/hwprobe.h> @@ -467,6 +469,32 @@ static int do_riscv_hwprobe(struct riscv_hwprobe __user *pairs,
#ifdef CONFIG_MMU
+/* Framework for synchronizing asynchronous boot-time probes */ +static DECLARE_COMPLETION(boot_probes_done); +static atomic_t pending_boot_probes = ATOMIC_INIT(1);
+void riscv_hwprobe_register_async_probe(void) +{ + atomic_inc(&pending_boot_probes); +}
+void riscv_hwprobe_complete_async_probe(void) +{ + if (atomic_dec_and_test(&pending_boot_probes)) + complete(&boot_probes_done); +}
+static void __init wait_for_all_boot_probes(void) +{ + if (atomic_dec_and_test(&pending_boot_probes)) + return;
+ pr_info("riscv: waiting for hwprobe asynchronous probes to complete...\n");
- wait_for_completion(&boot_probes_done);
+ pr_info("riscv: hwprobe asynchronous probes completed.\n"); +}
static int __init init_hwprobe_vdso_data(void) { struct vdso_arch_data *avd = vdso_k_arch_data; @@ -474,6 +502,8 @@ static int __init init_hwprobe_vdso_data(void) struct riscv_hwprobe pair; int key;
+ wait_for_all_boot_probes();
/* * Initialize vDSO data with the answers for the "all CPUs" case, to * save a syscall in the common case. diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c index ae2068425fbcd207..57e4169ab58fb9bc 100644 --- a/arch/riscv/kernel/unaligned_access_speed.c +++ b/arch/riscv/kernel/unaligned_access_speed.c @@ -379,6 +380,7 @@ static void check_vector_unaligned_access(struct work_struct *work __always_unus static int __init vec_check_unaligned_access_speed_all_cpus(void *unused __always_unused) { schedule_on_each_cpu(check_vector_unaligned_access); + riscv_hwprobe_complete_async_probe();
return 0; } @@ -473,8 +475,12 @@ static int __init check_unaligned_access_all_cpus(void) per_cpu(vector_misaligned_access, cpu) = unaligned_vector_speed_param; } else if (!check_vector_unaligned_access_emulated_all_cpus() && IS_ENABLED(CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS)) {
- kthread_run(vec_check_unaligned_access_speed_all_cpus,
- NULL, "vec_check_unaligned_access_speed_all_cpus");
- riscv_hwprobe_register_async_probe();
- if(IS_ERR(kthread_run(vec_check_unaligned_access_speed_all_cpus,
+ NULL, "vec_check_unaligned_access_speed_all_cpus"))) { + pr_warn("Failed to create vec_unalign_check kthread\n");
- riscv_hwprobe_complete_async_probe();
+ } }
/*
linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
Hi Alexandre, Palmer,
On 2025/6/23 20:15, Alexandre Ghiti wrote:
Hi Jingwei, Palmer,
On 6/20/25 10:43, Jingwei Wang wrote:
Hi Palmer, On 2025/6/11 06:25, Palmer Dabbelt wrote:
On Wed, 28 May 2025 07:28:19 PDT (-0700), wangjingwei@iscas.ac.cn wrote:
The riscv_hwprobe vDSO data is populated by init_hwprobe_vdso_data(), an arch_initcall_sync. However, underlying data for some keys, like RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF, is determined asynchronously.
Specifically, the per_cpu(vector_misaligned_access, cpu) values are set by the vec_check_unaligned_access_speed_all_cpus kthread. This kthread is spawned by an earlier arch_initcall (check_unaligned_access_all_cpus) and may complete its benchmark *after* init_hwprobe_vdso_data() has already populated the vDSO with default/stale values.
IIUC there's another race here: we don't ensure these complete before allowing userspace to see the values, so if these took so long to probe userspace started to make hwprobe() calls before they got scheduled we'd be providing the wrong answer.
Unless I'm just missing something, though -- I thought we'd looked at that case?
Thanks for the review. You're right, my current patch doesn't fix the race condition with userspace.
I don't think there could be a race since all initcalls are executed sequentially, meaning userspace won't be up before the arch_initcall level is finished.
Yes, the initcall sequence itself provides a guarantee before userspace starts. The theoretical race condition I'm concerned about comes from kthread_run, which hands off the probe's execution to the scheduler. This introduces an uncertainty in timing that we shouldn't rely on. The completion mechanism is intended to replace this timing assumption with an explicit, guaranteed synchronization point.
But that means that this patch in its current form will probably slow down the whole boot process. To avoid that (and in addition to this patch), can we move init_hwprobe_vdso_data() to late_initcall()?
That's a great point. To check the actual impact, I ran a test on a 64-core QEMU setup where I deliberately increased the probe time (from ~2ms to ~256ms by adjusting JIFFIES). Even in this worst-case scenario, the probe consistently finished during the subsystem_initcalls. This suggests the real-world slowdown from waiting is quite minimal.
However, I completely agree with your suggestion. Combining completion with a later initcall is the ideal solution to guarantee both correctness and performance. Since the vDSO data is only consumed by userspace, moving its initialization to late_initcall is perfectly safe and makes the most sense.
The robust solution here is to use the kernel's `completion`. I've tested this approach: the async probing thread calls `complete()` when finished, and `init_hwprobe_vdso_data()` blocks on `wait_for_completion()`. This guarantees the vDSO data is finalized before userspace can access it.
So, refresh the vDSO data for specified keys (e.g., MISALIGNED_VECTOR_PERF) ensuring it reflects the final boot-time values.
Test by comparing vDSO and syscall results for affected keys (e.g., MISALIGNED_VECTOR_PERF), which now match their final boot-time values.
Wouldn't all the other keys we probe via workqueue be racy as well?
The completion mechanism is easily reusable. If this approach is accepted, I will then identify other potential probe keys and integrate them into this synchronization logic.
Yes, I'd say that's the right way to do, there aren't lots of asynchronous initialization stuff so we can handle that when new ones land.
Thank you. I'll prepare a new version of the patch that implements this completion + late_initcall approach. I've also had a look, and for now, the asynchronous probe for vector misalignment seems to be a special case. The completion framework can be easily extended if similar ones land in the future.
Thanks, Jingwei
linux-stable-mirror@lists.linaro.org