On Mon, Aug 26 2024 at 10:01, Christophe Leroy wrote:
Le 26/08/2024 à 09:50, Jason A. Donenfeld a écrit :
But tglx pointed out in that thread that this actually isn't necessary:
| All of this is pointless because if a 32-bit application runs on a | 64-bit kernel it has to use the 64-bit 'generation'. So why on earth do | we need magic here for a 32-bit kernel? | | Just use u64 for both and spare all this voodoo. We're seriously not | "optimizing" for 32-bit kernels. | | All what happens on a 32-bit kernel is that the RNG will store the | unsigned long (32bit) generation into a 64bit variable: | | smp_store_release(&_vdso_rng_data.generation, next_gen + 1); | | As the upper 32bit are always zero, there is no issue vs. load store | tearing at all. So there is zero benefit for this aside of slightly | "better" user space code when running on a 32-bit kernel. Who cares?
So I just got rid of it and used a u64 as he suggested.
However, there's also an additional reason why it's not worth churning further over this - because VM_DROPPABLE is 64-bit only (due to flags in vma bits), likely so is vDSO getrandom() for the time being. So I think it makes more sense to retool this series to be ppc64, and then if you really really want 32-bit and can convince folks it matters, then all of these parts (for example, here, the fact that the smp helper doesn't want to tear) can be fixed up in a separate series.
So yes I really really want it on ppc32 because this is the only type of boards I have and this is really were we need getrandom() to be optimised,
For nostalgic reasons?
indeed ppc64 was sherry-on-the-cake in my series, I just added it because it was easy to do after doing ppc32.
The rng problem for ppc32 seems to be:
smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
right?
Your proposed type change creates inconsistency for 32-bit userspace running on 64-bit kernels because the data struct layout changes.
As explained before, there is no problem with store or load tearing on 32bit systems because the generation counter is only 32bit wide. So the obvious solution is to only update 32 bits on a 32bit kernel:
--- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -282,7 +282,7 @@ static void crng_reseed(struct work_stru * is ordered with the write above to base_crng.generation. Pairs with * the smp_rmb() before the syscall in the vDSO code. */ - smp_store_release(&_vdso_rng_data.generation, next_gen + 1); + smp_store_release((unsigned long *)&_vdso_rng_data.generation, next_gen + 1); #endif if (!static_branch_likely(&crng_is_ready)) crng_init = CRNG_READY;
Which is perfectly fine on 32-bit independent of endianess because the user space side does READ_ONCE(data->generation) and the read value is solely used for comparison so it does not matter at all whether the generation information is in the upper or the lower 32bit of the u64.
No?
But that's a trivial fix compared to making VM_DROPPABLE work on 32-bit correclty. :)
Thanks,
tglx