On 7/28/2023 4:38 PM, Linus Torvalds wrote:
On Fri, 28 Jul 2023 at 14:01, Limonciello, Mario mario.limonciello@amd.com wrote:
That's exactly why I was asking in the kernel bugzilla if something similar gets tripped up by RDRAND.
So that would sound very unlikely, but who knows... Microcode can obviously do pretty much anything at all, but at least the original fTPM issues _seemed_ to be about BIOS doing truly crazy things like SPI flash accesses.
I can easily imagine a BIOS fTPM code using some absolutely horrid global "EFI synchronization" lock or whatever, which could then cause random problems just based on some entirely unrelated activity.
I would not be surprised, for example, if wasn't the fTPM hwrnd code itself that decided to read some random number from SPI, but that it simply got serialized with something else that the BIOS was involved with. It's not like BIOS people are famous for their scalable code that is entirely parallel...
And I'd be _very_ surprised if CPU microcode does anything even remotely like that. Not impossible - HP famously screwed with the time stamp counter with SMIs, and I could imagine them - or others - doing the same with rdrand.
But it does sound pretty damn unlikely, compared to "EFI BIOS uses a one big lock approach".
So rdrand (and rdseed in particular) can be rather slow, but I think we're talking hundreds of CPU cycles (maybe low thousands). Nothing like the stuttering reports we've seen from fTPM.
Linus
Your theory sounds totally plausible and it would explain why even though this system has the fixes from the original issue it's tripping a similar behavior.
Based on the argument of RDRAND being on the same SOC I think it's a pretty good argument to drop contributing to the hwrng entropy *anything* that's not a dTPM.