Hi, this is your Linux kernel regression tracker. Top-posting for once, to make this easily accessible to everyone.
Jarkko (or James), what is needed to get this regression resolved? More people showed up that are apparently affected by this. Sure, 6.2 is out, but it's a regression in 6.1 it thus would be good to fix rather sooner than later. Ideally this week, if you ask me.
FWIW, latest version of this patch is here, but it didn't get any replies since it was posted last Tuesday (and the mail quoted below is just one day younger):
https://lore.kernel.org/all/20230220180729.23862-1-mario.limonciello@amd.com...
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page.
#regzbot poke
On 22.02.23 00:10, Limonciello, Mario wrote:
[Public]
-----Original Message----- From: Jarkko Sakkinen jarkko@kernel.org Sent: Tuesday, February 21, 2023 16:53 To: Limonciello, Mario Mario.Limonciello@amd.com Cc: Thorsten Leemhuis regressions@leemhuis.info; James Bottomley James.Bottomley@hansenpartnership.com; Jason@zx2c4.com; linux- integrity@vger.kernel.org; linux-kernel@vger.kernel.org; stable@vger.kernel.org; Linus Torvalds torvalds@linux-foundation.org; Linux kernel regressions list regressions@lists.linux.dev Subject: Re: [PATCH 1/1] tpm: disable hwrng for fTPM on some AMD designs
On Fri, Feb 17, 2023 at 08:25:56PM -0600, Limonciello, Mario wrote:
On 2/17/2023 16:05, Jarkko Sakkinen wrote:
Perhaps tpm_amd_* ?
When Jason first proposed this patch I feel the intent was it could cover multiple deficiencies. But as this is the only one for now, sure re-naming it is fine.
Also, just a question: is there any legit use for fTPM's, which are not updated? I.e. why would want tpm_crb to initialize with a dysfunctional firmware?> I.e. the existential question is: is it better to workaround the issue and let pass through, or make the user aware that the firmware would really need an update.
On 2/17/2023 16:35, Jarkko Sakkinen wrote:
Hmm, no reply since Mario posted this.
Jarkko, James, what's your stance on this? Does the patch look fine
from
your point of view? And does the situation justify merging this on the last minute for 6.2? Or should we merge it early for 6.3 and then backport to stable?
Ciao, Thorsten
As I stated in earlier response: do we want to forbid tpm_crb in this case or do we want to pass-through with a faulty firmware?
Not weighting either choice here I just don't see any motivating points in the commit message to pick either, that's all.
BR, Jarkko
Even if you're not using RNG functionality you can still do plenty of other things with the TPM. The RNG functionality is what tripped up this issue though. All of these issues were only raised because the kernel started using it by default for RNG and userspace wants random numbers all the
time.
If the firmware was easily updatable from all the OEMs I would lean on trying to encourage people to update. But alas this has been available for over a year and a sizable number of OEMs haven't distributed a fix.
The major issue I see with forbidding tpm_crb is that users may have been using the fTPM for something and taking it away in an update could lead to
a
no-boot scenario if they're (for example) tying a policy to PCR values and can no longer access those.
If the consensus were to go that direction instead I would want to see a module parameter that lets users turn on the fTPM even knowing this
problem
exists so they could recover. That all seems pretty expensive to me for this problem.
I agree with the last argument.
FYI, I did send out a v2 and folded in this argument to the commit message and adjusted for your feedback. You might not have found it in your inbox yet.
I re-read the commit message and https://www.amd.com/en/support/kb/faq/pa-410.
Why this scopes down to only rng? Should TPM2_CC_GET_RANDOM also blocked from /dev/tpm0?
The only reason that this commit was created is because the kernel utilized the fTPM for hwrng which triggered the problem. If that never happened this probably wouldn't have been exposed either.
Yes; I would agree that if someone was to do other fTPM operations over an extended period of time it's plausible they can cause the problem too.
But picking and choosing functionality to block seems quite arbitrary to me.