On Wed Sep 25, 2024 at 12:51 AM EEST, James Bottomley wrote:
On Wed, 2024-09-25 at 00:35 +0300, Jarkko Sakkinen wrote:
On Tue Sep 24, 2024 at 9:40 PM EEST, James Bottomley wrote:
On Tue, 2024-09-24 at 21:07 +0300, Jarkko Sakkinen wrote:
On Tue Sep 24, 2024 at 4:43 PM EEST, James Bottomley wrote:
On Sat, 2024-09-21 at 15:08 +0300, Jarkko Sakkinen wrote:
Instead of flushing and reloading the auth session for every single transaction, keep the session open unless /dev/tpm0 is used. In practice this means applying TPM2_SA_CONTINUE_SESSION to the session attributes. Flush the session always when /dev/tpm0 is written.
Patch looks fine but this description is way too terse to explain how it works.
I would suggest:
Boot time elongation as a result of adding sessions has been reported as an issue in https://bugzilla.kernel.org/show_bug.cgi?id=219229
The root cause is the addition of session overhead to tpm2_pcr_extend(). This overhead can be reduced by not creating and destroying a session for each invocation of the function. Do this by keeping a session resident in the TPM for reuse by any session based TPM command. The current flow of TPM commands in the kernel supports this because tpm2_end_session() is only called for tpm errors because most commands don't continue the session and expect the session to be flushed on success. Thus we can add the continue session flag to session creation to ensure the session won't be flushed except on error, which is a rare case.
I need to disagree on this as I don't even have PCR extends in my boot sequence and it still adds overhead. Have you verified this from the reporter?
There's bunch of things that use auth session, like trusted keys. Making such claim that PCR extend is the reason is nonsense.
Well, the bug report does say it's the commit adding sessions to the PCR extends that causes the delay:
https://bugzilla.kernel.org/show_bug.cgi?id=219229#c5
I don't know what else to tell you.
As far as I've tested this bug I've been able to generate similar costs with anything using HMAC encryption. PCR extend op itself should have same cost with or without encryption AFAIK.
That's true, but the only significant TPM operation in the secure boot path is the PCR extend for IMA. The RNG stuff is there a bit, but there are other significant delays in seeding the entropy pool. During boot with IMA enabled, you can do hundreds of binary measurements, hence the slow down.
I guess I need provide benchmarks on this to prove that PCR extend is not the only site that is affected.
Well, on the per operation figures, it's obviously not, a standard TPM operation gets a significant overhead because of sessions. However, it is the only site that causes a large boot slowdown because of the number of the number of measurements IMA does on boot.
Fair enough. I can buy this.
I'll phrase it that (since it was mentioned in the bugzilla comment) in the bug in question the root is in PCR extend but since in my own tests I got overhead from trusted keys I also mention that it overally affects also that and tpm2_get_random().
Regards,
James
BR, Jarkko