Hi,
On 05.02.21 14:05, Jason Gunthorpe wrote:
Commit fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>") already introduced function tpm_devs_release() to release the extra reference but did not implement the required put on chip->devs that results in the call of this function.
Seems wonky, the devs is just supposed to be a side thing, nothing should be using it as a primary reference count for a tpm.
The bug here is only that tpm_common_open() did not get a kref on the chip before putting it in priv and linking it to the fd. See the comment before tpm_try_get_ops() indicating the caller must already have taken care to ensure the chip is valid.
This should be all you need to fix the oops:
diff --git a/drivers/char/tpm/tpm-dev-common.c b/drivers/char/tpm/tpm-dev-common.c index 1784530b8387bb..1b738dca7fffb5 100644 --- a/drivers/char/tpm/tpm-dev-common.c +++ b/drivers/char/tpm/tpm-dev-common.c @@ -105,6 +105,7 @@ static void tpm_timeout_work(struct work_struct *work) void tpm_common_open(struct file *file, struct tpm_chip *chip, struct file_priv *priv, struct tpm_space *space) {
get_device(&priv->chip.dev); priv->chip = chip; priv->space = space; priv->response_read = true;
This is racy, isnt it? The time between we open the file and we want to grab the reference in common_open() the chip can already be unregistered and freed.
As a matter of fact this solution was the first thing that came into my mind, too, until I noticed the possible race condition. I can only guess that this was what James had in mind when he chose to take the extra reference to chip->dev in tpm_chip_alloc() instead of common_open().
diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c index ddaeceb..3ace199 100644 +++ b/drivers/char/tpm/tpm-chip.c @@ -360,8 +360,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev, * while cdevs is in use. The corresponding put * is in the tpm_devs_release (TPM2 only) */
- if (chip->flags & TPM_CHIP_FLAG_TPM2)
get_device(&chip->dev);
- get_device(&chip->dev);
if (chip->dev_num == 0) chip->dev.devt = MKDEV(MISC_MAJOR, TPM_MINOR); @@ -422,8 +421,21 @@ struct tpm_chip *tpmm_chip_alloc(struct device *pdev, rc = devm_add_action_or_reset(pdev, (void (*)(void *)) put_device, &chip->dev);
- if (rc)
- if (rc) {
return ERR_PTR(rc);put_device(&chip->devs);
This isn't right read what 'or_reset' does
In case of failure installing the action handler devm_add_action_or_reset() puts chip->dev for us. But we also have put chip->devs since we have retrieved a reference to both chip->dev and chip->devs. Or do I miss something here?
Jason
Regards, Lino