On Sat, Oct 18, 2025 at 12:07:58AM +0800, Tzung-Bi Shih wrote:
From this perspective your example is malformed. Resources should not become revoked concurrently *while a driver is bound*. The driver should be unbound, call misc_deregister_sync()/etc, and return from remove() guaranteeing it no longer touches any resources.
Imagining:
- Driver X provides the res1.
- Driver Y provides the res2.
- Driver Z provides the chardev /dev/zzz. The file operations use res1 and res2.
- A userspace program opened /dev/zzz.
Yes, then you have a mess and it is pretty hard to deal with.
We don't usually build things like that, and I'm not aware of any cases where killing the whole char dev is the right answer. Usually it's something like 'res1' is critical and 'res2' is discovered optionally.
Making up elaborate fictional use cases is not a way to justify an over complex design.
If it ends up call misc_deregister_sync(), should the synchronous function wait for the userspace program to close the FD?
Some subsystems do this, it isn't great.
The design behind revocable: the driver X waits via synchronize_srcu(), and then it is free to go. The subsequent accesses to res1 will get NULL, and should fail gracefully.
Yes, I understand what it is for, I am saying it is not required here.
For this specific cros_ec driver it's "res" is this:
struct cros_ec_dev *ec = dev_get_drvdata(pdev->dev.parent); struct cros_ec_platform *ec_platform = dev_get_platdata(ec->dev);
In fact, no, the "res" we are concerning is struct cros_ec_device, e.g. [1]. (I knew the naming cros_ec_device vs. cros_ec_dev is somehowg easy to confuse.)
struct cros_ec_dev *ec = dev_get_drvdata(mdev->parent); struct cros_ec_device *ec_dev = ec->ec_dev;
It's all the same stuff. The parent needs to ensure it remains bound only while it's ec->ec_dev is valid. That is its job.
This is already properly lifetime controlled!
It *HAS* to be, and even your patches are assuming it by blindly reaching into the parent's memory!
- misc->rps[0] = ec->ec_dev->revocable_provider;
If the parent driver has been racily unbound at this point the ec->ec_dev is already a UAF!
Not really, it uses the fact that the caller is from probe(). I think the driver can't be unbound when it is still in probe().
Right, but that's my point you are already relying on driver binding lifetime rules to make your access valid. You should continue to rely on that and fix the lack of synchronous remove to fix the bug.
To be clear, I'm using misc as an example which is also the real crash we observed. If the file operations use other resources provided by a hot-pluggable device, it'd need to use revocable APIs to prevent the UAFs.
I understand, but it is a very poor use of this new construct. Come with a driver that actually has two resources and needs something so complicated first.
Improve miscdev as Laurent suggested to fix this specific driver bug.
Do not mess up char dev by trying to link it to lists of recovable and build some wild auto-revoking thing nobody needs.
Jason