Tetsuo Handa wrote:
On 2024/07/13 8:49, Dan Williams wrote:
- /* Synchronize with dev_uevent() */
- synchronize_rcu();
this synchronize_rcu(), in order to make sure that READ_ONCE(dev->driver) in dev_uevent() observes NULL?
No, this synchronize_rcu() is to make sure that if dev_uevent() wins the race and observes that dev->driver is not NULL that it is still safe to dereference that result because the 'struct device_driver' object is still live.
I can't catch what the pair of rcu_read_lock()/rcu_read_unlock() in dev_uevent() and synchronize_rcu() in module_remove_driver() is for.
It is to extend the lifetime of @driver if dev_uevent() observes non-NULL @dev->driver.
I think that the below race is possible. Please explain how "/* Synchronize with module_remove_driver() */" works.
It is for this race:
Thread1: Thread2: dev_uevent(...) delete_module() driver = dev->driver; mod->exit() if (driver) driver_unregister() driver_detach() // <-- @dev->driver marked NULL module_remove_driver() free_module() // <-- @driver object destroyed add_uevent_var(env, "DRIVER=%s", driver->name); // <-- use after free of @driver
If driver_detach() happens before Thread1 reads dev->driver then there is no use after free risk.
The previous attempt to fix this held the device_lock() over dev_uevent() which prevents driver_detach() from even starting, but that causes lockdep issues and is even more heavy-handed than the synchronize_rcu() delay. RCU makes sure that @driver stays alive between reading @dev->driver and reading @driver->name.