On Mon, 2020-04-06 at 17:51 +0530, Sumit Garg wrote:
A race condition leading to a kernel crash is observed during invocation of ieee80211_register_hw() on a dragonboard410c device having wcn36xx driver built as a loadable module along with a wifi manager in user-space waiting for a wifi device (wlanX) to be active.
Sequence diagram for a particular kernel crash scenario:
user-space ieee80211_register_hw() RX IRQ +++++++++++++++++++++++++++++++++++++++++++++ | | | |<---wlan0---wiphy_register() | |----start wlan0---->| | | |<---IRQ---(RX packet) | Kernel crash | | due to unallocated | | workqueue. | | | | | alloc_ordered_workqueue() | | | | | Misc wiphy init. | | | | | ieee80211_if_add() | | | |
As evident from above sequence diagram, this race condition isn't specific to a particular wifi driver but rather the initialization sequence in ieee80211_register_hw() needs to be fixed.
Indeed, oops.
So re-order the initialization sequence and the updated sequence diagram would look like:
user-space ieee80211_register_hw() RX IRQ +++++++++++++++++++++++++++++++++++++++++++++ | | | | alloc_ordered_workqueue() | | | | | Misc wiphy init. | | | | |<---wlan0---wiphy_register() | |----start wlan0---->| | | |<---IRQ---(RX packet) | | | | ieee80211_if_add() | | | |
Makes sense.
@@ -1254,6 +1250,14 @@ int ieee80211_register_hw(struct ieee80211_hw *hw) local->sband_allocated |= BIT(band); }
- rtnl_unlock();
- result = wiphy_register(local->hw.wiphy);
- if (result < 0)
goto fail_wiphy_register;
- rtnl_lock();
I'm a bit worried about this unlock/relock here though.
I think we only need the RTNL for the call to ieee80211_init_rate_ctrl_alg() and then later ieee80211_if_add(), so perhaps we can move that a little closer?
All the stuff between is really just setting up local stuff, so doesn't really need to worry?
johannes