Currently frame registrations are not purged, even when changing the interface type. This can lead to potentially weird / dangerous situations where frames possibly not relevant to a given interface type remain registered and mgmt_frame_register is not called for the no-longer-relevant frame types.
The kernel currently relies on userspace apps to actually purge the registrations themselves, e.g. by closing the nl80211 socket associated with those frames. However, this requires multiple nl80211 sockets to be open by the userspace app, and for userspace to be aware of all state changes. This is not something that the kernel should rely on.
This commit adds a call to cfg80211_mlme_purge_registrations() to forcefully remove any registrations left over prior to switching the iftype.
Cc: stable@vger.kernel.org
Signed-off-by: Denis Kenzior denkenz@gmail.com --- net/wireless/util.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/wireless/util.c b/net/wireless/util.c index c99939067bb0..3fa092b78e62 100644 --- a/net/wireless/util.c +++ b/net/wireless/util.c @@ -964,6 +964,7 @@ int cfg80211_change_iface(struct cfg80211_registered_device *rdev, }
cfg80211_process_rdev_events(rdev); + cfg80211_mlme_purge_registrations(dev->ieee80211_ptr); }
err = rdev_change_virtual_intf(rdev, dev, ntype, params);
On Wed, 2019-08-28 at 16:11 -0500, Denis Kenzior wrote:
Currently frame registrations are not purged, even when changing the interface type. This can lead to potentially weird / dangerous situations where frames possibly not relevant to a given interface type remain registered and mgmt_frame_register is not called for the no-longer-relevant frame types.
I'd argue really just "weird and non-working", hardly dangerous. Even in the mac80211 design where we want to not let you intercept e.g. AUTH frames in client mode - if you did, then you'd just end up with a non- working interface. Not sure I see any "dangerous situation". Not really an all that important distinction though.
Depending on the design, it may also just be that those registrations are *ignored*, because e.g. firmware intercepts the AUTH frame already, which would just (maybe) confuse userspace - but that seems unlikely since it switched interface type and has no real need for those frames then.
The kernel currently relies on userspace apps to actually purge the registrations themselves, e.g. by closing the nl80211 socket associated with those frames. However, this requires multiple nl80211 sockets to be open by the userspace app, and for userspace to be aware of all state changes. This is not something that the kernel should rely on.
I tend to agree with that the kernel shouldn't rely on it.
This commit adds a call to cfg80211_mlme_purge_registrations() to forcefully remove any registrations left over prior to switching the iftype.
However, I do wonder if we should make this more transactional, and hang on to them if the type switching fails. We're not notifying userspace that the registrations have disappeared, so if type switching fails and it continues to work with the old type rather than throwing its hands up and quitting or something, it'd make a possibly bigger mess to just silently have removed them already.
I *think* it should be safe to just move this after the switching succeeds, since the switching can pretty much only be done at a point where nothing is happening on the interface anyway, though that might confuse the driver when the remove happens.
Also, perhaps it'd be better to actually hang on to those registrations that *are* still possible afterwards? But to not confuse the driver I guess that might require unregister/re-register to happen, all of which requires hanging on to the list and going through it after the type switch completed?
What do you think?
johannes
Hi Johannes,
On 8/30/19 3:53 AM, Johannes Berg wrote:
On Wed, 2019-08-28 at 16:11 -0500, Denis Kenzior wrote:
Currently frame registrations are not purged, even when changing the interface type. This can lead to potentially weird / dangerous situations where frames possibly not relevant to a given interface type remain registered and mgmt_frame_register is not called for the no-longer-relevant frame types.
I'd argue really just "weird and non-working", hardly dangerous. Even in the mac80211 design where we want to not let you intercept e.g. AUTH frames in client mode - if you did, then you'd just end up with a non- working interface. Not sure I see any "dangerous situation". Not really an all that important distinction though.
Fair enough, I'm happy to drop / reword this language. It seemed fishy to me since the unregistration operation was not called at all, and the driver does go to some lengths to set up the valid frame registration types.
Depending on the design, it may also just be that those registrations are *ignored*, because e.g. firmware intercepts the AUTH frame already, which would just (maybe) confuse userspace - but that seems unlikely since it switched interface type and has no real need for those frames then.
There might be corner cases where userspace gets confused and doesn't update the frame registrations properly. For example, wpa_s/hostap does not listen to SET_INTERFACE events that I can tell. So if some external app sets the mode (particularly on a 'live' interface) then all kinds of unexpected things might happen. This is one of the motivations for restricting certain NL80211 commands to interface SOCKET_OWNER.
So really this patch is intended more as a hot-fix / backport to stable to make sure the older kernels can deal with some of these situations.
The kernel currently relies on userspace apps to actually purge the registrations themselves, e.g. by closing the nl80211 socket associated with those frames. However, this requires multiple nl80211 sockets to be open by the userspace app, and for userspace to be aware of all state changes. This is not something that the kernel should rely on.
I tend to agree with that the kernel shouldn't rely on it.
This commit adds a call to cfg80211_mlme_purge_registrations() to forcefully remove any registrations left over prior to switching the iftype.
However, I do wonder if we should make this more transactional, and hang on to them if the type switching fails. We're not notifying userspace that the registrations have disappeared, so if type switching fails and it continues to work with the old type rather than throwing its hands up and quitting or something, it'd make a possibly bigger mess to just silently have removed them already.
I do like that idea, not sure how to go about implementing it though? The failure case is a bit hard to deal with. Something like NL80211_EXT_FEATURE_LIVE_IFTYPE_CHANGE would help, particularly if nl80211/cfg80211 actually checked it prior to doing anything (e.g. disconnecting, etc). That would then take care of the majority of the 'typical' failure paths. I didn't add such checking in the other patch set since I felt you might find it overly intrusive on userspace. But maybe we really should do this?
So playing devil's advocate, another argument might be that by the time we got here, we've already tore down a bunch of state. E.g. disconnected the station, stopped AP, etc. So we've already side-effected state in a bunch of ways, what's one more?
I *think* it should be safe to just move this after the switching succeeds, since the switching can pretty much only be done at a point where nothing is happening on the interface anyway, though that might confuse the driver when the remove happens.
I would concur as that is what happens today. But should it?
Also, perhaps it'd be better to actually hang on to those registrations that *are* still possible afterwards? But to not confuse the driver I guess that might require unregister/re-register to happen, all of which requires hanging on to the list and going through it after the type switch completed?
Yes, I had those exact thoughts as well.
It isn't currently clear to me if there are any guarantees on the driver operation call sequence that cfg80211 provides. E.g. can the driver expect rdev_change_virtual_intf to be called only once all the old registrations are purged and the new registrations are performed after the fact? Or should it expect things to just happen in any order?
What do you think?
A big part of me thinks that just wiping the slate clean and having userspace set it up from scratch isn't that much to ask and it might want to do that anyway. It might (a big maybe?) also make the driver's life easier if it can rely on certain guarantees from cfg80211. E.g. that all invalid registrations are purged.
I have seen wpa_s perform a bunch of register commands which bounce off with an -EALREADY. So it may already be erring on the side of caution and assuming that it needs to reset the state fully? Not sure.
But if the kernel wants to be nice and spends some cycles figuring out which frame registrations to keep and re-register them, that is also fine with me.
Regards, -Denis
Hi,
On Fri, 2019-08-30 at 01:32 -0500, Denis Kenzior wrote:
Hi Johannes,
On 8/30/19 3:53 AM, Johannes Berg wrote:
On Wed, 2019-08-28 at 16:11 -0500, Denis Kenzior wrote:
Currently frame registrations are not purged, even when changing the interface type. This can lead to potentially weird / dangerous situations where frames possibly not relevant to a given interface type remain registered and mgmt_frame_register is not called for the no-longer-relevant frame types.
I'd argue really just "weird and non-working", hardly dangerous.
I think I may just have found a way that's sort of "dangerous" in the sense of breaking all of our tests, but hey.
Even in the mac80211 design where we want to not let you intercept e.g. AUTH frames in client mode - if you did, then you'd just end up with a non- working interface. Not sure I see any "dangerous situation". Not really an all that important distinction though.
Fair enough, I'm happy to drop / reword this language. It seemed fishy to me since the unregistration operation was not called at all, and the driver does go to some lengths to set up the valid frame registration types.
Sure.
However, I do wonder if we should make this more transactional, and hang on to them if the type switching fails. We're not notifying userspace that the registrations have disappeared, so if type switching fails and it continues to work with the old type rather than throwing its hands up and quitting or something, it'd make a possibly bigger mess to just silently have removed them already.
I do like that idea, not sure how to go about implementing it though? The failure case is a bit hard to deal with. Something like NL80211_EXT_FEATURE_LIVE_IFTYPE_CHANGE would help, particularly if nl80211/cfg80211 actually checked it prior to doing anything (e.g. disconnecting, etc). That would then take care of the majority of the 'typical' failure paths. I didn't add such checking in the other patch set since I felt you might find it overly intrusive on userspace. But maybe we really should do this?
As I just said on the other patch, I think we probably should do that there, if just to be able to advertise a correct set of interface types that you can switch between there. I don't see how it'd be more intrusive to userspace than failing later? :-)
So playing devil's advocate, another argument might be that by the time we got here, we've already tore down a bunch of state. E.g. disconnected the station, stopped AP, etc. So we've already side-effected state in a bunch of ways, what's one more?
True, fair point.
I *think* it should be safe to just move this after the switching succeeds, since the switching can pretty much only be done at a point where nothing is happening on the interface anyway, though that might confuse the driver when the remove happens.
I would concur as that is what happens today. But should it?
Well, dunno, what should happen? If you ask drivers they might want to remove & re-register after, for those registrations that are still possible.
It isn't currently clear to me if there are any guarantees on the driver operation call sequence that cfg80211 provides. E.g. can the driver expect rdev_change_virtual_intf to be called only once all the old registrations are purged and the new registrations are performed after the fact? Or should it expect things to just happen in any order?
Well, evidently it cannot rely on anything today, and for the most part I guess this is implemented in the software paths where it doesn't really matter (the same way that mac80211 implements it).
But it probably should be defined better.
What do you think?
A big part of me thinks that just wiping the slate clean and having userspace set it up from scratch isn't that much to ask and it might want to do that anyway. It might (a big maybe?) also make the driver's life easier if it can rely on certain guarantees from cfg80211. E.g. that all invalid registrations are purged.
Yeah, fair point.
I have seen wpa_s perform a bunch of register commands which bounce off with an -EALREADY. So it may already be erring on the side of caution and assuming that it needs to reset the state fully? Not sure.
I'm pretty sure that it does in fact go through a full reset (re-setup) after switching things around.
But if the kernel wants to be nice and spends some cycles figuring out which frame registrations to keep and re-register them, that is also fine with me.
Let's not then.
I've applied this patch now.
johannes
Hi Johannes,
'typical' failure paths. I didn't add such checking in the other patch set since I felt you might find it overly intrusive on userspace. But maybe we really should do this?
As I just said on the other patch, I think we probably should do that there, if just to be able to advertise a correct set of interface types that you can switch between there. I don't see how it'd be more intrusive to userspace than failing later? :-)
What I was worried about was that all the fullmac drivers would have had to be updated to set the feature bit, and it would have caused wpa_s/hostapd to no longer be able to do the whole set_iftype -> ebusy -> ifdown & set_iftype retry logic until all were updated.
I would concur as that is what happens today. But should it?
Well, dunno, what should happen? If you ask drivers they might want to remove & re-register after, for those registrations that are still possible.
I would think you'd want to define a clear order of operations that cfg80211 / mac80211 would enforce :)
Let's not then.
I've applied this patch now.
Great, thanks.
Regards, -Denis
linux-stable-mirror@lists.linaro.org