On 23.04.22 08:21, Kalle Valo wrote:
Alexander Wetzel alexander@wetzel-home.de writes:
Using not existing queues can panic the kernel with rtl8180/rtl8185 cards. Ignore the skb priority for those cards, they only have one tx queue.
Cc: stable@vger.kernel.org Reported-by: pa@panix.com Tested-by: pa@panix.com Signed-off-by: Alexander Wetzel alexander@wetzel-home.de
Pierre Asselin (pa@panix.com) reported a kernel crash in the Gentoo forum: https://forums.gentoo.org/viewtopic-t-1147832-postdays-0-postorder-asc-start... He also confirmed that this patch fixes the issue.
In summary this happened: After updating wpa_supplicant from 2.9 to 2.10 the kernel crashed with a "divide error: 0000" when connecting to an AP. Control port tx now tries to use IEEE80211_AC_VO for the priority, which wpa_supplicants starts to use in 2.10.
Since only the rtl8187se part of the driver supports QoS, the priority of the skb is set to IEEE80211_AC_BE (2) by mac80211 for rtl8180/rtl8185 cards.
rtl8180 is then unconditionally reading out the priority and finally crashes on drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c line 544 without this patch: idx = (ring->idx + skb_queue_len(&ring->queue)) % ring->entries
"ring->entries" is zero for rtl8180/rtl8185 cards, tx_ring[2] never got initialized.
All this after "---" line is very useful information but the actual commit log is just two sentences. I would copy all to the commit log. We don't need to limit the size of the commit log, on the contrary we should include all the information in it.
I see what you mean, fine for me. If you prefer I can also make an update but feel to handle that at your convenience. If you e.g. see a better way to do that drop the patch and simply submit your version.
While I spent some time figuring out how QoS is intended to work and I'm pretty sure I finally got the outline it I'm still wondering why we never set the priority for skb's on the normal transmit path.
Obviously the idea is to keep the queue from whoever set it prior to us and just overwriting it with good reason.
I plan to look a bit more into that, especially since Pierre's system was working when wpa_supplicant is not using control Port. Thus skb_get_queue_mapping() must return zero - or max one - on that path. That only makes sense when the network subsystem knows that QoS is not supported and is not bothering to set the queue. (Or if we would map zero to IEEE80211_AC_BE, but we are not handling it that way)
It basically drills down to the fact that we only call _ieee80211_select_queue() on the normal tx path for drivers supporting wake_tx_queue. I would have expected that call to be done for all drivers. (Or at least all drivers supporting QoS.)
So there is either a strange bug or - so far more likely - some serious gap in my still evolving understanding of QoS.