diff options
Diffstat (limited to 'package/kernel/mac80211/patches/390-mac80211-fix-a-race-between-restart-and-CSA-flows.patch')
-rw-r--r-- | package/kernel/mac80211/patches/390-mac80211-fix-a-race-between-restart-and-CSA-flows.patch | 86 |
1 files changed, 86 insertions, 0 deletions
diff --git a/package/kernel/mac80211/patches/390-mac80211-fix-a-race-between-restart-and-CSA-flows.patch b/package/kernel/mac80211/patches/390-mac80211-fix-a-race-between-restart-and-CSA-flows.patch new file mode 100644 index 0000000000..db2aa0443f --- /dev/null +++ b/package/kernel/mac80211/patches/390-mac80211-fix-a-race-between-restart-and-CSA-flows.patch @@ -0,0 +1,86 @@ +From: Emmanuel Grumbach <emmanuel.grumbach@intel.com> +Date: Fri, 31 Aug 2018 11:31:06 +0300 +Subject: [PATCH] mac80211: fix a race between restart and CSA flows + +We hit a problem with iwlwifi that was caused by a bug in +mac80211. A bug in iwlwifi caused the firwmare to crash in +certain cases in channel switch. Because of that bug, +drv_pre_channel_switch would fail and trigger the restart +flow. +Now we had the hw restart worker which runs on the system's +workqueue and the csa_connection_drop_work worker that runs +on mac80211's workqueue that can run together. This is +obviously problematic since the restart work wants to +reconfigure the connection, while the csa_connection_drop_work +worker does the exact opposite: it tries to disconnect. + +Fix this by cancelling the csa_connection_drop_work worker +in the restart worker. + +Note that this can sound racy: we could have: + +driver iface_work CSA_work restart_work ++++++++++++++++++++++++++++++++++++++++++++++ + | + <--drv_cs ---| +<FW CRASH!> +-CS FAILED--> + | | + | cancel_work(CSA) + schedule | + CSA work | + | | + Race between those 2 + +But this is not possible because we flush the workqueue +in the restart worker before we cancel the CSA worker. +That would be bullet proof if we could guarantee that +we schedule the CSA worker only from the iface_work +which runs on the workqueue (and not on the system's +workqueue), but unfortunately we do have an instance +in which we schedule the CSA work outside the context +of the workqueue (ieee80211_chswitch_done). + +Note also that we should probably cancel other workers +like beacon_connection_loss_work and possibly others +for different types of interfaces, at the very least, +IBSS should suffer from the exact same problem, but for +now, do the minimum to fix the actual bug that was actually +experienced and reproduced. + +Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> +Signed-off-by: Luca Coelho <luciano.coelho@intel.com> +Signed-off-by: Johannes Berg <johannes.berg@intel.com> +--- + +--- a/net/mac80211/main.c ++++ b/net/mac80211/main.c +@@ -255,8 +255,27 @@ static void ieee80211_restart_work(struc + + flush_work(&local->radar_detected_work); + rtnl_lock(); +- list_for_each_entry(sdata, &local->interfaces, list) ++ list_for_each_entry(sdata, &local->interfaces, list) { ++ /* ++ * XXX: there may be more work for other vif types and even ++ * for station mode: a good thing would be to run most of ++ * the iface type's dependent _stop (ieee80211_mg_stop, ++ * ieee80211_ibss_stop) etc... ++ * For now, fix only the specific bug that was seen: race ++ * between csa_connection_drop_work and us. ++ */ ++ if (sdata->vif.type == NL80211_IFTYPE_STATION) { ++ /* ++ * This worker is scheduled from the iface worker that ++ * runs on mac80211's workqueue, so we can't be ++ * scheduling this worker after the cancel right here. ++ * The exception is ieee80211_chswitch_done. ++ * Then we can have a race... ++ */ ++ cancel_work_sync(&sdata->u.mgd.csa_connection_drop_work); ++ } + flush_delayed_work(&sdata->dec_tailroom_needed_wk); ++ } + ieee80211_scan_cancel(local); + + /* make sure any new ROC will consider local->in_reconfig */ |