Performance Evaluation of Multi-Armed Bandit Algorithms for Wi-Fi Channel Access
By: Miguel Casasnovas , Francesc Wilhelmi , Richard Combes and more
Potential Business Impact:
Makes Wi-Fi faster by learning how to use channels.
The adoption of dynamic, self-learning solutions for real-time wireless network optimization has recently gained significant attention due to the limited adaptability of existing protocols. This paper investigates multi-armed bandit (MAB) strategies as a data-driven approach for decentralized, online channel access optimization in Wi-Fi, targeting dynamic channel access settings: primary channel, channel width, and contention window (CW) adjustment. Key design aspects are examined, including the adoption of joint versus factorial action spaces, the inclusion of contextual information, and the nature of the action-selection strategy (optimism-driven, unimodal, or randomized). State-of-the-art algorithms and a proposed lightweight contextual approach, E-RLB, are evaluated through simulations. Results show that contextual and optimism-driven strategies consistently achieve the highest performance and fastest adaptation under recurrent conditions. Unimodal structures require careful graph construction to ensure that the unimodality assumption holds. Randomized exploration, adopted in the proposed E-RLB, can induce disruptive parameter reallocations, especially in multi-player settings. Decomposing the action space across several specialized agents accelerates convergence but increases sensitivity to randomized exploration and demands coordination under shared rewards to avoid correlated learning. Finally, despite its inherent inefficiencies from epsilon-greedy exploration, E-RLB demonstrates effective adaptation and learning, highlighting its potential as a viable low-complexity solution for realistic dynamic deployments.
Similar Papers
Learning-Based Channel Access in Wi-Fi: A Multi-Armed Bandit Approach
Networking and Internet Architecture
Makes Wi-Fi faster by learning how to share.
Decentralized Asynchronous Multi-player Bandits
Machine Learning (CS)
Helps devices share wireless signals without crashing.
Non-Stationary Restless Multi-Armed Bandits with Provable Guarantee
Machine Learning (CS)
Helps computers learn when things change.