Meta-Learning Multi-armed Bandits for Beam Tracking in 5G and 6G Networks
By: Alexander Mattick , George Yammine , Georgios Kontes and more
Beamforming-capable antenna arrays with many elements enable higher data rates in next generation 5G and 6G networks. In current practice, analog beamforming uses a codebook of pre-configured beams with each of them radiating towards a specific direction, and a beam management function continuously selects \textit{optimal} beams for moving user equipments (UEs). However, large codebooks and effects caused by reflections or blockages of beams make an optimal beam selection challenging. In contrast to previous work and standardization efforts that opt for supervised learning to train classifiers to predict the next best beam based on previously selected beams we formulate the problem as a partially observable Markov decision process (POMDP) and model the environment as the codebook itself. At each time step, we select a candidate beam conditioned on the belief state of the unobservable optimal beam and previously probed beams. This frames the beam selection problem as an online search procedure that locates the moving optimal beam. In contrast to previous work, our method handles new or unforeseen trajectories and changes in the physical environment, and outperforms previous work by orders of magnitude.
Similar Papers
Online Learning-based Adaptive Beam Switching for 6G Networks: Enhancing Efficiency and Resilience
Networking and Internet Architecture
Makes wireless internet faster and more reliable.
Environment-Aware Transfer Reinforcement Learning for Sustainable Beam Selection
Machine Learning (CS)
Makes wireless signals faster and use less power.
Physics-Informed Parametric Bandits for Beam Alignment in mmWave Communications
Machine Learning (CS)
Finds best signal for faster wireless internet.